Commit Graph

17 Commits

Author SHA1 Message Date
Andrei Stefanescu
0be59e5815
Update Macaw to use HasLLVMAnn. (#122) 2020-04-02 17:58:47 -07:00
Tristan Ravitch
80125921a9 This commit re-implements the memory model used by macaw symbolic
There are two major changes:
- The interface to memory models in Data.Macaw.Symbolic has changed
- The suggested implementation in Data.Macaw.Symbolic.Memory has changed

The change improves performance and fixes a soundness bug.

* `macawExtensions` (Data.Macaw.Symbolic) takes a new argument: a `MkGlobalPointerValidityPred`.  Use `mkGlobalPointerValidityPred` to provide one.
* `mapRegionPointers` no longer takes a default pointer argument (delete it at call sites)
* `GlobalMap` returns an `LLVMPtr sym w` instead of a `Maybe (LLVMPtr sym w)`

Users of the suggested memory model do not need to worry about the last change,
as it has been migrated.  If you provided your own address mapping function, it
must now be total.  This is annoying, but the old API was unsound because
macaw-symbolic did not have enough information to correctly handle the `Nothing`
case.  The idea of the change is that the mapping function should translate any
concrete pointers as appropriate, while symbolic pointers should generate a mux
over all possible allocations.  Unfortunately, macaw-symbolic does not have
enough information to generate the mux itself, as there may be allocations
created externally.

This interface and implementation is concerned with handling pointers to static
memory in a binary.  These are distinguished from pointers to
dynamically-allocated or stack memory because many machine code instructions
compute bitvectors and treat them as pointers.  In the LLVM memory model used by
macaw-symbolic, each memory allocation has a block identifier (a natural
number).  The stack and each heap allocation get unique block identifiers.
However, pointers to static memory have no block identifier and must be mapped
to a block in order to fit into the LLVM memory model.

The previous implementation assigned each mapped memory segment in a binary to
its own LLVM memory allocation.  This had the nice property of implicitly
proving that no memory access was touching unmapped memory.  Unfortunately, it
was especially inefficient in the presence of symbolic reads and writes, as it
generated mux trees over all possible allocations and significantly slowed
symbolic execution.

The new memory model implementation (in Data.Macaw.Symbolic.Memory) instead uses
a single allocation for all static allocations.  This pushes more of the logic
for resolving reads and writes into the SMT solver and the theory of arrays.  In
cases where sufficient constraints exist in path conditions, this means that we
can support symbolic reads and writes.  Additionally, since we have only a
single SMT array backing all allocations, mapping bitvectors to LLVM pointers in
the memory model is trivial: we just change their block identifier from zero
(denoting a bitvector) to the block identifier of the unique allocation backing
static data.

This change has to do some extra work to ensure safety (especially that unmapped
memory is never written to or read from).  This is handled with the
MkGlobalPointerValidityPred interface in Data.Macaw.Symbolic.  This function,
which is passed to the macaw-symbolic initialization, constructs well-formedness
predicates for all pointers used to access memory.  Symbolic execution tasks
that do not need to enforce this property can simply provide a function that
never returns any predicates to check.  Implementations that want a reasonable
default can use the mkGlobalPointerValidityPred from Data.Macaw.Symbolic.Memory.
The default implementation ensures that no reads or writes touch unmapped memory
and that writes to static data never write to read-only segments.

This change also converts the examples in macaw-symbolic haddocks to use doctest
to ensure that they do not get out of date.  These are checked as part of CI.
2020-02-11 09:58:53 -08:00
Andrew Kent
587aa7ea6b
Update crux/crucible code to use float mode reprs; bump submodules 2019-11-05 15:23:51 -08:00
Joe Hendrix
df95e65987
Various changes to support VCG.
The changes include:

  Clean up elf loading to fix a bug in rel addend parsing.

  Introduce block preconditions for populating reopt-vcg fields.

  Change load options to match reopt's interface.
2019-09-04 23:21:23 -07:00
Tristan Ravitch
0c3ea57a62 Update the macaw-x86-symbolic tests 2019-08-09 10:56:50 -07:00
Kevin Quick
40eff5802c
[x86_symbolic] updates for crucible nonce change from (ST h) to IO
Changes for compatibility with Crucible pull request
285 (https://github.com/GaloisInc/crucible/pull/285) and the
corresponding changes in macaw symbolic.
2019-07-19 13:15:44 -07:00
Joe Hendrix
4267dca987
Get x86_symbolic test cases in runable state. 2019-03-25 20:29:35 -07:00
Brian Huffman
a3d7376179 Adapt to changed crucible-llvm exports. 2018-08-27 16:16:32 -07:00
Joe Hendrix
dc4a4f0f5f
Merge remote-tracking branch 'public/stable' into jhx-x86-improvements 2018-07-20 20:32:09 -07:00
Joe Hendrix
2184fab0bc
Update macaw-symbol tests so they at least compile. 2018-07-20 10:07:49 -07:00
Rob Dockins
f74d999896 Bump crucible submodule again 2018-05-17 14:06:24 -07:00
Joe Hendrix
007405db1d
Improve robustness of elf loader, and start trying to parse relocations in objects. 2018-03-29 15:21:31 -07:00
Iavor Diatchki
ef1d277c12 Make test build (but not yet function) 2018-01-30 15:55:22 -08:00
Iavor Diatchki
737d4fc0c5 Fix test (still does not work, but for other reasons) 2018-01-30 15:50:34 -08:00
Joe Hendrix
d1bdff9866
Additional code for macaw-symbolic. 2018-01-22 16:58:33 -08:00
Joe Hendrix
8b97faa731
More progress on Macaw symbolic; compile fixes for Macaw changes. 2018-01-22 15:28:20 -08:00
Joe Hendrix
b7e06e64ee
Progress on macaw-symbolic and macaw-x86-symbolic. 2018-01-16 15:06:31 -08:00