Commit Graph

26 Commits

Author SHA1 Message Date
Tristan Ravitch
dbb4c83f08
Add a testing framework for macaw-symbolic (#184)
The new test suites cover x86_64, PowerPC, and ARM. They test that the semantics are actually correct (rather than just seeing if symbolic execution produces any result). The `Data.Macaw.Symbolic.Testing` module in macaw-symbolic provides some common utilities for symbolic execution engine setup, while there are tailored test harnesses for each architecture.

The semantics of the test harnesses are documented in each architecture test suite, but they:
1. Discover all of the test binaries (which are generated from the included makefiles)
2. Treat each function whose name begins with `test_` as a test entry point
3. Symbolically executes each test case with fully symbolic register states
4. Extracts the return value after symbolic execution, which is treated as the predicate to an assertion that must be proved
    - If the test case is in the `pass` subdirectory, it is proved and expected to hold
    - If the test case is in the `fail` subdirectory, it is proved and expected to not hold.

Each test harness supports two options for debugging:
- Dumping generated SMT queries
- Dumping generated Macaw IR for inspection

This testing uncovered a bug in the (previously untested) macaw-aarch32-symbolic code. It required a number of submodule updates to:

- Adapt to some what4 changes
- Fix a bug in the LLVM memory model that lets these tests pass
- Adapt to changes to some crucible APIs

This change also modifies the CI configuration to install SMT solvers earlier (which are now needed for all of the symbolic package tests).
2021-03-01 09:21:44 -08:00
robdockins
a58f1e25dd
Update to follow changes in What4. Nat is no longer a base type (#190)
Nat is no longer a what4 base type, so we have to adapt various APIs to accommodate that. The template haskell in macaw-semmc is updated to remove Nat cases. Changes to the `SymFn` type required removing a type parameter.

This commit also adds macaw-refinement to CI (which requires installing SMT solvers); that code had to be updated due to the what4 changes.


Co-authored-by: Tristan Ravitch <tristan@galois.com>
2021-02-19 15:44:56 -08:00
Brian Huffman
2a620d41de Switch from ansi-wl-pprint to the prettyprinter package.
This patch relies on the following submodule updates:
- GaloisInc/what4#77
- GaloisInc/elf-edit#20
- GaloisInc/crucible#586
- GaloisInc/asl-translator#28

This patch updates the following packages:
- macaw-base
- macaw-symbolic
- macaw-x86
- macaw-x86-symbolic
- macaw-aarch32
- macaw-ppc
- macaw-semmc
- macaw-refinement
2020-12-02 11:38:19 -08:00
Daniel Matichuk
3765c5eaab remove 'constraints' dependency 2020-10-06 11:20:40 -07:00
Daniel Matichuk
2da172b3ea generalize ArchInfo over memory model 2020-09-01 17:07:04 -07:00
Ben Selfridge
039b8497fc
updates what4, crucible, etc. (#146)
* update to bv-sized branch of what4 and other things

* removed parameterized-utils submodule completely

* Updates submodules

* Fixes macaw-symbolic w.r.t. crucible-llvm changes

Co-authored-by: Ben Selfridge <ben@000548-benselfridge.local>
2020-06-16 16:49:55 -07:00
Daniel Wagner
d39ad7a024 Merge branch 'master' into wip/equiv 2020-04-03 00:20:53 -04:00
Daniel Wagner
97c9e20089 add memory model as type argument in a few places 2020-03-18 00:21:15 -04:00
Daniel Wagner
ef91508325 naming only: trace, not 2 2020-03-12 16:57:01 -04:00
Tristan Ravitch
80125921a9 This commit re-implements the memory model used by macaw symbolic
There are two major changes:
- The interface to memory models in Data.Macaw.Symbolic has changed
- The suggested implementation in Data.Macaw.Symbolic.Memory has changed

The change improves performance and fixes a soundness bug.

* `macawExtensions` (Data.Macaw.Symbolic) takes a new argument: a `MkGlobalPointerValidityPred`.  Use `mkGlobalPointerValidityPred` to provide one.
* `mapRegionPointers` no longer takes a default pointer argument (delete it at call sites)
* `GlobalMap` returns an `LLVMPtr sym w` instead of a `Maybe (LLVMPtr sym w)`

Users of the suggested memory model do not need to worry about the last change,
as it has been migrated.  If you provided your own address mapping function, it
must now be total.  This is annoying, but the old API was unsound because
macaw-symbolic did not have enough information to correctly handle the `Nothing`
case.  The idea of the change is that the mapping function should translate any
concrete pointers as appropriate, while symbolic pointers should generate a mux
over all possible allocations.  Unfortunately, macaw-symbolic does not have
enough information to generate the mux itself, as there may be allocations
created externally.

This interface and implementation is concerned with handling pointers to static
memory in a binary.  These are distinguished from pointers to
dynamically-allocated or stack memory because many machine code instructions
compute bitvectors and treat them as pointers.  In the LLVM memory model used by
macaw-symbolic, each memory allocation has a block identifier (a natural
number).  The stack and each heap allocation get unique block identifiers.
However, pointers to static memory have no block identifier and must be mapped
to a block in order to fit into the LLVM memory model.

The previous implementation assigned each mapped memory segment in a binary to
its own LLVM memory allocation.  This had the nice property of implicitly
proving that no memory access was touching unmapped memory.  Unfortunately, it
was especially inefficient in the presence of symbolic reads and writes, as it
generated mux trees over all possible allocations and significantly slowed
symbolic execution.

The new memory model implementation (in Data.Macaw.Symbolic.Memory) instead uses
a single allocation for all static allocations.  This pushes more of the logic
for resolving reads and writes into the SMT solver and the theory of arrays.  In
cases where sufficient constraints exist in path conditions, this means that we
can support symbolic reads and writes.  Additionally, since we have only a
single SMT array backing all allocations, mapping bitvectors to LLVM pointers in
the memory model is trivial: we just change their block identifier from zero
(denoting a bitvector) to the block identifier of the unique allocation backing
static data.

This change has to do some extra work to ensure safety (especially that unmapped
memory is never written to or read from).  This is handled with the
MkGlobalPointerValidityPred interface in Data.Macaw.Symbolic.  This function,
which is passed to the macaw-symbolic initialization, constructs well-formedness
predicates for all pointers used to access memory.  Symbolic execution tasks
that do not need to enforce this property can simply provide a function that
never returns any predicates to check.  Implementations that want a reasonable
default can use the mkGlobalPointerValidityPred from Data.Macaw.Symbolic.Memory.
The default implementation ensures that no reads or writes touch unmapped memory
and that writes to static data never write to read-only segments.

This change also converts the examples in macaw-symbolic haddocks to use doctest
to ensure that they do not get out of date.  These are checked as part of CI.
2020-02-11 09:58:53 -08:00
Daniel Wagner
b5c15af0da start on alternative memory model 2020-01-13 22:11:40 -05:00
Kevin Quick
45f5c5e7af
Update cabal version for fields used. 2019-02-27 11:52:18 -08:00
Kevin Quick
a1e6f1b841
Update NoStarIsType spec for backward compatibility. 2019-02-27 09:30:26 -08:00
Tristan Ravitch
cc85cfe657 Clean up and document the macaw-symbolic API
This cleanup consolidates the interface to macaw symbolic into two (and a half)
modules:

 - Data.Macaw.Symbolic for clients who just need to symbolically simulate
   machine code
 - Data.Macaw.Symbolic.Backend for clients that need to implement new
   architectures
 - Data.Macaw.Symbolic.Memory provides a reusable example implementation of
   machine pointer to LLVM memory model pointer mapping

Most functions are now documented and are grouped by use case.  There are two
worked (compiling) examples in the haddocks that show how to translate Macaw
into Crucible and then symbolically simulate the results (including setting up
all aspects of Crucible).  The examples are included in the symbolic/examples
directory and can be loaded with GHCi to type check them.

The Data.Macaw.Symbolic.Memory module still needs a worked example.

There were very few changes to actual code as part of this overhaul, but there
are a few places where complicated functions were hidden behind newtypes, as
users never need to construct the values themselves (e.g., MacawArchEvalFn and
MacawSymbolicArchFunctions).  There was also a slight consolidation of
constraint synonyms to reduce duplication.  All callers will have to be updated.

There is also now a README for macaw-symbolic that explains its purpose and
includes pointers to the new haddocks.

This commit also fixes up the (minor) breakage in the macaw-x86-symbolic
implementation from the API changes.
2019-01-10 18:20:54 -08:00
Kevin Quick
98807daee2
Added -Wcompat for warnings about future compatibility. 2019-01-10 13:43:27 -08:00
Joe Hendrix
494f6c176d
Updates to Macaw. 2018-06-06 11:48:45 -07:00
Rob Dockins
c382b59bed Bump crucible submodule to pull in crucible/what4 split refactor,
and update macaw-symbolic and macax-x86-symbolic.
2018-05-15 15:58:14 -07:00
Rob Dockins
4a4b9d7a8a Minor update to track crucible API 2018-03-15 17:21:42 -07:00
Iavor Diatchki
86c730043e Cleanup pointer manipulation operations. 2018-02-09 11:41:39 -08:00
Iavor Diatchki
561d211972 Get started on integrating the LLVM memory model. 2018-02-06 16:47:13 -08:00
Joe Hendrix
f6699b1b57
Port to use crucible syntax extensions. 2018-01-09 10:40:14 -08:00
Joe Hendrix
deab99869d
Update for parameterized-utils compat. 2018-01-02 17:31:42 -08:00
Joe Hendrix
478e7db31a
Refactor X86 semantics types; add tuples to Macaw. 2017-12-01 13:58:20 -08:00
Joe Hendrix
c77d1ac421
Remove generic quot/rem ops, and BoolMux; Refactor macaw-symbolic. 2017-10-02 14:40:14 -07:00
Joe Hendrix
7c33bf82e4
Update license information. 2017-09-27 15:59:06 -07:00
Joe Hendrix
0242a88fa6
Rename macaw to macaw-base and move directory. 2017-09-27 15:41:37 -07:00