Commit Graph

252 Commits

Author SHA1 Message Date
Ben Selfridge
039b8497fc
updates what4, crucible, etc. (#146)
* update to bv-sized branch of what4 and other things

* removed parameterized-utils submodule completely

* Updates submodules

* Fixes macaw-symbolic w.r.t. crucible-llvm changes

Co-authored-by: Ben Selfridge <ben@000548-benselfridge.local>
2020-06-16 16:49:55 -07:00
Tristan Ravitch
5ba28484f9
symbolic: Add some documentation on pointer operations (#145)
symbolic: Add some documentation on pointer operations

Their behavior is not entirely obvious, so hopefully this should be useful to
someone in the future.
2020-06-13 10:27:43 -07:00
Brian Huffman
f65c80d7b1 Make code compile without warnings in ghc-8.6 and ghc-8.8. 2020-04-23 20:22:30 -07:00
Kevin Quick
34c40e086b
Update doctest fix to support older GHC 8.4. 2020-04-13 07:25:54 +00:00
Kevin Quick
dd368540fd
Small fixes to macaw-symbolic tests for recent changes. 2020-04-13 06:59:16 +00:00
Tristan Ravitch
c0717413e5 Fix warnings 2020-04-12 20:15:20 -07:00
Daniel Wagner
4ffec20d0a complete the merge 2020-04-03 22:49:34 -04:00
Daniel Wagner
d39ad7a024 Merge branch 'master' into wip/equiv 2020-04-03 00:20:53 -04:00
Andrei Stefanescu
0be59e5815
Update Macaw to use HasLLVMAnn. (#122) 2020-04-02 17:58:47 -07:00
Daniel Wagner
c8328006ac warning police 2020-04-02 13:49:41 -04:00
Daniel Wagner
97c9e20089 add memory model as type argument in a few places 2020-03-18 00:21:15 -04:00
Tristan Ravitch
e024646860
macaw-refinement (#114)
This commit updates macaw-refinement to work with the latest macaw/crucible and makes a few improvements along the way.

The major changes involved in this are:
* Block labels were removed from macaw, so we had to come up with an alternative approach to making synthetic blocks to represent dispatch resolved by macaw-refinement that is not really a jump table. We considered adding a new terminator that encoded "computed IP-based dispatch", but there was concern about the impact on client code. Instead, we added a field to the `DiscoveryFunInfo` that records "external" resolutions to indirect control flow (e.g., as by an SMT solver in macaw-refinement). The hook by which we feed SMT-based resolutions back into macaw was modified accordingly (`addDiscoveredFunctionBlockTargets`).
* Solver invocation changed to allow solver selection and parallel solver application.
* Logging is now done via the `lumberjack` library.
* macaw-symbolic now uses the "external" resolutions in `DiscoveryFunInfo` while building crucible CFGs.
* The path creation code in macaw-refinement was simplified significantly and the approach to path creation has been documented.
* The run-refinement tool is now more featureful.
* The test suite is a bit more structured and no longer depends on the printed output of the discovery process.
2020-03-12 17:15:08 -07:00
Daniel Wagner
1daee7da6b write a bit about how CFG-edge-killing is implemented 2020-03-12 18:11:07 -04:00
Daniel Wagner
ef91508325 naming only: trace, not 2 2020-03-12 16:57:01 -04:00
Daniel Wagner
f4daaa7e81 start working on execMacawStmtExtension for new memory model 2020-03-09 23:46:58 -04:00
Tristan Ravitch
c825332f39
Update/ghc 8.8 (#112)
Updates for GHC 8.8

The two main classes of update are related to MonadFail and type alias expansion.

The MonadFail updates introduce explicit MonadFail instances and backward-compatible `fail` implementations under `Monad` for older GHC versions.

The type alias expansion rules changed in GHC 8.8 in a way that breaks the `Simple Lens` idiom; instead, we have to use `Lens'`.  Lens started supporting this alias in version 3.8, which was released in 2013.

This change includes necessary submodule updates, as well as the update for the split of what4 into its own repository.
2020-03-03 13:28:26 -08:00
Daniel Wagner
5506e05486 Merge branch 'tr/new-macaw-symbolic-entry' into wip/equiv 2020-02-19 17:38:27 -05:00
Daniel Wagner
66c2fc4a98 provide a way to mark certain jumps as terminal 2020-02-19 17:37:18 -05:00
Tristan Ravitch
80125921a9 This commit re-implements the memory model used by macaw symbolic
There are two major changes:
- The interface to memory models in Data.Macaw.Symbolic has changed
- The suggested implementation in Data.Macaw.Symbolic.Memory has changed

The change improves performance and fixes a soundness bug.

* `macawExtensions` (Data.Macaw.Symbolic) takes a new argument: a `MkGlobalPointerValidityPred`.  Use `mkGlobalPointerValidityPred` to provide one.
* `mapRegionPointers` no longer takes a default pointer argument (delete it at call sites)
* `GlobalMap` returns an `LLVMPtr sym w` instead of a `Maybe (LLVMPtr sym w)`

Users of the suggested memory model do not need to worry about the last change,
as it has been migrated.  If you provided your own address mapping function, it
must now be total.  This is annoying, but the old API was unsound because
macaw-symbolic did not have enough information to correctly handle the `Nothing`
case.  The idea of the change is that the mapping function should translate any
concrete pointers as appropriate, while symbolic pointers should generate a mux
over all possible allocations.  Unfortunately, macaw-symbolic does not have
enough information to generate the mux itself, as there may be allocations
created externally.

This interface and implementation is concerned with handling pointers to static
memory in a binary.  These are distinguished from pointers to
dynamically-allocated or stack memory because many machine code instructions
compute bitvectors and treat them as pointers.  In the LLVM memory model used by
macaw-symbolic, each memory allocation has a block identifier (a natural
number).  The stack and each heap allocation get unique block identifiers.
However, pointers to static memory have no block identifier and must be mapped
to a block in order to fit into the LLVM memory model.

The previous implementation assigned each mapped memory segment in a binary to
its own LLVM memory allocation.  This had the nice property of implicitly
proving that no memory access was touching unmapped memory.  Unfortunately, it
was especially inefficient in the presence of symbolic reads and writes, as it
generated mux trees over all possible allocations and significantly slowed
symbolic execution.

The new memory model implementation (in Data.Macaw.Symbolic.Memory) instead uses
a single allocation for all static allocations.  This pushes more of the logic
for resolving reads and writes into the SMT solver and the theory of arrays.  In
cases where sufficient constraints exist in path conditions, this means that we
can support symbolic reads and writes.  Additionally, since we have only a
single SMT array backing all allocations, mapping bitvectors to LLVM pointers in
the memory model is trivial: we just change their block identifier from zero
(denoting a bitvector) to the block identifier of the unique allocation backing
static data.

This change has to do some extra work to ensure safety (especially that unmapped
memory is never written to or read from).  This is handled with the
MkGlobalPointerValidityPred interface in Data.Macaw.Symbolic.  This function,
which is passed to the macaw-symbolic initialization, constructs well-formedness
predicates for all pointers used to access memory.  Symbolic execution tasks
that do not need to enforce this property can simply provide a function that
never returns any predicates to check.  Implementations that want a reasonable
default can use the mkGlobalPointerValidityPred from Data.Macaw.Symbolic.Memory.
The default implementation ensures that no reads or writes touch unmapped memory
and that writes to static data never write to read-only segments.

This change also converts the examples in macaw-symbolic haddocks to use doctest
to ensure that they do not get out of date.  These are checked as part of CI.
2020-02-11 09:58:53 -08:00
Tristan Ravitch
39cf7d9682
Merge branch 'master' into tr/new-macaw-symbolic-entry 2020-02-05 10:43:24 -08:00
Daniel Wagner
95dd08bce9 Merge branch 'master' into wip/equiv 2020-02-04 12:21:51 -05:00
Daniel Wagner
c22f140a3b Merge branch 'tr/new-macaw-symbolic-entry' into wip/equiv 2020-01-13 22:21:51 -05:00
Daniel Wagner
b5c15af0da start on alternative memory model 2020-01-13 22:11:40 -05:00
Daniel Wagner
e5daa87dd0 fix typo in comment 2020-01-13 22:10:17 -05:00
Tristan Ravitch
6b490a8193 Update the crucible submodule
The only real code change required is that simulation failure messages have an
extra argument.  The goal with this update is to pull in some fixes to the
solver feature detection for yices in the latest crucible.
2019-12-19 15:03:09 -08:00
Tristan Ravitch
b76bfdb395 Add a new entry point to macaw-symbolic
This version constructs a Crucible CFG for a collection of blocks while
preserving control flow between them.  It allows the caller to specify blocks
that are considered "terminal": those blocks return the current register state.
Control flow to blocks no included in the "slice" are directed to synthetic
blocks that assume False in order to stop the symbolic simulator from exploring
those branches.
2019-12-19 11:21:05 -08:00
Rob Dockins
13aefd82f2 Update macaw-symbolic with changes to string literals in what4 2019-11-15 14:39:38 -08:00
Daniel Wagner
8d275627ba export a few handy internals 2019-10-30 16:07:21 -04:00
Joe Hendrix
821d434370
Add support for equalities in jump table bounds. 2019-08-27 16:39:41 -07:00
Joe Hendrix
494aff6ff0
This makes a number of changes to abstract domains.
The goal is to support a jumptable testcase that is not supported by
the current jump bounds check.  The jump bounds check needs to be
augmented so that it understands equality relationships between stack
values and registers, and bounds on both.

This patch tracks when a register points to a concrete stack offset.

As part of this, we droped the AbsDomain instance for AbsBlockState.
Clients should now likely use `fnStartAbsBlockState` in lieu of `top`.

The other client visible change is that the ClassifyFailure
constructor now has an extra argument with details about why
classification failure occured.
2019-08-21 23:29:16 -07:00
Tristan Ravitch
eaee8e0dc0 Remove an unused parameter from macaw-symbolic
Most of the interface functions took a map from addresses to segments, however this map
was never actually used in macaw-symbolic.

The migration for this change is simply to remove the unused parameter from all
call sites in client code.
2019-08-08 16:02:19 -07:00
Kevin Quick
419b977d6b
Add the new function handle return type, used for recursion bounding. 2019-08-07 09:51:57 -07:00
Kevin Quick
2353ad9f6d
Merge branch 'master' into nonce_handle_deparameterize 2019-07-19 17:06:50 -07:00
Kevin Quick
48c3ba1fed
[symbolic] additional nonce-related adjustments from 'ST h' to 'IO'. 2019-07-19 09:40:24 -07:00
Kevin Quick
80de5d94e5
[symbolic] update for use of safe Nonce in crucible.
Update for compatibility with Crucible changes in
https://github.com/GaloisInc/crucible/pull/285.
2019-07-19 00:13:00 -07:00
Kevin Quick
f525351621
Handle conversions for Float Mux in macaw-ppc. 2019-07-11 13:55:01 -07:00
Joe Hendrix
6a4b75852f
Fix missing case in macaw-symbolic 2019-05-30 23:39:38 -07:00
Joe Hendrix
c6a7ba7cd6
Rename pblock fields to be more descriptive. 2019-04-29 22:21:10 -07:00
Joe Hendrix
315cd2f9f0
Cleanups to macaw-symbolic 2019-04-29 21:30:59 -07:00
Joe Hendrix
70ea5b9036
Remove ParsedIte 2019-04-29 20:46:54 -07:00
Joe Hendrix
8aa4650683
Introduce ParsedBranch constructor. 2019-04-29 10:49:00 -07:00
Joe Hendrix
3331a19571
Drop support for branches within blocks. 2019-04-28 13:19:20 -07:00
Joe Hendrix
15676a2e45
Bump versions; Update macaw-symbol for conditional write. 2019-04-17 21:36:49 -07:00
Joe Hendrix
4267dca987
Get x86_symbolic test cases in runable state. 2019-03-25 20:29:35 -07:00
Joe Hendrix
82b96fb62a
Fix warnings; improve PLTStub comment. 2019-03-25 19:27:46 -07:00
Aaron Tomb
70981ba8ea Initial attempt at adapting to structured errors 2019-03-06 10:26:41 -08:00
Iavor Diatchki
65ca5447ea Merge branch 'master' of github.com:GaloisInc/macaw 2019-02-27 16:20:55 -08:00
Iavor Diatchki
1471740282 Only skip assignment if we are assigned to our own initial value. 2019-02-27 15:45:12 -08:00
Kevin Quick
45f5c5e7af
Update cabal version for fields used. 2019-02-27 11:52:18 -08:00
Kevin Quick
f8fce2175e
Sorting of pragma declarations in CrucGen. 2019-02-27 11:51:53 -08:00
Kevin Quick
b53b79471c
Add NondecreasingIndentation pragma to match code formatting. 2019-02-27 11:51:16 -08:00
Kevin Quick
60a3d39e98
More haddock updates for function argument information. 2019-02-27 11:35:20 -08:00
Kevin Quick
fcff1b7c3d
Update all source files for NoStarIsType pragma. 2019-02-27 11:35:09 -08:00
Kevin Quick
8ea677f7ed
[symbolic] Haddock fixes. 2019-02-27 09:30:55 -08:00
Kevin Quick
a1e6f1b841
Update NoStarIsType spec for backward compatibility. 2019-02-27 09:30:26 -08:00
Joe Hendrix
89ffdf088a
Fix macaw-x86-symbolic and update crucible. 2019-02-27 01:43:53 -08:00
Joe Hendrix
e8d2efcaae
Implement bitcast changes to macaw-symbolic 2019-02-26 17:53:34 -08:00
Kevin Quick
b80ab8fb67
[symbolic] update to latest crucible. 2019-02-19 08:24:45 -08:00
Luke Maurer
b23ba5820d CrucGen: Support for jump tables
This needed a bit more surgery than one would like, since we need to
translate the indirect jump to an if-else chain, which means creating a
bunch of "extra" blocks from within the `CrucGen` monad.
2019-02-15 18:35:25 -08:00
Tristan Ravitch
1d3610a783 Revert "Fill in undefineds with nonsense so pretty printing works"
This reverts commit 86ef62645d.

This commit generates invalid Crucible CFGs, which causes downstream analysis to
fail.  Reverting the commit will allow downstream clients to discard invalid CFGs.
2019-02-15 08:35:17 -08:00
Langston Barrett
274808a8ae update parameterized-utils submodule 2019-02-11 11:47:19 -08:00
Kevin Quick
5f12c3da88
[symbolic] Remove unneeded debug import. 2019-02-08 17:31:30 -08:00
Kevin Quick
edb486c6b3
Added toCrucibleEndian in symbolic and use for memory setup in refinement.
Requires updated macaw-loader BinaryFormat information.
2019-02-08 17:30:18 -08:00
Andrei Stefanescu
45e4251bf3 [refinement] Add an unbounded memory allocation at the bottom of the allocation stack. 2019-02-07 17:23:02 -08:00
Andrei Stefanescu
cca17b1c39 [refinement] Handle a code address as an LLVM pointer. 2019-02-06 20:13:47 -08:00
Andrei Stefanescu
cb8245d009 Map each memory segment in the binary using a What4 array. 2019-02-05 20:53:40 -08:00
Andrei Stefanescu
f7f7b6cac3 Merge branch 'master' into refinement 2019-02-04 15:39:19 -08:00
Andrei Stefanescu
87fc7c5439 Fix termStmtToJump. 2019-02-04 13:36:48 -08:00
Andrei Stefanescu
595c98efb8 Add proper mkBlockPathCFG. 2019-02-01 16:13:35 -08:00
Andrei Stefanescu
8ee9196cf6 Add Crucible translation of block paths. 2019-01-31 21:48:44 -08:00
Andrei Stefanescu
620b54e5af Handle all ParsedTermStmt constructors in termStmtToReturn. 2019-01-30 20:24:03 -08:00
Luke Maurer
957addd204 CrucGen: Use SetStruct rather than making a new one from scratch
This means far fewer instructions (and hence fewer registers), and in
turn a lot less heap space.  Peak memory usage is cut in half running
Brittle on a PPC64 exe with standard library.
2019-01-29 10:37:38 -08:00
Luke Maurer
b049a52ae9 CrucGen: Cache BV<->Ptr conversions
There was a fair amount of churn in generated Crucible CFGs due to
values redundantly getting converted back and forth.
2019-01-29 10:01:36 -08:00
Andrei Stefanescu
533dc131a4 Add register lookup and update functions in ArchVals. 2019-01-29 01:21:25 -08:00
Luke Maurer
5a8fba6d08 Cache TypeRepr and Position values
Generating the type of the register structure on demand was causing
`TypeRepr` to be the biggest chunk of the heap.  Similarly, we only need
to create a new `Position` when we change the offset.
2019-01-28 14:47:06 -08:00
Luke Maurer
12daa3a17b Make CrucGen stricter
Most crucially, make the `CrucGen` monad itself strict.  The heap was
filling up with old `CrucGenState`s being held onto by unevaluated
computations, since *every* computation was lazy.  Plugged a few other
sources of `CrucGenState` leaks as well.
2019-01-28 14:47:06 -08:00
Andrei Stefanescu
d0dd34a5bd [refinement] Initial setup for symbolic execution of a parsed block. 2019-01-25 22:46:57 -08:00
Joe Hendrix
ab066e2743
Merge remote-tracking branch 'public/master' into jhx/block-addr-removal 2019-01-22 11:12:25 -05:00
Joe Hendrix
8bf0d00e66
Fix warnings; crucible changes. 2019-01-22 10:25:45 -05:00
Joe Hendrix
0eac4d6b49
Remove blockAddr; update dependencies 2019-01-22 05:07:52 -05:00
Kevin Quick
7eabf2d01a
Handle additional side conditions returned by loadRawWithSideConditions. 2019-01-21 12:20:48 -08:00
Kevin Quick
f2b98011ce
Use initSimContext to create a Crucible SimContext.
This helps to immunize against changes in SimContext... e.g. the
addition on the profilingMetrics field that initSimContext provides a
default value for.
2019-01-21 12:20:00 -08:00
Nathan Collins
86ef62645d Fill in undefineds with nonsense so pretty printing works 2019-01-17 14:25:59 -08:00
Kevin Quick
190ed07121
[symbolic] add imports for mappend operator for GHC 8.2.2. 2019-01-12 18:10:16 -08:00
Tristan Ravitch
379f89ee78 Update to the latest crucible version
The llvm memory model was extended with better diagnostics and configurable
handling of undefined behavior.  macaw-symbolic uses no undefined behavior
checking, as those operations are only undefined in C.
2019-01-11 23:01:07 -08:00
Tristan Ravitch
7b57ac0c34 Additional haddocks 2019-01-11 13:58:15 -08:00
Tristan Ravitch
bda8ace256 symbolic: Clean up the memory mapping API
The API is now cleaner and includes more documentation (with an example).  Some
unnecessary types are removed/combined.
2019-01-11 13:21:04 -08:00
Tristan Ravitch
81f8f5a849 Add an extra comment to the backend docs 2019-01-11 13:11:40 -08:00
Tristan Ravitch
68c5578f03 symbolic: Translate the InstructionStart metadata statement into Crucible
Before, we just discarded them during the translation.  They are useful metadata
for generating diagnostics in Crucible, so this commit translates them.  They
are no-ops during symbolic evaluation.

To make them truly useful, they need to include the address of the block that
they belong to (their data payload in macaw is just an offset from the start of
a block).  This information wasn't available before, so it has to be plumbed
through in macaw-x86.
2019-01-10 22:23:39 -08:00
Tristan Ravitch
694e463e5d symbolic: Export another useful value wrapper in the user-facing API
This is a data wrapper used to convert macaw to crucible values
2019-01-10 22:22:44 -08:00
Tristan Ravitch
cc85cfe657 Clean up and document the macaw-symbolic API
This cleanup consolidates the interface to macaw symbolic into two (and a half)
modules:

 - Data.Macaw.Symbolic for clients who just need to symbolically simulate
   machine code
 - Data.Macaw.Symbolic.Backend for clients that need to implement new
   architectures
 - Data.Macaw.Symbolic.Memory provides a reusable example implementation of
   machine pointer to LLVM memory model pointer mapping

Most functions are now documented and are grouped by use case.  There are two
worked (compiling) examples in the haddocks that show how to translate Macaw
into Crucible and then symbolically simulate the results (including setting up
all aspects of Crucible).  The examples are included in the symbolic/examples
directory and can be loaded with GHCi to type check them.

The Data.Macaw.Symbolic.Memory module still needs a worked example.

There were very few changes to actual code as part of this overhaul, but there
are a few places where complicated functions were hidden behind newtypes, as
users never need to construct the values themselves (e.g., MacawArchEvalFn and
MacawSymbolicArchFunctions).  There was also a slight consolidation of
constraint synonyms to reduce duplication.  All callers will have to be updated.

There is also now a README for macaw-symbolic that explains its purpose and
includes pointers to the new haddocks.

This commit also fixes up the (minor) breakage in the macaw-x86-symbolic
implementation from the API changes.
2019-01-10 18:20:54 -08:00
Kevin Quick
98807daee2
Added -Wcompat for warnings about future compatibility. 2019-01-10 13:43:27 -08:00
Kevin Quick
b5ef20067d
Explicit results checking instead of implicit pattern monad fail. 2019-01-10 13:39:09 -08:00
Kevin Quick
16a867efd2
Haddock and README fixes. 2019-01-08 16:38:38 -08:00
Tristan Ravitch
b398db41b2 Merge branch 'master' of github.com:GaloisInc/macaw into HEAD 2019-01-07 20:43:32 -08:00
Tristan Ravitch
9c19e1b37d macaw-symbolic: Export an extra constructor
This constructor is very useful for traversing terms externally
2019-01-07 20:42:52 -08:00
Luke Maurer
46cdd8be82 Adapt to Nonce-based registerized CFGs 2019-01-03 12:10:24 -08:00
Luke Maurer
c43a0c24d8 Add INLINE pragmas to CrucGen monad instance 2018-12-26 18:42:50 -08:00
Brian Huffman
8dc4a54ca2 Use new constant noAlignment instead of literal 0 :: Alignment. 2018-12-20 14:03:38 -08:00
Brian Huffman
00c08376e5 Bump crucible version; adapt to crucible-llvm changes. 2018-12-18 17:47:50 -08:00