Commit Graph

714 Commits

Author SHA1 Message Date
Kevin Quick
190ed07121
[symbolic] add imports for mappend operator for GHC 8.2.2. 2019-01-12 18:10:16 -08:00
Kevin Quick
fb0d5c4776 Merge branch 'master' into refinement 2019-01-12 17:12:57 -08:00
Tristan Ravitch
379f89ee78 Update to the latest crucible version
The llvm memory model was extended with better diagnostics and configurable
handling of undefined behavior.  macaw-symbolic uses no undefined behavior
checking, as those operations are only undefined in C.
2019-01-11 23:01:07 -08:00
Tristan Ravitch
7b57ac0c34 Additional haddocks 2019-01-11 13:58:15 -08:00
Tristan Ravitch
bda8ace256 symbolic: Clean up the memory mapping API
The API is now cleaner and includes more documentation (with an example).  Some
unnecessary types are removed/combined.
2019-01-11 13:21:04 -08:00
Tristan Ravitch
81f8f5a849 Add an extra comment to the backend docs 2019-01-11 13:11:40 -08:00
Kevin Quick
7b50a38d30 Merge branch 'master' into refinement 2019-01-11 09:04:26 -08:00
Tristan Ravitch
68c5578f03 symbolic: Translate the InstructionStart metadata statement into Crucible
Before, we just discarded them during the translation.  They are useful metadata
for generating diagnostics in Crucible, so this commit translates them.  They
are no-ops during symbolic evaluation.

To make them truly useful, they need to include the address of the block that
they belong to (their data payload in macaw is just an offset from the start of
a block).  This information wasn't available before, so it has to be plumbed
through in macaw-x86.
2019-01-10 22:23:39 -08:00
Tristan Ravitch
694e463e5d symbolic: Export another useful value wrapper in the user-facing API
This is a data wrapper used to convert macaw to crucible values
2019-01-10 22:22:44 -08:00
Tristan Ravitch
cc85cfe657 Clean up and document the macaw-symbolic API
This cleanup consolidates the interface to macaw symbolic into two (and a half)
modules:

 - Data.Macaw.Symbolic for clients who just need to symbolically simulate
   machine code
 - Data.Macaw.Symbolic.Backend for clients that need to implement new
   architectures
 - Data.Macaw.Symbolic.Memory provides a reusable example implementation of
   machine pointer to LLVM memory model pointer mapping

Most functions are now documented and are grouped by use case.  There are two
worked (compiling) examples in the haddocks that show how to translate Macaw
into Crucible and then symbolically simulate the results (including setting up
all aspects of Crucible).  The examples are included in the symbolic/examples
directory and can be loaded with GHCi to type check them.

The Data.Macaw.Symbolic.Memory module still needs a worked example.

There were very few changes to actual code as part of this overhaul, but there
are a few places where complicated functions were hidden behind newtypes, as
users never need to construct the values themselves (e.g., MacawArchEvalFn and
MacawSymbolicArchFunctions).  There was also a slight consolidation of
constraint synonyms to reduce duplication.  All callers will have to be updated.

There is also now a README for macaw-symbolic that explains its purpose and
includes pointers to the new haddocks.

This commit also fixes up the (minor) breakage in the macaw-x86-symbolic
implementation from the API changes.
2019-01-10 18:20:54 -08:00
Kevin Quick
d87482c949
Add run-refinement --unrefined flag to show pre- and post- refinement. 2019-01-10 17:25:12 -08:00
Kevin Quick
d04bdf9ac3
Add run-refinement tool for cmdline dumping of exe file info.
This tool is similar to run-refurbish but it is intended to dump
information about additional refinements provided by this library.
2019-01-10 14:53:12 -08:00
Kevin Quick
f0087c9ea2
Enable warnings for future compatibility issues. 2019-01-10 14:41:44 -08:00
Kevin Quick
8d3a333cb9 Merge branch 'master' into refinement 2019-01-10 13:56:18 -08:00
Kevin Quick
51efbf6392 Merge branch 'master' into refinement 2019-01-10 13:46:19 -08:00
Kevin Quick
98807daee2
Added -Wcompat for warnings about future compatibility. 2019-01-10 13:43:27 -08:00
Kevin Quick
b5ef20067d
Explicit results checking instead of implicit pattern monad fail. 2019-01-10 13:39:09 -08:00
Kevin Quick
16a867efd2
Haddock and README fixes. 2019-01-08 16:38:38 -08:00
Tristan Ravitch
b398db41b2 Merge branch 'master' of github.com:GaloisInc/macaw into HEAD 2019-01-07 20:43:32 -08:00
Tristan Ravitch
9c19e1b37d macaw-symbolic: Export an extra constructor
This constructor is very useful for traversing terms externally
2019-01-07 20:42:52 -08:00
Kevin Quick
d62bf8f26e Add README and Changelog and update cabal synopsis/description. 2019-01-07 15:13:50 -08:00
Kevin Quick
d4d7f1b9be Add refinement library.
The refinement library provides supplemental functionality for
discovery of elements that macaw-symbolic is not able to discover via
pattern matching.  This library will use crucible symbolic analysis to
attempt to determine elements that could not be identified by
macaw-symbolic.  The identification provided by macaw-symbolic is
incomplete, and so is the identification by this macaw-refinement, but
macaw-refinement attempts to additionally "refine" the analysis to
achieve even more information which can then be provided back to the
macaw analysis.

  * Terminator effects for incomplete blocks.  For example, the target
    IP address by symbolic evaluation (e.g. of jump tables).  If the
    current block does not provide sufficient information to
    symbolically identify the target, previous blocks can be added to
    the analysis (back to the entry block or a loop point).

  * Argument liveness (determining which registers and memory
    locations are used/live by a block allows determination of ABI
    compliance (for transformations) and specific block
    requirements (which currently start with a full register state and
    blank memory).

  * Call graphs.  Determination of targets of call instructions that
    cannot be identified by pattern matching via symbolic evaluation,
    using techniques similar to those for identifying incomplete blocks.
2019-01-07 14:16:03 -08:00
Luke Maurer
46cdd8be82 Adapt to Nonce-based registerized CFGs 2019-01-03 12:10:24 -08:00
Luke Maurer
b93302a536 Cache map with arch registers as keys
The use of `Data.Parameterized.Map.fromList` in `mkRegStateM` was
showing up in profiling as a huge time sink.  We don't actually need to
build the map from scratch there, though, since the keys are known ahead
of time.  Adding an `archRegSet` variable to the `RegisterInfo` class
(with the obvious default implementation) ensures that a `MapF` with the
right keys will be built once and then reused.
2018-12-27 11:32:56 -08:00
Luke Maurer
64a1c01a7b Use RULE to optimize uses of boundValue as getter
GHC was leaving `boundValue` in its higher-order form, which was causing
slowdowns accounting for ~3% of runtime in Brittle.
2018-12-27 11:32:46 -08:00
Luke Maurer
c43a0c24d8 Add INLINE pragmas to CrucGen monad instance 2018-12-26 18:42:50 -08:00
Brian Huffman
8dc4a54ca2 Use new constant noAlignment instead of literal 0 :: Alignment. 2018-12-20 14:03:38 -08:00
Brian Huffman
a8ad3121ef Bump crucible submodule. 2018-12-20 14:02:52 -08:00
Andrei Stefanescu
2ce1157af6
Merge pull request #19 from GaloisInc/fix/keep-return-address-stack-write
Keep the write of the return address to the stack (x86)
2018-12-20 13:15:10 -08:00
Brian Huffman
00c08376e5 Bump crucible version; adapt to crucible-llvm changes. 2018-12-18 17:47:50 -08:00
Andrei Stefanescu
76ac547995 Merge branch 'master' of github.com:GaloisInc/macaw into fix/keep-return-address-stack-write 2018-12-18 14:31:08 -08:00
Brian Huffman
7e6582fa07 Bump submodules, adapt to changes in crucible-llvm api. 2018-12-18 13:47:51 -08:00
Tristan Ravitch
96129be6de Keep the write of the return address to the stack (x86)
This mostly affects x86.  Previously, we threw away the write of the return
address to the stack when identifying calls for macaw-x86.  This was partly for
hygiene and partly to support the "addresses written to memory are function
pointers" heuristic.  Treating the return address as a potential function
pointer breaks function identification, so that is important.

The problem comes in the translation of macaw into crucible - we never write the
return address to the stack, but returns still read the return address from the
stack.  If it wasn't written in the first place, this leads to a read
from (potentially) uninitialized memory, which causes errors in the symbolic
simulator.  There are two solutions:

1. Make returns not read from the stack
2. Keep the write of the return address to the stack

Solution 1 is a problem, as we have a data dependency on the read.  Eliding it
breaks Crucible generation later and produces an invalid CFG.

Solution 2 works well.  The implementation is actually simple.  We can keep
identifyCall the same for x86 and just construct the basic block not from the
return value but from the original list of statements (unaltered).  We do need
to have identifyCall still give us the reduced statement list, which we use for
identifying possible function pointers written onto the stack (but not the
return address, which we do not want to treat as a function pointer).
2018-12-07 15:11:39 -08:00
Brian Huffman
3fc657782d Add Semigroup instance to make GHC 8.4 happy. 2018-12-07 13:48:38 -08:00
Joe Hendrix
3dd2f15dd6
Add mapsRegsWith; 8.6 compatibility. 2018-12-04 13:41:07 -08:00
Joe Hendrix
146ec121c3
Merge pull request #17 from GaloisInc/jhx/plt-support
Add PLT support
2018-12-04 09:30:08 -08:00
Joe Hendrix
25e922ef83
Fix previous commit 2018-12-04 09:02:27 -08:00
Joe Hendrix
ebc5d9575e
Merge remote-tracking branch 'public/master' into jhx/plt-support 2018-12-04 08:04:32 -08:00
Joe Hendrix
f03941d607
Add test-plt test case, and fix discovery to use trust symbols. 2018-12-04 00:04:23 -08:00
Joe Hendrix
a0a89083e8
Support X86 Relative; other minor changes. 2018-12-03 20:52:44 -08:00
Andrei Stefanescu
3f39c614e9 Add support for RepMovs and RepStos. 2018-11-27 02:23:36 -08:00
Kevin Quick
3c7e222676
Add missing import for previous change. 2018-11-26 11:26:01 -08:00
Kevin Quick
b92f008676
Merge branch 'master' of github.com:GaloisInc/macaw 2018-11-25 23:53:28 -08:00
Kevin Quick
3f8769a424
[x86_symbolic] add semantics for X86Div, X86Rem, X86IDiv, and X86IRem. 2018-11-25 22:02:18 -08:00
Kevin Quick
01b8175e7f
[x86_symbolic] Update cabal specification for compliance. 2018-11-25 22:01:40 -08:00
Kevin Quick
0ee3f7df2d
[x86_symbolic] more info for unimplemented statement and termstmt semantics. 2018-11-25 22:00:19 -08:00
Kevin Quick
3f77e763e9
Implement NoStarIsType for GHC 8.6. 2018-11-21 18:27:42 +00:00
Kevin Quick
1d7cdc87eb
Implement NoStarIsType and MonadFail for GHC 8.6. 2018-11-21 00:08:33 +00:00
Kevin Quick
7a64cb614f
Explicit NoStarIsType with Data.Kind.Type and increasing do indentation (for GHC 8.6) 2018-11-20 09:43:48 +00:00
Joe Hendrix
1547712176
Bump parameterized-util version. 2018-11-17 16:03:34 -08:00