Commit Graph

681 Commits

Author SHA1 Message Date
Kevin Quick
63d24be712
[refinement] add binary versions of test samples. 2019-01-22 23:36:48 -08:00
Kevin Quick
fb869eedf7
[refinement] add some test sources to the test/samples. 2019-01-22 23:35:01 -08:00
Kevin Quick
08d4dcd832
[refinement] Add test framework to compare output to expected output. 2019-01-22 20:04:26 -08:00
Joe Hendrix
0451046cab
Merge pull request #22 from GaloisInc/jhx/exports
Additional exports
2019-01-22 16:26:38 -05:00
Joe Hendrix
a5e3ba7247
Additional exports 2019-01-22 15:51:38 -05:00
Joe Hendrix
ed05584dcf
Merge pull request #21 from GaloisInc/jhx/block-addr-removal
block addr removal
2019-01-22 14:21:25 -05:00
Joe Hendrix
3eb92f34e1
Add x86_tests 2019-01-22 13:25:37 -05:00
Joe Hendrix
6d1cc603d0
Merge remote-tracking branch 'public/jhx/minor-additions' into jhx/block-addr-removal
Also fixes some warnings.
2019-01-22 11:32:00 -05:00
Joe Hendrix
ab066e2743
Merge remote-tracking branch 'public/master' into jhx/block-addr-removal 2019-01-22 11:12:25 -05:00
Joe Hendrix
8bf0d00e66
Fix warnings; crucible changes. 2019-01-22 10:25:45 -05:00
Joe Hendrix
23186a4991
Minor comments; fix stack.yaml 2019-01-22 05:36:17 -05:00
Joe Hendrix
0eac4d6b49
Remove blockAddr; update dependencies 2019-01-22 05:07:52 -05:00
Kevin Quick
fa891367ef Merge branch 'master' into refinement 2019-01-21 12:22:53 -08:00
Kevin Quick
7eabf2d01a
Handle additional side conditions returned by loadRawWithSideConditions. 2019-01-21 12:20:48 -08:00
Kevin Quick
f2b98011ce
Use initSimContext to create a Crucible SimContext.
This helps to immunize against changes in SimContext... e.g. the
addition on the profilingMetrics field that initSimContext provides a
default value for.
2019-01-21 12:20:00 -08:00
Nathan Collins
86ef62645d Fill in undefineds with nonsense so pretty printing works 2019-01-17 14:25:59 -08:00
Kevin Quick
82f4b15a02 Merge branch 'master' into refinement 2019-01-16 10:36:44 -08:00
Kevin Quick
190ed07121
[symbolic] add imports for mappend operator for GHC 8.2.2. 2019-01-12 18:10:16 -08:00
Kevin Quick
fb0d5c4776 Merge branch 'master' into refinement 2019-01-12 17:12:57 -08:00
Tristan Ravitch
379f89ee78 Update to the latest crucible version
The llvm memory model was extended with better diagnostics and configurable
handling of undefined behavior.  macaw-symbolic uses no undefined behavior
checking, as those operations are only undefined in C.
2019-01-11 23:01:07 -08:00
Tristan Ravitch
7b57ac0c34 Additional haddocks 2019-01-11 13:58:15 -08:00
Tristan Ravitch
bda8ace256 symbolic: Clean up the memory mapping API
The API is now cleaner and includes more documentation (with an example).  Some
unnecessary types are removed/combined.
2019-01-11 13:21:04 -08:00
Tristan Ravitch
81f8f5a849 Add an extra comment to the backend docs 2019-01-11 13:11:40 -08:00
Kevin Quick
7b50a38d30 Merge branch 'master' into refinement 2019-01-11 09:04:26 -08:00
Tristan Ravitch
68c5578f03 symbolic: Translate the InstructionStart metadata statement into Crucible
Before, we just discarded them during the translation.  They are useful metadata
for generating diagnostics in Crucible, so this commit translates them.  They
are no-ops during symbolic evaluation.

To make them truly useful, they need to include the address of the block that
they belong to (their data payload in macaw is just an offset from the start of
a block).  This information wasn't available before, so it has to be plumbed
through in macaw-x86.
2019-01-10 22:23:39 -08:00
Tristan Ravitch
694e463e5d symbolic: Export another useful value wrapper in the user-facing API
This is a data wrapper used to convert macaw to crucible values
2019-01-10 22:22:44 -08:00
Tristan Ravitch
cc85cfe657 Clean up and document the macaw-symbolic API
This cleanup consolidates the interface to macaw symbolic into two (and a half)
modules:

 - Data.Macaw.Symbolic for clients who just need to symbolically simulate
   machine code
 - Data.Macaw.Symbolic.Backend for clients that need to implement new
   architectures
 - Data.Macaw.Symbolic.Memory provides a reusable example implementation of
   machine pointer to LLVM memory model pointer mapping

Most functions are now documented and are grouped by use case.  There are two
worked (compiling) examples in the haddocks that show how to translate Macaw
into Crucible and then symbolically simulate the results (including setting up
all aspects of Crucible).  The examples are included in the symbolic/examples
directory and can be loaded with GHCi to type check them.

The Data.Macaw.Symbolic.Memory module still needs a worked example.

There were very few changes to actual code as part of this overhaul, but there
are a few places where complicated functions were hidden behind newtypes, as
users never need to construct the values themselves (e.g., MacawArchEvalFn and
MacawSymbolicArchFunctions).  There was also a slight consolidation of
constraint synonyms to reduce duplication.  All callers will have to be updated.

There is also now a README for macaw-symbolic that explains its purpose and
includes pointers to the new haddocks.

This commit also fixes up the (minor) breakage in the macaw-x86-symbolic
implementation from the API changes.
2019-01-10 18:20:54 -08:00
Kevin Quick
d87482c949
Add run-refinement --unrefined flag to show pre- and post- refinement. 2019-01-10 17:25:12 -08:00
Kevin Quick
d04bdf9ac3
Add run-refinement tool for cmdline dumping of exe file info.
This tool is similar to run-refurbish but it is intended to dump
information about additional refinements provided by this library.
2019-01-10 14:53:12 -08:00
Kevin Quick
f0087c9ea2
Enable warnings for future compatibility issues. 2019-01-10 14:41:44 -08:00
Kevin Quick
8d3a333cb9 Merge branch 'master' into refinement 2019-01-10 13:56:18 -08:00
Kevin Quick
51efbf6392 Merge branch 'master' into refinement 2019-01-10 13:46:19 -08:00
Kevin Quick
98807daee2
Added -Wcompat for warnings about future compatibility. 2019-01-10 13:43:27 -08:00
Kevin Quick
b5ef20067d
Explicit results checking instead of implicit pattern monad fail. 2019-01-10 13:39:09 -08:00
Kevin Quick
16a867efd2
Haddock and README fixes. 2019-01-08 16:38:38 -08:00
Tristan Ravitch
b398db41b2 Merge branch 'master' of github.com:GaloisInc/macaw into HEAD 2019-01-07 20:43:32 -08:00
Tristan Ravitch
9c19e1b37d macaw-symbolic: Export an extra constructor
This constructor is very useful for traversing terms externally
2019-01-07 20:42:52 -08:00
Kevin Quick
d62bf8f26e Add README and Changelog and update cabal synopsis/description. 2019-01-07 15:13:50 -08:00
Kevin Quick
d4d7f1b9be Add refinement library.
The refinement library provides supplemental functionality for
discovery of elements that macaw-symbolic is not able to discover via
pattern matching.  This library will use crucible symbolic analysis to
attempt to determine elements that could not be identified by
macaw-symbolic.  The identification provided by macaw-symbolic is
incomplete, and so is the identification by this macaw-refinement, but
macaw-refinement attempts to additionally "refine" the analysis to
achieve even more information which can then be provided back to the
macaw analysis.

  * Terminator effects for incomplete blocks.  For example, the target
    IP address by symbolic evaluation (e.g. of jump tables).  If the
    current block does not provide sufficient information to
    symbolically identify the target, previous blocks can be added to
    the analysis (back to the entry block or a loop point).

  * Argument liveness (determining which registers and memory
    locations are used/live by a block allows determination of ABI
    compliance (for transformations) and specific block
    requirements (which currently start with a full register state and
    blank memory).

  * Call graphs.  Determination of targets of call instructions that
    cannot be identified by pattern matching via symbolic evaluation,
    using techniques similar to those for identifying incomplete blocks.
2019-01-07 14:16:03 -08:00
Luke Maurer
46cdd8be82 Adapt to Nonce-based registerized CFGs 2019-01-03 12:10:24 -08:00
Luke Maurer
b93302a536 Cache map with arch registers as keys
The use of `Data.Parameterized.Map.fromList` in `mkRegStateM` was
showing up in profiling as a huge time sink.  We don't actually need to
build the map from scratch there, though, since the keys are known ahead
of time.  Adding an `archRegSet` variable to the `RegisterInfo` class
(with the obvious default implementation) ensures that a `MapF` with the
right keys will be built once and then reused.
2018-12-27 11:32:56 -08:00
Luke Maurer
64a1c01a7b Use RULE to optimize uses of boundValue as getter
GHC was leaving `boundValue` in its higher-order form, which was causing
slowdowns accounting for ~3% of runtime in Brittle.
2018-12-27 11:32:46 -08:00
Luke Maurer
c43a0c24d8 Add INLINE pragmas to CrucGen monad instance 2018-12-26 18:42:50 -08:00
Brian Huffman
8dc4a54ca2 Use new constant noAlignment instead of literal 0 :: Alignment. 2018-12-20 14:03:38 -08:00
Brian Huffman
a8ad3121ef Bump crucible submodule. 2018-12-20 14:02:52 -08:00
Andrei Stefanescu
2ce1157af6
Merge pull request #19 from GaloisInc/fix/keep-return-address-stack-write
Keep the write of the return address to the stack (x86)
2018-12-20 13:15:10 -08:00
Brian Huffman
00c08376e5 Bump crucible version; adapt to crucible-llvm changes. 2018-12-18 17:47:50 -08:00
Andrei Stefanescu
76ac547995 Merge branch 'master' of github.com:GaloisInc/macaw into fix/keep-return-address-stack-write 2018-12-18 14:31:08 -08:00
Brian Huffman
7e6582fa07 Bump submodules, adapt to changes in crucible-llvm api. 2018-12-18 13:47:51 -08:00
Tristan Ravitch
96129be6de Keep the write of the return address to the stack (x86)
This mostly affects x86.  Previously, we threw away the write of the return
address to the stack when identifying calls for macaw-x86.  This was partly for
hygiene and partly to support the "addresses written to memory are function
pointers" heuristic.  Treating the return address as a potential function
pointer breaks function identification, so that is important.

The problem comes in the translation of macaw into crucible - we never write the
return address to the stack, but returns still read the return address from the
stack.  If it wasn't written in the first place, this leads to a read
from (potentially) uninitialized memory, which causes errors in the symbolic
simulator.  There are two solutions:

1. Make returns not read from the stack
2. Keep the write of the return address to the stack

Solution 1 is a problem, as we have a data dependency on the read.  Eliding it
breaks Crucible generation later and produces an invalid CFG.

Solution 2 works well.  The implementation is actually simple.  We can keep
identifyCall the same for x86 and just construct the basic block not from the
return value but from the original list of statements (unaltered).  We do need
to have identifyCall still give us the reduced statement list, which we use for
identifying possible function pointers written onto the stack (but not the
return address, which we do not want to treat as a function pointer).
2018-12-07 15:11:39 -08:00