Execution framework for determining the best refinement (if any) for a
particular block by extracting the CFG for that function and iterating
over successively larger paths leading to the unknown transfer block.
The core solution-generation via SMT/Crucible/What4 is still mocked out.
Updates the unknown transfer resolution module to iterate through the
set of blocks with unknown transfer results, attempting to refine the
unknown transfer failures recursively so that any newly discovered
blocks are also attempted (if necessary) and generating a (possibly
updated) DiscoveryState where any refined unknown transfer conditions
replace the original information.
Does not yet perform the actual refinement, just provides the
framework that would attempt to refine each unknown transfer.
These `getCallTarget` and `doJump` were calling `fail` if they saw an argument
type that we hadn't thought to handle yet. This change turns those errors into
TranslationError statements, allowing macaw to continue exploring code.
This came up recently in a glibc-based example where macaw ended up exploring
unaligned code and creating a strange jump to a far pointer, which doesn't make
much sense in x86_64 mode.
This helps to immunize against changes in SimContext... e.g. the
addition on the profilingMetrics field that initSimContext provides a
default value for.
The llvm memory model was extended with better diagnostics and configurable
handling of undefined behavior. macaw-symbolic uses no undefined behavior
checking, as those operations are only undefined in C.
This includes a minor change: a new required field for the blocks returned by
the machine-specific disassembler. The information was already readily
available in this backend.
Before, we just discarded them during the translation. They are useful metadata
for generating diagnostics in Crucible, so this commit translates them. They
are no-ops during symbolic evaluation.
To make them truly useful, they need to include the address of the block that
they belong to (their data payload in macaw is just an offset from the start of
a block). This information wasn't available before, so it has to be plumbed
through in macaw-x86.
This cleanup consolidates the interface to macaw symbolic into two (and a half)
modules:
- Data.Macaw.Symbolic for clients who just need to symbolically simulate
machine code
- Data.Macaw.Symbolic.Backend for clients that need to implement new
architectures
- Data.Macaw.Symbolic.Memory provides a reusable example implementation of
machine pointer to LLVM memory model pointer mapping
Most functions are now documented and are grouped by use case. There are two
worked (compiling) examples in the haddocks that show how to translate Macaw
into Crucible and then symbolically simulate the results (including setting up
all aspects of Crucible). The examples are included in the symbolic/examples
directory and can be loaded with GHCi to type check them.
The Data.Macaw.Symbolic.Memory module still needs a worked example.
There were very few changes to actual code as part of this overhaul, but there
are a few places where complicated functions were hidden behind newtypes, as
users never need to construct the values themselves (e.g., MacawArchEvalFn and
MacawSymbolicArchFunctions). There was also a slight consolidation of
constraint synonyms to reduce duplication. All callers will have to be updated.
There is also now a README for macaw-symbolic that explains its purpose and
includes pointers to the new haddocks.
This commit also fixes up the (minor) breakage in the macaw-x86-symbolic
implementation from the API changes.