The goal is to support a jumptable testcase that is not supported by
the current jump bounds check. The jump bounds check needs to be
augmented so that it understands equality relationships between stack
values and registers, and bounds on both.
This patch tracks when a register points to a concrete stack offset.
As part of this, we droped the AbsDomain instance for AbsBlockState.
Clients should now likely use `fnStartAbsBlockState` in lieu of `top`.
The other client visible change is that the ClassifyFailure
constructor now has an extra argument with details about why
classification failure occured.
Most of the interface functions took a map from addresses to segments, however this map
was never actually used in macaw-symbolic.
The migration for this change is simply to remove the unused parameter from all
call sites in client code.
This needed a bit more surgery than one would like, since we need to
translate the indirect jump to an if-else chain, which means creating a
bunch of "extra" blocks from within the `CrucGen` monad.
This reverts commit 86ef62645d.
This commit generates invalid Crucible CFGs, which causes downstream analysis to
fail. Reverting the commit will allow downstream clients to discard invalid CFGs.
This means far fewer instructions (and hence fewer registers), and in
turn a lot less heap space. Peak memory usage is cut in half running
Brittle on a PPC64 exe with standard library.
Generating the type of the register structure on demand was causing
`TypeRepr` to be the biggest chunk of the heap. Similarly, we only need
to create a new `Position` when we change the offset.
Most crucially, make the `CrucGen` monad itself strict. The heap was
filling up with old `CrucGenState`s being held onto by unevaluated
computations, since *every* computation was lazy. Plugged a few other
sources of `CrucGenState` leaks as well.
This helps to immunize against changes in SimContext... e.g. the
addition on the profilingMetrics field that initSimContext provides a
default value for.
The llvm memory model was extended with better diagnostics and configurable
handling of undefined behavior. macaw-symbolic uses no undefined behavior
checking, as those operations are only undefined in C.
Before, we just discarded them during the translation. They are useful metadata
for generating diagnostics in Crucible, so this commit translates them. They
are no-ops during symbolic evaluation.
To make them truly useful, they need to include the address of the block that
they belong to (their data payload in macaw is just an offset from the start of
a block). This information wasn't available before, so it has to be plumbed
through in macaw-x86.
This cleanup consolidates the interface to macaw symbolic into two (and a half)
modules:
- Data.Macaw.Symbolic for clients who just need to symbolically simulate
machine code
- Data.Macaw.Symbolic.Backend for clients that need to implement new
architectures
- Data.Macaw.Symbolic.Memory provides a reusable example implementation of
machine pointer to LLVM memory model pointer mapping
Most functions are now documented and are grouped by use case. There are two
worked (compiling) examples in the haddocks that show how to translate Macaw
into Crucible and then symbolically simulate the results (including setting up
all aspects of Crucible). The examples are included in the symbolic/examples
directory and can be loaded with GHCi to type check them.
The Data.Macaw.Symbolic.Memory module still needs a worked example.
There were very few changes to actual code as part of this overhaul, but there
are a few places where complicated functions were hidden behind newtypes, as
users never need to construct the values themselves (e.g., MacawArchEvalFn and
MacawSymbolicArchFunctions). There was also a slight consolidation of
constraint synonyms to reduce duplication. All callers will have to be updated.
There is also now a README for macaw-symbolic that explains its purpose and
includes pointers to the new haddocks.
This commit also fixes up the (minor) breakage in the macaw-x86-symbolic
implementation from the API changes.
We were getting "Unterminated crucible block" errors for any code
containing the X86 HLT instruction. An ErrorStmt is perhaps not
precisely what HLT means, but we're going with that for the moment.
This update renames many of the declarations exported by
Data.Macaw.Memory so that we have more consistent names.
The majority of the existing names are now exported with DEPRECATION
warnings. Some of the symbol declarations that were not used by the
Memory datatype have been moved to other modules.
The minor version of macaw-base has been incremented.
Now one can either directly produce an SSA CFG or produce a registerized
one, perhaps mess with it (as with the new
`Lang.Crucible.Utils.RegRewrite` module), then translate it to SSA.
It needs to take (and return) a Crucible state so that we can insert the new
function handle into the handle map (so that the Crucible Call statement can
find it).
The GlobalMap is mapping from virtual addresses computed by a program to the
corresponding logical address in the LLVM memory model during symbolic
simulation. It is needed because addresses in binaries are computed from
bitvectors, which are not valid pointers in the LLVM memory model.
This change turns the GlobalMap from a Data.Map into a function, which is more
flexible and allows for a wider range of possible implementations of this
functionality, especially implementations that introduce numerous disjoint
segments for the original binary contents.