The old formulation (with system calls as block terminators) proved to be
impossible to implement properly. Handlers for syntax overrides have very
limited types (`IO`, rather than `OverrideSim`), which made symbolic branching
and reusing overrides impossible.
This change replaces the system call block terminator with an arch-specific
function that is translated into a function handle lookup (which is then
dispatched to with a call).
Unfortunately, this refactoring required combining the AArch32 simplification
module with the architecture extension definitions, due to the new translation
relying on the simplifier instance.
Adds support in macaw-aarch32 for conditional returns. These are not supported in core macaw, and are thus architecture-specific block terminators.
This required changes to the type of arch-specific block terminators. Before, `ArchTermStmt` was only parameterized by a state thread (`ids`). This meant that they could not contain macaw (or crucible) values. Some work on. AArch32 requires being able to store condition values in arch terminators (to support conditional returns). This change modifies the `ArchTermStmt` to enable this, which requires a bit of plumbing through various definitions and some extra instances.
In support of actually using this, it also became necessary to plumb fallthrough block labels through the architecture-specific terminator translation in macaw-symbolic.
Note that this change was overdue, as the PowerPC backend was storing macaw values in a way that would have rendered them unusable in the macaw-ppc-symbolic translation, had any interpretation been provided. These new changes will enable a handler to be written for the conditional PowerPC trap instructions.
PowerPC, x86, and ARM have been updated.
Improves the macaw-aarch32 tests. There is now a command line option to save the generated macaw IR for each
discovered function to /tmp. Note that this reuses some infrastructure from the macaw-symbolic tests. This
shared functionality should be extracted into a macaw-testing library.
This change adds an optional argument to `genArchVals` that allows client code to override the backend translation behavior of `MacawArchStmtExtension`s on a statement-by-statement basis. The new argument has type `Maybe (MacawArchStmtExtensionOverride arch)`, where `MacawArchStmtExtensionOverride` is a function that takes a statement and a crucible state, and returns an optional tuple containing the value produced by the statement, as well as an updated state. Returning 'Nothing' indicates that the backend should use its default handler for the statement.
Client code that wishes to maintain the existing default behavior in all cases can simply pass `Nothing` for the new argument to `genArchVals`.
GaloisInc/crucible#794 increases the number of functions that use
implicit `MemOptions`, including a handful of key LLVM memory model–related
functions. As a result, many parts of `macaw` need to add implicit `?memOpts`
parameters to accommodate to this change.
The new test suites cover x86_64, PowerPC, and ARM. They test that the semantics are actually correct (rather than just seeing if symbolic execution produces any result). The `Data.Macaw.Symbolic.Testing` module in macaw-symbolic provides some common utilities for symbolic execution engine setup, while there are tailored test harnesses for each architecture.
The semantics of the test harnesses are documented in each architecture test suite, but they:
1. Discover all of the test binaries (which are generated from the included makefiles)
2. Treat each function whose name begins with `test_` as a test entry point
3. Symbolically executes each test case with fully symbolic register states
4. Extracts the return value after symbolic execution, which is treated as the predicate to an assertion that must be proved
- If the test case is in the `pass` subdirectory, it is proved and expected to hold
- If the test case is in the `fail` subdirectory, it is proved and expected to not hold.
Each test harness supports two options for debugging:
- Dumping generated SMT queries
- Dumping generated Macaw IR for inspection
This testing uncovered a bug in the (previously untested) macaw-aarch32-symbolic code. It required a number of submodule updates to:
- Adapt to some what4 changes
- Fix a bug in the LLVM memory model that lets these tests pass
- Adapt to changes to some crucible APIs
This change also modifies the CI configuration to install SMT solvers earlier (which are now needed for all of the symbolic package tests).
The previous implementation used all of the registers defined in ASL translator.
It turns out that we don't actually use all of them in macaw; the significant
difference is that the `__memory` pseudo-location is used by asl-translator, but
not macaw (as macaw handles memory through implicit context). This change
modifies the register assignment to include everything except memory (which
requires an update to the asl-translator submodule).
aarch32: Support mixed ARM/Thumb1 binaries
This updates the aarch32 backend to decode Thumb instructions and generate the Thumb semantics. The major implementation change is to use the `ArchBlockPrecond` feature of macaw to track the Thumb state (`PSTATE_T`) across block boundaries.
The ARM code discovery decides whether or not a function entry point should be decoded as Thumb by examining the low bit of the function address. If the low bit is set, it is a Thumb entry point. This has the slightly odd effect of causing macaw to say that the function is at the address with the low bit set, which is not technically true. This is documented in the README, but not obvious on inspection. Most use cases should not care, and can in any case account for it. In the future, it should be possible to fix this (though it will require some changes to the core of macaw).
aarch32-symbolic: Implement most of the remaining macaw-aarch32-symbolic bits
It should be usable now, modulo some execution-time semantics for the floating
point operations. There will be a separate ticket covering the changes required
for them (some refactoring of how they are handled during translation is required).