This change adds an optional argument to `genArchVals` that allows client code to override the backend translation behavior of `MacawArchStmtExtension`s on a statement-by-statement basis. The new argument has type `Maybe (MacawArchStmtExtensionOverride arch)`, where `MacawArchStmtExtensionOverride` is a function that takes a statement and a crucible state, and returns an optional tuple containing the value produced by the statement, as well as an updated state. Returning 'Nothing' indicates that the backend should use its default handler for the statement.
Client code that wishes to maintain the existing default behavior in all cases can simply pass `Nothing` for the new argument to `genArchVals`.
Implement support for symbolically executing system calls in macaw-symbolic. **To update code that does not need to symbolically execute system calls (i.e., most clients of macaw-symbolic), just pass the new `unsupportedSyscalls` default handler as the fifth argument of `macawExtensions`.**
The primary interface is via the new `LookupSyscallHandle` callback passed to `macawExtensions`. This callback inspects the environment and returns a Crucible `FunctionHandle` that models the behavior of the requested system call. Note that this mechanism only supports concrete system calls (i.e., system calls where the system call number is concrete). The x86 backend has been updated to support this new functionality.
The representation of system calls in macaw is still architecture-specific (because there are interesting differences between system call instructions across architectures). The idea is that system calls are now treated in two steps:
1. A macaw-symbolic extension statement that looks up the override to invoke for the given syscall (returned as a Crucible FunctionHandle)
2. A call to that handle
We need this two step approach because the handlers that interpret syntax extension statements cannot symbolically branch (and thus cannot call overrides). The extension interpreter just looks up the necessary handle and uses the standard call/override machinery to handle any branching required to support the system call model functionality.
The major complication to this approach is that system calls need to update values in registers when they return. To capture these updates, the architecture-specific syntax extension needs to explicitly update any machine registers that could possibly be affected. The explicit updates are necessary because machine registers do not exist anymore at the macaw-symbolic level (at least within a block). To handle all of these constraints:
1. System calls are represented as extension functions at the macaw level when lifted from machine code.
2. During translation into crucible (via macaw-symbolic), the extension functions are translated into two statements: a function handle lookup and then a function call (with the return values being explicitly threaded through the Crucible function).
3. During symbolic execution, the lookup statement examines the environment to return the necessary function handle, while the handle is called via the normal machinery.
Note that the feature is entirely controlled by the `LookupSyscallHandle` function, which determines the system call dispatch policy. No system call models are included with this change.
Co-authored-by: Brett Boston <boston@galois.com>
GaloisInc/crucible#794 increases the number of functions that use
implicit `MemOptions`, including a handful of key LLVM memory model–related
functions. As a result, many parts of `macaw` need to add implicit `?memOpts`
parameters to accommodate to this change.
Nat is no longer a what4 base type, so we have to adapt various APIs to accommodate that. The template haskell in macaw-semmc is updated to remove Nat cases. Changes to the `SymFn` type required removing a type parameter.
This commit also adds macaw-refinement to CI (which requires installing SMT solvers); that code had to be updated due to the what4 changes.
Co-authored-by: Tristan Ravitch <tristan@galois.com>
This commit updates macaw-refinement to work with the latest macaw/crucible and makes a few improvements along the way.
The major changes involved in this are:
* Block labels were removed from macaw, so we had to come up with an alternative approach to making synthetic blocks to represent dispatch resolved by macaw-refinement that is not really a jump table. We considered adding a new terminator that encoded "computed IP-based dispatch", but there was concern about the impact on client code. Instead, we added a field to the `DiscoveryFunInfo` that records "external" resolutions to indirect control flow (e.g., as by an SMT solver in macaw-refinement). The hook by which we feed SMT-based resolutions back into macaw was modified accordingly (`addDiscoveredFunctionBlockTargets`).
* Solver invocation changed to allow solver selection and parallel solver application.
* Logging is now done via the `lumberjack` library.
* macaw-symbolic now uses the "external" resolutions in `DiscoveryFunInfo` while building crucible CFGs.
* The path creation code in macaw-refinement was simplified significantly and the approach to path creation has been documented.
* The run-refinement tool is now more featureful.
* The test suite is a bit more structured and no longer depends on the printed output of the discovery process.
Attempting to refine the switching test for PPC executables ends up
with a non-terminating Z3 process, so this test is disabled until this
is diagnosed.