There were two identical definitions of `toCrucibleEndian`, one in
`D.M.S.Memory` and another in `D.M.S.Testing`. This commit removes the
latter in favor of the former, which is actually exported.
Have all additions (at any bit width) go through the special PtrAdd
handler (rather than BVAdd). Also add special handlers for truncation and
extension.
These changes support architectures that do pointer operations at non-pointer
widths (e.g., to detect overflow). These new operations apply the named
operations over just the offset of pointers, preserving the block id.
The `Data.Macaw.Symbolic.Memory` module provides a default memory model and
initial memory setup that is suitable for many symbolic execution
workloads. However, the defaults cannot handle dynamically-linked programs, as
it calls `error` when it attempts to determine an initial value for relocations
it finds in memory.
There are no good defaults for this, as what those values should be depend a lot
on what the verifier wants to prove.
This commit adds some hooks to configure this behavior in the verifier, and is
designed to be extensible and enable other configuration choices where
reasonable.
The original API is unchanged, as it calls the added `newGlobalMemoryWith`
function with a default set of hooks. Callers with special memory handling
needs are directed to use that function.
This updates the `crucible` submodule to include GaloisInc/crucible#906
(`Control granularity of reading uninitialized memory`), as well as the
`semmc` submodule to bring in corresponding changes on its side
(GaloisInc/semmc#69). Some additional `?memOpts :: MemOptions` constraints
needed to be added to some functions in `macaw-symbolic` and
`macaw-refinement` as a result.
Adds support in macaw-aarch32 for conditional returns. These are not supported in core macaw, and are thus architecture-specific block terminators.
This required changes to the type of arch-specific block terminators. Before, `ArchTermStmt` was only parameterized by a state thread (`ids`). This meant that they could not contain macaw (or crucible) values. Some work on. AArch32 requires being able to store condition values in arch terminators (to support conditional returns). This change modifies the `ArchTermStmt` to enable this, which requires a bit of plumbing through various definitions and some extra instances.
In support of actually using this, it also became necessary to plumb fallthrough block labels through the architecture-specific terminator translation in macaw-symbolic.
Note that this change was overdue, as the PowerPC backend was storing macaw values in a way that would have rendered them unusable in the macaw-ppc-symbolic translation, had any interpretation been provided. These new changes will enable a handler to be written for the conditional PowerPC trap instructions.
PowerPC, x86, and ARM have been updated.
Improves the macaw-aarch32 tests. There is now a command line option to save the generated macaw IR for each
discovered function to /tmp. Note that this reuses some infrastructure from the macaw-symbolic tests. This
shared functionality should be extracted into a macaw-testing library.
This change adds an optional argument to `genArchVals` that allows client code to override the backend translation behavior of `MacawArchStmtExtension`s on a statement-by-statement basis. The new argument has type `Maybe (MacawArchStmtExtensionOverride arch)`, where `MacawArchStmtExtensionOverride` is a function that takes a statement and a crucible state, and returns an optional tuple containing the value produced by the statement, as well as an updated state. Returning 'Nothing' indicates that the backend should use its default handler for the statement.
Client code that wishes to maintain the existing default behavior in all cases can simply pass `Nothing` for the new argument to `genArchVals`.
Implement support for symbolically executing system calls in macaw-symbolic. **To update code that does not need to symbolically execute system calls (i.e., most clients of macaw-symbolic), just pass the new `unsupportedSyscalls` default handler as the fifth argument of `macawExtensions`.**
The primary interface is via the new `LookupSyscallHandle` callback passed to `macawExtensions`. This callback inspects the environment and returns a Crucible `FunctionHandle` that models the behavior of the requested system call. Note that this mechanism only supports concrete system calls (i.e., system calls where the system call number is concrete). The x86 backend has been updated to support this new functionality.
The representation of system calls in macaw is still architecture-specific (because there are interesting differences between system call instructions across architectures). The idea is that system calls are now treated in two steps:
1. A macaw-symbolic extension statement that looks up the override to invoke for the given syscall (returned as a Crucible FunctionHandle)
2. A call to that handle
We need this two step approach because the handlers that interpret syntax extension statements cannot symbolically branch (and thus cannot call overrides). The extension interpreter just looks up the necessary handle and uses the standard call/override machinery to handle any branching required to support the system call model functionality.
The major complication to this approach is that system calls need to update values in registers when they return. To capture these updates, the architecture-specific syntax extension needs to explicitly update any machine registers that could possibly be affected. The explicit updates are necessary because machine registers do not exist anymore at the macaw-symbolic level (at least within a block). To handle all of these constraints:
1. System calls are represented as extension functions at the macaw level when lifted from machine code.
2. During translation into crucible (via macaw-symbolic), the extension functions are translated into two statements: a function handle lookup and then a function call (with the return values being explicitly threaded through the Crucible function).
3. During symbolic execution, the lookup statement examines the environment to return the necessary function handle, while the handle is called via the normal machinery.
Note that the feature is entirely controlled by the `LookupSyscallHandle` function, which determines the system call dispatch policy. No system call models are included with this change.
Co-authored-by: Brett Boston <boston@galois.com>
This bumps the `crucible` submodule to bring in:
* `crucible-symio` (GaloisInc/crucible#788). This requires adding a new
project dependency in `cabal.project.dist`.
* GaloisInc/crucible#808, which adds yet another `?memOpts :: MemOptions`
constraint, this time in `doPtrAddOffset`.
GaloisInc/crucible#794 increases the number of functions that use
implicit `MemOptions`, including a handful of key LLVM memory model–related
functions. As a result, many parts of `macaw` need to add implicit `?memOpts`
parameters to accommodate to this change.
The new test suites cover x86_64, PowerPC, and ARM. They test that the semantics are actually correct (rather than just seeing if symbolic execution produces any result). The `Data.Macaw.Symbolic.Testing` module in macaw-symbolic provides some common utilities for symbolic execution engine setup, while there are tailored test harnesses for each architecture.
The semantics of the test harnesses are documented in each architecture test suite, but they:
1. Discover all of the test binaries (which are generated from the included makefiles)
2. Treat each function whose name begins with `test_` as a test entry point
3. Symbolically executes each test case with fully symbolic register states
4. Extracts the return value after symbolic execution, which is treated as the predicate to an assertion that must be proved
- If the test case is in the `pass` subdirectory, it is proved and expected to hold
- If the test case is in the `fail` subdirectory, it is proved and expected to not hold.
Each test harness supports two options for debugging:
- Dumping generated SMT queries
- Dumping generated Macaw IR for inspection
This testing uncovered a bug in the (previously untested) macaw-aarch32-symbolic code. It required a number of submodule updates to:
- Adapt to some what4 changes
- Fix a bug in the LLVM memory model that lets these tests pass
- Adapt to changes to some crucible APIs
This change also modifies the CI configuration to install SMT solvers earlier (which are now needed for all of the symbolic package tests).
Nat is no longer a what4 base type, so we have to adapt various APIs to accommodate that. The template haskell in macaw-semmc is updated to remove Nat cases. Changes to the `SymFn` type required removing a type parameter.
This commit also adds macaw-refinement to CI (which requires installing SMT solvers); that code had to be updated due to the what4 changes.
Co-authored-by: Tristan Ravitch <tristan@galois.com>
* reify block exit in macaw extension
* add return address to block end classification
* use global variable to retain block endings
* update to mkBlockSliceCFG signature
* add haddocks for MacawBlockEnd
Co-authored-by: Tristan Ravitch <tristan@galois.com>