380d732d0e
Implement support for symbolically executing system calls in macaw-symbolic. **To update code that does not need to symbolically execute system calls (i.e., most clients of macaw-symbolic), just pass the new `unsupportedSyscalls` default handler as the fifth argument of `macawExtensions`.** The primary interface is via the new `LookupSyscallHandle` callback passed to `macawExtensions`. This callback inspects the environment and returns a Crucible `FunctionHandle` that models the behavior of the requested system call. Note that this mechanism only supports concrete system calls (i.e., system calls where the system call number is concrete). The x86 backend has been updated to support this new functionality. The representation of system calls in macaw is still architecture-specific (because there are interesting differences between system call instructions across architectures). The idea is that system calls are now treated in two steps: 1. A macaw-symbolic extension statement that looks up the override to invoke for the given syscall (returned as a Crucible FunctionHandle) 2. A call to that handle We need this two step approach because the handlers that interpret syntax extension statements cannot symbolically branch (and thus cannot call overrides). The extension interpreter just looks up the necessary handle and uses the standard call/override machinery to handle any branching required to support the system call model functionality. The major complication to this approach is that system calls need to update values in registers when they return. To capture these updates, the architecture-specific syntax extension needs to explicitly update any machine registers that could possibly be affected. The explicit updates are necessary because machine registers do not exist anymore at the macaw-symbolic level (at least within a block). To handle all of these constraints: 1. System calls are represented as extension functions at the macaw level when lifted from machine code. 2. During translation into crucible (via macaw-symbolic), the extension functions are translated into two statements: a function handle lookup and then a function call (with the return values being explicitly threaded through the Crucible function). 3. During symbolic execution, the lookup statement examines the environment to return the necessary function handle, while the handle is called via the normal machinery. Note that the feature is entirely controlled by the `LookupSyscallHandle` function, which determines the system call dispatch policy. No system call models are included with this change. Co-authored-by: Brett Boston <boston@galois.com> |
||
---|---|---|
.. | ||
examples | ||
src/Data/Macaw | ||
test | ||
LICENSE | ||
macaw-symbolic.cabal | ||
README.org |
Overview
The macaw-symbolic library provides a mechanism for translating machine code functions discovered by macaw into Crucible CFGs that can then be symbolically simulated.
The core macaw-symbolic library supports translating generic macaw into crucible, but is not a standalone library. To translate actual machine code, an architecture-specific backend is required. For example, macaw-x86-symbolic can be used to translate x86_64 binaries into crucible. Examples for using macaw-symbolic (and architecture-specific backends) are available in Data.Macaw.Symbolic.
In order to avoid API bloat, the definitions required to implement a new architecture-specific backend are exported through the Data.Macaw.Symbolic.Backend module.
An additional module, Data.Macaw.Symbolic.Memory, provides an example of how to handle memory address translation in the simulator for machine code programs. There are other possible ways to translate memory addresses, but this module provides a versatile example that can serve many common use cases.