Adds support in macaw-aarch32 for conditional returns. These are not supported in core macaw, and are thus architecture-specific block terminators.
This required changes to the type of arch-specific block terminators. Before, `ArchTermStmt` was only parameterized by a state thread (`ids`). This meant that they could not contain macaw (or crucible) values. Some work on. AArch32 requires being able to store condition values in arch terminators (to support conditional returns). This change modifies the `ArchTermStmt` to enable this, which requires a bit of plumbing through various definitions and some extra instances.
In support of actually using this, it also became necessary to plumb fallthrough block labels through the architecture-specific terminator translation in macaw-symbolic.
Note that this change was overdue, as the PowerPC backend was storing macaw values in a way that would have rendered them unusable in the macaw-ppc-symbolic translation, had any interpretation been provided. These new changes will enable a handler to be written for the conditional PowerPC trap instructions.
PowerPC, x86, and ARM have been updated.
Improves the macaw-aarch32 tests. There is now a command line option to save the generated macaw IR for each
discovered function to /tmp. Note that this reuses some infrastructure from the macaw-symbolic tests. This
shared functionality should be extracted into a macaw-testing library.
This change makes the block classifier heuristic part of the `ArchitectureInfo`
structure. This enables clients and architecture backends to customize the
block classification heuristics. This is most useful for architectures that
have complex architecture-specific block terminators that require analysis to
generate (e.g., conditional returns). It will also make macaw-refinement
simpler in the future, as the SMT-based refinement is just an additional block
classifier (but is currently implemented via a hacky side channel).
This change introduces an ancillary change, which should not be very
user-visible.
It splits the Macaw.Discovery and Macaw.Discovery.State modules to break
module import cycles in a way that enables us to expose the classifier. This
should not be user-visible, as Macaw.Discovery still exports the same
names (with one minor exception that should not appear in user code).
It also moves the definition of the `ArchBlockPrecond` type family; the few
affected places should be updated. User code should probably not be able to see
this.
See `Note [Sign-extending immediate operands in push]` in
`Data.Macaw.X86.Semantics` for the full story. I have also added a test case
in `macaw-x86-symbolic` which ensures that the stack-pointer-decrementing
logic behaves as one would expect.
Bumps in the `flexdis86` submodule to bring in GaloisInc/flexdis86#37.
Fixes#235.
Implement support for symbolically executing system calls in macaw-symbolic. **To update code that does not need to symbolically execute system calls (i.e., most clients of macaw-symbolic), just pass the new `unsupportedSyscalls` default handler as the fifth argument of `macawExtensions`.**
The primary interface is via the new `LookupSyscallHandle` callback passed to `macawExtensions`. This callback inspects the environment and returns a Crucible `FunctionHandle` that models the behavior of the requested system call. Note that this mechanism only supports concrete system calls (i.e., system calls where the system call number is concrete). The x86 backend has been updated to support this new functionality.
The representation of system calls in macaw is still architecture-specific (because there are interesting differences between system call instructions across architectures). The idea is that system calls are now treated in two steps:
1. A macaw-symbolic extension statement that looks up the override to invoke for the given syscall (returned as a Crucible FunctionHandle)
2. A call to that handle
We need this two step approach because the handlers that interpret syntax extension statements cannot symbolically branch (and thus cannot call overrides). The extension interpreter just looks up the necessary handle and uses the standard call/override machinery to handle any branching required to support the system call model functionality.
The major complication to this approach is that system calls need to update values in registers when they return. To capture these updates, the architecture-specific syntax extension needs to explicitly update any machine registers that could possibly be affected. The explicit updates are necessary because machine registers do not exist anymore at the macaw-symbolic level (at least within a block). To handle all of these constraints:
1. System calls are represented as extension functions at the macaw level when lifted from machine code.
2. During translation into crucible (via macaw-symbolic), the extension functions are translated into two statements: a function handle lookup and then a function call (with the return values being explicitly threaded through the Crucible function).
3. During symbolic execution, the lookup statement examines the environment to return the necessary function handle, while the handle is called via the normal machinery.
Note that the feature is entirely controlled by the `LookupSyscallHandle` function, which determines the system call dispatch policy. No system call models are included with this change.
Co-authored-by: Brett Boston <boston@galois.com>
* Update macaw-x86-tests to build properly.
* Fix off by two error in memMapOverwrite
* Introduce some special handling for unsigned-extension in stack
analysis so it knows one value is the unsigned extension of another.
* Error report formating improvements
* Slightly more precise treatment of archfn is bound updates.
Consolidate three different checks that control when to explore
a function into a single one defined in exploreFunPred.
Modify noreturn function calls to not treat the return address
as a potential function entry point.
Add basic checking of LSDA address to compare-dwarfdump.
Minor code refactoring and submodule updates.
* x86: Add semantics for the vpsrld and vpsrlq instructions
* x86: Add semantics for vpaddq
* Fix Haddock for PointwiseLogicalShiftR
* x86: Change vpsubd to PtSub rather than PtAdd
This change treats them as no-ops (which is what they do on all released
hardware). We could represent them with arch extensions. This has a supporting
change in flexdis86 (included as a submodule).
Updates for GHC 8.8
The two main classes of update are related to MonadFail and type alias expansion.
The MonadFail updates introduce explicit MonadFail instances and backward-compatible `fail` implementations under `Monad` for older GHC versions.
The type alias expansion rules changed in GHC 8.8 in a way that breaks the `Simple Lens` idiom; instead, we have to use `Lens'`. Lens started supporting this alias in version 3.8, which was released in 2013.
This change includes necessary submodule updates, as well as the update for the split of what4 into its own repository.
The new registerUse analysis uses a three phase process:
Phase 1 computes invariants about the start state of each block. It
will indicate when registers/stack locations store stack offsets, and
where callee saved registers are stashed. It also memoizes
information about stack reads and writes to simplify later passes.
Phase 2 is a demand analysis that computes which registers and stack
locations must be available to execute the program. It then
propagates those constraints across blocks in the function.
Phase 3 combines the information into a form relevant for function
recovery.
The changes include:
Clean up elf loading to fix a bug in rel addend parsing.
Introduce block preconditions for populating reopt-vcg fields.
Change load options to match reopt's interface.
The goal is to support a jumptable testcase that is not supported by
the current jump bounds check. The jump bounds check needs to be
augmented so that it understands equality relationships between stack
values and registers, and bounds on both.
This patch tracks when a register points to a concrete stack offset.
As part of this, we droped the AbsDomain instance for AbsBlockState.
Clients should now likely use `fnStartAbsBlockState` in lieu of `top`.
The other client visible change is that the ClassifyFailure
constructor now has an extra argument with details about why
classification failure occured.
This introduces a new datatype CValue for representing constants
in Macaw programs, modifies the existing Value datatype to use then,
and introduces patterns for compatibility with existing datatypes.
The patch also updates the function argument analysis to use more
explicit argument passing rather than monadic updates. The intent is
to help clarify when data is initialized rather than updated.
Finally this updates a README and does some minor updates.
This primarily refines the abstract state propagated to branch
pairs. It was needed on the ARM platform to support the IT blocks
with the changes to the Core representation in macaw-base 0.3.6.
This also includes a few simplifications added and comment
improvements.