Commit Graph

26 Commits

Author SHA1 Message Date
Ryan Scott
c2c2a3d6bd macaw-symbolic: Implement MacawFreshSymbolic for all macaw Types
Rather than `error`ing, we now generate fresh constants for all possible
`macaw` `Type`s that are supplied to the `MacawFreshSymbolic` operation.

Fixes #301.
2023-11-17 17:08:34 -05:00
Ryan Scott
c3c5330f7f Don't use deprecated TypeInType extension
As of GHC 8.6, `TypeInType` is simply an alias for `DataKinds` + `PolyKinds`.
And as of GHC 9.6, `TypeInType` is deprecated. Let's just remove our uses of
`TypeInType` to avoid deprecation warnings.
2023-08-21 08:16:10 -04:00
Ryan Scott
7e346081a5 Fix -Wtype-equality-requires-operators warnings
GHC 9.4 adds `-Wtype-equality-requires-operators` to `-Wall`, which warns about
certain uses of type equalities that are not forward-compatible with planned
changes in GHC. See [this
section](https://gitlab.haskell.org/ghc/ghc/-/wikis/migration/9.4?version_id=b60e52482a666d25638d59cd7e86851ddf971dc1#-is-now-a-type-operator)
of the GHC 9.4 Migration Guide. These warnings are easily fixed by enabling the
`TypeOperators` extension.
2023-04-18 15:17:23 -04:00
Ryan Scott
dc7c1759f1 macaw-symbolic: Test both memory model configurations in test suites 2023-03-14 13:27:07 -04:00
Ryan Scott
97c61e471a Add basic support for simulating PLT stubs and shared libraries
This extends `Data.Macaw.Symbolic.Testing` in `macaw-symbolic` to be able to
handle binaries that depend on shared libraries. This is fully functional for
the x86-64 and AArch32 symbolic backends, and I have added test cases to the
respective repos demonstrating that it works. (The PowerPC backend is not yet
supported. At a minimum, this is blocked on GaloisInc/elf-edit#35.)

To implement this, I also needed to add some additional infrastructure to
`macaw-base` (I put this infrastructure here as it doesn't depend on any
Crucible-specific functionality):

* `Data.Macaw.Memory.ElfLoader.DynamicDependencies`: a basic ELF dynamic
  loader that performs a breadth-first search over all `DT_NEEDED` entries
  that an ELF binary depends on (both directly and indirectly).
* `Data.Macaw.Memory.ElfLoader.PLTStubs`: a collection of heuristics for
  detecting the addresses of PLT stubs in a dynamically linked binary.

It is worth noting that shared libraries are rife with nuance and subtlety,
and the way `macaw` models shared libraries is not 100% accurate. I have
written a length `Note [Shared libraries]` in `Data.Macaw.Symbolic.Testing`
to describe where corners had to be cut.

Fixes #318.
2023-02-23 17:16:12 -05:00
Ryan Scott
da06413771 Add test case for movt semantics
This requires bumping the `asl-translator` submodule to bring the changes from
GaloisInc/asl-translator#47, which are necessary for the test case to work.
2022-10-26 18:15:06 -04:00
Ryan Scott
6e020bcde6 Fix -Wincomplete-uni-patterns warnings
GHC 9.2 adds `-Wincomplete-uni-patterns` to `-Wall`, which uncovers a slew of
previously unnoticed warnings in `macaw`. This patch fixes them, mostly by
adding explicit fall-through cases.
2022-05-31 15:50:48 -04:00
Tristan Ravitch
659cfff6c9
aarch32: Implement support for conditional calls (#289)
The core of macaw cannot represent conditional calls because the existing block terminators are not sufficiently expressive and it doesn't support creating synthetic blocks to represent control flow not directly tied to machine addresses.

To work around this, we introduce ARM-specific block terminators for conditional calls and plumb them through up to macaw-aarch32-symbolic.

Fixes #288
2022-05-20 15:17:26 -07:00
Tristan Ravitch
8e10643b0f
Fix tail call classification (#286)
The tail call classifier came after the jump classifier, which was a problem because it is less strict than the tail call classifier, meaning it would always fire.  This commit moves direct jump to be the last classifier applied, giving the others a chance.

Includes a test case in the ARM backend.

This requires some updates to some of the expected test results, as a few blocks are now classified as tail calls that were
plain jumps before.  They really could be considered either.  I think it would be nice if these could be classified as jumps instead, but the reason they are flagged as tail calls is mostly down to the fact that their surrounding context is so simple that either interpretation works.

Correcting this would require some heuristics based on additional analysis passes.

The test harness for macaw symbolic required a few changes because the new detection of some jumps as tail calls introduces new calls into the symbolic test suites. However, the symbolic testing harness did not support calls before.  Adding support required a bit of plumbing, including a more extensive code discovery pass.


Fixes #285
2022-05-10 07:29:55 -07:00
Daniel Matichuk
bbc0b6a40a
Bump ASL revision to handle slicing issues with UBFX instruction on ARM (#280)
* bump asl-translator

* add ubfx tests

* add CI version to workflow
2022-04-20 10:08:36 -07:00
Kevin Quick
78b45a10c6 Only fix personality to (MS.MacawSimulatorState sym) where required. 2022-02-21 13:27:20 -05:00
Rob Dockins
465a84ee49 Update with changes flowing from GaloicInc/crucible#945.
This mostly deals with the splitting of the old `sym` type into
two: one for dealing with expression creation, and a new simulator
backend type for dealing with control-flow and assertions.
2022-01-24 16:24:07 -08:00
Tristan Ravitch
63a65c3d85 x86: Fix failing proof obligations due to EvenParity
See the writeup in Crucible.hs in this commit for details. In short, the recent
changes to generalize `PtrAdd` triggered a failing proof obligation due to a use
of `llvmPointer_bv`.  The new implementation is as sound as the previous one,
but more general.

Fixes #260
2022-01-21 15:33:10 -08:00
Tristan Ravitch
3e918f8b51
Revise handling of syscalls in AArch32 to match X86 (#246)
The old formulation (with system calls as block terminators) proved to be
impossible to implement properly. Handlers for syntax overrides have very
limited types (`IO`, rather than `OverrideSim`), which made symbolic branching
and reusing overrides impossible.

This change replaces the system call block terminator with an arch-specific
function that is translated into a function handle lookup (which is then
dispatched to with a call).

Unfortunately, this refactoring required combining the AArch32 simplification
module with the architecture extension definitions, due to the new translation
relying on the simplifier instance.
2021-11-24 11:59:56 -08:00
Tristan Ravitch
9ce3d43188
AArch32: Support conditional returns (#243)
Adds support in macaw-aarch32 for conditional returns. These are not supported in core macaw, and are thus architecture-specific block terminators.

This required changes to the type of arch-specific block terminators. Before, `ArchTermStmt` was only parameterized by a state thread (`ids`).  This meant that they could not contain macaw (or crucible) values.  Some work on. AArch32 requires being able to store condition values in arch terminators (to support conditional returns). This change modifies the `ArchTermStmt` to enable this, which requires a bit of plumbing through various definitions and some extra instances.

In support of actually using this, it also became necessary to plumb fallthrough block labels through the architecture-specific terminator translation in macaw-symbolic.

Note that this change was overdue, as the PowerPC backend was storing macaw values in a way that would have rendered them unusable in the macaw-ppc-symbolic translation, had any interpretation been provided.  These new changes will enable a handler to be written for the conditional PowerPC trap instructions.

PowerPC, x86, and ARM have been updated.

Improves the macaw-aarch32 tests. There is now a command line option to save the generated macaw IR for each
discovered function to /tmp. Note that this reuses some infrastructure from the macaw-symbolic tests. This
shared functionality should be extracted into a macaw-testing library.
2021-11-19 16:20:50 -08:00
Brett Boston
a336895da7
Add optional override for MacawArchStmtExtensions to genArchVals (#230)
This change adds an optional argument to `genArchVals` that allows client code to override the backend translation behavior of `MacawArchStmtExtension`s on a statement-by-statement basis.  The new argument has type `Maybe (MacawArchStmtExtensionOverride arch)`, where `MacawArchStmtExtensionOverride` is a function that takes a statement and a crucible state, and returns an optional tuple containing the value produced by the statement, as well as an updated state.  Returning 'Nothing' indicates that the backend should use its default handler for the statement.

Client code that wishes to maintain the existing default behavior in all cases can simply pass `Nothing` for the new argument to `genArchVals`.
2021-09-14 18:24:47 -07:00
Ryan Scott
7f7de2a59b
Adapt to GaloisInc/crucible#794 (#224)
GaloisInc/crucible#794 increases the number of functions that use
implicit `MemOptions`, including a handful of key LLVM memory model–related
functions. As a result, many parts of `macaw` need to add implicit `?memOpts`
parameters to accommodate to this change.
2021-08-23 20:39:08 -04:00
Tristan Ravitch
dbb4c83f08
Add a testing framework for macaw-symbolic (#184)
The new test suites cover x86_64, PowerPC, and ARM. They test that the semantics are actually correct (rather than just seeing if symbolic execution produces any result). The `Data.Macaw.Symbolic.Testing` module in macaw-symbolic provides some common utilities for symbolic execution engine setup, while there are tailored test harnesses for each architecture.

The semantics of the test harnesses are documented in each architecture test suite, but they:
1. Discover all of the test binaries (which are generated from the included makefiles)
2. Treat each function whose name begins with `test_` as a test entry point
3. Symbolically executes each test case with fully symbolic register states
4. Extracts the return value after symbolic execution, which is treated as the predicate to an assertion that must be proved
    - If the test case is in the `pass` subdirectory, it is proved and expected to hold
    - If the test case is in the `fail` subdirectory, it is proved and expected to not hold.

Each test harness supports two options for debugging:
- Dumping generated SMT queries
- Dumping generated Macaw IR for inspection

This testing uncovered a bug in the (previously untested) macaw-aarch32-symbolic code. It required a number of submodule updates to:

- Adapt to some what4 changes
- Fix a bug in the LLVM memory model that lets these tests pass
- Adapt to changes to some crucible APIs

This change also modifies the CI configuration to install SMT solvers earlier (which are now needed for all of the symbolic package tests).
2021-03-01 09:21:44 -08:00
Tristan Ravitch
52f3efb2c9 aarch32: Change the register context
The previous implementation used all of the registers defined in ASL translator.
It turns out that we don't actually use all of them in macaw; the significant
difference is that the `__memory` pseudo-location is used by asl-translator, but
not macaw (as macaw handles memory through implicit context).  This change
modifies the register assignment to include everything except memory (which
requires an update to the asl-translator submodule).
2021-01-16 17:42:01 -08:00
Tristan Ravitch
37861df8c7
Support for mixed ARM/Thumb binaries (#174)
aarch32: Support mixed ARM/Thumb1 binaries

This updates the aarch32 backend to decode Thumb instructions and generate the Thumb semantics. The major implementation change is to use the `ArchBlockPrecond` feature of macaw to track the Thumb state (`PSTATE_T`) across block boundaries.

The ARM code discovery decides whether or not a function entry point should be decoded as Thumb by examining the low bit of the function address. If the low bit is set, it is a Thumb entry point. This has the slightly odd effect of causing macaw to say that the function is at the address with the low bit set, which is not technically true. This is documented in the README, but not obvious on inspection. Most use cases should not care, and can in any case account for it. In the future, it should be possible to fix this (though it will require some changes to the core of macaw).
2020-11-02 12:48:01 -08:00
Daniel Matichuk
8f9ee8624e fixup GenArchInfo instance 2020-10-05 12:54:22 -07:00
Daniel Matichuk
d0349a7af3 Merge remote-tracking branch 'origin/master' into memgen 2020-10-05 12:41:36 -07:00
Tristan Ravitch
cbc7a3ca31
Feature/aarch32 symbolic backend (#162)
aarch32-symbolic: Implement most of the remaining macaw-aarch32-symbolic bits

It should be usable now, modulo some execution-time semantics for the floating
point operations.  There will be a separate ticket covering the changes required
for them (some refactoring of how they are handled during translation is required).
2020-10-05 12:31:39 -07:00
Daniel Matichuk
b04e0a2fa6 fix ArchInfo instance for macaw-aarch32-symbolic 2020-10-05 12:22:07 -07:00
Tristan Ravitch
b9672eb7f9 Add a missing symbolic instance 2020-04-05 22:08:58 -07:00
Tristan Ravitch
f3b6b6ba4a Add a (dummy) symbolic backend for AArch32 2020-04-05 21:16:03 -07:00