Commit Graph

45 Commits

Author SHA1 Message Date
Your Name
dfaa7cbf68 x86-symbolic: Add mm* aliases for st* registers 2024-09-24 15:17:09 -04:00
Your Name
f4f1e1fe43 x86-symbolic: Indexes for the remaining registers 2024-09-24 14:28:28 -04:00
Your Name
896d35cba9 x86-symbolic: Remove terse register names from top-level exports
These names are so short that they might conflict with local names in
files that import this module. Instead, export them from `Regs` and
encourage qualified use of that module.
2024-09-24 13:50:39 -04:00
Your Name
79cf6eb04d x86-symbolic: Add Index definitions, type aliases for X87 status regs 2024-09-24 13:23:29 -04:00
Your Name
5f28a031c3 x86-symbolic: Simplify register Index definitions, add flags registers 2024-09-24 13:15:26 -04:00
Your Name
33af2814d9 symbolic-aarch32: Use ABI-compatible stack setup code in test harness 2024-09-23 13:16:43 -04:00
Langston Barrett
f80b275f9c symbolic: Sort imports 2024-09-11 17:58:29 -04:00
Langston Barrett
edc9635fe7 x86-symbolic: Use SysV-compatible stack setup in test suite
Fix a logic bug (bytes, not bits!) along the way
2024-09-11 17:57:06 -04:00
Langston Barrett
9f2da79c29 symbolic: Try proving *all* safety conditions 2024-09-11 13:11:48 -04:00
Langston Barrett
be0a57f555 symbolic: Don't feed the stack pointer to the ResultExtractor
None of the supported architectures return values via the stack, and
tracking the stack pointer needlessly complicates the code.
2024-09-05 11:22:55 -04:00
Langston Barrett
813d057881 x86-{symbolic,syntax}: Indexs and names for GP registers 2024-08-16 15:32:49 -04:00
Langston Barrett
97751b0895 x86-symbolic: Add Indexs into MacawCrucibleRegTypes for named registers 2024-08-16 15:32:49 -04:00
Ryan Scott
1add47389a macaw-x86: Fix call semantics when call target involves the stack pointer
Previously, the `macaw-x86` semantics for `call` would retrieve the call target
*after* pushing the next instruction's address to the stack, but if the call
target involves the stack pointer, then this would mean that it would get the
next instruction's address when retrieving the call target. This is not what is
intended!

This patch fixes the issue by always retrieving the call target *before*
pushing the next instruction's address to the stack. I have added a test case
to the `macaw-x86-symbolic` test suite which demonstrates that this fix works
as intended.

Fixes #420.
2024-08-13 12:31:09 -04:00
Ryan Scott
a6ff58f473 macaw-x86-symbolic: Fix idiv/div semantics
When converting a Macaw value with the Macaw type `TupleType [x_1, ..., x_n]`
to Crucible, the resulting Crucible value will have the Crucible type
`StructType (EmptyCtx ::> ToCrucibleType x_n ::> ... ::> ToCrucibleType x_1)`.
(See `macawListToCrucible(M)` in `Data.Macaw.Symbolic.PersistentState` for
where this is implemented.) Note that the order of the tuple's fields is
reversed in the process of converting it to a Crucible struct. This is a
convention that one must keep in mind when dealing with Macaw tuples at the
Crucible level.

As it turns out, the part of `macaw-x86-symbolic` reponsible for interpreting
the semantics of the `idiv` instruction (for signed quotient/remainder) and the
`div` instruction (for unsigned quotient/remainder) were _not_ respecting this
convention. This is because the `macaw-x86-symbolic` semantics were returning a
Crucible struct consisting of `Empty :> quotient :> remainder)`, but at the
Macaw level, this was interpreted as the tuple `(remainder, quotient)`, which
is the opposite of the intended order. This led to subtle bugs such as those
observed in #393.

The solution is straightforward: have the `macaw-x86-symbolic` semantics
compute `Empty :> remainder :> quotient` instead. Somewhat counterintuitive,
but it does work.

Fixes #393.
2024-07-12 16:56:51 -04:00
Ryan Scott
dc7c1759f1 macaw-symbolic: Test both memory model configurations in test suites 2023-03-14 13:27:07 -04:00
Ryan Scott
97c61e471a Add basic support for simulating PLT stubs and shared libraries
This extends `Data.Macaw.Symbolic.Testing` in `macaw-symbolic` to be able to
handle binaries that depend on shared libraries. This is fully functional for
the x86-64 and AArch32 symbolic backends, and I have added test cases to the
respective repos demonstrating that it works. (The PowerPC backend is not yet
supported. At a minimum, this is blocked on GaloisInc/elf-edit#35.)

To implement this, I also needed to add some additional infrastructure to
`macaw-base` (I put this infrastructure here as it doesn't depend on any
Crucible-specific functionality):

* `Data.Macaw.Memory.ElfLoader.DynamicDependencies`: a basic ELF dynamic
  loader that performs a breadth-first search over all `DT_NEEDED` entries
  that an ELF binary depends on (both directly and indirectly).
* `Data.Macaw.Memory.ElfLoader.PLTStubs`: a collection of heuristics for
  detecting the addresses of PLT stubs in a dynamically linked binary.

It is worth noting that shared libraries are rife with nuance and subtlety,
and the way `macaw` models shared libraries is not 100% accurate. I have
written a length `Note [Shared libraries]` in `Data.Macaw.Symbolic.Testing`
to describe where corners had to be cut.

Fixes #318.
2023-02-23 17:16:12 -05:00
Ryan Scott
6e020bcde6 Fix -Wincomplete-uni-patterns warnings
GHC 9.2 adds `-Wincomplete-uni-patterns` to `-Wall`, which uncovers a slew of
previously unnoticed warnings in `macaw`. This patch fixes them, mostly by
adding explicit fall-through cases.
2022-05-31 15:50:48 -04:00
Tristan Ravitch
8e10643b0f
Fix tail call classification (#286)
The tail call classifier came after the jump classifier, which was a problem because it is less strict than the tail call classifier, meaning it would always fire.  This commit moves direct jump to be the last classifier applied, giving the others a chance.

Includes a test case in the ARM backend.

This requires some updates to some of the expected test results, as a few blocks are now classified as tail calls that were
plain jumps before.  They really could be considered either.  I think it would be nice if these could be classified as jumps instead, but the reason they are flagged as tail calls is mostly down to the fact that their surrounding context is so simple that either interpretation works.

Correcting this would require some heuristics based on additional analysis passes.

The test harness for macaw symbolic required a few changes because the new detection of some jumps as tail calls introduces new calls into the symbolic test suites. However, the symbolic testing harness did not support calls before.  Adding support required a bit of plumbing, including a more extensive code discovery pass.


Fixes #285
2022-05-10 07:29:55 -07:00
Rob Dockins
465a84ee49 Update with changes flowing from GaloicInc/crucible#945.
This mostly deals with the splitting of the old `sym` type into
two: one for dealing with expression creation, and a new simulator
backend type for dealing with control-flow and assertions.
2022-01-24 16:24:07 -08:00
Tristan Ravitch
63a65c3d85 x86: Fix failing proof obligations due to EvenParity
See the writeup in Crucible.hs in this commit for details. In short, the recent
changes to generalize `PtrAdd` triggered a failing proof obligation due to a use
of `llvmPointer_bv`.  The new implementation is as sound as the previous one,
but more general.

Fixes #260
2022-01-21 15:33:10 -08:00
Ryan Scott
5547632f65 macaw-x86: Handle sign-extended immediates in def_push
See `Note [Sign-extending immediate operands in push]` in
`Data.Macaw.X86.Semantics` for the full story. I have also added a test case
in `macaw-x86-symbolic` which ensures that the stack-pointer-decrementing
logic behaves as one would expect.

Bumps in the `flexdis86` submodule to bring in GaloisInc/flexdis86#37.

Fixes #235.
2021-10-12 16:37:21 -04:00
Brett Boston
a336895da7
Add optional override for MacawArchStmtExtensions to genArchVals (#230)
This change adds an optional argument to `genArchVals` that allows client code to override the backend translation behavior of `MacawArchStmtExtension`s on a statement-by-statement basis.  The new argument has type `Maybe (MacawArchStmtExtensionOverride arch)`, where `MacawArchStmtExtensionOverride` is a function that takes a statement and a crucible state, and returns an optional tuple containing the value produced by the statement, as well as an updated state.  Returning 'Nothing' indicates that the backend should use its default handler for the statement.

Client code that wishes to maintain the existing default behavior in all cases can simply pass `Nothing` for the new argument to `genArchVals`.
2021-09-14 18:24:47 -07:00
Ryan Scott
7f7de2a59b
Adapt to GaloisInc/crucible#794 (#224)
GaloisInc/crucible#794 increases the number of functions that use
implicit `MemOptions`, including a handful of key LLVM memory model–related
functions. As a result, many parts of `macaw` need to add implicit `?memOpts`
parameters to accommodate to this change.
2021-08-23 20:39:08 -04:00
Sam Breese
135fb062bb
x86: Fix semantics for BSF and BSR instructions (#216)
* x86: Fix semantics for BSF and BSR instructions

* Add a test for BSR and BSF
2021-07-13 14:14:59 -04:00
Tristan Ravitch
dbb4c83f08
Add a testing framework for macaw-symbolic (#184)
The new test suites cover x86_64, PowerPC, and ARM. They test that the semantics are actually correct (rather than just seeing if symbolic execution produces any result). The `Data.Macaw.Symbolic.Testing` module in macaw-symbolic provides some common utilities for symbolic execution engine setup, while there are tailored test harnesses for each architecture.

The semantics of the test harnesses are documented in each architecture test suite, but they:
1. Discover all of the test binaries (which are generated from the included makefiles)
2. Treat each function whose name begins with `test_` as a test entry point
3. Symbolically executes each test case with fully symbolic register states
4. Extracts the return value after symbolic execution, which is treated as the predicate to an assertion that must be proved
    - If the test case is in the `pass` subdirectory, it is proved and expected to hold
    - If the test case is in the `fail` subdirectory, it is proved and expected to not hold.

Each test harness supports two options for debugging:
- Dumping generated SMT queries
- Dumping generated Macaw IR for inspection

This testing uncovered a bug in the (previously untested) macaw-aarch32-symbolic code. It required a number of submodule updates to:

- Adapt to some what4 changes
- Fix a bug in the LLVM memory model that lets these tests pass
- Adapt to changes to some crucible APIs

This change also modifies the CI configuration to install SMT solvers earlier (which are now needed for all of the symbolic package tests).
2021-03-01 09:21:44 -08:00
Joe Hendrix
3991d3a1f1 Fix macaw-x86-symbolic tests. 2020-11-12 13:58:40 -08:00
Rob Dockins
99f8cb0bdf Update to use new HasLLVMAnn API, which requires an action for
recording (or discarding) annotations rather than a map.
2020-09-11 14:40:02 -07:00
Kevin Quick
22f2a3122c
[x86_symbolic] Create badBehaviorMap earlier in test. 2020-08-10 09:48:50 -07:00
Andrei Stefanescu
0be59e5815
Update Macaw to use HasLLVMAnn. (#122) 2020-04-02 17:58:47 -07:00
Tristan Ravitch
80125921a9 This commit re-implements the memory model used by macaw symbolic
There are two major changes:
- The interface to memory models in Data.Macaw.Symbolic has changed
- The suggested implementation in Data.Macaw.Symbolic.Memory has changed

The change improves performance and fixes a soundness bug.

* `macawExtensions` (Data.Macaw.Symbolic) takes a new argument: a `MkGlobalPointerValidityPred`.  Use `mkGlobalPointerValidityPred` to provide one.
* `mapRegionPointers` no longer takes a default pointer argument (delete it at call sites)
* `GlobalMap` returns an `LLVMPtr sym w` instead of a `Maybe (LLVMPtr sym w)`

Users of the suggested memory model do not need to worry about the last change,
as it has been migrated.  If you provided your own address mapping function, it
must now be total.  This is annoying, but the old API was unsound because
macaw-symbolic did not have enough information to correctly handle the `Nothing`
case.  The idea of the change is that the mapping function should translate any
concrete pointers as appropriate, while symbolic pointers should generate a mux
over all possible allocations.  Unfortunately, macaw-symbolic does not have
enough information to generate the mux itself, as there may be allocations
created externally.

This interface and implementation is concerned with handling pointers to static
memory in a binary.  These are distinguished from pointers to
dynamically-allocated or stack memory because many machine code instructions
compute bitvectors and treat them as pointers.  In the LLVM memory model used by
macaw-symbolic, each memory allocation has a block identifier (a natural
number).  The stack and each heap allocation get unique block identifiers.
However, pointers to static memory have no block identifier and must be mapped
to a block in order to fit into the LLVM memory model.

The previous implementation assigned each mapped memory segment in a binary to
its own LLVM memory allocation.  This had the nice property of implicitly
proving that no memory access was touching unmapped memory.  Unfortunately, it
was especially inefficient in the presence of symbolic reads and writes, as it
generated mux trees over all possible allocations and significantly slowed
symbolic execution.

The new memory model implementation (in Data.Macaw.Symbolic.Memory) instead uses
a single allocation for all static allocations.  This pushes more of the logic
for resolving reads and writes into the SMT solver and the theory of arrays.  In
cases where sufficient constraints exist in path conditions, this means that we
can support symbolic reads and writes.  Additionally, since we have only a
single SMT array backing all allocations, mapping bitvectors to LLVM pointers in
the memory model is trivial: we just change their block identifier from zero
(denoting a bitvector) to the block identifier of the unique allocation backing
static data.

This change has to do some extra work to ensure safety (especially that unmapped
memory is never written to or read from).  This is handled with the
MkGlobalPointerValidityPred interface in Data.Macaw.Symbolic.  This function,
which is passed to the macaw-symbolic initialization, constructs well-formedness
predicates for all pointers used to access memory.  Symbolic execution tasks
that do not need to enforce this property can simply provide a function that
never returns any predicates to check.  Implementations that want a reasonable
default can use the mkGlobalPointerValidityPred from Data.Macaw.Symbolic.Memory.
The default implementation ensures that no reads or writes touch unmapped memory
and that writes to static data never write to read-only segments.

This change also converts the examples in macaw-symbolic haddocks to use doctest
to ensure that they do not get out of date.  These are checked as part of CI.
2020-02-11 09:58:53 -08:00
Andrew Kent
587aa7ea6b
Update crux/crucible code to use float mode reprs; bump submodules 2019-11-05 15:23:51 -08:00
Joe Hendrix
df95e65987
Various changes to support VCG.
The changes include:

  Clean up elf loading to fix a bug in rel addend parsing.

  Introduce block preconditions for populating reopt-vcg fields.

  Change load options to match reopt's interface.
2019-09-04 23:21:23 -07:00
Tristan Ravitch
0c3ea57a62 Update the macaw-x86-symbolic tests 2019-08-09 10:56:50 -07:00
Kevin Quick
40eff5802c
[x86_symbolic] updates for crucible nonce change from (ST h) to IO
Changes for compatibility with Crucible pull request
285 (https://github.com/GaloisInc/crucible/pull/285) and the
corresponding changes in macaw symbolic.
2019-07-19 13:15:44 -07:00
Joe Hendrix
4267dca987
Get x86_symbolic test cases in runable state. 2019-03-25 20:29:35 -07:00
Brian Huffman
a3d7376179 Adapt to changed crucible-llvm exports. 2018-08-27 16:16:32 -07:00
Joe Hendrix
dc4a4f0f5f
Merge remote-tracking branch 'public/stable' into jhx-x86-improvements 2018-07-20 20:32:09 -07:00
Joe Hendrix
2184fab0bc
Update macaw-symbol tests so they at least compile. 2018-07-20 10:07:49 -07:00
Rob Dockins
f74d999896 Bump crucible submodule again 2018-05-17 14:06:24 -07:00
Joe Hendrix
007405db1d
Improve robustness of elf loader, and start trying to parse relocations in objects. 2018-03-29 15:21:31 -07:00
Iavor Diatchki
ef1d277c12 Make test build (but not yet function) 2018-01-30 15:55:22 -08:00
Iavor Diatchki
737d4fc0c5 Fix test (still does not work, but for other reasons) 2018-01-30 15:50:34 -08:00
Joe Hendrix
d1bdff9866
Additional code for macaw-symbolic. 2018-01-22 16:58:33 -08:00
Joe Hendrix
8b97faa731
More progress on Macaw symbolic; compile fixes for Macaw changes. 2018-01-22 15:28:20 -08:00
Joe Hendrix
b7e06e64ee
Progress on macaw-symbolic and macaw-x86-symbolic. 2018-01-16 15:06:31 -08:00