Commit Graph

249 Commits

Author SHA1 Message Date
Ryan Scott
f9b300a3d1 macaw-symbolic: Add fromCrucibleEndian function
This is like `toCrucibleEndian`, but in the opposite direction.
2022-01-04 16:27:57 -06:00
Ryan Scott
45f991ccdf macaw-symbolic: Consolidate duplicate definitions of toCrucibleEndian
There were two identical definitions of `toCrucibleEndian`, one in
`D.M.S.Memory` and another in `D.M.S.Testing`. This commit removes the
latter in favor of the former, which is actually exported.
2022-01-04 16:27:57 -06:00
Ryan Scott
c900dc4de8 macaw-symbolic: Replace mkName with safeSymbol
Fixes #251.
2022-01-04 16:27:24 -06:00
Tristan Ravitch
9c426986ff Revise handling of some pointer operations
Have all additions (at any bit width) go through the special PtrAdd
handler (rather than BVAdd). Also add special handlers for truncation and
extension.

These changes support architectures that do pointer operations at non-pointer
widths (e.g., to detect overflow).  These new operations apply the named
operations over just the offset of pointers, preserving the block id.
2021-12-23 13:54:29 -08:00
Tristan Ravitch
21366abc23
symbolic: Make relocation handling configurable in the symbolic backend (#250)
The `Data.Macaw.Symbolic.Memory` module provides a default memory model and
initial memory setup that is suitable for many symbolic execution
workloads. However, the defaults cannot handle dynamically-linked programs, as
it calls `error` when it attempts to determine an initial value for relocations
it finds in memory.

There are no good defaults for this, as what those values should be depend a lot
on what the verifier wants to prove.

This commit adds some hooks to configure this behavior in the verifier, and is
designed to be extensible and enable other configuration choices where
reasonable.

The original API is unchanged, as it calls the added `newGlobalMemoryWith`
function with a default set of hooks.  Callers with special memory handling
needs are directed to use that function.
2021-12-16 15:22:38 -08:00
Ryan Scott
d3a53a6769 Update crucible, semmc submodules; adapt to GaloisInc/crucible#906
This updates the `crucible` submodule to include GaloisInc/crucible#906
(`Control granularity of reading uninitialized memory`), as well as the
`semmc` submodule to bring in corresponding changes on its side
(GaloisInc/semmc#69). Some additional `?memOpts :: MemOptions` constraints
needed to be added to some functions in `macaw-symbolic` and
`macaw-refinement` as a result.
2021-11-22 18:27:46 -05:00
Tristan Ravitch
9ce3d43188
AArch32: Support conditional returns (#243)
Adds support in macaw-aarch32 for conditional returns. These are not supported in core macaw, and are thus architecture-specific block terminators.

This required changes to the type of arch-specific block terminators. Before, `ArchTermStmt` was only parameterized by a state thread (`ids`).  This meant that they could not contain macaw (or crucible) values.  Some work on. AArch32 requires being able to store condition values in arch terminators (to support conditional returns). This change modifies the `ArchTermStmt` to enable this, which requires a bit of plumbing through various definitions and some extra instances.

In support of actually using this, it also became necessary to plumb fallthrough block labels through the architecture-specific terminator translation in macaw-symbolic.

Note that this change was overdue, as the PowerPC backend was storing macaw values in a way that would have rendered them unusable in the macaw-ppc-symbolic translation, had any interpretation been provided.  These new changes will enable a handler to be written for the conditional PowerPC trap instructions.

PowerPC, x86, and ARM have been updated.

Improves the macaw-aarch32 tests. There is now a command line option to save the generated macaw IR for each
discovered function to /tmp. Note that this reuses some infrastructure from the macaw-symbolic tests. This
shared functionality should be extracted into a macaw-testing library.
2021-11-19 16:20:50 -08:00
Tristan Ravitch
952fe5578d Submodule updates 2021-11-18 21:40:09 -08:00
Ryan Scott
2152108e81
Make LookupFunctionHandle a newtype (#238)
`LookupSyscallHandle` is already a newtype. `LookupFunctionHandle` deserves to
share the love.
2021-10-13 12:10:36 -04:00
Ryan Scott
fbd7bce217
Add Show{,F} instances for MacawCrucibleValue (#232)
I found these useful while printf debugging.
2021-09-28 08:52:05 -04:00
Brett Boston
a336895da7
Add optional override for MacawArchStmtExtensions to genArchVals (#230)
This change adds an optional argument to `genArchVals` that allows client code to override the backend translation behavior of `MacawArchStmtExtension`s on a statement-by-statement basis.  The new argument has type `Maybe (MacawArchStmtExtensionOverride arch)`, where `MacawArchStmtExtensionOverride` is a function that takes a statement and a crucible state, and returns an optional tuple containing the value produced by the statement, as well as an updated state.  Returning 'Nothing' indicates that the backend should use its default handler for the statement.

Client code that wishes to maintain the existing default behavior in all cases can simply pass `Nothing` for the new argument to `genArchVals`.
2021-09-14 18:24:47 -07:00
Tristan Ravitch
380d732d0e
Implement system call support for x86 (#226)
Implement support for symbolically executing system calls in macaw-symbolic. **To update code that does not need to symbolically execute system calls (i.e., most clients of macaw-symbolic), just pass the new `unsupportedSyscalls` default handler as the fifth argument of `macawExtensions`.**

The primary interface is via the new `LookupSyscallHandle` callback passed to `macawExtensions`. This callback inspects the environment and returns a Crucible `FunctionHandle` that models the behavior of the requested system call. Note that this mechanism only supports concrete system calls (i.e., system calls where the system call number is concrete). The x86 backend has been updated to support this new functionality.

The representation of system calls in macaw is still architecture-specific (because there are interesting differences between system call instructions across architectures). The idea is that system calls are now treated in two steps:
1. A macaw-symbolic extension statement that looks up the override to invoke for the given syscall (returned as a Crucible FunctionHandle)
2. A call to that handle

We need this two step approach because the handlers that interpret syntax extension statements cannot symbolically branch (and thus cannot call overrides). The extension interpreter just looks up the necessary handle and uses the standard call/override machinery to handle any branching required to support the system call model functionality.

The major complication to this approach is that system calls need to update values in registers when they return. To capture these updates, the architecture-specific syntax extension needs to explicitly update any machine registers that could possibly be affected. The explicit updates are necessary because machine registers do not exist anymore at the macaw-symbolic level (at least within a block). To handle all of these constraints:
1. System calls are represented as extension functions at the macaw level when lifted from machine code.
2. During translation into crucible (via macaw-symbolic), the extension functions are translated into two statements: a function handle lookup and then a function call (with the return values being explicitly threaded through the Crucible function).
3. During symbolic execution, the lookup statement examines the environment to return the necessary function handle, while the handle is called via the normal machinery.

Note that the feature is entirely controlled by the `LookupSyscallHandle` function, which determines the system call dispatch policy. No system call models are included with this change.

Co-authored-by: Brett Boston <boston@galois.com>
2021-08-27 15:47:40 -07:00
Ryan Scott
7c95f5d874
Adapt to crucible-symio and even more MemOptions (#225)
This bumps the `crucible` submodule to bring in:

* `crucible-symio` (GaloisInc/crucible#788). This requires adding a new
  project dependency in `cabal.project.dist`.
* GaloisInc/crucible#808, which adds yet another `?memOpts :: MemOptions`
  constraint, this time in `doPtrAddOffset`.
2021-08-26 08:31:28 -04:00
Ryan Scott
7f7de2a59b
Adapt to GaloisInc/crucible#794 (#224)
GaloisInc/crucible#794 increases the number of functions that use
implicit `MemOptions`, including a handful of key LLVM memory model–related
functions. As a result, many parts of `macaw` need to add implicit `?memOpts`
parameters to accommodate to this change.
2021-08-23 20:39:08 -04:00
Tristan Ravitch
df839de678 Update submodules
This commit adapts to recent changes in crucible and raises the version bounds
to admit newer versions of what4.
2021-07-15 21:22:30 -07:00
Sam Breese
135fb062bb
x86: Fix semantics for BSF and BSR instructions (#216)
* x86: Fix semantics for BSF and BSR instructions

* Add a test for BSR and BSF
2021-07-13 14:14:59 -04:00
Joe Hendrix
1cb86f771c Cleanup discovery; fix macaw-symbolic 2021-04-21 11:27:27 -07:00
Daniel Matichuk
1d9d5efe1d
correctly set block endings for all slice exits (#205) 2021-04-14 08:38:16 -07:00
Kevin Quick
b79e313cfc
Revert unintentional type parameter change in MacawEvalStmtFunc. 2021-04-05 12:07:49 -07:00
Kevin Quick
ff4af631fd
[symbolic] Rename EvalStmtFunc to MacawEvalStmtFunc for clarity relative to Crucible EvalStmtFunc. 2021-04-05 11:07:19 -07:00
Kevin Quick
72ddaaf1d8
[symbolic] Fix comment reference in Testing module. 2021-04-05 08:45:00 -07:00
Kevin Quick
3a63547a0f
Haddock spelling fix 2021-04-05 08:44:59 -07:00
Kevin Quick
0c14c7151b
[symbolic] update doctests for FnBindings newtype wrapper. 2021-04-03 23:16:39 -07:00
Kevin Quick
2df7ce1126
[symbolic] add memory variable name to Crucible mkMemVar call. 2021-04-03 16:12:55 -07:00
Kevin Quick
8fb68d439e
[symbolic] Update for What4 removal of BaseNatType 2021-04-03 16:12:04 -07:00
Joe Hendrix
cfc3c646f3 Add new vector operations to macaw-symbolic 2021-03-23 22:36:25 -07:00
Andrei Stefanescu
aa7bfc8c22
Remove isValidPtr checks in doPtrSub and doPtrEq. (#193)
The operations are correct even when the pointers are not valid LLVM
pointers.
2021-03-19 13:24:44 -07:00
Sam Breese
bd11e25695
symbolic: Some special-case handling for bitwise operations (#192)
* Handle XOR of pointer and bitvector

* Remove comment
2021-03-03 17:52:17 -05:00
Tristan Ravitch
dbb4c83f08
Add a testing framework for macaw-symbolic (#184)
The new test suites cover x86_64, PowerPC, and ARM. They test that the semantics are actually correct (rather than just seeing if symbolic execution produces any result). The `Data.Macaw.Symbolic.Testing` module in macaw-symbolic provides some common utilities for symbolic execution engine setup, while there are tailored test harnesses for each architecture.

The semantics of the test harnesses are documented in each architecture test suite, but they:
1. Discover all of the test binaries (which are generated from the included makefiles)
2. Treat each function whose name begins with `test_` as a test entry point
3. Symbolically executes each test case with fully symbolic register states
4. Extracts the return value after symbolic execution, which is treated as the predicate to an assertion that must be proved
    - If the test case is in the `pass` subdirectory, it is proved and expected to hold
    - If the test case is in the `fail` subdirectory, it is proved and expected to not hold.

Each test harness supports two options for debugging:
- Dumping generated SMT queries
- Dumping generated Macaw IR for inspection

This testing uncovered a bug in the (previously untested) macaw-aarch32-symbolic code. It required a number of submodule updates to:

- Adapt to some what4 changes
- Fix a bug in the LLVM memory model that lets these tests pass
- Adapt to changes to some crucible APIs

This change also modifies the CI configuration to install SMT solvers earlier (which are now needed for all of the symbolic package tests).
2021-03-01 09:21:44 -08:00
robdockins
a58f1e25dd
Update to follow changes in What4. Nat is no longer a base type (#190)
Nat is no longer a what4 base type, so we have to adapt various APIs to accommodate that. The template haskell in macaw-semmc is updated to remove Nat cases. Changes to the `SymFn` type required removing a type parameter.

This commit also adds macaw-refinement to CI (which requires installing SMT solvers); that code had to be updated due to the what4 changes.


Co-authored-by: Tristan Ravitch <tristan@galois.com>
2021-02-19 15:44:56 -08:00
Joe Hendrix
6d1e47623d Provide jumptable layout info 2021-01-27 15:27:53 -08:00
Daniel Matichuk
cd5dfe8c65
macaw-symbolic: record original block endings when making a CFG slice (#185)
* reify block exit in macaw extension

* add return address to block end classification

* use global variable to retain block endings

* update to mkBlockSliceCFG signature

* add haddocks for MacawBlockEnd

Co-authored-by: Tristan Ravitch <tristan@galois.com>
2021-01-11 16:56:54 -08:00
Brian Huffman
2a620d41de Switch from ansi-wl-pprint to the prettyprinter package.
This patch relies on the following submodule updates:
- GaloisInc/what4#77
- GaloisInc/elf-edit#20
- GaloisInc/crucible#586
- GaloisInc/asl-translator#28

This patch updates the following packages:
- macaw-base
- macaw-symbolic
- macaw-x86
- macaw-x86-symbolic
- macaw-aarch32
- macaw-ppc
- macaw-semmc
- macaw-refinement
2020-12-02 11:38:19 -08:00
Joe Hendrix
402f99071f Fix warnings and ensure travis uses werror on some packages. 2020-11-06 13:58:39 -08:00
Joe Hendrix
c356694627 Update to work with latest elf-edit. 2020-11-06 13:48:26 -08:00
Daniel Matichuk
adeeac988a
Merge branch 'master' into memgen 2020-10-28 10:47:21 -07:00
Kevin Quick
923089d767
Extract single Assignment element without redundant pattern case 2020-10-20 21:09:15 -07:00
Sam Breese
3a071e317b
symbolic: Fix PackBits and UnpackBits (#166) 2020-10-12 14:20:05 -04:00
Sam Breese
00d8530255
symbolic: Add special cases for null pointers in doPtrEq (#165) 2020-10-08 14:38:46 -04:00
Daniel Matichuk
11ceebb592 more haddock fixes 2020-10-06 16:52:21 -07:00
Daniel Matichuk
aab0010d0a more haddock fixes 2020-10-06 14:57:43 -07:00
Daniel Matichuk
a595bdc6a4 fix haddocks 2020-10-06 11:20:59 -07:00
Daniel Matichuk
3765c5eaab remove 'constraints' dependency 2020-10-06 11:20:40 -07:00
Daniel Matichuk
af7758032c
Merge branch 'master' into memgen 2020-10-05 12:23:36 -07:00
Rob Dockins
99f8cb0bdf Update to use new HasLLVMAnn API, which requires an action for
recording (or discarding) annotations rather than a map.
2020-09-11 14:40:02 -07:00
Daniel Matichuk
70d35d44e4 parameterize ArchInfo and ArchVals over mem type 2020-09-02 10:21:39 -07:00
Daniel Matichuk
84f2fac999 cleanup unused language extensions 2020-09-01 17:55:11 -07:00
Daniel Matichuk
2da172b3ea generalize ArchInfo over memory model 2020-09-01 17:07:04 -07:00
Kevin Quick
1db2d1669c
[symbolic] Defer badBehaviorMap creation to caller instead. 2020-08-10 14:50:45 -07:00
Kevin Quick
ad46e191c6
Update for addition of badBehaviorMap implicit for LLVM annotations.
See https://github.com/GaloisInc/crucible/pull/453
2020-08-08 22:16:48 -07:00