Commit Graph

1970 Commits

Author SHA1 Message Date
Brett Boston
a5796fc955
Reverse syscall override return register ordering (#284)
When a user overrides a system call on an architecture that supports returning two values from a system call and they provide a context containing the result of the system call in the form

```
empty :> v0 :> v1
```

macaw will perform the register assignment

```
r0 := v1
r1 := v0
```

This change reverses this behavior so that the assignment becomes

```
r0 := v0
r1 := v1
```

This brings the expected ordering of the result context in agreement
with the left-to-right ordering of the argument context:

```
empty :> arg1 :> arg2 :> ...
```
2022-05-04 12:41:02 -07:00
Daniel Matichuk
bbc0b6a40a
Bump ASL revision to handle slicing issues with UBFX instruction on ARM (#280)
* bump asl-translator

* add ubfx tests

* add CI version to workflow
2022-04-20 10:08:36 -07:00
Daniel Matichuk
5b4d16dd8a
generalize treatment for terminal statements in CFG slicing code (#278)
* remove/generalize MacawBlockEnd from CFG slicing

* expose functions in symbolic backend

* hide bvLit from Backend import

* add CI version to workflow
2022-04-20 10:08:15 -07:00
Ryan Scott
7300149315 CI: Use CACHE_VERSION 2022-04-19 12:47:24 -04:00
Ryan Scott
c86e13d35c Adapt to dynSymEntry being definition-aware in elf-edit
This bump the `elf-edit` submodule to bring in the changes from
https://github.com/GaloisInc/elf-edit/pull/29, which adds an additional
`VersionDefMap` argument to `elf-edit` to make it aware of version definitions.
This requires some changes to the API in `Data.Macaw.Memory.ElfLoader` to
accommodate.
2022-04-19 12:47:24 -04:00
Ryan Scott
ef0ece6a72 Make newMergedGlobalMemoryWith work over any Foldable collection
This would come in handy for an application where I wish to pass a
`NonEmptyVector` to `newMergedGlobalMemoryWith`. Currently, I have to convert
the `NonEmptyVector` to a `NonEmpty` list to accomplish this, wish seems
wasteful given that `newMergedGlobalMemoryWith` only needs to use the
`Foldable` interface.
2022-04-04 15:03:56 -04:00
Brett Boston
93ca88c2fc
Export Data.Macaw.Discovery.Classifier.classifierEndBlock (#275) 2022-04-04 11:57:19 -07:00
Tristan Ravitch
11c5a04b3d Fix a bug in handling return detection in thumb mode
The return address gets masked and has the low-bit set in an obtuse way due to
the semantics. This threw off the call detection.

This change matches against the quirky pattern.
2022-03-30 20:19:06 -07:00
simonjwinwood
4e8efbf09b
Merge pull request #272 from GaloisInc/wip/zero-sized-reloc-workaround
Ignore relocations with zero size when constructing Memory
2022-03-24 07:42:15 +11:00
Simon Winwood
7ca8e4c237
Ignore relocations with zero size when constructing Memory 2022-03-23 01:36:07 -07:00
Brett Boston
3f895bed1b
Add RV32GC support to macaw-riscv (#269)
This change adds support for RV32GC RISCV binaries.  Specifically, it:

* Updates the return matcher to recognize returns in 32-bit binaries
* Updates detection of unsupported binaries to allow RV32GC binaries
* Adds RV32GC versions of the RV64GC tests
2022-03-21 14:08:50 -07:00
Tristan Ravitch
dd9448db3c Export an extra function from the discovery API
This is necessary to implement custom logic using the incremental computation
monad, as the even lower-level infrastructure is not exposed at all.
2022-03-21 09:20:35 -07:00
Brett Boston
adf3d67614
Add ability to load multiple memories into a flat address space (#268)
This change adds a function `newMergedGlobalMemoryWith`, which acts like
`newGlobalMemoryWith` but takes a list of macaw memories and merges them
into a flat address space.  This aids in reasoning dynamically linked
programs.
2022-03-17 20:34:30 -07:00
Tristan Ravitch
45f8af1e5a
[symbolic] Optimize how initial binary memories are populated (#267)
Before, the API provided by macaw-symbolic asserted the initial value of each byte of memory individually. This was fairly expensive for large binaries, as each such assertion flushed the solver pipe.

This change generates a large conjunction of assertions and sends them all at once. In unscientific testing, this saved half an hour on a large binary.

API Changes:

- Note that it introduces a minor API change. The optimization required that the `sym` parameter be concretely an `ExprBuilder`.
2022-03-10 16:43:00 -08:00
Brett Boston
9d0d9d762c
Update softfloat-hs repo to eliminate manual build step (#265)
* Update softfloat-hs repo to eliminate manual build step

* Fix small typo in README (softfloat -> softfloat-hs)
2022-03-04 15:49:36 -08:00
Brett Boston
82640e7a4b
Add RISC-V backend (#259)
* riscv: added grift as submodule

* added macaw-riscv project

* make arch polykinded everywhere in macaw base

* stubbed out riscv_info

* update grift

* started on RISCVReg

* started on RISCVReg

* RegisterInfo instance for RISCVReg (a few unimplemented fields)

* filled out archRegSet

* filled out withArchConstraints, archAddrWidth, and archEndianness

* added Arch module

* RISCV initialBlockRegs

* preliminary work on disassembleFn

* wip: disassembleFn

* made things more lens-y

* wip: disassemble instruction

* finished disassembly of grift assignment statements

* separated out DisInstM into separate module

* disassembly wip

* finished disassembleBlock

* Finished riscvDisassembleFn

* bump grift submodule

* made macaw discovery poly-kinded

* added risc-v test suite

* added risc-v test suite

* fixed macaw semantics to hardware x0 to constant value 0

* added riscvPreserveReg based on assembler's manual

* riscvDemandContext

* successfully disassembled a block!

* enhanced tests to allow optional entry point spec

* stubbed out identifyCall

* stubbed out identifyReturn

* passing initial test

* added checkForReturnAddr stub

* fleshed out identifyCall and identifyReturn

* update grift submodule

* bug fix and exception handling

* added EXC register, which tracks whether or not we've attempted to
read from/write to any system registers.

* Replaced custom CSR type with GRIFT's (but we're not using it
currently)

* added better show instance for GPRs we should migrate this to a
GRIFT pretty printer at some point)

* Fixed a vicious bug in the semantics; unsigned and signed LT were
getting swapped in translation

* added pattern synonyms for GPRs

* improved docs and fixed RISCVReg bug (GP was 3 instead of 4)

* changed undefineds to errors

* changed RISCV class to RISCVConstraints

* wrapped GRIFT's "RV" parameter in a type to remove the need to make
macaw architecture parameter polykinded

* rolled back all changes to macaw base that made things poly-kinded

* reverted two more macaw core changes, updated license, removed old PPC test

* macaw: update to upstream changes in bv-sized and grift

* address code review comments

* macaw-riscv: expose fewer modules

* Update RISCVTermStmt definition

* Update riscv_info.  macaw-riscv now builds against master

* Update bv-sized and cabal freeze files

* Update cabal freeze files with satisfying lens version

* Get tests building

* Fix printf runtime error

* Add simpler tests

* Change RISCV target version and update grift pointer

[skip ci]

* Compressed branch test passes

[skip ci]

* Add additional small tests

[skip ci]

* Introduce a syscall PrimFn

* Syscalls now correctly classified

* Fix return regs from syscall

* Extract syscall arguments

* Update expected riscv test results

* Add macaw-riscv build + test to CI

* Get building with GHC 9.0.2

* Revert "Update cabal freeze files with satisfying lens version"

This reverts commit 4aa95c19c3.

* Install softfloat in CI

* Update Grift

* Some initial cleanup

* More cleanup

* Resolve FIXME on getReg

* Detect and only accept rv64gc rvreprs

* Address Tristan's PR comments

* Update Grift pointer

* Add info on installing Softfloat to README for macaw-riscv

* Add missing submodule step to softfloat build instructions

Co-authored-by: Ben Selfridge <benselfridge@000279.local>
Co-authored-by: Valentin Robert <val@galois.com>
2022-03-04 12:44:46 -08:00
Ryan Scott
a43151963d Note personality-related tweaks in the changelog 2022-02-21 13:27:20 -05:00
Ryan Scott
5c801c5803 Universally quantify personality type in Lookup{Function,Syscall}Handle
This is needed for situations where you want to be able to inspect the
personality type in function/syscall lookups.
2022-02-21 13:27:20 -05:00
Kevin Quick
78b45a10c6 Only fix personality to (MS.MacawSimulatorState sym) where required. 2022-02-21 13:27:20 -05:00
Brett Boston
719aa8fb01
Translate PLTStubs as tail calls (#263)
* Translate PLTStubs as tail calls

This change modifies `addMacawParsedTermStmt` to translate `PLTStub`s as
tail calls.

* Replace CR.Call + CR.Return with CR.TailCall
2022-02-18 15:02:59 -08:00
Brett Boston
ad51ae3c54
Add support for standalone PIEs (#262)
* Add support for standalone PIEs

This changeset adds support for standalone position independent
executables (PIEs) that do not make use of procedure linkage tables.  It
does so by adding relative address support to `populateSegmentChunk` and
adding an additional simplification rule for Aarch32.

This covers part of the work for #234.

* Remove NoOp + replace mult with left shift
2022-02-03 13:51:11 -08:00
robdockins
d1d71fd973
Merge pull request #257 from GaloisInc/separate-backend
Update with changes flowing from GaloicInc/crucible#945.
2022-01-25 08:54:44 -08:00
Rob Dockins
c572e372db Update changelog 2022-01-24 16:28:32 -08:00
Rob Dockins
465a84ee49 Update with changes flowing from GaloicInc/crucible#945.
This mostly deals with the splitting of the old `sym` type into
two: one for dealing with expression creation, and a new simulator
backend type for dealing with control-flow and assertions.
2022-01-24 16:24:07 -08:00
Rob Dockins
2845d6fd74 Update what4 and crucible submodules 2022-01-24 16:24:07 -08:00
Tristan Ravitch
63a65c3d85 x86: Fix failing proof obligations due to EvenParity
See the writeup in Crucible.hs in this commit for details. In short, the recent
changes to generalize `PtrAdd` triggered a failing proof obligation due to a use
of `llvmPointer_bv`.  The new implementation is as sound as the previous one,
but more general.

Fixes #260
2022-01-21 15:33:10 -08:00
Ryan Scott
729c65054b Replace deprecated prettyprinter import with modern equivalent
Also bump the lower version bounds on `prettyprinter`.
2022-01-10 16:40:23 -05:00
Ryan Scott
ce10bc9243 Drop support for GHC 8.6
This allows us to remove gobs of CPP as a consequence.
2022-01-10 16:40:23 -05:00
Ryan Scott
c3b8bac579 CI: Test GHC 9.0.2 2022-01-10 16:40:23 -05:00
Ryan Scott
ce3e9cba13 Add notes on freeze file generation to README 2022-01-10 16:40:23 -05:00
Ryan Scott
dfeef5f332 Regenerate freeze files 2022-01-10 16:40:23 -05:00
Ryan Scott
7819377d84 macaw-x86-symbolic: Remove unused FloatInfoFromSSEType definition
GHC 9.0 uncovered this type family as being unused. (See
https://gitlab.haskell.org/ghc/ghc/-/issues/18470, which made
`-Wunused-top-binds` more clever about detecting unused, closed
type families like `FloatInfoFromSSEType`.) Let's remove it to avoid
an `-Wunused-top-binds` warning.
2022-01-10 16:40:23 -05:00
Ryan Scott
2f2ba118bd macaw-semmc: Avoid arithmetic overflow from negative shift amounts
While pre-9.0 versions of GHC would silently turn negative right shifts into
left shifts, GHC 9.0 will throw an `arithmetic overflow` exception instead.
This patch makes this behavior explicit in `macaw-semmc` to allow the code to
work on GHC 9.0.

Fixes #212.
2022-01-10 16:40:23 -05:00
Ryan Scott
049096c506 Support building with GHC 9.0
This contains a variety of fixes needed to make the packages in the `macaw`
repo compile with GHC 9.0:

* GHC 9.0 implements simplified subsumption (see
  [here](https://gitlab.haskell.org/ghc/ghc/-/wikis/migration/9.0?version_id=5fcd0a50e0872efb3c38a32db140506da8310d87#simplified-subsumption)).
  In most cases, adapting to this is a matter of manually eta expanding
  definitions, such as in `base:Data.Macaw.Analysis.RegisterUse`. In the case
  of `macaw-x86-symbolic:Data.Macaw.X86.Crucible`, the type signature of
  `evalExt` had to be made more specific to adapt to the loss of contravariance
  when typechecking `(->)`.
* GHC's constraint solver now solves constraints in each top-level group
  sooner (see
  [here](https://gitlab.haskell.org/ghc/ghc/-/wikis/migration/9.0?version_id=5fcd0a50e0872efb3c38a32db140506da8310d87#the-order-of-th-splices-is-more-important)).
  This affects `macaw-aarch32` and `macaw-symbolic`, as they separate top-level
  groups with `$(return [])` Template Haskell splices. The previous locations
  of these splices made it so that the TH-generated instances in that package
  were not available to any code before the splice, resulting in type errors
  when compiled with GHC 9.0.

  To overcome this, I rearranged the TH-generated instances so that they appear
  before the top-level groups that make use of them.
* GHC 9.0 now enables `-Wstar-is-type` in `-Wall`, so this patch replaces some
  uses of `*` with `Data.Kind.Type`. `Data.Kind` requires the use of GHC 8.0 or
  later, so this patch also updates thes lower bounds on `base` to `>= 4.9` in
  the appropriate `.cabal` files. (I'm fairly certain that this requirement was
  already present implicity, but better to be explicit about it.)
* The `asl-translator`, `crucible`, and `semmc` submodules were updated to
  allow them to build with GHC 9.0. The `llvm-pretty` and
  `llvm-pretty-bc-parser` submodules were also bumped to accommodate unrelated
  changes in `crucible` that were brought in.
* The upper version bounds on `doctest` in `macaw-symbolic`'s test suite were
  raised to allow it to build with GHC 9.0.
2022-01-10 16:40:23 -05:00
Ryan Scott
f9b300a3d1 macaw-symbolic: Add fromCrucibleEndian function
This is like `toCrucibleEndian`, but in the opposite direction.
2022-01-04 16:27:57 -06:00
Ryan Scott
45f991ccdf macaw-symbolic: Consolidate duplicate definitions of toCrucibleEndian
There were two identical definitions of `toCrucibleEndian`, one in
`D.M.S.Memory` and another in `D.M.S.Testing`. This commit removes the
latter in favor of the former, which is actually exported.
2022-01-04 16:27:57 -06:00
Ryan Scott
c900dc4de8 macaw-symbolic: Replace mkName with safeSymbol
Fixes #251.
2022-01-04 16:27:24 -06:00
Tristan Ravitch
9c426986ff Revise handling of some pointer operations
Have all additions (at any bit width) go through the special PtrAdd
handler (rather than BVAdd). Also add special handlers for truncation and
extension.

These changes support architectures that do pointer operations at non-pointer
widths (e.g., to detect overflow).  These new operations apply the named
operations over just the offset of pointers, preserving the block id.
2021-12-23 13:54:29 -08:00
Tristan Ravitch
21366abc23
symbolic: Make relocation handling configurable in the symbolic backend (#250)
The `Data.Macaw.Symbolic.Memory` module provides a default memory model and
initial memory setup that is suitable for many symbolic execution
workloads. However, the defaults cannot handle dynamically-linked programs, as
it calls `error` when it attempts to determine an initial value for relocations
it finds in memory.

There are no good defaults for this, as what those values should be depend a lot
on what the verifier wants to prove.

This commit adds some hooks to configure this behavior in the verifier, and is
designed to be extensible and enable other configuration choices where
reasonable.

The original API is unchanged, as it calls the added `newGlobalMemoryWith`
function with a default set of hooks.  Callers with special memory handling
needs are directed to use that function.
2021-12-16 15:22:38 -08:00
Tristan Ravitch
8047a5d0ed
arm: Add transfer functions for division (#249)
Add support for abstract interpretation of division operations in the ARM
backend (when the operands to division are concretely known). 

This commit also adds extended documentation on the semantics of these
operations.

The concrete evaluation eliminates constant division operations.  The abstract
cases are probably obsolete in light of that, but are still interesting...

These changes were motivated by insufficient simplification around some of the syscall/errno handling code in musl
2021-12-07 14:11:59 -08:00
Tristan Ravitch
3e918f8b51
Revise handling of syscalls in AArch32 to match X86 (#246)
The old formulation (with system calls as block terminators) proved to be
impossible to implement properly. Handlers for syntax overrides have very
limited types (`IO`, rather than `OverrideSim`), which made symbolic branching
and reusing overrides impossible.

This change replaces the system call block terminator with an arch-specific
function that is translated into a function handle lookup (which is then
dispatched to with a call).

Unfortunately, this refactoring required combining the AArch32 simplification
module with the architecture extension definitions, due to the new translation
relying on the simplifier instance.
2021-11-24 11:59:56 -08:00
Ryan Scott
d3a53a6769 Update crucible, semmc submodules; adapt to GaloisInc/crucible#906
This updates the `crucible` submodule to include GaloisInc/crucible#906
(`Control granularity of reading uninitialized memory`), as well as the
`semmc` submodule to bring in corresponding changes on its side
(GaloisInc/semmc#69). Some additional `?memOpts :: MemOptions` constraints
needed to be added to some functions in `macaw-symbolic` and
`macaw-refinement` as a result.
2021-11-22 18:27:46 -05:00
Ryan Scott
9efbbf1753 Update llvm-pretty-bc-parser submodule
This updates the `llvm-pretty-bc-parser` to include these PRs:

* GaloisInc/llvm-pretty-bc-parser#159 (`Support parsing fneg instructions`)
* GaloisInc/llvm-pretty-bc-parser#166 (`Support parsing freeze instructions`)
* GaloisInc/llvm-pretty-bc-parser#164 (`Parse DebugInfoEnumerator properly on LLVM 12+`)
* GaloisInc/llvm-pretty-bc-parser#162 (`Support parsing dict{Associated,Allocated,Rank} fields introduced in LLVM 12`)

The `llvm-pretty` submodule had corresponding changes as well. These were
included as part of a previous commit to update the `llvm-pretty` submodule
(952fe5578d), but the `llvm-pretty-bc-parser`
submodule was not updated at the same time. This commit brings the two back
into harmony.
2021-11-22 18:27:46 -05:00
Tristan Ravitch
9ce3d43188
AArch32: Support conditional returns (#243)
Adds support in macaw-aarch32 for conditional returns. These are not supported in core macaw, and are thus architecture-specific block terminators.

This required changes to the type of arch-specific block terminators. Before, `ArchTermStmt` was only parameterized by a state thread (`ids`).  This meant that they could not contain macaw (or crucible) values.  Some work on. AArch32 requires being able to store condition values in arch terminators (to support conditional returns). This change modifies the `ArchTermStmt` to enable this, which requires a bit of plumbing through various definitions and some extra instances.

In support of actually using this, it also became necessary to plumb fallthrough block labels through the architecture-specific terminator translation in macaw-symbolic.

Note that this change was overdue, as the PowerPC backend was storing macaw values in a way that would have rendered them unusable in the macaw-ppc-symbolic translation, had any interpretation been provided.  These new changes will enable a handler to be written for the conditional PowerPC trap instructions.

PowerPC, x86, and ARM have been updated.

Improves the macaw-aarch32 tests. There is now a command line option to save the generated macaw IR for each
discovered function to /tmp. Note that this reuses some infrastructure from the macaw-symbolic tests. This
shared functionality should be extracted into a macaw-testing library.
2021-11-19 16:20:50 -08:00
Tristan Ravitch
952fe5578d Submodule updates 2021-11-18 21:40:09 -08:00
Tristan Ravitch
c6eae87724 Update the aarch32 disassembler
This fixes some critical bugs in thumb2 disassembly.  This change adds some
macaw-level tests to ensure that thumb2 is working properly.
2021-11-05 20:01:11 -07:00
Tristan Ravitch
2c85dce18e Expose block classification in the ArchitectureInfo
This change makes the block classifier heuristic part of the `ArchitectureInfo`
structure.  This enables clients and architecture backends to customize the
block classification heuristics.  This is most useful for architectures that
have complex architecture-specific block terminators that require analysis to
generate (e.g., conditional returns).  It will also make macaw-refinement
simpler in the future, as the SMT-based refinement is just an additional block
classifier (but is currently implemented via a hacky side channel).

This change introduces an ancillary change, which should not be very
user-visible.

It splits the Macaw.Discovery and Macaw.Discovery.State modules to break
module import cycles in a way that enables us to expose the classifier.  This
should not be user-visible, as Macaw.Discovery still exports the same
names (with one minor exception that should not appear in user code).

It also moves the definition of the `ArchBlockPrecond` type family; the few
affected places should be updated. User code should probably not be able to see
this.
2021-11-05 18:25:03 -07:00
Ryan Scott
2152108e81
Make LookupFunctionHandle a newtype (#238)
`LookupSyscallHandle` is already a newtype. `LookupFunctionHandle` deserves to
share the love.
2021-10-13 12:10:36 -04:00
Ryan Scott
5547632f65 macaw-x86: Handle sign-extended immediates in def_push
See `Note [Sign-extending immediate operands in push]` in
`Data.Macaw.X86.Semantics` for the full story. I have also added a test case
in `macaw-x86-symbolic` which ensures that the stack-pointer-decrementing
logic behaves as one would expect.

Bumps in the `flexdis86` submodule to bring in GaloisInc/flexdis86#37.

Fixes #235.
2021-10-12 16:37:21 -04:00
Ryan Scott
4fc8a46ed6 Bump semmc submodule to bring in GaloisInc/semmc#67 2021-10-12 16:37:21 -04:00