9296692138
The bug arose in the handling of `StackOffsetAbsVal`, which track an abstraction of references relative to the stack pointer. The offsets in `StackOffsetAbsVal` are `Int64`; they are signed because references are both above and below the stack pointer. The code constructing new values of this type was incorrectly zero-extending new offsets instead of sign extending them. This did not matter on 64 bit architectures, as it happened to result in the same values. It substantially corrupted the abstract stack on PowerPC 32. It did not seem to affect AArch32, but that is likely just due to luck in compiler code generation that does not require this level of precision in the abstract stack. The resulting errors manifest in the `absEvalCall` function. Because of the lack of sign extension in `StackOffsetAbsVal`s, it made the current stack pointer look like a huge number, which caused *all* stack entries to be dropped after function calls. This fix simplifies the stack offset abstract value computation substantially and ensures that signs are extended correctly. The commit adds a PowerPC32 test case that only passes with this fix. |
||
---|---|---|
.github | ||
base | ||
deps | ||
doc | ||
macaw-aarch32 | ||
macaw-aarch32-symbolic | ||
macaw-ppc | ||
macaw-ppc-symbolic | ||
macaw-riscv | ||
macaw-semmc | ||
refinement | ||
scripts | ||
symbolic | ||
utils/compare-dwarfdump | ||
x86 | ||
x86_symbolic | ||
.gitignore | ||
.gitmodules | ||
cabal.project.dist | ||
cabal.project.freeze.ghc-8.8.4 | ||
cabal.project.freeze.ghc-8.10.7 | ||
cabal.project.freeze.ghc-9.0.2 | ||
cabal.project.freeze.ghc-9.2.2 | ||
cabal.project.werror | ||
LICENSE | ||
README.md |
This is the main repository for the Macaw binary analysis framework. This framework is implemented to offer extensible support for architectures.
Overview
The main algorithm implemented so far is a code discovery procedure
which will discover reachable code in the binary given one or more
entry points such as _start
or the current symbols.
The Macaw libraries are:
- macaw-base -- The core architecture-independent operations and algorithms.
- macaw-symbolic -- Library that provides symbolic simulation of Macaw programs via Crucible.
- macaw-x86 -- Provides definitions enabling Macaw to be used on X86_64 programs.
- macaw-x86-symbolic -- Adds Macaw-symbolic extensions needed to support x86.
- macaw-semmc -- Contains the architecture-independent components of the translation from semmc semantics into macaw IR. This provides the shared infrastructure for all of our backends; this will include the Template Haskell function to create a state transformer function from learned semantics files provided by the semmc library.
- macaw-arm -- Enables macaw for ARM (32-bit) binaries by reading the semantics files generated by semmc and using Template Haskell to generate a function that transforms machine states according to the learned semantics.
- macaw-arm-symbolic -- Enables macaw/crucible symbolic simulation for ARM (32-bit) architectures.
- macaw-ppc -- Enables macaw for PPC (32-bit and 64-bit) binaries by reading the semantics files generated by semmc and using Template Haskell to generate a function that transforms machine states according to the learned semantics..
- macaw-ppc-symbolic -- Enables macaw/crucible symbolic simulation for PPC architectures
- macaw-riscv -- Enables macaw for RISC-V (RV32GC and RV64GC variants) binaries.
- macaw-refinement -- Enables additional architecture-independent refinement of code discovery. This can enable discovery of more functionality than is revealed by the analysis in macaw-base.
The libraries that make up Macaw are released under the BSD license.
These Macaw core libraries depend on a number of different supporting libraries, including:
- elf-edit -- loading and parsing of ELF binary files
- galois-dwarf -- retrieval of Dwarf debugging information from binary files
- flexdis86 -- disassembly and semantics for x86 architectures
- dismantle -- disassembly for ARM and PPC architectures
- semmc -- semantics definitions for ARM and PPC architectures
- crucible -- Symbolic execution and analysis
- what4 -- Symbolic representation for the crucible backend
- parameterized-utils -- utilities for working with parameterized types
Building
Preparation
Dependencies for building Macaw that are not obtained from Hackage are supported via Git submodules:
$ git submodule update --init
Preparing Softfloat for RISC-V Backend
The RISC-V backend depends on softfloat-hs, which in turn depends on the
softfloat library. Macaw's build system will automatically build softfloat,
but the softfloat-hs repo must be recursively cloned to enable this. If you
are not building macaw-riscv
you can skip this step. To recursively clone
softfloat-hs, run:
$ cd deps/softfloat-hs
$ git submodule update --init --recursive
Building with Cabal
The Macaw libraries can be individually built with Cabal v1, but as a group and more easily with Cabal v2:
$ ln -s cabal.project.dist cabal.project
$ cabal v2-configure
$ cabal v2-build all
To build a single library, either specify that library name instaed of
all
, or change to that library's subdirectory before building:
$ cabal v2-build macaw-refinement
or
$ cd refinement
$ cabal v2-build
Building with Stack
To build with Stack, first create a top-level stack.yaml
file by
symlinking to one of the provided stack-ghc-<version>.yaml
files. E.g.
$ ln -s stack-ghc-8.6.3.yaml stack.yaml
$ stack build
Status
This codebase is a work in progress. Support for PowerPC support (both 32 and 64 bit) and X86_64 is reasonably robust. Support for ARM is ongoing.
Notes on Freeze Files
We use the cabal.project.freeze.ghc-*
files to constrain dependency versions
in CI. We recommand using the following command for best results before building
locally:
ln -s cabal.GHC-<VER>.config cabal.project.freeze
These freeze files were generated using the .github/update-freeze
script.
Note that at present, these configuration files assume a Unix-like operating
system, as we do not currently test Windows on CI. If you would like to use
these configuration files on Windows, you will need to make some manual changes
to remove certain packages and flags:
regex-posix
tasty +unix
unix
unix-compat
License
This code is made available under the BSD3 license and without any support.