* Overview The high level goal is to write and/or generate architecture-specific backends for /macaw/ based on the semantics discovered by /semmc/. In particular, we are interested in making /macaw-ppc/ and /macaw-arm/. We will hand-write some of the code, but we will generate as much as possible automatically. We will read in the semantics files generated by /semmc/ and use Template Haskell to generate a function that transforms machine states according to the learned semantics. We will implement a base package (/macaw-semmc/) that provides shared infrastructure for all of our backends; this will include the Template Haskell function to create a state transformer function from learned semantics files. ** Repository Layout - ~macaw-semmc~ contains the architecture-independent components of the translation from semmc semantics into macaw IR. - ~macaw-ppc~ implements the PowerPC-specific backend of the translation. - ~macaw-ppc-symbolic~ implements a translation of macaw IR (with PowerPC architecture-specific functions) into Crucible IR, which is suitable for symbolic execution. - ~macaw-arm~ implements the ARM-specific backend of the translation. ** Building The dependencies of this project that are not available on Hackage are tracked via git submodules. To build with a reasonably modern version of ~cabal~ (i.e., one that supports ~new-build~): #+BEGIN_SRC $ git submodule update --init $ ln -s cabal.project.dist cabal.project $ cabal new-configure $ cabal new-build macaw-ppc #+END_SRC To build with ~stack~: #+BEGIN_SRC $ git submodule update --init # Choose one of the provided GHC version-specific # stack-ghc-.yaml files, e.g. stack-ghc-8.2.yaml. $ ln -s stack-ghc-.yaml stack.yaml $ stack build #+END_SRC ** Code dependencies and related packages - macaw (binary code discovery) - macaw-x86 (x86_64 backend for macaw) - semmc (semantics learning and code synthesis) - semmc-ppc (PowerPC backend for synthesis) - dismantle-tablegen (disassembler infrastructure) - dismantle-ppc (PowerPC disassembler) - crucible (interface to SMT solvers) - parameterized-utils (utilities for working with parameterized types) * Semantics background The /semmc/ library is designed to learn semantics for machine code instructions. Its output, for each Instruction Set Architecture (ISA), is a directory of files where each file contains a formula corresponding to the semantics for an opcode in the ISA. For example, the ~ADDI.sem~ file contains the semantics for the add immediate instruction in PowerPC. There are functions in /semmc/ for dealing with this representation. Formulas are loaded into a data type called ~ParameterizedFormula~, which contains formula fragments based on the ~ExprBuilder~ representation of /crucible/. This can be thought of as a convenient representation of SMT formulas. * Status This codebase is a work in progress. PowerPC support (both 32 and 64 bit) is reasonably robust. Support for ARM is ongoing. * License This code is made available under the BSD3 license and without any support.