The previous generator put all of the code for each matcher in a single large
case expression. While there were individual functions broken out for each case
body, they were all still in the same let expression, which created a huge term.
This refactoring lifts all of the semantics definition bodies to the top
level (with NOINLINE pragmas) to give the code generator less to chew on at a
time.
This improves compile times a little, but, more importantly, works around a bug
in the register allocator in GHC 8.4 that caused a crash in the PowerPC
semantics functions.
The semantics for these instructions cannot be represented in our semantics
language, as they have side effects (i.e., they trap). This currently means
that we have to implement their semantics by hand in macaw-ppc.
It is now (optionally) pure via the MonadThrow class. It also exposes a new
binary format repr, which currently only has constructors for ELF containers.
The generic binary loading interface is instantiated once for each
architecture/binary container pair. This isn't great, but there is enough
custom work in each setting to justify it.
The binary loading interface isn't finished yet, and needs to learn some
additional operations to support relocation. It already supports additional
information that is architecture specific and binary container format
specific (that operations will have to use on a per-format basis).
On the PowerPC side, the Table of Contents (TOC) is now architecture-specific
information constructed by the loader (currently from ELF binaries). The new
TOC data type is in place to support this more easily (the old format was just a
function).
This change is in the core generator monad and applied in the PowerPC backend.
This change includes some macaw updates (which required a new elf-edit version).
Now test to ensure that no blocks end in a classification failure (or a
disassembly failure). Before, many blocks were not classified, which causes
problems downstream. This required some changes in macaw core in two places:
1. The simplifier needed some additional rules to remove some redundant
constructions that threw off the abstract interpretation of values. This was
particularly an issue while reading return values off of the stack in
PowerPC.
2. Extending the abstract interpretation to be able to handle more operations (shiftl)
We need special treatment of the return, as the low two bits are cleared on
PowerPC, so we can't just rely on pattern matching against the ReturnAddr in the
IP register.
The identifyReturn was previously unused because the Macaw Discovery
performed this test inline, but some architectures have different
semantics so the identifyReturn is now used by the Discovery process.
This implements the return discovery that should be sufficient for the
PPC.
Recent changes in macaw(-base) mean that we split blocks more aggressively. The
old expected outputs were conservative - these new values are much more in line
with intuitive expectation (with more aggressive splitting of blocks and less
code duplication between blocks).
Pass operand and architecture types and instead of
case opcode of
ADD -> case operands of
Just GPR gpr0 :< Nil of ->
SSA-semantics
Generate:
let opc_ADD operands = case operands of
Just GPR gpr0 :< Nil of ->
SSA-semantics
in case opcode of
ADD -> opc_ADD operand
This provides better encapsulation for the individual operands and
more specific control over the types (at the cost of a pair of
additional type specifications in the call). This also seems to
reduce memory consumption by about half.
The system call instructions TRAP and SC were updating the IP twice, which led
to skipping instructions. The IP increment for these instructions was already
handled in the abstract interpretation of arch-specific terminators.