This code was mostly architecture independent already, so this commit moves it
to the macaw-semmc module so that it can be shared with the ARM backend. I
still plan to move the main TH module with the SimpleBuilder to macaw
translation, but that requires a few other changes first.
The code pointer discovery in macaw can't handle this case because we never
write the code pointers into memory - we only read them. We really need a way
to tell macaw about code pointers.
The easy workaround is to pull all of the function entry points out of the TOC
and just seed the macaw search with them, but it would be nice to be able to
identify them from first principles.
This change now memoizes translations of SimpleBuilder expression fragments,
which allows us to restore the sharing in semantics formulas. The generator
re-uses shared sub-expressions automatically now. This generates less Haskell
code, yielding better code density and fewer terms constructed at run time. It
also reduces compile times.
It seems to cut the size of the generated TH code by about half. It also
generates less deeply-nested Haskell code, making the resulting TH splices human
readable.
It runs code discovery over a large-ish binary to test coverage. We currently
fail due to unsupported instructions (expected). This test will guide
priorities on implementing new semantics.
The main function is 'extractValue', which takes an operand and returns a macaw
bitvector for it (in the PPCGenerator monad).
There are still some missing cases for the memory operands.
These lists come from semmc and contain the bytestrings of the semantics files
for each opcode.
NOTE: The lists are currently empty (presumably due to bugs), but the logic for
moving data around and setting up a SimpleBuilder instance is at least right.