aarch32: Support mixed ARM/Thumb1 binaries
This updates the aarch32 backend to decode Thumb instructions and generate the Thumb semantics. The major implementation change is to use the `ArchBlockPrecond` feature of macaw to track the Thumb state (`PSTATE_T`) across block boundaries.
The ARM code discovery decides whether or not a function entry point should be decoded as Thumb by examining the low bit of the function address. If the low bit is set, it is a Thumb entry point. This has the slightly odd effect of causing macaw to say that the function is at the address with the low bit set, which is not technically true. This is documented in the README, but not obvious on inspection. Most use cases should not care, and can in any case account for it. In the future, it should be possible to fix this (though it will require some changes to the core of macaw).
Improve the TH codegen for macaw-semmc
This change lazily translates as much as possible. It also generates somewhat more compact code. This change also finishes implementing primitives for the aarch32 backend. Complementing the aarch32 changes, the macaw-semmc interface has been modified to allow macaw-aarch32 to avoid a redundant serialize-deserialize round.
Co-authored-by: Kevin Quick <kquick@galois.com>
These packages replace the old macaw-arm (which has been removed). The only
change to the core macaw is to introduce a `Lift` instance for the Endianness
data type, which is used in macaw-semmc.
The macaw-aarch32 package uses the official ARM semantics (via the
asl-translator package). In its current state, macaw-aarch32 seems to handle
the common idioms of simple ARM binaries. Position independent executables have
not been tested yet. The semantics and disassemblers for Thumb are present, but
not integrated into code discovery at this time. There are some tests in
macaw-aarch32. Compile times are longer than necessarily desired.
macaw-aarch32 can be compiled in two modes: lite mode (cabal flag -fasl-lite),
which uses a restricted set of instructions for testing, and takes less time to
compile. The full instruction set is the default, though there are a few
undefined functions that are not yet handled for the full set, mostly relating
to floating point operations.
The macaw-aarch32-symbolic package is currently a stub, but is implemented to
provide a few necessary instances.