This patch adds initial support for relocations in Macaw code
discovery, and adds other refactoring.
* It introduces a SymbolValue constructor to represent references to
symbols within Macaw.
* The various cases for x86 mov are made explicit after the flexdis refactor
broke the previous code. We should now support segment register movs and
give better error messages when seeing mov with control or debug registers.
* The generic exception operation is replaced with Hlt and UD2 terminal
x86-specific statements.
* CodeAddrReason is split into FunctionExploreReason and BlockExploreReason to
clarify whether a function or block was discovered.
* The Macaw pretty printer is changed to use write_mem in place of pointer syntax.
* Various other refactoring is made to clarify code.
ArchMemAddr is easier to use than ArchAddrWord in downstream clients, and is
probably more faithful in the case where we want to support shared libraries
and/or object files.
The new statement is called `ArchState`, and has two fields: an address and a
map. The address is the address of the instruction it is standing in for. The
map contains a mapping from the *machine registers* that the instruction updated
to the *macaw values* that were assigned to those locations.
This is useful metadata for debugging, but is also required to do some types of
architecture-independent analysis (where we can still reason about machine
register contents).
Instead of inline analysis of whether the instruction pointer has been
updated to contain the ReturnAddr symbolic value, defer the
determination of the call return to the (previously defined but
unused) architecture-specific handling. This allows architectures
like ARM that perform modifications on the values loaded to the
instruction pointer (e.g. clearing lower bits) to provide their own
recognition of a return operation.
Also modifies the signature of identifyReturn to return a Sequence of
statements to match the identifyCall type signature.
Replaces the previously unused identifyX86Return with the inline
detection of IP == ReturnAddr.