Commit Graph

223 Commits

Author SHA1 Message Date
Kevin Quick
16a867efd2
Haddock and README fixes. 2019-01-08 16:38:38 -08:00
Luke Maurer
b93302a536 Cache map with arch registers as keys
The use of `Data.Parameterized.Map.fromList` in `mkRegStateM` was
showing up in profiling as a huge time sink.  We don't actually need to
build the map from scratch there, though, since the keys are known ahead
of time.  Adding an `archRegSet` variable to the `RegisterInfo` class
(with the obvious default implementation) ensures that a `MapF` with the
right keys will be built once and then reused.
2018-12-27 11:32:56 -08:00
Luke Maurer
64a1c01a7b Use RULE to optimize uses of boundValue as getter
GHC was leaving `boundValue` in its higher-order form, which was causing
slowdowns accounting for ~3% of runtime in Brittle.
2018-12-27 11:32:46 -08:00
Andrei Stefanescu
76ac547995 Merge branch 'master' of github.com:GaloisInc/macaw into fix/keep-return-address-stack-write 2018-12-18 14:31:08 -08:00
Tristan Ravitch
96129be6de Keep the write of the return address to the stack (x86)
This mostly affects x86.  Previously, we threw away the write of the return
address to the stack when identifying calls for macaw-x86.  This was partly for
hygiene and partly to support the "addresses written to memory are function
pointers" heuristic.  Treating the return address as a potential function
pointer breaks function identification, so that is important.

The problem comes in the translation of macaw into crucible - we never write the
return address to the stack, but returns still read the return address from the
stack.  If it wasn't written in the first place, this leads to a read
from (potentially) uninitialized memory, which causes errors in the symbolic
simulator.  There are two solutions:

1. Make returns not read from the stack
2. Keep the write of the return address to the stack

Solution 1 is a problem, as we have a data dependency on the read.  Eliding it
breaks Crucible generation later and produces an invalid CFG.

Solution 2 works well.  The implementation is actually simple.  We can keep
identifyCall the same for x86 and just construct the basic block not from the
return value but from the original list of statements (unaltered).  We do need
to have identifyCall still give us the reduced statement list, which we use for
identifying possible function pointers written onto the stack (but not the
return address, which we do not want to treat as a function pointer).
2018-12-07 15:11:39 -08:00
Brian Huffman
3fc657782d Add Semigroup instance to make GHC 8.4 happy. 2018-12-07 13:48:38 -08:00
Joe Hendrix
3dd2f15dd6
Add mapsRegsWith; 8.6 compatibility. 2018-12-04 13:41:07 -08:00
Joe Hendrix
25e922ef83
Fix previous commit 2018-12-04 09:02:27 -08:00
Joe Hendrix
ebc5d9575e
Merge remote-tracking branch 'public/master' into jhx/plt-support 2018-12-04 08:04:32 -08:00
Joe Hendrix
f03941d607
Add test-plt test case, and fix discovery to use trust symbols. 2018-12-04 00:04:23 -08:00
Joe Hendrix
a0a89083e8
Support X86 Relative; other minor changes. 2018-12-03 20:52:44 -08:00
Kevin Quick
7a64cb614f
Explicit NoStarIsType with Data.Kind.Type and increasing do indentation (for GHC 8.6) 2018-11-20 09:43:48 +00:00
Joe Hendrix
c4b7252c77
Add specialized terminal statement for PLT stubs. 2018-11-16 13:40:40 -05:00
Joe Hendrix
bb63f9f859
This fixes tail call detection, and allows architecture-specific checks. 2018-11-12 11:56:44 -05:00
Kevin Quick
8e55f1644f
[base] Remove obsolete/unused GaloisDwarf.hs file. 2018-11-02 15:56:20 -07:00
Joe Hendrix
8ce9d06d27
Fix handling for non-position independent, but dynamically linked executables. 2018-10-25 13:49:21 -07:00
Joe Hendrix
2e93d42893
Merge remote-tracking branch 'public/master' 2018-10-22 13:04:30 -07:00
Kevin Quick
46a2c1c72a
Minor spelling fix in Types haddock docs. 2018-10-18 22:31:13 -07:00
Joe Hendrix
3948314813
Export AddrSymMap 2018-10-18 12:59:53 -07:00
Joe Hendrix
c886c19b03
Rename Memory exports.
This update renames many of the declarations exported by
Data.Macaw.Memory so that we have more consistent names.

The majority of the existing names are now exported with DEPRECATION
warnings.  Some of the symbol declarations that were not used by the
Memory datatype have been moved to other modules.

The minor version of macaw-base has been incremented.
2018-10-18 10:07:20 -07:00
Luke Maurer
8b0c58c661 Make architecture type families injective
This should cut down on the number of proxies/explicit type arguments
needed when dealing with these types.

Awkwardly, ArchTermStmt isn't injective, because PPC32 and PPC64 happen
to use exactly the same type. We could add an argument to that type and
then all the families could be injective.
2018-10-12 15:23:13 -07:00
Andrei Stefanescu
9c64a192d2 Evaluate PopCount and Bsr with concrete arguments. 2018-09-27 23:23:25 -07:00
Joey Dodds
1d3ce2ce77 add back import 2018-09-27 16:48:05 -07:00
Joey Dodds
d681046ddd removed an extra space 2018-09-27 16:15:18 -07:00
Joey Dodds
4f1f8656dd Merge branch 'master' of https://github.com/GaloisInc/macaw into HEAD 2018-09-27 15:39:01 -07:00
Joey Dodds
82b60bc315 add memory command to return all symbols as opposed to just function symbols 2018-09-27 15:18:15 -07:00
Daniel Wagner
d3f19048ce don't double-offset in pretty-printer
The pretty-printer for Stmts takes a pretty-printer function as an
argument. This used when a Stmt stores an offset from the beginning of a
block can, but we don't have information about that block internally in
the Stmt.

An ArchState Stmt stores an ArchMemAddr, which is independent of the
block it's in. Previously we were treating the ArchMemAddr as an offset
and passing it to the pretty-printer function for offsets; in practice
this means most of them were printed as values about twice as big as
they were supposed to be.
2018-09-26 16:29:26 -04:00
Joe Hendrix
96bd9bee1a
Fix off-by-one bug in applying relocations. 2018-09-17 16:20:51 -07:00
Joe Hendrix
73d24d42f9
Bump elf-edit compat 2018-09-17 15:32:18 -07:00
Joe Hendrix
c3dcfd7e3f
Update ElfLoader to apply PLT relocations. 2018-09-17 15:28:13 -07:00
Joe Hendrix
491f40caf1
Add support for R_ARM_GLOB_DAT. 2018-09-11 00:26:23 -07:00
Joe Hendrix
e180b996ce
Merge branch 'master' of github.com:GaloisInc/macaw 2018-09-10 15:34:14 -07:00
Joe Hendrix
4104191b54
Fix relocation handling; update for elf-edit/binary-symbols changes. 2018-09-10 15:33:35 -07:00
Andrei Stefanescu
0314c12908 Add float type to Macaw. 2018-08-27 11:37:01 -07:00
Kevin Quick
cc23c9b4e1
Fix miscellaneous haddock documentation misspellings. 2018-08-19 21:30:11 -07:00
Joe Hendrix
64d71737af
Bump submodules 2018-08-15 00:17:03 -07:00
Joe Hendrix
230b318dcf
Updates to discovery 2018-08-14 23:29:02 -07:00
Joe Hendrix
4c21eb9a97
Merge remote-tracking branch 'arm-reloc/master' 2018-08-12 23:30:23 -07:00
Daniel Wagner
da991102e7 comment IPAlignment more 2018-08-09 13:20:56 -04:00
Ledah Casburn
af826d6c16 Add naive bytestring matching utility.
Used to determine where bytes from execution trace match memory from ELF file.
2018-08-07 17:03:16 -07:00
Joe Hendrix
7f00e036b2
Remove error with ARM relocations. 2018-08-07 11:20:03 -07:00
Joe Hendrix
b49036ebdf
Fix broken array bounds check in jump table discovery. 2018-07-30 13:58:52 -07:00
Joe Hendrix
aa742d148d
Fix missing negation in function executable check. 2018-07-30 13:32:54 -07:00
Joe Hendrix
036f39cbb4
Bug fixes to code discovery; introduce JumpTableLayout.
This fixes bugs in scanning addresses in memory, and failing to check
the executable status of function entry points.
2018-07-30 13:28:46 -07:00
Joe Hendrix
e4a27d7bbc
Merge branch 'master' of github.com:GaloisInc/macaw 2018-07-27 00:28:50 -07:00
Joe Hendrix
c6a1ecba6c
Rename MemSet to RepStos to reflect underlying x86 function. 2018-07-27 00:24:24 -07:00
Tristan Ravitch
4d1299a6d2 Merge branch 'master' into breaking-change/symbolic-global-map 2018-07-24 16:53:51 -07:00
Joe Hendrix
43e81ab795
Bump parameterized-utils min version 2018-07-23 13:47:24 -07:00
Joe Hendrix
901446bda5
Add test case for object jump table. 2018-07-20 18:16:52 -07:00
Joe Hendrix
0d0898c644
Add support for parsing jump tables with relocations in entries.
This also adds simplification rules and some refactoring of existing
interfaces
2018-07-20 09:57:06 -07:00
Joe Hendrix
f1c5b10fd5
Extend relocation support and 1-1 x86 block association. 2018-07-18 16:57:17 -07:00
Brian Huffman
2330c81ab4 Fix haddock parse errors. 2018-07-17 13:23:48 -07:00
Joe Hendrix
b24649db35
Remove redundant function from Memory. 2018-07-12 13:46:02 -07:00
Joe Hendrix
bca405562a
Drop automatic parsing NO_TYPE symbols in ElfLoader. 2018-07-03 16:35:41 -07:00
Andrei Stefanescu
313b2a738a Add parameters and return type to subprogram datatype 2018-06-26 10:58:43 -07:00
Joe Hendrix
0fc925f989
Update for elf-edit compat 2018-06-15 08:24:52 -07:00
Joe Hendrix
6391a87db1
Merge branch 'master' of github.com:GaloisInc/macaw 2018-06-12 16:20:55 -07:00
Daniel Wagner
f4d4e381b7 have a way to align potentially misaligned IPs 2018-06-11 10:30:32 -04:00
Daniel Wagner
7251bb6b03 MOAR REWRITES 2018-06-11 10:30:32 -04:00
Daniel Wagner
a9d49a96ed don't turn sext into uext 2018-06-11 10:30:32 -04:00
Daniel Wagner
b5c143418d rewrite trunc.sext and trunc.uext 2018-06-11 10:30:32 -04:00
Joe Hendrix
494f6c176d
Updates to Macaw. 2018-06-06 11:48:45 -07:00
Joe Hendrix
77627c391d
Remove redundant IPAlignment constraint. 2018-06-06 11:28:26 -07:00
Daniel Wagner
e3d7c26b8c minor improvement to jump bounds abstract interpretation 2018-05-30 15:50:16 -04:00
Daniel Wagner
024e393e8e more rewrite rules for <= and < 2018-05-30 15:50:16 -04:00
Daniel Wagner
38aeecba21 add/improve rewrite rules for testing bits of shifted values 2018-05-30 15:50:16 -04:00
Daniel Wagner
917f921301 make JumpBounds abstract interpretation more precise 2018-05-30 15:50:16 -04:00
Daniel Wagner
d0566fe03b lay some groundwork for jump table detection on PPC 2018-05-30 15:50:16 -04:00
Daniel Wagner
3814d9c649 documentation fix 2018-05-30 15:50:16 -04:00
Daniel Wagner
5c9707508c rewrite away saturated shifts 2018-05-30 15:50:16 -04:00
Daniel Wagner
fa96a062e1 adding two finsets, get a finset 2018-05-30 15:50:16 -04:00
Ben Selfridge
0dddfcacea fixed haddock parse errors 2018-05-10 16:48:18 -07:00
Tristan Ravitch
0eb0bd14f7 Merge branch 'master' of github.com:GaloisInc/macaw into HEAD 2018-04-25 08:41:41 -07:00
Daniel Wagner
6453486013 delete some debugging print statements 2018-04-24 17:07:07 -04:00
Daniel Wagner
0565805c4f more principled error reporting in readMemReprDyn 2018-04-24 14:52:38 -04:00
Daniel Wagner
3b3bcecc4a handle jump tables again, including PIC tables 2018-04-24 14:52:38 -04:00
Tristan Ravitch
bd686c3c2e Merge branch 'master' of github.com:GaloisInc/macaw into HEAD 2018-04-24 09:07:39 -07:00
Joe Hendrix
9047cb41fb
Fix warnings in macaw-base; Fix errors in macaw-symbolic.
This also makes some changes to eliminate a couple redundent
type-class constraints in CrucGen.hs which propagated to other changes.
2018-04-24 01:17:03 -07:00
Tristan Ravitch
ee96681d8d Merge branch 'master' of github.com:GaloisInc/macaw into HEAD 2018-04-23 18:51:19 -07:00
Tristan Ravitch
8c20e0e156 Export another utility and type from Macaw.Memory
This type is needed to write some type signatures, and we needed an accessor to
extract segment ranges from a SegmentContents.
2018-04-23 18:50:39 -07:00
Joe Hendrix
052506f202
Remove PhaseHolderStmt. 2018-04-23 11:35:31 -07:00
Joe Hendrix
097edda1ef
Relocation support; various cleanups.
This patch adds initial support for relocations in Macaw code
discovery, and adds other refactoring.

* It introduces a SymbolValue constructor to represent references to
  symbols within Macaw.
* The various cases for x86 mov are made explicit after the flexdis refactor
  broke the previous code.  We should now support segment register movs and
  give better error messages when seeing mov with control or debug registers.
* The generic exception operation is replaced with Hlt and UD2 terminal
  x86-specific statements.
* CodeAddrReason is split into FunctionExploreReason and BlockExploreReason to
  clarify whether a function or block was discovered.
* The Macaw pretty printer is changed to use write_mem in place of pointer syntax.
* Various other refactoring is made to clarify code.
2018-04-23 11:24:21 -07:00
Joe Hendrix
0b8e95b0b0
Merge branch 'master' of github.com:GaloisInc/macaw 2018-04-17 16:02:28 -07:00
Brian Huffman
1e3fad7d77 Fix typo in module header description. 2018-04-16 10:07:23 -07:00
Joe Hendrix
2feebceddc
Refactor relocation support; support .rel and some object symbols. 2018-04-05 09:06:12 -07:00
Tristan Ravitch
2524b77cb5 base: Change the type of the address in the ArchState statement
ArchMemAddr is easier to use than ArchAddrWord in downstream clients, and is
probably more faithful in the case where we want to support shared libraries
and/or object files.
2018-03-30 10:33:49 -07:00
Tristan Ravitch
ce96c55896 Merge branch 'master' of github.com:GaloisInc/macaw into HEAD 2018-03-29 17:09:05 -07:00
Tristan Ravitch
51b8dae802 Change the pretty printing of the 'ArchState' macaw statement 2018-03-29 17:08:40 -07:00
Joe Hendrix
265f61e206
Merge branch 'master' of github.com:GaloisInc/macaw 2018-03-29 16:30:29 -07:00
Joe Hendrix
007405db1d
Improve robustness of elf loader, and start trying to parse relocations in objects. 2018-03-29 15:21:31 -07:00
Iavor Diatchki
8ac1a914ae Merge branch 'master' of github.com:GaloisInc/macaw 2018-03-29 12:42:24 -07:00
Iavor Diatchki
81f327e037 Add a function to find all symbols, not just functions.
Joe is working on making this more generic in some way,
so this is just a quick (probably temporary) fix to expose
the needed functionality.
2018-03-29 12:42:18 -07:00
Jason Dagit
372d7d7208 Add a new macaw statement to record updates to machine registers
The new statement is called `ArchState`, and has two fields: an address and a
map.  The address is the address of the instruction it is standing in for.  The
map contains a mapping from the *machine registers* that the instruction updated
to the *macaw values* that were assigned to those locations.

This is useful metadata for debugging, but is also required to do some types of
architecture-independent analysis (where we can still reason about machine
register contents).
2018-03-29 09:53:08 -07:00
Tristan Ravitch
8d5e39c87f base: Add additional simplification rules to the rewriter 2018-03-27 18:13:46 -07:00
Tristan Ravitch
c2c5835b10 base: Add another case to the abstract interpretation
Now handle shifts of constants
2018-03-27 18:13:23 -07:00
Kevin Quick
594e9e025d
Restrict Semigroup imports to avoid collisions on unused definitions. 2018-03-27 10:43:04 -07:00
Kevin Quick
818f7a7767
Remove unused import in Macaw CFG Core. 2018-03-27 10:42:28 -07:00
Kevin Quick
377c3d1a2b
Use architecture-specific identifyReturn in Discovery process.
Instead of inline analysis of whether the instruction pointer has been
updated to contain the ReturnAddr symbolic value, defer the
determination of the call return to the (previously defined but
unused) architecture-specific handling.  This allows architectures
like ARM that perform modifications on the values loaded to the
instruction pointer (e.g. clearing lower bits) to provide their own
recognition of a return operation.

Also modifies the signature of identifyReturn to return a Sequence of
statements to match the identifyCall type signature.

Replaces the previously unused identifyX86Return with the inline
detection of IP == ReturnAddr.
2018-03-27 10:35:55 -07:00
Joe Hendrix
ceefa7ae75
Update memory to use explicit BSS region and disable includeBSS option. 2018-03-23 16:26:07 -07:00
Joe Hendrix
557408132c
Merge branch 'master' of github.com:GaloisInc/macaw 2018-03-23 14:13:09 -07:00