This makes the racket compiler check for new versions of modules it
already compiled. Should avoid some confusing errors when developing the
racket libs.
If a user arbitrarily permutes the links associated with code, it's
possible to end up with a load request that appears to have mutually
recursive definitions without common hashes. So, this has been made into
a catchable exception instead of throwing a Haskell error.
The initial hashing in the runtime just calls error when this happens,
since it means some internal code generation generated bad SCCs, not
that some kind of user error happened.
Solves one observed problem and potentially some others.
The observed problem was the process of unhashing and rehashing does not
replace any term links in the original terms. This is because term link
literals can't be turned into variables and subsequently replaced with
new hashes. So, instead we use our available variable mappings to
replace the literals manually.
A superior methodology might be to replicate the SCC behavior already in
hashTermComponents, and incrementally remap individual components.
However, that is a considerable amount of additional work, and the
post-floating references are just used as a translation layer between
the codebase and verifiable hashes, so making them completely consistent
doesn't seem necessary.
I've also added some codebase->verifiable replacements from the context
in some places. It's possible not doing this would have caused problems
during UCM sessions where some terms are loaded incrementally due to
multiple calls into the runtime. I didn't observe any failing tests due
to this, though.
When the interpreter is called for things in a scratch file, the code
path is slightly different, because all the definitions are pre-combined
into a letrec. If the definitions are subsequently added to the
codebase, we will have already loaded and compiled their code. However,
previously the remapping from base to floated references would not
exist, because that was only being generated for loaded dependencies in
the other path.
So, this code adds a similar remapping for a top-level letrec in this
code path. This fixes a problem with compiling a definition that had
just been added from a scratch file. The remap was expected to be there,
but it wasn't, so the compiler could find the code to emit.
The pair marshaling code was mistakenly using only a single layer of
data nesting, but unison pairs are like 2-element cons lists.
The rehashing code was not sorting the SCCs into a canonical order, so
the exact input order for components with more than one binding could
influence the hash. Sorting by input reference order fixes this, as all
references in an SCC are required to have the same hash, and differ only
by index.
I've checked the transcripts, and I believe these are outputs that
_should_ have changed, because the output depends on the specific values
of term links. fix2027 is just a different hash in a stack trace, which
is not surprising because it's the hash of a floated function.
I had mistakenly been putting floated terms in the decompile info, and
that was causing some mismatches with previous decompiling outputs.
Switching back to remembering the unfloated definitions seems to have
cleared up the discrepancies.
Previously, we were taking already hashed terms as input to the various
intermediate compiler stages. We would then do things like lambda
lifting, and hash the resulting definitions afterwards. However, this
leads to problems like mutually recursive definitions that do not share
a common hash.
To rectify this, we instead do the intermediate operations on an
_unhashed_ version of code looked up from the codebase. Essentially, we
turn relevant definitions back into a letrec, do the floating, then hash
the results of that processing. This gives proper hashes to the
processed terms so that the compiled terms can be rehashed with relative
ease.
The system for this is unfortunately quite intricate. To get good
decompiled output, we need to keep track of associations between
original and floated terms _and_ floated and rehashed terms, then
mediate between them in various places.