We eliminate unneeded symbols at the end of the algorithm, but that
didn't save too much space. In fact, I didn't observe any difference
by eliminating that last step from the algorithm.
- Rename data types so that we have LittleEndian and BigEndian types.
- Do not assume that "base" integral types are aligned to 2 bytes
boundaries, because that doesn't make much difference in terms of
performance.
- Remove `pu8` type
Core part of this change is cleaning up Mold from UBSan alignment errors.
Following approach inspired by LLVM's ELFTypes is done here:
1. Introduce generic Packed class acting as proxy for unaligned aggregated value.
Every access to such value through proxy is secured by memcpy call.
2. Replace plain struct members with Packed members wherever it matters.
3. Use explicit castings through Packed type for misaligned loads.
For more background please check: https://github.com/rui314/mold/discussions/477
Signed-off-by: Dawid Jurczak <dawid_jurek@vp.pl>
Previously, Symbol::sym_idx overflows if there are more than 524,288
symbols in a single file. That's a very large number but is not impossible.
Fixes https://github.com/rui314/mold/issues/405
And there is probably little need to do so, since it's DWARF4-only
and requires explicit -fdebug-types-section, but at least detect
and abort in this case.
Previously, our output for .gdb_index depended on the iteration order of
ConcurrentMap. Since the iteration order varies on every run of the linker
due to thread scheduling randomness, our .gdb_index were different on
every lihnker invocation.
In mold, we guarantee that our output is the same as long as the same
command line options and input files are given to the same version of mold.
So the situation clearly violated the rule.
This commit fixes that issue.
The value is an index (offset by DW_AT_rnglists_base) into the first
part of .debug_rnglists, which is a list of offsets into the second
part of .debug_rnglists, which lists ranges, each entry starting
with a value describing how to interpret the data.
Signed-off-by: Luboš Luňák <l.lunak@centrum.cz>
The value is not the address, it's an index into .debug_addr
that's additionally offset by DW_AT_addr_base.
Signed-off-by: Luboš Luňák <l.lunak@centrum.cz>
.gdb_index contains two maps: a map from identifiers (type names,
function names or variable names) to compunits, and a map from
function address ranges to compunits. The latter is harder to create
because we needed to parse DWARF debug records to read function
address ranges. We generally don't want to do that.
This commit stops emitting address ranges. Now, .gdb_index sections
created by our linker contains the zero-length map as address ranges.
Though I'm not 100% sure if this is actually OK, it looks like gdb
works fine with that. It's at least worth a try.
https://github.com/rui314/mold/issues/439https://github.com/rui314/mold/issues/396
mold is usually built for all supported tagets, namely, x86-64, i386,
ARM32, ARM64 and RISCV64. This is a good thing because it makes cross
compilation easy. That is, as long as you have a copy of the mold linker,
it is guaranteed to work as a cross linker.
However, during a quick debug session in which you build mold many
times, you may want to build mold only for your native target. That
greatly reduces build time because it reduces the amount of code
after template instantiation.
Therefore, in this commit, I introduced new macros,
MOLD_DEBUG_X86_64_ONLY and MOLD_DEBUG_ARM64_ONLY, to build mold for
x86-64 and ARM64 only, respectively.
These flags should never be used for production. They are solely for
debugging purpose.
Previously, --gdb-index tries to read bogus compressed data from
input sections if input debug sections are compressed.
Fixes https://github.com/rui314/mold/issues/431