While an iterator may point beyond the last element of a `std::span`, a
pointer may not.
Fixes#298.
Signed-off-by: Christoph Erhardt <github@sicherha.de>
It is up to packager's choice to create mold as a position-independent
executable or not. If you want, pass `LDFLAGS=-fpie` and `CXXFLAGS=-fPIE`
to `make`.
Our Makefile didn't allow overriding CFLAGS, CXXFLAGS or LDFLAGS
because we add mandatory options to these variables. It is contrary
to conventions Makefile conventions explained here.
https://www.gnu.org/prep/standards/html_node/Command-Variables.html#Command-Variables
In this patch, I separated mandatory options to MOLD_CXXFLAGS and
MOLD_LDFLAGS so that users can freely override CFLAGS, CXXFLAGS and
LDFLAGS.
I also removed `DEBUG`, `ASAN` and `TSAN` variables from the Makefile
because we can now simply pass `CXXFLAGS="-g -O0"` or
`CXXFLAGS=-fsanitize=...` instead.
The SerenityOS LibC doesn't leak the definitions of select(2) or struct
timeval into other POSIX headers. Manually include these headers in
elf/subprocess.cc where they are used for better compatibility.
Signed-off-by: Andrew Kaster <akaster@serenityos.org>
Before this change, all bitmap entries ended up containing the value
0x3 (as `pos[i] - base` is guaranteed to be a multiple of the word
size). This expression needs to produce the offset in words from the
base address, so we need to *divide* by the word size.
Signed-off-by: Daniel Bertalan <dani@danielbertalan.dev>
If the corresponding DT_RELR, DT_RELRSZ and DT_RELRENT dynamic entries
are not present, dynamic linkers won't pick up the relocations encoded
in .relr.dyn.
Signed-off-by: Daniel Bertalan <dani@danielbertalan.dev>
To process version scriots, we have to match glob patterns against
symbol strings. Sometimes, we have hundreds or thousands of glob
patterns and have to match them against millions of mangled long
C++ symbol names. This step can be very slow.
In this patch, I implemented the Aho-Corasick algorithm to match glob
patterns to symbol strings as quickly as possible. For the details
of the algorithm, see https://en.wikipedia.org/wiki/Glob_(programming).
This patch improves mold's performance for programs that uses large
version scripts. For example, linking libQt6Gui.so.6.3.0 reduced from
1.10s to 0.05s with this patch.
This patch also changes how symbol versions are applied if two or more
version patterns match to a single symbol string. Previously, the last
one in a script file took precedence. Now, the first one takes
precedence. I believe the new behavior is compatible with GNU ld.
Fixes https://github.com/rui314/mold/issues/156
Fixed https://github.com/rui314/mold/issues/287
If you try to link ICC-generated objects file with GCC-generated objects,
you'll get an error message that `__gxx_personality_v0` is duplicated.
This is because ICC puts a section with this symbol into a GNU linkonce
section, while GCC puts it into a COMDAT section.
ICC should stop putting that symbol into GNU linkonce because it's
superceded by COMDAT long ago.
That being said, we need to do something to make it possible to mix
ICC and GCC object files. In this patch, we simply discard the ICC-
generated section. Since `__gxx_personality_v0` is always provided by
libc (that function handles C++ exceptions), we can simply ignore ICC's
output.
Fixes https://github.com/rui314/mold/issues/271
libtbb.so contains undefined symbols on Gentoo with musl libc.
Because we are not using such symbols at runtime, and symbols are
resolved lazily, mold would work fine as long as we can build it.
By passing `-allow-shlib-undefined`, we can build mold.
Fixes https://github.com/rui314/mold/issues/281
ARM64 branch instructions have only a 25-bit displacement. Non-thumb
instructions are always aligned to 4 byte boundaries, but with that
implicit trailing two zeros considered, they can represent only a 27-bit
displacement. That means branch instructions can jump to a location only
if it is within a ±128 MiB range.
If a branch destination is farther than that, a linker has to emit machine
code sequence that constructs a full 32-bit address in a register to jump
to the final destination, and redirect the branch to that code sequence.
Such code sequence is called a "range extension thunk" or just "thunk".
Previously, mold didn't support range extension thunks, so it couldn't link
large programs. That would fail with an "relocation out of range" error.
Now, mold gained a feature to create thunks and can link large programs.
Thunk creation is an interesting algorithmic problem. We need to insert
a thunk for at least in every 128 MiB chunk, because otherwise branch
instructions wouldn't be able to jump to a thunk. Adding an entry to a thunk
could slightly enlarge the distance between a branch instruction location
and its destination if the thunk is in between them. That could make the
branch that was previously reachable unreachable.
Usually, this problem is solved by an iterative algorithm. With the
iterative algorithm, a linker check for reachability of all relocations,
create new thunks if necessary, and repeat it until no new thunks are
created.
I implemented a different algorithm than that in this patch. The algorithm
implemented in this patch is guaranteed to work in O(n) where n is the
number of relocations. This algorithm might be novel.
And the algorithm implemented in this patch is quite fast. It can create
thunks in 80 milliseconds on a 16-core Amazon Graviton 2 machine for
clang-14 that has an ~100MB .text section.
Previously, we didn't handle version scripts like this correctly:
ver {
global: *;
local: foo*;
}
We didn't handle `local:` part correctly except for `*`.
Fixes https://github.com/rui314/mold/issues/277