We only need the RMW to be atomic and doesn't need the ordering guarantee wrt
other memory locations.
Might make ARM a little faster (no machine code change for x86).
Signed-off-by: Tatsuyuki Ishi <ishitatsuyuki@gmail.com>
Atomic RMW operations can take hundreds of cycles and tends to be massively
slower than a fence-free alternative.
Add a fast_mark function that optimistically tests if the node is already
visited first, which allows avoiding the fence penalty if we can early exit.
When testing on Chromium, this roughly improved the associated pass's runtime
by 20--30%.
Signed-off-by: Tatsuyuki Ishi <ishitatsuyuki@gmail.com>
We've long had a hack to de-prioritize GNU_UNIQUE symbols, due to an edge case
where these symbols could get higher resolution priority yet eliminated in the
later COMDAT elimination stage. But since GNU_UNIQUE symbols were supposed to
have a resolution as strong as GLOBAL symbols, it created unsoundness in the
resolution and could have cause false duplicate symbol errors in later stages.
The root cause is that COMDAT elimination, unlike ICF and SHF_MERGE, removes
symbols from the candidate pool and can actually affect resolution result. Hence
we ought to do the elimination before the final resolution.
Doing so requires fully wiping the resolution instead of just the ones from
eliminated archive members. This has a performance hit that we need to keep an
eye on.
On another note this wiping of resolution is actually justifiable. The old
algorithm did not handle the case of an archive member overriding a DSO symbol.
So wiping is the correct way even in absence of COMDAT elimination.
One caveat is that currently the symbol resolution stage doesn't actually care
about the is_alive flag which is how COMDAT symbols are killed. There are two
other users of this flag: mergeable sections, and eh_frame. The eh_frame section
is already handled with special logic, and we can treat it as if it had no
symbols to resolve. Mergeable sections needs a bit more careful handling, but a
good way is to just leave them unmerged and update the references in the same
later pass where merge happens. This has been done in a prior commit.
Signed-off-by: Tatsuyuki Ishi <ishitatsuyuki@gmail.com>
This pass is rather different in that it runs the resolve pass with lower
priority for archive symbols in order to determine extraction. Split it out to
better convey its difference.
Signed-off-by: Tatsuyuki Ishi <ishitatsuyuki@gmail.com>
We can skip the is_alive condition by doing the compaction beforehand.
This also allows splitting this function into two phases clearly which will
happen later.
Signed-off-by: Tatsuyuki Ishi <ishitatsuyuki@gmail.com>
Make sure clear_symbols() are called on IR object files too, so that we'll never
have references to symbols inside IR object files after LTO. Can't really think
of an example that causes problem, but for sanity.
Signed-off-by: Tatsuyuki Ishi <ishitatsuyuki@gmail.com>
When the section-header index has the escape value `SHN_XINDEX`, the
actual index must be looked up in the separate `SHT_SYMTAB_SHNDX` table.
Trying to use `SHN_XINDEX` (= 0xffff) as an index results in an
out-of-bounds read. The error can be observed when running the
`x86_64_many-sections.sh` test on RHEL 8 or 9 (but not on Fedora,
because there the assembler doesn't emit section symbols).
Instead of using `st_shndx` directly, call the pre-existing helper
method `get_shndx()` to get the correct behaviour.
Signed-off-by: Christoph Erhardt <github@sicherha.de>
Even if we always enable assert(), we want code to be able to compile
with assert() being expanded to a null statement. I.e., we should not
write code that has side effect in assert().
In order to add a test case for it, I added a cmake option to force
disable assert().
Our assertions are cheap, and the cost of branch instructions that
are never taken is almost free thanks to hardware branch prediction.
So I don't think it's worth to disable it for release builds.
We set a value to .ARM.exidx's sh_shlink to appease GNU binutil's
strip command (this value is not used by runtime), but even strip
doesn't care what value is set! We set a wrong value by accident.
Fixes https://github.com/rui314/mold/issues/804
We've been treating STB_GNU_UNIQUE symbols as if they were weak.
The rationale of doing is this:
1. GCC sometimes creates GNU-unique symbols instead of weak ones
for comdats, and
2. after COMDAT de-duplication, we used to report a duplicate symbol
error for GNU-unique symbols
However, (2) is no longer the case because we do not report a symbol
duplication error if its section is dead.
I'm not sure if this new logic will work for all programs, but I want
to give it a shot.
https://github.com/rui314/mold/issues/524