1
1
mirror of https://github.com/rui314/mold.git synced 2024-11-09 16:05:58 +03:00
Commit Graph

5410 Commits

Author SHA1 Message Date
Rui Ueyama
d0210b85b9 Run tests before install 2022-10-24 16:19:01 +08:00
Rui Ueyama
73cdbb320d
Merge pull request #818 from ishitatsuyuki/opt
Some trivial optimizations
2022-10-24 00:37:47 -07:00
Tatsuyuki Ishi
6088d954f6 [ELF] Remove redundant lock in mark_live_objects.
There are no writes to `sym` apart from the fast_mark which is already atomic.

Signed-off-by: Tatsuyuki Ishi <ishitatsuyuki@gmail.com>
2022-10-24 14:54:57 +08:00
Tatsuyuki Ishi
20ca4e38f4 Use relaxed for update_{minimum,maximum}.
We only need the RMW to be atomic and doesn't need the ordering guarantee wrt
other memory locations.
Might make ARM a little faster (no machine code change for x86).

Signed-off-by: Tatsuyuki Ishi <ishitatsuyuki@gmail.com>
2022-10-24 14:54:13 +08:00
Rui Ueyama
616f8b2205 Update comments 2022-10-24 13:54:39 +08:00
Rui Ueyama
4379dc4afa Prepare to distribute binary packages for arm, ppc64le and s390x 2022-10-24 13:54:39 +08:00
Rui Ueyama
c5106da010 Remove unused file 2022-10-24 11:46:27 +08:00
Rui Ueyama
0d6e373ffc Refactor 2022-10-24 11:33:58 +08:00
Rui Ueyama
6361875a15 Revert "[ELF] Add --start-stop"
This reverts commit 1cfa4a93ae because
it was committed by accident.
2022-10-24 07:29:17 +08:00
Rui Ueyama
9c7970f0af Add a test 2022-10-24 07:16:27 +08:00
Rui Ueyama
59ea52933d [ELF] Honor section alignment when creating copy relocations 2022-10-24 06:58:03 +08:00
Rui Ueyama
4fa41e1ad4 Fix -Wdeprecated-declarations
Fixes https://github.com/rui314/mold/issues/816
2022-10-24 06:28:13 +08:00
Rui Ueyama
ce35e19c6d
Merge pull request #815 from polluks/patch-1 2022-10-23 12:39:57 -07:00
Stefan
8643520464
Fixed typos 2022-10-23 20:31:29 +02:00
Rui Ueyama
d01deafff0
Merge pull request #810 from ishitatsuyuki/comdat-early 2022-10-23 05:24:48 -07:00
Rui Ueyama
8f74e48dfb
Merge pull request #814 from ishitatsuyuki/fast-mark
Add an optimized mark function for faster GC passes.
2022-10-23 04:18:35 -07:00
Tatsuyuki Ishi
59a713fb4c Add an optimized mark function for faster GC passes.
Atomic RMW operations can take hundreds of cycles and tends to be massively
slower than a fence-free alternative.

Add a fast_mark function that optimistically tests if the node is already
visited first, which allows avoiding the fence penalty if we can early exit.

When testing on Chromium, this roughly improved the associated pass's runtime
by 20--30%.

Signed-off-by: Tatsuyuki Ishi <ishitatsuyuki@gmail.com>
2022-10-23 18:19:32 +08:00
Rui Ueyama
626619a8b0 Enable transparent huge pages for output file if THP is available 2022-10-23 18:17:47 +08:00
Rui Ueyama
585eae2030 Fix CI 2022-10-23 17:29:57 +08:00
Tatsuyuki Ishi
730e970e79 [ELF] Run COMDAT elimination inside resolve_symbols.
We've long had a hack to de-prioritize GNU_UNIQUE symbols, due to an edge case
where these symbols could get higher resolution priority yet eliminated in the
later COMDAT elimination stage. But since GNU_UNIQUE symbols were supposed to
have a resolution as strong as GLOBAL symbols, it created unsoundness in the
resolution and could have cause false duplicate symbol errors in later stages.

The root cause is that COMDAT elimination, unlike ICF and SHF_MERGE, removes
symbols from the candidate pool and can actually affect resolution result. Hence
we ought to do the elimination before the final resolution.

Doing so requires fully wiping the resolution instead of just the ones from
eliminated archive members. This has a performance hit that we need to keep an
eye on.

On another note this wiping of resolution is actually justifiable. The old
algorithm did not handle the case of an archive member overriding a DSO symbol.
So wiping is the correct way even in absence of COMDAT elimination.

One caveat is that currently the symbol resolution stage doesn't actually care
about the is_alive flag which is how COMDAT symbols are killed. There are two
other users of this flag: mergeable sections, and eh_frame. The eh_frame section
is already handled with special logic, and we can treat it as if it had no
symbols to resolve. Mergeable sections needs a bit more careful handling, but a
good way is to just leave them unmerged and update the references in the same
later pass where merge happens. This has been done in a prior commit.

Signed-off-by: Tatsuyuki Ishi <ishitatsuyuki@gmail.com>
2022-10-23 14:27:26 +08:00
Tatsuyuki Ishi
26c503ad30 [ELF] Defer initialize_mergeable_sections.
To allow symbol resolution to skip !is_alive sections.

Signed-off-by: Tatsuyuki Ishi <ishitatsuyuki@gmail.com>
2022-10-23 14:18:28 +08:00
Tatsuyuki Ishi
1a99724eca [ELF] Split out 1st pass of symbol resolution as finalize_archive_extraction.
This pass is rather different in that it runs the resolve pass with lower
priority for archive symbols in order to determine extraction. Split it out to
better convey its difference.

Signed-off-by: Tatsuyuki Ishi <ishitatsuyuki@gmail.com>
2022-10-23 13:53:03 +08:00
Tatsuyuki Ishi
03dad9ba14 [ELF] Simplify do_resolve_symbols.
We can skip the is_alive condition by doing the compaction beforehand.

This also allows splitting this function into two phases clearly which will
happen later.

Signed-off-by: Tatsuyuki Ishi <ishitatsuyuki@gmail.com>
2022-10-23 13:05:38 +08:00
Tatsuyuki Ishi
e7a7f9e39a [ELF] Ensure all symbols from LTO objects are cleared.
Make sure clear_symbols() are called on IR object files too, so that we'll never
have references to symbols inside IR object files after LTO. Can't really think
of an example that causes problem, but for sanity.

Signed-off-by: Tatsuyuki Ishi <ishitatsuyuki@gmail.com>
2022-10-23 13:00:44 +08:00
Rui Ueyama
e228abfd5a git [ELF] Do not include non-memory-mapped sections into build-id computation 2022-10-23 12:41:59 +08:00
Rui Ueyama
61f28d270c Fix CI 2022-10-23 12:24:58 +08:00
Rui Ueyama
9535e080e9 Simplify 2022-10-23 09:31:19 +08:00
Rui Ueyama
f9750d163d
Merge pull request #809 from sicherha/fix_SHN_XINDEX_out_of_bounds
Fix name lookup for section symbols when `st_shndx == SHN_XINDEX`
2022-10-22 18:15:55 -07:00
Christoph Erhardt
f8cb32e59a Fix name lookup for section symbols when st_shndx == SHN_XINDEX
When the section-header index has the escape value `SHN_XINDEX`, the
actual index must be looked up in the separate `SHT_SYMTAB_SHNDX` table.
Trying to use `SHN_XINDEX` (= 0xffff) as an index results in an
out-of-bounds read. The error can be observed when running the
`x86_64_many-sections.sh` test on RHEL 8 or 9 (but not on Fedora,
because there the assembler doesn't emit section symbols).

Instead of using `st_shndx` directly, call the pre-existing helper
method `get_shndx()` to get the correct behaviour.

Signed-off-by: Christoph Erhardt <github@sicherha.de>
2022-10-22 14:41:42 +02:00
Rui Ueyama
4508d93d08
Merge pull request #808 from ishitatsuyuki/refactor 2022-10-22 05:21:16 -07:00
Tatsuyuki Ishi
fb4b67ae18 [ELF] Move ElfSym memset to inside of loop.
Should eliminate redundant assignments and improve memory locality.

Signed-off-by: Tatsuyuki Ishi <ishitatsuyuki@gmail.com>
2022-10-22 20:16:54 +09:00
Rui Ueyama
1e98ec0b38
Merge pull request #807 from ishitatsuyuki/symtab-garbage
[ELF] Fix uninitialized ElfSym paddings
2022-10-22 03:58:07 -07:00
Tatsuyuki Ishi
0b50ebda47 [ELF] Fix uninitialized ElfSym paddings
Designated initializer doesn't initialize the padding; make sure we memset
them by ourselves.

Signed-off-by: Tatsuyuki Ishi <ishitatsuyuki@gmail.com>
2022-10-22 19:37:35 +09:00
Rui Ueyama
5469f5ac1b Revert "Always enable assert()"
Turned out that what I wrote as a commit message for
800fcdc5d4 is wrong. mold is 5% slower
for linking clang with assertions enabled.
2022-10-22 16:24:22 +08:00
Rui Ueyama
068ce4654f Add a CI for a non-assert build
Even if we always enable assert(), we want code to be able to compile
with assert() being expanded to a null statement. I.e., we should not
write code that has side effect in assert().

In order to add a test case for it, I added a cmake option to force
disable assert().
2022-10-22 16:03:45 +08:00
Rui Ueyama
800fcdc5d4 Always enable assert()
Our assertions are cheap, and the cost of branch instructions that
are never taken is almost free thanks to hardware branch prediction.
So I don't think it's worth to disable it for release builds.
2022-10-22 15:21:08 +08:00
Rui Ueyama
0d7a5a7284 Create config.h 2022-10-22 14:40:47 +08:00
Rui Ueyama
7db72a871c Remove Makefile
It's been deprecated for some time.
2022-10-22 14:40:47 +08:00
Rui Ueyama
9842388cd0 Do not halt even if there's a big input text section 2022-10-22 13:24:49 +08:00
Rui Ueyama
29c4403ec4 Refactor 2022-10-22 13:03:25 +08:00
Rui Ueyama
74124636dc [ELF] Make less number of range extension thunks 2022-10-22 12:32:32 +08:00
Rui Ueyama
193c245781 Refactor 2022-10-22 12:22:57 +08:00
Rui Ueyama
5cdc761746 Print out the number of bytes occupied by range extension thunks 2022-10-22 11:01:27 +08:00
Rui Ueyama
d35d3922f0 Fix typo
We set a value to .ARM.exidx's sh_shlink to appease GNU binutil's
strip command (this value is not used by runtime), but even strip
doesn't care what value is set! We set a wrong value by accident.

Fixes https://github.com/rui314/mold/issues/804
2022-10-22 08:24:00 +08:00
Rui Ueyama
c4850273c7 [ELF] Handle GNU-unique symbols as if they were weak
https://github.com/rui314/mold/issues/524
2022-10-21 20:58:17 +08:00
Rui Ueyama
86f3130e44 [ELF] Remove the special logic for STB_GNU_UNIQUE
We've been treating STB_GNU_UNIQUE symbols as if they were weak.
The rationale of doing is this:

1. GCC sometimes creates GNU-unique symbols instead of weak ones
   for comdats, and

2. after COMDAT de-duplication, we used to report a duplicate symbol
   error for GNU-unique symbols

However, (2) is no longer the case because we do not report a symbol
duplication error if its section is dead.

I'm not sure if this new logic will work for all programs, but I want
to give it a shot.

https://github.com/rui314/mold/issues/524
2022-10-21 17:15:51 +08:00
Rui Ueyama
2dc1fecb02 Update comments 2022-10-21 17:15:51 +08:00
Rui Ueyama
33e3a75a79
Merge pull request #802 from ZhongRuoyu/disable-subprocess-cc-on-macos 2022-10-20 23:14:45 -07:00
Zhong Ruoyu
3fe46175f3
Do not build subprocess.cc on macOS
Signed-off-by: Zhong Ruoyu <zhongruoyu@outlook.com>
2022-10-21 12:44:27 +08:00
Rui Ueyama
861bccce72 [ELF] Do not emit non-memory-allocated sections if --oformat=binary 2022-10-21 08:24:15 +08:00