Almost all functions and data types are template in mold, which are
parameterized to target type (e.g. X86_64 or ARM32). I'm generally
happy with that design, but there's one drawback; as we add more
targets, compilation gets slower.
Now we support more than 10 targets, so the compiler has to do 10x
more work than it originall did. As a result, compiling output-chunks.cc
(one of our largest file) took more than 30 seconds to compile on a
Threadripper machine. I hasitated to add a support for a new target
because of this problem. We needed to fix it.
This commit tweaks the build system configs so that the build system
generates bunch of intermediate .cc files. Each generated .cc file
instantiates an original .cc file for one target.
The total amount of required computation doesn't change by this hack,
but now we can parallelize the compilation step. That works well on a
modern multi-core machine.
I generally don't like to tweak build system to workaround a tool's
issue (that's why I'm creating mold in the first place), but it looks
like there's no way to speed up compilation other than this. Or,
maybe we can write a faster C++ compiler someday, but that's another
topic.
Previously, we detected if a given linker plugin supports get_symbols_v3
or not and restart the process if only get_symbols_v2 is supported.
Now, mold restart itself wehn get_symbols_v2 is called for the first time,
eliminating the need for the feature detection.
https://github.com/rui314/mold/issues/454
Previously, if a symbol in a non-IR object is weak or common and
is resolved to an IR symbol, mold internalized that symbol. That
resulted in having two separate definitions for the same symbol
and caused a mysterious error.
Fixed https://github.com/rui314/mold/issues/479
mold is usually built for all supported tagets, namely, x86-64, i386,
ARM32, ARM64 and RISCV64. This is a good thing because it makes cross
compilation easy. That is, as long as you have a copy of the mold linker,
it is guaranteed to work as a cross linker.
However, during a quick debug session in which you build mold many
times, you may want to build mold only for your native target. That
greatly reduces build time because it reduces the amount of code
after template instantiation.
Therefore, in this commit, I introduced new macros,
MOLD_DEBUG_X86_64_ONLY and MOLD_DEBUG_ARM64_ONLY, to build mold for
x86-64 and ARM64 only, respectively.
These flags should never be used for production. They are solely for
debugging purpose.
This commit ensure that all symbol resolution results are cleared
after LTO, so that the following second symbol resolution will not
be affected by the previous results.
ELF object files returned by do_lto() may contain unforeseen undefined
symbols, so we can't call that function after the symbol resolution pass
because it can cause undefined symbol errors.
This reverts commit ddcb7b4197. `is_dso`
is used by hot functions such as `Symbol::get_addr()`, so we want to
eliminate the cost of virtual function dispatch.
read_lto_object is called on each archive member, so if an archive file
contains lots of members, it can exhaust the pool of file descriptors.
This patch fixes the issue by opening an archive only once.
Since compiled IR objects may contain mergeable string sections, we
don't know the estimated number of items in output mergeable sections.
Previously, we created the concurrent hash tables for output mergeable
sections before compiling IR objects, so there was a chance to see
the "concurrent hash table full" error.
This commit moves the code to add section pieces to output mergeable
sections after `do_lto`.
The LTO plugin API support is still in progress, but with this change,
mold can link itself with `-flto` with both GCC and Clang.
Since mold now supports LTO natively, I removed the fallback mechanism
to ld.bfd or ld.lld that I implemented in
a5029d19a8.
Fixes https://github.com/rui314/mold/issues/181