1
1
mirror of https://github.com/rui314/mold.git synced 2024-12-28 19:04:27 +03:00
mold/docs/mold.1
Rui Ueyama 60fee59742 Ignore --no-undefined-version
Gentoo's sys-boot/efibootmgr package uses this flag.
2021-05-29 17:07:54 +09:00

617 lines
21 KiB
Groff

.TH MOLD 1
.SH NAME
mold \- A Modern Linker
.SH SYNOPSIS
mold [\fBoptions\fR] \fIobjfile\fR \fB...\fR
.SH DESCRIPTION
\fBmold\fR is a multi-threaded, high-performance linker that is
several times faster than the industry-standard ones, namely, GNU ld,
GNU gold or LLVM lld. It is developed as a drop-in replacement for
these linkers and command-line compatible with them with a few
exceptions.
.PP
Note that even though \fBmold\fR can build many userland programs,
including large ones such as Chrome and Firefox, it is still very
experimental. Do not use \fBmold\fR for production unless you know
what you are doing. \fBmold\fR supports Linux/x86-64 and Linux/i386.
.SS "How to use mold"
On Unix, \fBcc\fR (or \fBgcc\fR or \fBclang\fR, depending on what
compiler you use) is not just a C compiler but is actually a compiler
driver for various kinds of input files. \fBcc\fR dispatches based
on input file extensions. If you pass object files (\fB.o\fR files) to
a compiler driver, the linker command, which is usually installed as
\fB/usr/bin/ld\fR, is invoked as a backend.
.PP
To use an alternative linker, you have to modify a build config file
such as \fBMakefile\fR or \fBcargo.toml\fR to pass an appropriate
option to a compiler driver which invokes a linker. The problem is
that it is not always clear how to do that. In a complicated build
system, it can be very hard to figure out how to convince a system to
use a non-standard linker.
.PP
To deal with the situation, \fBmold\fR provides a convenient option to
force any build system to use \fBmold\fR. To use the feature, invoke
\fBmake\fR or a build command of your choice (such as \fBcargo
build\fR) as follows:
.PP
.Vb 1
\& mold \-run make <make arguments if any>
.Ve
.PP
If you run a build command like above, the command runs under the
influence of \fBmold\fR so that when the command tries to run
\fB/usr/bin/ld\fR, \fB/usr/bin/ld.gold\fR or \fB/usr/bin/ld.lld\fR,
\fBmold\fR is silently invoked instead.
.PP
Internally, \fBmold\fR invokes a given command with the
\fBLD_PRELOAD\fR environment variable set to its companion shared
object file. The shared object file intercepts all \fBexec()\fR-family
functions to run \fBmold\fR if other linker is attempted to be run.
.PP
If you don't want to use the \fB\-run\fR option, and if you are using
\fBclang\fR, you can pass \fB\-fuse\-ld\fR=\fI/absolute/path/to/mold\fR to
\fBclang\fR to used \fBmold\fR. If you are using \fBgcc\fR, it looks
like there's unfortunately no easy way (other than \fB\-run\fR) to
force it to use \fBmold\fR, as \fBgcc\fR doesn't take an arbitrary
pathname as an argument for \fB\-fuse\-ld\fR.
.PP
I do not recommend installing \fBmold\fR as \fB/usr/bin/ld\fR
unless you know what you are doing.
.SS Compatibility
\fBmold\fR is designed to be a drop-in replacement for the GNU linkers
for linking user-land programs. If your user-land program cannot be
built due to missing command-line options, please file a bug at
\fBhttps://github.com/rui314/mold/issues\fR.
.PP
\fBmold\fR supports a very limited set of linker script features,
which is just sufficient to read
\fB/usr/lib/x86_64-linux-gnu/libc.so\fR on Linux systems (on Linux,
that file is despite its name not a shared library but an ASCII linker
script that loads a real \fBlibc.so\fR file!)
Beyond that, I have no plan to support any linker script features.
The linker script is an ad-hoc, over-designed, complex language which
I believe needs to be disrupted by a simpler mechanism. I have a plan
to add a replacement for the linker script to \fBmold\fR instead.
.SS Archive symbol resolution
Traditionally, Unix linkers are sensitive to the order in which input
files appear on command line. They process input files from the first
(leftmost) file to the last (rightmost) file one by one. While reading
input files, they maintain sets of defined and undefined symbols.
When visiting an archive file (\fB.a\fR files), they pull out object
files to resolve as many undefined symbols as possible and go on to
the next input file. Object files that weren't pulled out will never
have a chance for a second look.
.PP
Due to this semantics, you usually have to add archive files at the end
of a command line, so that when a linker reaches archive files, it
knows what symbols are remain undefined. If you put archive files at
the beginning of a command line, a linker doesn't have any undefined
symbol, and thus no object files will be pulled out from archives.
.PP
You can change the processing order by \fB\-\-start\-group\fR and
\fB\-\-end\-group\fR options, though they make a linker slower.
.PP
\fBmold\fR as well as LLVM \fBlld\fR linker take a different
approach. They memorize what symbols can be resolved from archive
files instead of forgetting it after processing each
archive. Therefore, \fBmold\fR and \fBlld\fR can "go back" in a
command line to pull out object files from archives, if they are
needed to resolve remaining undefined symbols. They are not sensitive
to the input file order.
.PP
\fB\-\-start\-group\fR and \fB\-\-end\-group\fR are still accepted
by \fBmold\fR and \fBlld\fR for compatibility with traditional linkers,
but they are silently ignored.
.SS Dynamic symbol resolution
Some of Unix linker's features are unable to be understood without
understanding the semantics of dynamic symbol resolution. Therefore,
even though that's not specific to \fBmold\fR, we'll explain it here.
.PP
We use "ELF module" or just "module" as a collective term to refer an
executable or a shared library file in the ELF format.
.PP
An ELF module may have lists of imported symbols and exported symbols,
as well as a list of shared library names from which imported symbols
should be imported. The point is that imported symbols are not bound
to any specific shared library until runtime. This is contrary to
systems such as Windows that have the two-level namespace for dynamic
symbols. On Windows, for example, dynamic symbols are represented as a
tuple of (symbol name, shared library name), so that each dynamic
symbol is guaranteed to be resolved from its corresponding library.
.PP
Here is how the Unix dynamic linker resolves dynamic symbols. Upon the
start of an ELF program, the dynamic linker construct a list of ELF
modules which as a whole consists of a complete program. The
executable file is always at the beginning of the list, which is
followed by its depending shared libraries. An imported symbol is
searched from the beginning of the list to the end. If two or more
modules define the same symbol, the one that appears first in the list
takes precedence over the others.
.PP
Typically, a module that exports a symbol also imports the same
symbol. Such symbol is usually resolved to itself, but that's not the
case if a module that appears before in the symbol search list
provides another definition of the same symbol.
.PP
Let me take \fBmalloc\fR as an example. Assume that you define your
version of \fBmalloc\fR in your main executable file. Then, all
\fBmalloc\fR calls from any module are resolved to your function
instead of the one in libc, because the executable is at the beginning
of the dynamic symbol search list. Note that even \fBmalloc\fR calls
within libc are resolved to your definition since libc exports and
imports \fBmalloc\fR. Therefore, by defining \fBmalloc\fR yourself,
You can overwrite a library function, and the \fBmalloc\fR in libc
becomes dead code.
.PP
This Unix's semantics is tricky and sometimes considered harmful. For
example, assume that you accidentally define \fBatoi\fR as a global
function in your executable that behaves completely different from the
one in the C standard. Then, all \fBatoi\fR function calls from any
modules (even function calls within libc) are redirected to your
function instead of the one in libc which obviously causes a problem.
That is a somewhat surprising consequence for an accidental name
conflict. On the other hand, this semantics is sometimes considered
useful because it allows users to overwrite library functions without
recompiling modules containing them. Whether good or bad, you should
this semantics in your mind to understand Unix linkers behaviors.
.SS Build reproducibility
\fBmold\fR's output is deterministic. That is, if you pass the same
object files and the same command line options to \fBmold\fR, it is
guaranteed that the output is always bit-wise the same. The linker's
internal randomness, such as the timing of thread scheduling or
iteration orders of hash tables, doesn't affect the output.
.PP
\fBmold\fR does not have any host-specific default settings. This is
contrary to the GNU linkers to which some configurable values, such as
system-dependent library search paths, are hard-coded. Therefore,
\fBmold\fR depends only on its command-line arguments.
.SH OPTIONS
.IP "\fB\-\-help\fR"
Report usage information to stdout and exit.
.IP "\fB\-v\fR"
.PD 0
.IP "\fB\-\-version\fR"
.PD
Report version information to stdout.
.IP "\fB\-V\fR"
Report version and target information to stdout.
.IP "\fB\-C\fR \fIdir\fR"
.PD 0
.IP "\fB\-\-directory\fR \fIdir\fR"
.PD
Change to \fIdir\fR before doing anything.
.IP "\fB\-E\fR"
.PD 0
.IP "\fB\-\-export\-dynamic\fR"
.IP "\fB\-\-no\-export\-dynamic\fR"
.PD
When creating an executable, using \fB\-E\fI option causes all global
symbols to be put into the dynamic symbol table, so that the symbols
are visible from other ELF modules at runtime.
.PP
By default, or if \fB\-\-no\-export\-dynamic\fR is given, only symbols
that are referenced by DSOs at link-time are exported from an executable.
.IP "\fB\-F\fR \fIlibname\fR"
.PD 0
.IP "\fB\-\-filter\fR=\fIlibname\fR"
.PD
Set the \fBDT_FILTER\fR dynamic section field to \fIlibname\fR.
.IP "\fB\-I\fR\fIfile\fR"
.PD 0
.IP "\fB\-\-dynamic\-linker\fR=\fIfile\fR"
.PD 0
.IP "\fB\-\-no\-dynamic\-linker\fR"
.PD
Set the dynamic linker path to \fIfile\fR. If no \fB-I\fR option is
given, or if \fB\-\-no\-dynamic\-linker\fR is given, no dynamic linker
path is set to an output file. This is contrary to the GNU linkers
which sets a default dynamic linker path in that case.
However, this difference doesn't usually make any difference because
the compiler driver always passes \fB-I\fR to a linker.
.IP "\fB\-L\fR\fIdir\fR"
.PD 0
.IP "\fB\-\-library\-path\fR=\fIdir\fR"
.PD
Add \fIdir\fR to the list of library search paths from which
\fBmold\fR searches libraries for the \fB-l\fR option.
Unlike the GNU linkers, \fBmold\fR does not have the default search
paths. This difference doesn't usually make any difference because the
compiler driver always passes all necessary search paths to a linker.
.IP "\fB\-M\fR"
.PD 0
.IP "\fB\-\-print\-map\fR"
.PD
Write a map file to stdout.
.IP "\fB\-N\fR"
.PD 0
.IP "\fB\-\-omagic\fR"
.IP "\fB\-\-no\-omagic\fR"
.PD
Force \fBmold\fR to emit an output file with an old-fashioned memory
layout. First, it makes the first data segment to not be aligned to a
page boundary. Second, text segments are marked as writable if the
option is given.
.RE
.IP "\fB\-S\fR"
.PD 0
.IP "\fB\-\-strip\-debug\fR"
.PD
Omit \fB.debug_*\fR sections from the output file.
.IP "\fB\-T\fR \fIfile\fR"
.PD 0
.IP "\fB\-\-script\fR=\fIfile\fR"
.PD
Read linker script from \fIfile\fR.
.IP "\fB\-X\fR"
.PD 0
.IP "\fB\-\-discard\-locals\fR"
.PD
Discard temporary local symbols to reduce the sizes of the symbol
table and the string table. Temporary local symbols are local symbols
starting with \fB.L\fR. Compilers usually generate such symbols for
unnamed program elements such as string literals or floating-point
literals.
.IP "\fB\-e \fR\fIsymbol\fR"
.PD 0
.IP "\fB\-\-entry\fR=\fIsymbol\fR"
.PD
Use \fIsymbol\fR as the entry point symbol instead of the default
entry point symbol \fB_start\fR.
.IP "\fB\-f\fR \fIshlib\fR"
.PD 0
.IP "\fB\-\-auxiliary\fR=\fIshlib\fR"
.PD
Set the \fBDT_AUXILIARY\fR dynamic section field to \fIshlib\fR.
.IP "\fB\-h\fR \fIlibname\fR"
.PD 0
.IP "\fB\-\-soname\fR\fIlibname\fR"
.PD
Set the \fBDT_SONAME\fR dynamic section field to \fIlibname\fR. This
option is used when creating a shared object file. Typically, when you
create lib\fIfoo\fR.so, you want to pass \fB\-\-soname\fR=\fIfoo\fR to
a linker.
.IP "\fB\-l\fR\fIlibname\fR"
Search for lib\fIlibname\fR.so or lib\fIlibname\fR.a from library
search paths.
.IP "\fB\-m\fR \fI[\fIelf_x86_64\fR,\fIelf_i386\fR\fR]"
.PD 0
.PD
Choose a target.
.IP "\fB\-o\fR \fIfile\fR"
.PD 0
.IP "\fB\-\-output\fR=\fIfile\fR"
.PD
Use \fIfile\fR as the output file name instead of the default name
\fBa.out\fR.
.IP "\fB\-s\fR"
.PD 0
.IP "\fB\-\-strip\-all\fR"
.PD
Omit \fB.symtab\fR section from the output file.
.IP "\fB\-u\fR \fIsymbol\fR"
.PD 0
.IP "\fB\-\-undefined\fR=\fIsymbol\fR"
.PD
If \fIsymbol\fR remains as an undefined symbol after reading all
object files, and if there is an static archive that contains an
object file defining \fIsymbol\fR, pull out the object file and link
it so that the output file contains a definition of \fIsymbol\fR.
.IP "\fB\-\-Bdynamic\fR"
Link against shared libraries.
.IP "\fB\-\-Bstatic\fR"
Do not link against shared libraries.
.IP "\fB\-\-Bsymbolic\fR"
When creating a shared library, make global symbols export-only
(i.e. do not import the same symbol). As a result, references within a
shared library is always resolved locally, negating symbol override at
runtime. See \fB\s-1Dynamic symbol resolution\s0\fR for more
information about symbol imports and exports.
.IP "\fB\-\-Bsymbolic\-functions\fR"
Have the same effect as \fB\-\-Bsymbolic\fR but works only for
function symbols. Data symbols remains both imported and exported.
.IP "\fB\-\-Map\fR=\fIfile\fR"
Write map file to \fIfile\fR.
.IP "\fB\-\-as\-needed\fR"
.PD 0
.IP "\fB\-\-no\-as\-needed\fR"
.PD
By default, shared libraries given to a linker are unconditionally
added to the list of required libraries in an output file. However,
shared libraries after \fB\-\-as\-needed\fR are added to the list only
when at least one symbol is actually used by an object file. In other
words, shared libraries after \fB\-\-as\-needed\fR are not added to the
list if they are not needed by a program.
The \fB\-\-no\-as\-needed\fR option restores the default behavior
for subsequent files.
.IP "\fB\-\-build\-id\fR"
.PD 0
.IP "\fB\-\-build\-id\fR=[\fInone\fR,\fImd5\fR,\fIsha1\fR,\fIsha256\fR,\fIuuid\fR,0x\fIhexstring\fR]"
.IP "\fB\-\-no\-build\-id\fR"
.PD
Create a \fB.note.gnu.build-id\fR section containing a byte string to
uniquely identify an output file. \fB\-\-build\-id\fR and
\fB\-\-build\-id=sha256\fR compute a 256 bits cryptographic hash of an
output file and set it to build-id. \fBmd5\fR and \fBsha1\fR compute
the same hash but truncate it to 128 and 160 bits, respectively,
before setting it to build-id. \fBuuid\fR sets a random 128 bits UUID.
0x\fIhexstring\fR sets \fIhexstring\fR.
.IP "\fB\-\-chroot\fR=\fIdir\fR"
Set \fIdir\fR to root directory.
.IP "\fB\-\-compress\-debug\-sections\fR=[\fInone\fR,\fIzlib\fR,\fIzlib\-gabi\fR]"
Compress DWARF debug info (\fB.debug_*\fR sections) using the zlib
compression algorithm.
.IP "\fB\-\-demangle\fR"
.PD 0
.IP "\fB\-\-no\-demangle\fR"
.PD
Demangle C++ symbols in log messages.
.IP "\fB\-\-dynamic\-list\fR=\fIfile\fR"
Read a list of dynamic symbols from \fIfile\fR.
.IP "\fB\-\-eh\-frame\-hdr\fR"
.PD 0
.IP "\fB\-\-no\-eh\-frame\-hdr\fR"
.PD
Create \fB.eh_frame_hdr\fR section.
.IP "\fB\-\-exclude\-libs\fR=\fIlib,lib,..\fR"
Mark all symbols in given libraries hidden.
.IP "\fB\-\-fini\fR=\fIsymbol\fR"
Call \fIsymbol\fR at unload-time.
.IP "\fB\-\-fork\fR"
.PD 0
.IP "\fB\-\-no\-fork\fR"
.PD
Spawn a child process and let it do the actual linking. When linking a
large program, the OS kernel can take a few hundred milliseconds to
terminate a mold process. \fB\-\-fork\fR hides that latency.
.IP "\fB\-\-gc\-sections\fR"
.PD 0
.IP "\fB\-\-no\-gc\-sections\fR"
.PD
Remove unreferenced sections
.IP "\fB\-\-hash\-style\fR=[\fIsysv\fR,\fIgnu\fR,\fIboth\fR]"
Set hash style
.IP "\fB\-\-icf\fR"
.PD 0
.IP "\fB\-\-no\-icf\fR"
.PD
Fold identical code
.IP "\fB\-\-image\-base\fR=\fIaddr\fR"
Set the base address to \fIaddr\fR
.IP "\fB\-\-init\fR=\fIsymbol\fR"
Call \fIsymbol\fR at load-time
.IP "\fB\-\-no\-undefined\fR"
Report undefined symbols (even with \fB\-\-shared\fR)
.IP "\fB\-\-perf\fR"
Print performance statistics
.IP "\fB\-\-pie\fR"
.PD 0
.IP "\fB\-\-pic\-executable\fR"
.IP "\fB\-\-no\-pie\fR"
.IP "\fB\-\-no\-pic\-executable\fR"
.PD
Create a position independent executable
.IP "\fB\-\-pop\-state\fR"
Pop state of flags governing input file handling
.IP "\fB\-\-preload\fR"
Preload object files
.IP "\fB\-\-print\-gc\-sections\fR"
.PD 0
.IP "\fB\-\-no\-print\-gc\-sections\fR"
.PD
Print removed unreferenced sections
.IP "\fB\-\-print\-icf\-sections\fR"
.PD 0
.IP "\fB\-\-no\-print\-icf\-sections\fR"
.PD
Print folded identical sections
.IP "\fB\-\-push\-state\fR"
Pop state of flags governing input file handling
.IP "\fB\-\-quick\-exit\fR"
.PD 0
.IP "\fB\-\-no\-quick\-exit\fR"
.PD
Use quick_exit to exit.
.IP "\fB\-\-relax\fR"
.PD 0
.IP "\fB\-\-no\-relax\fR"
.PD
Rewrite machine instructions with more efficient ones for some
relocations. The feature is enabled by default.
.IP "\fB\-\-repro\fR"
Embed input files to .repro section
.IP "\fB\-\-retain\-symbols\-file\fR=\fIfile\fR"
Keep only symbols listed in \fIfile\fR. \fIfile\fR is a text file
containing a symbol name on each line. \fBmold\fR discards all local
symbols as well as global sybmol that are not in \fIfile\fR. Note that
this option removes symbols only from \fB.symtab\fR section and does
not affect \fB.dynsym\fR section, which is used for dynamic linking.
.IP "\fB\-\-rpath\fR=\fIdir\fR"
Add \fIdir\fR to runtime search path
.IP "\fB\-\-run\fR \fIcommand arg ...\fR"
Run \fIcommand\fR with mold as \fB/usr/bin/ld\fR
.IP "\fB\-\-shared\fR"
.PD 0
.IP "\fB\-\-Bshareable\fR"
.PD
Create a share library
.IP "\fB\-\-spare\-dynamic\-tags\fR=\fInumber\fR"
Reserve give number of tags in .dynamic section
.IP "\fB\-\-static\fR"
Do not link against shared libraries
.IP "\fB\-\-stats\fR"
Print input statistics.
.IP "\fB\-\-sysroot\fR=\fIdir\fR"
Set target system root directory
.IP "\fB\-\-thread\-count=\fIcount\fR\fR"
Use \fIcount\fR number of threads.
.IP "\fB\-\-threads\fR"
.PD 0
.IP "\fB\-\-no\-threads\fR"
.PD
Use multiple threads. By default, \fBmold\fR uses as many threads as
the number of cores or 32, whichever is the smallest. The reason why
it is capped to 32 is because \fBmold\fR doesn't scale well beyond
that point. To use only one thread, pass \fB\-\-no\-threads\fR or
\fB\-\-thread-count=1\fR.
.IP "\fB\-\-trace\fR"
Print name of each input file.
.IP "\fB\-\-version\-script\fR=\fIfile\fR"
Read version script from \fIfile\fR.
.IP "\fB\-\-warn\-common\fR"
.PD 0
.IP "\fB\-\-no\-warn\-common\fR"
.PD
Warn about common symbols
.IP "\fB\-\-whole\-archive\fR"
.PD 0
.IP "\fB\-\-no\-whole\-archive\fR"
.PD
When archive files (\fB.a\fR files) are given to a linker, only object
files that are needed to resolve undefined symbols are extracted from
them and linked to an output file. \fB\-\-whole\-archive\fR changes
that behavior for subsequent archives so that a linker extracts all
object files and link them to an output. For example, if you are
creating a shared object file and you want to include all archive
members to the output, you should pass \fB\-\-whole\-archive\fR.
\fB\-\-no\-whole\-archive\fR restores the default behavior for
subsequent archives.
.IP "\fB\-\-wrap\fR=\fIsymbol\fR"
Make \fIsymbol\fR to be resolved to \fB__wrap_\fIsymbol\fR. The
original symbol can be resolved as \fB__real_\fIsymbol\fR. This
option is typically used for wrapping an existing function.
.IP "\fB\-z now\fR"
Disable lazy function resolution
.IP "\fB\-z lazy\fR"
Enable lazy function resolution (default)
.IP "\fB\-z execstack\fR"
Require executable stack
.IP "\fB\-z noexecstack\fR"
Do not require executable stack (default)
.IP "\fB\-z relro\fR"
Make some sections read-only after relocation (default)
.IP "\fB\-z norelro\fR"
Do not use relro
.IP "\fB\-z defs\fR"
Report undefined symbols (even with \fI\-\-shared\fR)
.IP "\fB\-z nodefs\fR"
Do not report undefined symbols
.IP "\fB\-z nodlopen\fR"
Mark DSO not available to dlopen
.IP "\fB\-z nodelete\fR"
Mark DSO non-deletable at runtime
.IP "\fB\-z nocopyreloc\fR"
Do not create copy relocations
.IP "\fB\-z initfirst\fR"
Mark DSO to be initialized first at runtime
.IP "\fB\-z interpose\fR"
Mark object to interpose all DSOs but executable
.IP "\fB\-(\fR"
.PD 0
.IP "\fB\-)\fR"
.IP "\fB\-O\fR\fInumber\fR"
.IP "\fB\-\-allow\-multiple\-definition\fR"
.IP "\fB\-\-allow\-shlib\-undefined\fR"
.IP "\fB\-\-color\-diagnostics\fR"
.IP "\fB\-\-disable\-new\-dtags\fR"
.IP "\fB\-\-enable\-new\-dtags\fR"
.IP "\fB\-\-end\-group\fR"
.IP "\fB\-\-fatal\-warnings\fR"
.IP "\fB\-\-gdb\-index\fR"
.IP "\fB\-\-no\-allow\-shlib\-undefined\fR"
.IP "\fB\-\-no\-copy\-dt\-needed\-entries\fR"
.IP "\fB\-\-no\-fatal\-warnings\fR"
.IP "\fB\-\-no\-undefined\-version\fR"
.IP "\fB\-\-nostdlib\fR"
.IP "\fB\-\-plugin\-opt\fR"
.IP "\fB\-\-plugin\fR"
.IP "\fB\-\-rpath\-link\fR=\fIdir\fR"
.IP "\fB\-\-sort\-common\fR"
.IP "\fB\-\-sort\-section\fR"
.IP "\fB\-\-start\-group\fR"
.PD
Ignored
.SH BUGS
Report bugs at \fBhttps://github.com/rui314/mold/issues\fR.
.SH AUTHOR
Rui Ueyama <\fBruiu@cs\&.stanford\&.edu\fR>
.SH "SEE ALSO"
.BR ld (1),
.BR gold (1),
.BR ld.so (8)