1
1
mirror of https://github.com/rui314/mold.git synced 2024-12-27 10:23:41 +03:00

temporary

This commit is contained in:
Rui Ueyama 2020-11-05 18:57:00 +09:00
parent f554ce96e7
commit 94c6917f95

69
BUGS.md
View File

@ -3,44 +3,47 @@ development of the mold linker.
## GNU IFUNC ## GNU IFUNC
A statically-linked "hello world" program mysteriously crashed in Problem: A statically-linked "hello world" program mysteriously
`__libc_start_main` function which is called just after `_start`. crashed in `__libc_start_main` function which is called just after
`_start`.
I opened up gdb and found that the program reads a bogus value from Investigation: I opened up gdb and found that the program reads a
the TLS block. It looks like `memcpy` failed to copy proper data. bogus value from a TLS block. It looks like `memcpy` failed to copy
After some investigation, I noticed that `memcpy` doesn't copy data at proper data there. After some investigation, I noticed that `memcpy`
all but instead returns the address of `__memcpy_avx_unaligned` doesn't copy data at all but instead returns the address of
function, which is a real `memcpy` function optimized for machines `__memcpy_avx_unaligned` function, which is a real `memcpy` function
with AVX registers. optimized for machines with the AVX registers.
It turned out the odd issue was caused by the GNU IFUNC mechanism. This odd issue was caused by the GNU IFUNC mechanism. That is, if a
That is, if a function symbol has the type `STT_GNU_IFUNC`, the function symbol has type `STT_GNU_IFUNC`, the function does not do
function does not do what its name suggests to do but instead returns what its name suggests to do but instead returns a pointer to a
a pointer to a function that does the actual job. In this case, function that does the actual job. In this case, `memcpy` is an IFUNC
`memcpy` is an IFUNC function, and it returns an address of function, and it returns an address of `__memcpy_avx_unaligned` which
`__memcpy_avx_unaligned`. is a real `memcpy` function.
IFUNC function addresses are stored to `.got` section. The dynamic IFUNC function addresses are stored to `.got` section in an ELF
loader executes all IFUNC functions at startup and replace their GOT executable. The dynamic loader executes all IFUNC functions at
entries with their return values. This mechanism allows programs to startup and replace their GOT entries with their return values. This
choose the best implementation among variants of the same function at mechanism allows programs to choose the best implementation among
runtime based on the machine info. variants of the same function at runtime based on the machine info.
If a program is statically-linked, there's no dynamic loader that If a program is statically-linked, there's no dynamic loader that
rewrites its GOT entries. Therefore, if a program is rewrites the GOT entries. Therefore, if a program is
statically-linked, a libc's startup routine does that on behalf of the statically-linked, a libc's startup routine does that on behalf of the
dynamic loader. Concretely, the startup routine interprets all dynamic dynamic loader. Concretely, a startup routine interprets all dynamic
relocations between `__rela_iplt_start` and `__rela_iplt_start` symbols. relocations between `__rela_iplt_start` and `__rela_iplt_start`
It is linker's responsibility to mark the beginning and the ending of symbols. It is linker's responsibility to emit dynamic relocations
a `.rela.dyn` section with the symbols, so that the startup routine for IFUNC symbols even if it is linking a statically-linked program
can find it. and mark the beginning and the ending of a `.rela.dyn` section with
the symbols, so that the startup routine can find the relocations.
The bug was my linker didn't define these symbols. Since these symbols The bug was my linker didn't define `__rela_iplt_start` and
are weak, they are initialized to zero, and from the point of the `__rela_iplt_stop` symbols. Since these symbols are weak, they are
initializer function, there's no dynamic entries between initialized to zero. From the point of the initializer function,
`__rela_iplt_start` and `__rela_iplt_start` symbols. That left GOT there's no dynamic entries between `__rela_iplt_start` and
entries for IFUNC symbols. If you call one of the functions, it don't `__rela_iplt_start` symbols. That left GOT entries for IFUNC symbols
do what it should do but instead returns a pointer that its job. untouched.
The proper fix was to define the linker-synthesized symbols. I did The proper fix was to emit dynamic relocations for IFUNC symbols and
that, and the bug was fixed. define the linker-synthesized symbols. I did that, and the bug was
fixed.