rui314/mold - mold - gitea: Gitea Service

mirror of https://github.com/rui314/mold.git synced 2024-10-05 00:57:08 +03:00

Author	SHA1	Message	Date
Rui Ueyama	31ca2cc2ae	[ELF] Compile an Aho-Corasick tree into a DFA	2022-01-22 12:35:49 +09:00
Rui Ueyama	e9a5d1cee3	[ELF] Handle glob patterns directly instead of converting them to regex Converting a glob pattern to a regex is fragile, and std::regex is slow. So we should handle glob patterns ourselves.	2022-01-21 23:32:27 +09:00
Rui Ueyama	3f4e570320	Simplify	2022-01-21 20:59:19 +09:00
Rui Ueyama	d0c1c4db19	[ELF] Use Aho-Corasick algo to handle version script patternsC To process version scriots, we have to match glob patterns against symbol strings. Sometimes, we have hundreds or thousands of glob patterns and have to match them against millions of mangled long C++ symbol names. This step can be very slow. In this patch, I implemented the Aho-Corasick algorithm to match glob patterns to symbol strings as quickly as possible. For the details of the algorithm, see https://en.wikipedia.org/wiki/Glob_(programming). This patch improves mold's performance for programs that uses large version scripts. For example, linking libQt6Gui.so.6.3.0 reduced from 1.10s to 0.05s with this patch. This patch also changes how symbol versions are applied if two or more version patterns match to a single symbol string. Previously, the last one in a script file took precedence. Now, the first one takes precedence. I believe the new behavior is compatible with GNU ld. Fixes https://github.com/rui314/mold/issues/156 Fixed https://github.com/rui314/mold/issues/287	2022-01-21 20:30:05 +09:00
Rui Ueyama	92876820cb	[ELF/ARM64] Support range extension thunks ARM64 branch instructions have only a 25-bit displacement. Non-thumb instructions are always aligned to 4 byte boundaries, but with that implicit trailing two zeros considered, they can represent only a 27-bit displacement. That means branch instructions can jump to a location only if it is within a ±128 MiB range. If a branch destination is farther than that, a linker has to emit machine code sequence that constructs a full 32-bit address in a register to jump to the final destination, and redirect the branch to that code sequence. Such code sequence is called a "range extension thunk" or just "thunk". Previously, mold didn't support range extension thunks, so it couldn't link large programs. That would fail with an "relocation out of range" error. Now, mold gained a feature to create thunks and can link large programs. Thunk creation is an interesting algorithmic problem. We need to insert a thunk for at least in every 128 MiB chunk, because otherwise branch instructions wouldn't be able to jump to a thunk. Adding an entry to a thunk could slightly enlarge the distance between a branch instruction location and its destination if the thunk is in between them. That could make the branch that was previously reachable unreachable. Usually, this problem is solved by an iterative algorithm. With the iterative algorithm, a linker check for reachability of all relocations, create new thunks if necessary, and repeat it until no new thunks are created. I implemented a different algorithm than that in this patch. The algorithm implemented in this patch is guaranteed to work in O(n) where n is the number of relocations. This algorithm might be novel. And the algorithm implemented in this patch is quite fast. It can create thunks in 80 milliseconds on a 16-core Amazon Graviton 2 machine for clang-14 that has an ~100MB .text section.	2022-01-20 03:16:50 +00:00
Rui Ueyama	be41fdbc21	[ELF] Rename variables `ctx.arg` should contain only variables that can directly be mapped to command line arguments.	2022-01-20 08:35:52 +09:00
Rui Ueyama	43b2a1e423	[ELF] Do not scan .text for .relr.dyn	2022-01-17 20:40:22 +09:00
Rui Ueyama	04ba8f4cbd	[ELF] Rename a function	2022-01-17 10:26:23 +09:00
Rui Ueyama	1152b6eb38	[ELF] Remove redundant assignment	2022-01-16 08:41:26 +09:00
Rui Ueyama	bd6afa1b23	[ELF] Add `-pack-dyn-relocs=relr` .relr.dyn is a new section that has been implemented in other linkers recently. That section contains only the RELATIVE-type dynamic relocations (i.e. base relocations). Compared to the regular .rela.dyn, a .relr.dyn's size is typically less than 1/10 because the section is compressed. Since PIEs (position-independent executables) tend to contain lots of RELATIVE-type relocations and PIEs are now the default on many Linux distributions for security reasons, .relr.dyn is more effective than it was. It can reduce binary size by a few percent or more. Note that the runtime support is catching up, so binaries built with `-pack-dyn-relocs=relr` may not work on your system unless you are running a very recent version of Linux.	2022-01-14 20:54:37 +09:00
Rui Ueyama	2ee2243c63	Refactor	2022-01-13 23:53:53 +09:00
Rui Ueyama	d8dc8a6787	[ELF] Optimize relocation processing for non-alloc sections In mold, a relocation refers either a symbol or a section piece. A section piece is a segment of a mergeable section such as .rodata.str.1 or .debug_str. Previously, we preprocessed all relocations referring mergeable sections to find their corresponding section pieces. We did this because we want to find a relocation target as quickly as possible for `gc_sections` and `scan_rels`. However, we didn't actually have to do that for non-alloc sections, as non-alloc sections are not subject to neither `gc_sections` nor `scan_rels`. So, we could skip relocation preprocessing for non-alloc sections. This commit implement that optimization. It looks like this is an effective optimization for programs that have large debug info because debug sections tend to contain a lot of relocations referring a .debug_str which is a mergeable section. Here is a few notable examples. Output size Before After clang-14 2.2 GiB 1.455s 1.396s (4% faster) mongodb 4.8 GiB 2.341s 1.925s (17% faster) These nubmers were measured on a simulated 16-core 32-thread machine.	2022-01-13 23:22:03 +09:00
Rui Ueyama	d3cd2e780f	Remove a redundant data structure	2022-01-13 20:38:43 +09:00
Rui Ueyama	d2363f1819	Fix warning for the case if `-DDEBUG` is given	2022-01-13 16:51:12 +09:00
Rui Ueyama	9d27ee0839	[ELF] Skip incompatible files specified by linker script GROUP command Fixes https://github.com/rui314/mold/issues/260	2022-01-13 10:42:13 +09:00
Rui Ueyama	e78608a37a	[ELF] Rename a file We collectively call .o and .so files as input files, so we should name this file input-files.cc.	2022-01-12 20:41:49 +09:00
Rui Ueyama	d2c57e99ca	Refactor	2022-01-12 20:35:41 +09:00
Rui Ueyama	205138f16e	Refactor SharedFile class This change factors out `elf_syms` and `first_global` from SharedFile and move them into the base class.	2022-01-12 15:58:33 +09:00
Rui Ueyama	cafc57fa65	Simplify	2022-01-12 11:16:21 +09:00
Rui Ueyama	ce5749ce2f	[ELF] Make symbol resolution deterministic Previously, our parallel symbol resolution algorithm was not deterministic in edge cases. As an example, consider the following two source files: foo.c: inline void fn1() { ... } bar.c: inline void fn1() { ... } void fn2() { ... } Let's say you compile these files and put them into an archive file. If mold decided to pull out `foo.o` first for `fn1` and then `bar.o` for `fn2`, then both `foo.o` and `bar.o` are included into a result. However, if mold pulled out `bar.o` first, then there's no chance for `foo.o` to be pulled out, so only `foo.o` would be included into a result. The algorithm implemented in this commit should be deterministic. We do not override symbols when we mark live objects.	2022-01-12 11:16:21 +09:00
Rui Ueyama	9ca6a9dc5e	[ELF] Add `-z ibt` Fixes https://github.com/rui314/mold/issues/229	2022-01-09 12:38:38 +09:00
Rui Ueyama	31a43a7ba6	[ELF] Add `-z cet-report` Fixes https://github.com/rui314/mold/issues/229	2022-01-09 12:34:32 +09:00
Rui Ueyama	778f8629eb	Refactor	2022-01-09 12:34:32 +09:00
Rui Ueyama	e29bd8f42b	[ELF] Add `-z shstk` Fixes https://github.com/rui314/mold/issues/229	2022-01-09 12:34:24 +09:00
Rui Ueyama	fbfa01dcd1	[ELF] Implement `-z ibtplt` https://github.com/rui314/mold/issues/229	2022-01-08 14:09:12 +09:00
Rui Ueyama	18367e69a8	Refactor	2022-01-08 13:24:29 +09:00
Rui Ueyama	644fdc8a2f	Simplify	2022-01-07 20:42:25 +09:00
Rui Ueyama	0e17dbeda8	[ELF] Make --defsym'ed symbols absolute If a symbol is defined in the form of --defsym=foo=0x<hexvalue>, it should be defined as an absolute symbol with the given value.	2022-01-07 16:28:14 +09:00
Rui Ueyama	79e397ea03	Add a missing #include	2022-01-06 20:54:23 +09:00
Rui Ueyama	daa88f2f06	[ELF] Handle '[]' in glob patterns Previously, mold crashes due to an invalid regex pattern exception when `[...]` is given as a version script pattern. Fixes https://github.com/rui314/mold/issues/258	2022-01-06 20:45:46 +09:00
Rui Ueyama	43fa021d48	[ELF] Do not place non-exported symbols into .gnu.hash Previously, mold put all global symbols into .gnu.hash. Although I believe it was not an error, it bloated the size of .gnu.hash because .gnu.hash needs only exported symbols. https://github.com/rui314/mold/issues/255	2022-01-06 17:47:19 +09:00
Rui Ueyama	907f713e51	[ELF] Remove NEEDS_DYNSYM flag from symbol I wanted to make sure that a symbol X is in .dynsym if and only if (X.is_imported \|\| X.is_exported).	2022-01-06 15:27:39 +09:00
Rui Ueyama	be49d673a5	[ELF] Deduce emulation from input files if -m is not given	2022-01-05 19:55:32 +09:00
Rui Ueyama	a5029d19a8	[ELF] Automatically fall back to ld.bfd or ld.lld if LTO is in use This is very hacky but highly practical, so I couldn't resist to not implement this. We should support LTO natively in the future. In the meantime, this feature should work as a poor-man's replacement. Fixes https://github.com/rui314/mold/issues/242	2022-01-04 20:50:16 +09:00
Rui Ueyama	a2839f60dd	[ELF] Improve an error message Fixes https://github.com/rui314/mold/issues/233	2022-01-04 13:49:48 +09:00
Rui Ueyama	9894b3173b	[ELF] Add --default-symver Fixes https://github.com/rui314/mold/issues/228	2022-01-03 20:29:33 +09:00
Rui Ueyama	d100a735a0	Refactor: rename a function	2021-12-31 21:28:16 +09:00
Rui Ueyama	902f23c456	Refactor	2021-12-31 21:04:07 +09:00
Rui Ueyama	c92e192e56	[ELF] Remove redundant `inline` keywords	2021-12-31 21:02:06 +09:00
Rui Ueyama	440ff27e58	[ELF] Refactor	2021-12-31 21:01:59 +09:00
Rui Ueyama	085e3c3abe	Simplify	2021-12-31 18:04:39 +09:00
Rui Ueyama	58e43633fc	Revert "Revert "Use C++17 filesystem API"" This reverts commit `f29a85f20a` with a fix for a build failure.	2021-12-31 16:39:54 +09:00
Rui Ueyama	f29a85f20a	Revert "Use C++17 filesystem API" This reverts commit `f6e91df440` because it causes build breakages for a lot of Gentoo packages.	2021-12-30 22:15:52 +09:00
Rui Ueyama	f6e91df440	Use C++17 filesystem API	2021-12-30 21:32:01 +09:00
Rui Ueyama	d49c50ddaf	[ELF] Add --defsym Fixes https://github.com/rui314/mold/issues/208	2021-12-30 14:48:39 +09:00
Rui Ueyama	f3766cda81	[ELF] Add -z {max,common}-page-size Fixes https://github.com/rui314/mold/issues/203	2021-12-29 17:14:08 +09:00
Rui Ueyama	6e290aab3e	[ELF] Implement --color-diagnostics	2021-12-25 16:55:51 +09:00
Rui Ueyama	5601cf4236	[ELF] Add `-z separate-code`, `-z noseparate-code` and `-z separate-lodable-segments` Fixes https://github.com/rui314/mold/issues/172	2021-12-24 20:28:45 +09:00
Rui Ueyama	8c86c28496	Add `-z nodefaultlib` Fixes https://github.com/rui314/mold/issues/184	2021-12-23 15:01:57 +09:00
Rui Ueyama	530568e662	[ELF] Use a better type	2021-12-20 20:08:25 +09:00
Rui Ueyama	455e729393	[ELF] Make version script application faster Instead of calling regex match multiple times, create a single regex and call regex match function only once. https://github.com/rui314/mold/issues/156	2021-12-17 19:09:26 +09:00
Rui Ueyama	04d5abc9eb	Attempt to fix an ODR violation	2021-12-16 20:34:13 +09:00
Rui Ueyama	733bb6354f	Rename variables	2021-12-11 21:40:57 +09:00
Rui Ueyama	54d75153d2	Rename variables	2021-12-07 21:55:04 +09:00
Rui Ueyama	84cd674ff1	Revert "[ELF] Rename section piece -> subsection" This reverts commit `3fc9c6c3ed`.	2021-12-07 14:44:32 +09:00
Rui Ueyama	58b74bf961	Revert "[ELF] s/SectionFragment/Subsection/g" This reverts commit `aa46df7735`.	2021-12-07 14:44:11 +09:00
Rui Ueyama	22116629f1	[ELF] Add --start-lib and --end-lib Fixes https://github.com/rui314/mold/issues/133	2021-12-06 20:09:46 +09:00
Rui Ueyama	986516545a	[ELF] Do not share OutputFile between ELF and Mach-O	2021-10-26 14:06:11 +09:00
Rui Ueyama	ca9a6d0843	Move code from `elf/output-file.cc` to `output-file.h`	2021-10-05 11:37:30 +09:00
Rui Ueyama	30b45d2263	[ELF] Rename `Symbol<E>::intern` -> `intern`	2021-10-05 08:31:28 +09:00
Rui Ueyama	3fc9c6c3ed	[ELF] Rename section piece -> subsection	2021-10-05 08:22:38 +09:00
Rui Ueyama	77069156cb	[ELF] Rename OutputChunk Chunk	2021-10-04 10:16:24 +09:00
Rui Ueyama	928c39937a	Refactor	2021-10-03 16:26:31 +09:00
Rui Ueyama	001cf042d9	Refactor	2021-10-03 16:01:59 +09:00
Rui Ueyama	fcd10254b7	Rename variables	2021-10-03 16:01:59 +09:00
Rui Ueyama	ebc8a68cb3	[ELF] Rename variables	2021-10-02 14:12:42 +09:00
Rui Ueyama	52c7793326	Move archive-file.cc out of `elf` directory	2021-09-30 23:13:27 +09:00
Rui Ueyama	7c9205f68d	Move memory-mapped-file.cc out of `elf` directory So that we can use the class from mold/mach-o.	2021-09-30 23:13:22 +09:00
Rui Ueyama	a2c9d0ad4d	[ELF] Reduce the size of Subsection struct	2021-09-29 16:30:04 +09:00
Rui Ueyama	aa46df7735	[ELF] s/SectionFragment/Subsection/g	2021-09-29 16:18:41 +09:00
Rui Ueyama	2844f1f573	[ELF] Refactor	2021-09-28 13:23:58 +09:00
Rui Ueyama	bc2bd0e26a	[ELF] Refactor	2021-09-27 23:01:51 +09:00
Rui Ueyama	c40b9aea5f	[Mach-O] Generalize `perf.cc` so that we can use the feature in mold/mach-o	2021-09-27 18:14:56 +09:00
Rui Ueyama	541399911d	[ELF] Refactor	2021-09-26 19:55:46 +09:00
Rui Ueyama	a5d02b18c7	Refactor	2021-09-25 21:47:09 +09:00
Rui Ueyama	7232178513	[Mach-O] wip	2021-09-16 14:20:38 +09:00
Rui Ueyama	150565601b	Move error handlers from `mold::elf` to `mold`	2021-09-15 15:32:42 +09:00
Rui Ueyama	6ab3ddaf8e	Move cleanup handlers from `mold::elf` to `mold`	2021-09-15 15:25:14 +09:00
Rui Ueyama	7ffc3a4545	[Mach-O] Add a feature to dump an executable This is not a linker feature, but in order to learn how Mach-O executables are constructed, I'll implement a dump feature. I'll remove the feature once I understand the structure of Mach-O binaries.	2021-09-13 18:16:13 +09:00
Rui Ueyama	e051ad2a9a	Do not define _GNU_SOURCE We should not depend on glibc-specific features.	2021-09-13 17:29:49 +09:00
Rui Ueyama	08b61f29d2	[ELF] Add --require-defined	2021-09-12 18:34:42 +09:00
Rui Ueyama	0a4305f82a	Inline a few functions	2021-09-11 14:54:55 +09:00
Rui Ueyama	2e5f480a97	Simplify input file preloading logic	2021-09-10 20:41:20 +09:00
Rui Ueyama	8558a98bf6	Simplify	2021-09-10 16:05:45 +09:00
Rui Ueyama	68ff4a8408	Rename functions	2021-09-08 20:15:03 +09:00
Rui Ueyama	dc7e0ca4bf	Remove a useless macro	2021-09-08 19:51:02 +09:00
Rui Ueyama	78bddacd5b	Move target-independent files to the top directory	2021-09-08 19:49:51 +09:00
Rui Ueyama	cab0ccf0bd	[Mach-O] Add a stub for Mach-O	2021-09-08 19:03:07 +09:00
Rui Ueyama	c21302e28a	Do not use strerror strerror is not guaranteed to be thread-safe	2021-09-05 17:15:15 +09:00
Rui Ueyama	459b5973bb	Move code to `elf` sub-directory	2021-09-02 23:16:49 +09:00

1 2 3 4 5

240 Commits