Commit Graph

221 Commits

Author SHA1 Message Date
Tim Schumacher
e2c55ee0a8 LibC: Move dlfcn_integration.h to the bits directory 2022-09-05 10:12:02 +01:00
Tim Schumacher
27bfb81702 Everywhere: Refer to dlfcn*.h by its non-prefixed name 2022-09-05 10:12:02 +01:00
Itamar
db11cfa2c5 Utilities+LibELF: Temporary promises for dynamic linker in "pledge"
This adds a "temporary promises for the dynamic-linker" flag ('-d')
to the "pledge" utility.

Example usage:
pledge -d -p "stdio rpath" id

Without the '-d' flag, id would crash because the dynamic linker
requires 'prot_exec'.

When this flag is used and the program to be run is dynamically linked,
"pledge" adds promises that are required by the dynamic linker
to the promise set provided by the user.

The dynamic linker will later "give up" the pledge promises it no
longer requires.
2022-07-21 16:40:11 +02:00
Tim Schumacher
3f59cb5e70 LibELF: Copy the entire TLS segment instead of each symbol one-by-one
This automatically fixes an issue where we were accidentally copying
garbage data from beyond the TLS segment as uninitialized data isn't
actually stored inside the image.
2022-07-20 18:24:13 +02:00
Tim Schumacher
6799b271bf LibELF: Remove outdated TLS handling in generic program header code 2022-07-20 18:24:13 +02:00
Tim Schumacher
224ac1a307 LibC: Remove a bunch of weak pthread_* symbols 2022-07-19 20:58:51 -07:00
sin-ack
fbc771efe9 Everywhere: Use default StringView constructor over nullptr
While null StringViews are just as bad, these prevent the removal of
StringView(char const*) as that constructor accepts a nullptr.

No functional changes.
2022-07-12 23:11:35 +02:00
sin-ack
3f3f45580a Everywhere: Add sv suffix to strings relying on StringView(char const*)
Each of these strings would previously rely on StringView's char const*
constructor overload, which would call __builtin_strlen on the string.
Since we now have operator ""sv, we can replace these with much simpler
versions. This opens the door to being able to remove
StringView(char const*).

No functional changes.
2022-07-12 23:11:35 +02:00
sin-ack
c70f45ff44 Everywhere: Explicitly specify the size in StringView constructors
This commit moves the length calculations out to be directly on the
StringView users. This is an important step towards the goal of removing
StringView(char const*), as it moves the responsibility of calculating
the size of the string to the user of the StringView (which will prevent
naive uses causing OOB access).
2022-07-12 23:11:35 +02:00
Idan Horowitz
fbeef409c6 DynamicLoader: Stop performing relative relocations on non-pie objects
Co-authored-by: Daniel Bertalan <dani@danielbertalan.dev>
2022-07-10 14:24:34 +02:00
Tim Schumacher
b9f7966e00 LibC: Move stack canary initialization before the global constructors
Once again, QEMU creates threads while running its constructors, which
is a recipe for disaster if we switch out the stack guard while that is
already running in the background.

To solve that, move initialization to our LibC initialization stage,
which is before any actual external initialization code runs.
2022-07-08 22:27:38 +00:00
DexesTTP
7ceeb74535 AK: Use an enum instead of a bool for String::replace(all_occurences)
This commit has no behavior changes.

In particular, this does not fix any of the wrong uses of the previous
default parameter (which used to be 'false', meaning "only replace the
first occurence in the string"). It simply replaces the default uses by
String::replace(..., ReplaceMode::FirstOnly), leaving them incorrect.
2022-07-06 11:12:45 +02:00
Idan Horowitz
753844ec96 LibELF: Take TLS segment alignment into account in DynamicLoader
Previously we would just tightly pack the different libraries' TLS
segments together, but that is incorrect, as they might require some
kind of minimum alignment for their TLS base address.

We now plumb the required TLS segment alignment down to the TLS block
linear allocator and align the base address down to the appropriate
alignment.
2022-07-05 11:26:10 +02:00
Tim Schumacher
e2036ca2ca LibELF: Store the full file path in DynamicObject
Otherwise, our `dirname` call on the parent object will always be empty
when trying to resolve dependencies.
2022-06-30 11:57:10 +02:00
Tim Schumacher
6732fec8b8 LibELF: Warn on self-dlopening libraries while initializing 2022-06-24 11:28:05 +01:00
Tim Schumacher
082a7baa3b LibELF: Check if initializers ran instead of trusting s_global_objects
The original heuristic of "a library being in `s_global_objects` means
that it was fully initialized already" doesn't hold up anymore since we
changed the loading order. This was causing us to skip parts of the
initialization of dependency libraries when running dlopen (since it was
the only user of that setting).

Instead, set a flag after we run stage 4 (which is the "run the global
initializers" stage) and check that flag when determining unfinished
dependencies. This entirely replaces the `skip_global_objects` logic.
2022-06-24 11:28:05 +01:00
Tim Schumacher
d2b87419ac LibELF: Only collect region sizes before reserving memory
This keeps us from needlessly allocating storage via `malloc` as part
of the `Vector`s that early, which we might conflict on while reserving
memory for the main executable.
2022-06-21 22:38:15 +01:00
Tim Schumacher
c0b31796a8 LibELF: Unmap the source file temporarily while reserving space
This further reduces the chance that we will conflict with data that is
already present at the target location.
2022-06-21 22:38:15 +01:00
Tim Schumacher
c1d8612eb5 LibELF: Store DynamicLoader ELF images using an OwnPtr
This is preparation work for the next commit, where we will replace the
stored ELF image mid-load.
2022-06-21 22:38:15 +01:00
Tim Schumacher
07208feae7 LibELF: Actually do the library mapping as early as possible
We previously trusted the `map` part in `map_library` too much, and
assumed that this would already lock in the binary at its final place.

However, the `map()` function of the loader was only called in
`load_main_library`, which ran only right before jumping to the
entrypoint.

Make our binary loading a bit more stable by actually mapping the binary
right after we read its information, and only do the linking right
before jumping to the entrypoint.
2022-06-21 22:38:15 +01:00
Andrew Kaster
72066880c6 LibELF: Always use parent object basename for $ORIGIN processing
Using the main executable basename produces the wrong $ORIGIN processing
for libraries that are secondary dependencies of the main executable,
or dependencies of an object loaded via dlopen.
2022-06-12 00:28:26 +01:00
Tim Schumacher
89da0f2da5 LibELF: Name library maps with the full file path 2022-05-07 20:02:00 +02:00
Tim Schumacher
2b7b7f2816 LibELF: Separate library resolving into a new function 2022-05-07 20:02:00 +02:00
Daniel Bertalan
7aca408993 LibELF: Fail gracefully when IFUNC resolver's object has textrels
.text sections of objects that contain textrels have to be writable
during the relocation procedure. Because of this, we would segfault if
we tried to execute IFUNC resolvers defined in them. Let's print a
meaningful error message instead.

Additionally, a warning is now printed when we load objects with
textrels, as in the future, additional security mitigations might
interfere with them being loaded.
2022-05-01 12:42:01 +02:00
Daniel Bertalan
08c459e495 LibELF: Add support for IFUNCs
IFUNC is a GNU extension to the ELF standard that allows a function to
have multiple implementations. A resolver function has to be called at
load time to choose the right one to use. The PLT will contain the entry
to the resolved function, so branching and more indirect jumps can be
avoided at run-time.

This mechanism is usually used when a routine can be made faster using
CPU features that are available in only some models, and a fallback
implementation has to exist for others.

We will use this feature to have two separate memset implementations for
CPUs with and without ERMS (Enhanced REP MOVSB/STOSB) support.
2022-05-01 12:42:01 +02:00
Daniel Bertalan
4d5965bd2c LibELF: Keep track of whether the PLT contains REL or RELA relocations 2022-05-01 12:42:01 +02:00
Daniel Bertalan
ed5f110b40 LibELF: Perform .relr.dyn relocations before .rel.dyn
IFUNC resolvers depend on the resolved function's address having been
relocated by the time they are called. This means that relative
relocations have to be done first.

The linker is kind enough to put R_*_RELATIVE before R_*_IRELATIVE in
.rel.dyn, but .relr.dyn contains relative relocations too.
2022-05-01 12:42:01 +02:00
Andrew Kaster
435a263998 LibELF: Relax restriction on allowed values of EI_OSABI to allow GNU
This check is here to make sure we only try to load serenity binaries.
However, with -fprofile-instr-generate -fcoverage-mapping, clang
sets the EI_OSABI field to 3, for GNU. The instrumentation uses a lot of
retained COMDAT sections for coverage instrumentation that get the
SHF_GNU_RETAINED section header flag set on them, forcing llvm to set
the ABI to GNU.
2022-05-01 12:42:01 +02:00
Timur Sultanov
33d19a562f LibELF: Look up symbols in all global modules
dlsym() called with RTLD_DEFAULT (nullptr) should look up
symbol in all global modules instead of only looking into the
executable file
2022-04-03 23:25:39 +01:00
Idan Horowitz
086969277e Everywhere: Run clang-format 2022-04-01 21:24:45 +01:00
Brian Gianforcaro
7d667b9f69 LibELF: Remove unused m_program_interpreter member from DynamicLoader
While profiling I realized that this member is unused, so the
StringBuilder and String allocation are completely un-necessary.
2022-03-31 10:18:07 +02:00
Brian Gianforcaro
39f924a731 LibELF: Skip DynamicObject::dump() if logging isn't enabled
I noticed that we were populating this StringBuilder and then throwing
away the result while profiling `true` with UserSpace emulator.

Before:

    courage:~ $ time -n 1000 true
    Timing report: 3454 ms
    ==============
    Command:         true
    Average time:    3.45 ms (median: 3, stddev: 3.42, min: 0, max:11)
    Excluding first: 3.45 ms (median: 3, stddev: 3.42, min: 0, max:11)

After:

    courage:~ $ time -n 1000 true
    Timing report: 3308 ms
    ==============
    Command:         true
    Average time:    3.30 ms (median: 3, stddev: 3.28, min: 0, max:12)
    Excluding first: 3.30 ms (median: 3, stddev: 3.29, min: 0, max:12)
2022-03-31 10:18:07 +02:00
Lenny Maiorani
271d82e23f Libraries: Use default constructors/destructors in LibELF
https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#cother-other-default-operation-rules

"The compiler is more likely to get the default semantics right and
you cannot implement these functions better than the compiler."
2022-03-10 18:04:26 -08:00
Tim Schumacher
35e5024b7d DynamicLinker: Replace $ORIGIN with the executable path 2022-03-08 23:21:35 +01:00
Tim Schumacher
e7f861f34c DynamicLinker: Implement support for RPATH and RUNPATH 2022-03-08 23:21:35 +01:00
Tim Schumacher
7bd0a3e9ba DynamicLoader: Make the cached DynamicObject publicly accessible 2022-03-08 23:21:35 +01:00
Idan Horowitz
c73ef87fc7 Kernel+LibELF+LibVT: Remove unused AK::String header includes 2022-02-16 22:21:37 +01:00
Idan Horowitz
d296001f3f LibELF: Exclude sorted symbols APIs from the Kernel
These are only used by userland, and are implemented using infallible
Strings, so let's just ifdef them out of the Kernel.
2022-02-16 22:21:37 +01:00
Idan Horowitz
1616460312 LibELF: Exclude MemoryRegionInfo::object_name() from the Kernel
This API is only used by userland, and it uses infallible Strings, so
let's just ifdef it out of the Kernel.
2022-02-16 22:21:37 +01:00
Idan Horowitz
7bd409dbdf LibELF: Use StringBuilder::string_view() to avoid String allocation 2022-02-16 22:21:37 +01:00
Daniel Bertalan
3974cac148 LibELF: Implement support for DT_RELR relative relocations
The DT_RELR relocation is a relatively new relocation encoding designed
to achieve space-efficient relative relocations in PIE programs.

The description of the format is available here:
https://groups.google.com/g/generic-abi/c/bX460iggiKg/m/Pi9aSwwABgAJ

It works by using a bitmap to store the offsets which need to be
relocated. Even entries are *address* entries: they contain an address
(relative to the base of the executable) which needs to be relocated.
Subsequent even entries are *bitmap* entries: "1" bits encode offsets
(in word size increments) relative to the last address entry which need
to be relocated.

This is in contrast to the REL/RELA format, where each entry takes up
2/3 machine words. Certain kinds of relocations store useful data in
that space (like the name of the referenced symbol), so not everything
can be encoded in this format. But as position-independent executables
and shared libraries tend to have a lot of relative relocations, a
specialized encoding for them absolutely makes sense.

The authors of the format suggest an overall 5-20% reduction in the file
size of various programs. Due to our extensive use of dynamic linking
and us not stripping debug info, relative relocations don't make up such
a large portion of the binary's size, so the measurements will tend to
skew to the lower side of the spectrum.

The following measurements were made with the x86-64 Clang toolchain:

- The kernel contains 290989 relocations. Enabling RELR decreased its
  size from 30 MiB to 23 MiB.
- LibUnicodeData contains 190262 relocations, almost all of them
  relative. Its file size changed from 17 MiB to 13 MiB.
- /bin/WebContent contains 1300 relocations, 66% of which are relative
  relocations. With RELR, its size changed from 832 KiB to 812 KiB.

This change was inspired by the following blog post:
https://maskray.me/blog/2021-10-31-relative-relocations-and-relr
2022-02-11 18:07:53 +01:00
kleines Filmröllchen
145eeb57ab Userland: Remove a bunch of unnecessary Vector imports
How silly :^)
2022-01-28 23:40:25 +01:00
Sam Atkins
45cf40653a Everywhere: Convert ByteBuffer factory methods from Optional -> ErrorOr
Apologies for the enormous commit, but I don't see a way to split this
up nicely. In the vast majority of cases it's a simple change. A few
extra places can use TRY instead of manual error checking though. :^)
2022-01-24 22:36:09 +01:00
Andreas Kling
c482508aa1 LibELF: Use shared memory mapping when loading ELF objects
There's no reason to make a private read-only mapping just for reading
(and validating) the ELF headers, and copying out the data segments.
2022-01-15 19:51:15 +01:00
Idan Horowitz
cfb9f889ac LibELF: Accept Span instead of Pointer+Size in validate_program_headers 2022-01-13 22:40:25 +01:00
Idan Horowitz
3e959618c3 LibELF: Use StringBuilders instead of Strings for the interpreter path
This is required for the Kernel's usage of LibELF, since Strings do not
expose allocation failure.
2022-01-13 22:40:25 +01:00
Jesse Buhagiar
48c9350036 LibELF: Add LD_LIBRARY_PATH envvar support :^)
The dynamic linker now supports having custom library paths
as specified by the user.
2022-01-05 15:01:14 +02:00
Brian Gianforcaro
bad6d50b86 Kernel: Use Process::require_promise() instead of REQUIRE_PROMISE()
This change lays the foundation for making the require_promise return
an error hand handling the process abort outside of the syscall
implementations, to avoid cases where we would leak resources.

It also has the advantage that it makes removes a gs pointer read
to look up the current thread, then process for every syscall. We
can instead go through the Process this pointer in most cases.
2021-12-29 18:08:15 +01:00
Daniel Bertalan
d1ef8e63f7 LibELF: Use MAP_FIXED_NOREPLACE for address space reservation
This ensures that we don't corrupt our address space if a non-PIE
program's requested address space happens to coincide with memory we
already use.
2021-12-23 23:08:10 +01:00
Idan Horowitz
6c8f1e62db LibELF: Ignore unknown dynamic tags instead of asserting
Some programs (like wine) add custom dynamic tags to their binaries
that are used internally by them, so we can just ignore them.
2021-12-22 00:02:36 -08:00