To add a new per-CPU data structure, add an ID for it to the
ProcessorSpecificDataID enum.
Then call ProcessorSpecific<T>::initialize() when you are ready to
construct the per-CPU data structure on the current CPU. It can then
be accessed via ProcessorSpecific<T>::get().
This patch replaces the existing hard-coded mechanisms for Scheduler
and MemoryManager per-CPU data structure.
This enables further work on implementing KASLR by adding relocation
support to the pre-kernel and updating the kernel to be less dependent
on specific virtual memory layouts.
This allows us to specify virtual addresses for things the kernel should
access via virtual addresses later on. By doing this we can make the
kernel independent from specific physical addresses.
Previously the kernel relied on a fixed offset between virtual and
physical addresses based on the kernel's load address. This allows us
to specify an independent offset.
Instead of doing a reset via triple-fault, let's just shutdown the QEMU
virtual machine because this is already a QEMU-specific handling code
for Self-Test CI mode.
In preparation for modifying the Kernel IOCTL API to return KResult
instead of int, we need to fix this ioctl to an argument to receive
it's return value, instead of using the actual function return value.
It's easy to forget the responsibility of validating and safely copying
kernel parameters in code that is far away from syscalls. ioctl's are
one such example, and bugs there are just as dangerous as at the root
syscall level.
To avoid this case, utilize the AK::Userspace<T> template in the ioctl
kernel interface so that implementors have no choice but to properly
validate and copy ioctl pointer arguments.
GCC and Clang allow us to inject a call to a function named
__sanitizer_cov_trace_pc on every edge. This function has to be defined
by us. By noting down the caller in that function we can trace the code
we have encountered during execution. Such information is used by
coverage guided fuzzers like AFL and LibFuzzer to determine if a new
input resulted in a new code path. This makes fuzzing much more
effective.
Additionally this adds a basic KCOV implementation. KCOV is an API that
allows user space to request the kernel to start collecting coverage
information for a given user space thread. Furthermore KCOV then exposes
the collected program counters to user space via a BlockDevice which can
be mmaped from user space.
This work is required to add effective support for fuzzing SerenityOS to
the Syzkaller syscall fuzzer. :^) :^)
When we get a COW fault and discover that whoever we were COW'ing
together with has either COW'ed that page on their end (or they have
unmapped/exited) we simplify life for ourselves by clearing the COW
bit and keeping the page we already have. (No need to COW if the page
is not shared!)
The act of doing this does not return a committed page to the pool.
In fact, that committed page we had reserved for this purpose was used
up (allocated) by our COW buddy when they COW'ed the page.
This fixes a kernel panic when running TestLibCMkTemp. :^)
This makes a kernel panic immediately fail the on-target CI job.
Otherwise the failed job looks like a test timeout unless one digs into
the details of the job.
We don't need an entirely separate VMObject subclass to influence the
location of the physical pages.
Instead, we simply allocate enough physically contiguous memory first,
and then pass it to the AnonymousVMObject constructor that takes a span
of physical pages.
VMObject already has an IntrusiveList of all the Regions that map it.
We were keeping a counter in addition to this, and only using it in
a single place to avoid iterating over the list in case it only had
1 entry.
Simplify VMObject by removing this counter and always iterating the
list even if there's only 1 entry. :^)
This was previously used for a single debug logging statement during
memory purging. There are no remaining users of this weak pointer,
so let's get rid of it.
If a purgeable VM object is in the "volatile" state when we're asked
to make a COW clone of it, make life simpler by simply "purging"
the cloned object right away.
This effectively means that a fork()'ed child process will discover
its purgeable+volatile regions to be empty if/when it tries making
them non-volatile.
This patch changes the semantics of purgeable memory.
- AnonymousVMObject now has a "purgeable" flag. It can only be set when
constructing the object. (Previously, all anonymous memory was
effectively purgeable.)
- AnonymousVMObject now has a "volatile" flag. It covers the entire
range of physical pages. (Previously, we tracked ranges of volatile
pages, effectively making it a page-level concept.)
- Non-volatile objects maintain a physical page reservation via the
committed pages mechanism, to ensure full coverage for page faults.
- When an object is made volatile, it relinquishes any unused committed
pages immediately. If later made non-volatile again, we then attempt
to make a new committed pages reservation. If this fails, we return
ENOMEM to userspace.
mmap() now creates purgeable objects if passed the MAP_PURGEABLE option
together with MAP_ANONYMOUS. anon_create() memory is always purgeable.
Right now, NE2000 NICs don't work because the link is down by default
and this will never change. Of all the NE2000 documentation I looked
at I could not find a link status indicator, so just assume the link
is up.
next_packet_page points to a page, but was being compared to a byte
offset rather than a page offset when adjusting the BOUNDARY register
when the ring buffer wraps around.
Fixes#8327.
This bug manifests it self when the caller to sys$pledge() passes valid
promises, but invalid execpromises. The code would apply the promises
and then return an error for the execpromises. This leaves the user in
a confusing state, as the promises were silently applied, but we return
an error suggesting the operation has failed.
Avoid this situation by tweaking the implementation to only apply the
promises / execpromises after all validation has occurred.
GCC-11 added a new option `-fzero-call-used-regs` which causes the
compiler to zero function arguments before return of a function. The
goal being to reduce the possible attack surface by disarming ROP
gadgets that might be potentially useful to attackers, and reducing
the risk of information leaks via stale register data. You can find
the GCC commit below[0].
This is a mitigation I noticed on the Linux KSPP issue tracker[1] and
thought it would be useful mitigation for the SerenityOS Kernel.
The reduction in ROP gadgets is observable using the ropgadget utility:
$ ROPgadget --nosys --nojop --binary Kernel | tail -n1
Unique gadgets found: 42754
$ ROPgadget --nosys --nojop --binary Kernel.RegZeroing | tail -n1
Unique gadgets found: 41238
The size difference for the i686 Kernel binary is negligible:
$ size Kernel Kernel.RegZerogin
text data bss dec hex filename
13253648 7729637 6302360 27285645 1a0588d Kernel
13277504 7729637 6302360 27309501 1a0b5bd Kernel.RegZeroing
We don't have any great workloads to measure regressions in Kernel
performance, but Kees Cook mentioned he measured only around %1
performance regression with this enabled on his Linux kernel build.[2]
References:
[0] d10f3e900b
[1] https://github.com/KSPP/linux/issues/84
[2] https://lore.kernel.org/lkml/20210714220129.844345-1-keescook@chromium.org/
This patch greatly simplifies VMObject locking by doing two things:
1. Giving VMObject an IntrusiveList of all its mapping Region objects.
2. Removing VMObject::m_paging_lock in favor of VMObject::m_lock
Before (1), VMObject::for_each_region() was forced to acquire the
global MM lock (since it worked by walking MemoryManager's list of
all regions and checking for regions that pointed to itself.)
With each VMObject having its own list of Regions, VMObject's own
m_lock is all we need.
Before (2), page fault handlers used a separate mutex for preventing
overlapping work. This design required multiple temporary unlocks
and was generally extremely hard to reason about.
Instead, page fault handlers now use VMObject's own m_lock as well.
Despite what the declaration would have us believe these are not "u8*".
If they were we wouldn't have to use the & operator to get the address
of them and then cast them to "u8*"/FlatPtr afterwards.
Since we're taking from the committed set of pages, there should never
be a reason for this call to fail.
Also add a Badge to disallow taking committed pages from anywhere but
the Region class.
We don't need to have a dedicated API for creating a VMObject with a
single page, the multi-page API option works in all cases.
Also make the API take a Span<NonnullRefPtr<PhysicalPage>> instead of
a NonnullRefPtrVector<PhysicalPage>.
Depending on the values it might be difficult to figure out whether a
value is decimal or hexadecimal. So let's make this more obvious. Also
this allows copying and pasting those numbers into GNOME calculator and
probably also other apps which auto-detect the base.
KBufferBuilder is always allowed to expand if it wants to. This
restriction was added a long time ago when it was unsafe to allocate
VM while generating ProcFS contents.
Use a Mutex instead of a SpinLock to protect the per-FileDescription
generated data cache. This allows processes to go to sleep while
waiting their turn.
Also don't try to be clever by reusing existing cache buffers.
Just allocate KBuffers as needed (and make sure to surface failures.)
KBufferBuilder exists for code that wants to build a KBuffer instead
of a String. KBuffer is backed by anonymous VM, while String is backed
by a kernel heap allocation.
Advisory locks don't actually prevent other processes from writing to
the file, but they do prevent other processes looking to acquire and
advisory lock on the file.
This implementation currently only adds non-blocking locks, which are
all I need for now.
Before we start disabling acquisition of the big process lock for
specific syscalls, make sure to document and assert that all the
lock is held during all syscalls.
There are cases where we want to conditionally take a lock, but still
would like to use an RAII type to make sure we don't leak the lock.
This was previously impossible to do with `MutexLocker` due to it's
design. This commit tweaks the design to allow the object to be
initialized to an "empty" state without a lock associated, so it does
nothing, and then later a lock can be "attached" to the locker.
I realized that the get_lock() API's where also unused, and would no
longer make sense for empty locks, so they were removed.
Move these to MM to simplify the flow of the syscall handler.
While here, also make sure we hold the process space lock for
the duration of the validation to avoid potential issues where
another thread attempts to modify the process space during the
validation. This will allow us to move the validation out of the
big process lock scope in a future change.
Additionally utilize the new no_lock variants of functions to avoid
unnecessary recursive process space spinlock acquisitions.
Currently all syscalls run under the Process:m_big_lock, which is an
obvious bottleneck. Long term we would like to remove the big lock and
replace it with the required fine grained locking.
To facilitate this goal we need a way of gradually decomposing the big
lock into the all of the required fine grained locks. This commit
introduces instrumentation to the syscall table, allowing the big lock
requirement to be toggled on/off per syscall.
Eventually when we are finished, no syscall will required the big lock,
and we'll be able to remove all of this instrumentation.
The entire process is not needed, just require the user to pass in the
Space. Also provide no_lock variant to use when you already have the
VM/Space lock acquired, to avoid unnecessary recursive spinlock
acquisitions.
Depending on the exact layout of the .ksyms section the kernel would
fail to boot because the kernel_load_end variable didn't account for the
section's size.
The non CPU specific code of the kernel shouldn't need to deal with
architecture specific registers, and should instead deal with an
abstract view of the machine. This allows us to remove a variety of
architecture specific ifdefs and helps keep the code slightly more
portable.
We do this by exposing the abstract representation of instruction
pointer, stack pointer, base pointer, return register, etc on the
RegisterState struct.
Allocate all the RX buffers in one big memory region (and same for TX.)
This removes 38 lines from every crash dump (and just seems like a
reasonable idea in general.)
We need to cast physical addresses to PhysicalPtr instead of FlatPtr,
which is currently always 64 bits. However, if one day we were to
support 32 bit non-pae mode then it would also truncate appropriately.
This switches tracking CPU usage to more accurately measure time in
user and kernel land using either the TSC or another time source.
This will also come in handy when implementing a tickless kernel mode.
As threads come and go, we can't simply account for how many time
slices the threads at any given point may have been using. We need to
also account for threads that have since disappeared. This means we
also need to track how many time slices we have expired globally.
However, because this doesn't account for context switches outside of
the system timer tick values may still be under-reported. To solve this
we will need to track more accurate time information on each context
switch.
This also fixes top's cpu usage calculation which was still based on
the number of context switches.
Fixes#6473
This adds a ".profile" extension to perfcore files written by the
Kernel. Also, the process name is now visible in the perfcore filename.
Furthermore, this patch adds error handling for the case where the
filename generated by the Kernel is already taken. In that case, a digit
will be added to the filename (before the extension).
This also adds some more error logging to dump_perfcore().
This implements a simple bootloader that is capable of loading ELF64
kernel images. It does this by using QEMU/GRUB to load the kernel image
from disk and pass it to our bootloader as a Multiboot module.
The bootloader then parses the ELF image and sets it up appropriately.
The kernel's entry point is a C++ function with architecture-native
code.
Co-authored-by: Liav A <liavalb@gmail.com>
When a Thread is being unblocked and we need to re-lock the process
big_lock and re-locking blocks again, then we may end up in
Thread::block again while still servicing the original lock's
Thread::block. So permit recursion as long as it's only the big_lock
that we block on again.
Fixes#8822
We often get queried for the root inode, and it will always be cached
in memory anyway, so let's make Ext2FS::root_inode() fast by keeping
the root inode in a dedicated member variable.
To count the remaining children, we simply need to traverse the
directory and increment a counter. No need for a custom virtual that
all file systems have to implement. :^)
Unless we're accessing mutex-guarded metadata, there's no need to
acquire the inode lock.
The file system ID or inode index of a constructed inode will never
change, for example.
We should never request a regions removal that we don't currently
own. We currently assert this everywhere else by all callers.
Instead lets just push the assert down into the RedBlackTree removal
and assume that we will always successfully remove the region.
This is a much more ergonomic option than getting a
`VERIFY_NOT_REACHED()` failure at run-time. I encountered this issue
with Clang, where sized deallocation is not the default due to ABI
breakage concerns.
Note that we can't simply just not declare these functions, because the
C++ standard states:
> If this function with size parameter is defined, the program shall
> also define the version without the size parameter.
The compiler will use these to allocate objects that have alignment
requirements greater than that of our normal `operator new` (4/8 byte
aligned).
This means we can now use smart pointers for over-aligned types.
Fixes a FIXME.
By default, the compiler will assume that `operator new` returns
pointers that are aligned correctly for every built-in type. This is not
the case in the kernel on x64, since the assumed alignment is 16
(because of long double), but the kmalloc blocks are only
`alignas(void*)`.
Thread::yield_and_release_relock_big_lock releases the big lock, yields
and then relocks the big lock.
Thread::yield_assuming_not_holding_big_lock yields assuming the big
lock is not being held.
When blocking on a Lock other than the big lock and we're holding the
big lock, we need to release the big lock first. This fixes some
deadlocks where a thread blocks while holding the big lock, preventing
other threads from getting the big lock in order to unblock the waiting
thread.
When a Lock blocks (e.g. due to a mode mismatch or because someone
else holds it) the lock mode will be updated to what was requested.
There were also some cases where restoring locks may have not worked
as intended as it may have been held already by the same thread.
Fixes#8787
The kernel doesn't currently boot when using an address other than
0xc0000000 because the page tables aren't set up properly for that
but this at least lets us build the kernel.
The 32-bit boot code jumps to 0xc0000000 + entry address once page
tables are set up. This is unnecessary for 64-bit mode because we'll
do another far jump just moments later.
I botched this in 859e5741ff, the check
was supposed to be with Process::is_kernel_process().
This fixes an issue with zombie processes hanging around forever.
Thanks tomuta for spotting it! :^)
Reimplement directory traversal in terms of read_bytes() instead of
doing direct block access. This lets us avoid taking the inode lock
while iterating over the directory contents.
Once we've finalized all the file system metadata in flush_writes(),
we no longer need to hold the file system lock during the call to
BlockBasedFileSystem::flush_writes().
Ext2FS::get_inode() will remember unknown inode indices that it has
been asked about and put them into the inode cache as null inodes.
flush_writes() was not null-checking these while iterating, which
was a bug I finally managed to hit.
Flushing also seemed like a good time to drop unknown inodes from
the cache, since there's no good reason to hold to them indefinitely.
The file system lock is meant to protect the file system metadata
(super blocks, bitmaps, etc.) Not protect processes from reading
independent parts of the disk at once.
This patch introduces a new lock to protect the *block cache* instead,
which is the real thing that needs synchronization.
Forcing users of a FileDescription to seek before they can read/write
makes it inherently racy. This patch adds variants of read/write that
simply ignore the "current offset" of the description in favor of a
caller-supplied offset.
This data structure is a much better fit for what is essentially a
sorted list of non-overlapping ranges.
Not using Vector means we no longer have to worry about Vector buffers
getting huge. Only nice & small allocations from now on.
This adds a new section .ksyms at the end of the linker map, reserves
5MiB for it (which are after end_of_kernel_image so they get re-used
once MemoryManager is initialized) and then embeds the symbol map into
the kernel binary with objcopy. This also shrinks the .ksyms section to
the real size of the symbol file (around 900KiB at the moment).
By doing this we can make the symbol map available much earlier in the
boot process, i.e. even before VFS is available.
We leak a ref() onto every user process when constructing them,
either via Process::create_user_process(), or via Process::sys$fork().
This ref() is balanced by a corresponding unref() in
Thread::WaitBlockCondition::finalize().
Since kernel processes don't have a leaked ref() on them, this led to
an extra Process::unref() on kernel processes during finalization.
This happened during every boot, with the `init_stage2` process.
Found by turning off kfree() scrubbing. :^)
In case we are about to delete the PID directory, we clear the Process
pointer. If someone still holds a reference to the PID directory (by
opening it), we still need to delete the process, but we can't delete
the directory, so we will keep it alive, but any operation on it will
fail by propogating the error to userspace about that the Process was
deleted and therefore there's no meaning to trying to do operations on
the directory.
Fixes#8576.
The C++ standard says that it's legal to call the `delete` operator with
a null pointer argument, in which case it should be a no-op. I
encountered this issue when running a kernel that's compiled with Clang.
I assume this fact was used for some kind of optimization.
It's possible that another thread might try to exit the process just
about the same time another thread does the same, or a crash happens.
Also, we may not be able to kill all other threads instantly as they
may be blocked in the kernel (though in this case they would get killed
before ever returning back to user mode. So keep track of whether
Process::die was already called and ignore it on subsequent calls.
Fixes#8485
We can have multiple PhysicalRegions (often the case when there is a
huge amount of RAM) so we really shouldn't print a debug message any
time someone tries to allocate from one. They will move on to another
region anyway.
We were incorrectly using sizeof(PhysicalPageEntry) for some address
calculations instead of sizeof(PageTableEntry).
It still worked correctly because they happen to be the same size.
We now keep all the PhysicalZones on one of two intrusive lists within
the PhysicalRegion.
The "usable" list contains all zones that can be allocated from,
and the "full" list contains all zones with no free pages.
Instead of creating a PhysicalRegion and then expanding it over and
over as we traverse the memory map on boot, we now compute the final
size of the contiguous physical range up front, and *then* create a
PhysicalRegion object.
Nobody was using this API to request anythign about `PAGE_SIZE`
alignment, so let's get rid of it for now. We can reimplement it if
we end up needing it.
Also note that it wasn't actually used anywhere.
The previous allocator was very naive and kept the state of all pages
in one big bitmap. When allocating, we had to scan through the bitmap
until we found an unset bit.
This patch introduces a new binary buddy allocator that manages the
physical memory pages.
Each PhysicalRegion is divided into zones (PhysicalZone) of 16MB each.
Any extra pages at the end of physical RAM that don't fit into a 16MB
zone are turned into 15 or fewer 1MB zones.
Each zone starts out with one full-sized block, which is then
recursively subdivided into halves upon allocation, until a block of
the request size can be returned.
There are more opportunities for improvement here: the way zone objects
are allocated and stored is non-optimal. Same goes for the allocation
of buddy block state bitmaps.
Threads that don't make syscalls still need to be killed, and we can
do that at any time we want so long the thread is in user mode and
not somehow blocked (e.g. page fault).
This reverts commit 3c3a1726df.
We cannot blindly kill threads just because they're not executing in a
system call. Being blocked (including in a page fault) needs proper
unblocking and potentially kernel stack cleanup before we can mark a
thread as Dying.
Fixes#8691
This enables the Lock class to block a thread even while the thread is
working on a BlockCondition. A thread can still only be either blocked
by a Lock or a BlockCondition.
This also establishes a linked list of threads that are blocked by a
Lock and unblocking directly unlocks threads and wakes them directly.
It's possible that a timer may have been queued to be executed by
the timer irq handler, but if we're in a critical section on the
same processor and are trying to cancel that timer, we would spin
forever waiting for it to be executed.
This re-arranges the order of how things are initialized so that we
try to initialize process and thread management earlier. This is
neccessary because a lot of the code uses the Lock class, which really
needs to have a running scheduler in place so that we can properly
preempt.
This also enables us to potentially initialize some things in parallel.
Instead of each PhysicalPage knowing whether it comes from the
supervisor pages or from the user pages, we can just check in both
sets when freeing a page.
It's just a handful of pointer range checks, nothing expensive.
There appears to be no reason why the process registration needs
to happen under the space spin lock. As the first thread is not started
yet it should be completely uncontested, but it's still bad practice.
If no other thread is ready to be run we don't need to switch to the
idle thread and wait for the next timer interrupt. We can just give
the thread another timeslice and keep it running.
We need some overflow checks due to the implementation of TmpFS.
When size_t is 32 bits and off_t is 64 bits, we might overflow our
KBuffer max size and confuse the KBuffer set_size code, causing a VERIFY
failure. Make sure that resulting offset + size will fit in a size_t.
Another constraint, we make sure that the resulting offset + size will
be less than half of the maximum value of a size_t, because we double
the KBuffer size each time we resize it.
We had an inconsistency in valid user addresses. is_user_range() was
checking against the kernel base address, but previous changes caused
the maximum valid user addressable range to be 32 MiB below that.
This patch stops mmap(MAP_FIXED) of a range between these two bounds
from panic-ing the kernel in RangeAllocator::allocate_specific.
Previously we would simply assume that Region allocation always
succeeded. There is still one such assumption when splitting user
regions inside a Space. That will be dealt with in a separate commit.
It is not legal to resize a VMObject after it has been created.
As far as I can tell, this code would never actually run since the
object was already populated with physical pages due to using
AllocationStrategy::AllocateNow.
Previously, VirtualFileSystem::mkdir() would always return ENOENT if
no parent custody was returned by resolve_path(). This is incorrect when
e.g. the user has no search permission in a component of the path
prefix (=> EACCES), or if on component of the path prefix is a file (=>
ENOTDIR). This patch fixes that behavior.
This was only used by a single class (AK::ByteBuffer) in the kernel
and not in an OOM-safe way.
Now that ByteBuffer no longer uses it, there's no need for the kernel
heap to burden itself with supporting this.
C++14 gave us sized operator delete, but we haven't been taking
advantage of it. Let's get to a point where it can help us by
adding kfree_sized(void*, size_t).
This removes some assertions from KLexicalPath::basename() by supporting
paths with trailing slashes, empty paths, paths consisting of only
slashes and paths with ending "." and ".." segments.
The spec requires a flush after setting the new buffer resource id,
which is required by QEMUs SDL backend but not the GTK backend. This
brings us in line with the spec and makes it work for the SDL backend.
The System V ABI for both x86 and x86_64 requires that the stack pointer
is 16-byte aligned on entry. Previously we did not align the stack
pointer properly.
As far as "main" was concerned the stack alignment was correct even
without this patch due to how the C++ _start function and the kernel
interacted, i.e. the kernel misaligned the stack as far as the ABI
was concerned but that misalignment (read: it was properly aligned for
a regular function call - but misaligned in terms of what the ABI
dictates) was actually expected by our _start function.
This involves refactoring VirtIOConsole into VirtIOConsole and
VirtIOConsolePort. VirtIOConsole is the VirtIODevice, it owns multiple
VirtIOConsolePorts as well as two control queues. Each
VirtIOConsolePort is a CharacterDevice.
By making sure the PhysicalPage instance is fully destructed the
allocators will have a chance to reclaim the PhysicalPageEntry for
free-list purposes. Just pass them the physical address of the page
that was freed, which is enough to lookup the PhysicalPageEntry later.
By moving the PhysicalPage classes out of the kernel heap into a static
array, one for each physical page, we can avoid the added overhead and
easily find them by indexing into an array.
This also wraps the PhysicalPage into a PhysicalPageEntry, which allows
us to re-use each slot with information where to find the next free
page.
We already use PAE for the NX bit, but this changes the PhysicalAddress
structure to be able to hold 64 bit physical addresses. This allows us
to use all the available physical memory.
If a non-const lvalue reference is passed to these constructors, the
converting constructor will be selected instead of the desired copy/move
constructor.
Since I needed to touch `KResultOr` anyway, I made the forwarding
converting constructor use `forward<U>` instead of `move`. This meant
that previously, if a lvalue was passed to it, a move operation took
place even if no `move()` was called on it. Member initializers and
if-else statements have been changed to match our current coding style.
These functions are only used from within `dbgln_if` calls, so in
certain build configurations, they go unused. Similarly to variables, we
now signal to the compiler that we understand that these are not always
in use.
`.text` segments with non-aligned offsets had their lengths applied to
the first page's base address. This meant that in some cases the last
PAGE_SIZE - 1 bytes weren't mapped. Previously, it did not cause any
problems as the GNU ld insists on aligning everything; but that's not
the case with the LLVM toolchain.
The ProtectedDataMutationScope cannot blindly assume that there is only
exactly one thread at a time that may want to unprotect the Process.
Most of the time the big lock guaranteed this, but there are some cases
such as finalization (among others) where this is not necessarily
guaranteed.
This fixes random panics due to access violations when the
ProtectedDataMutationScope protects the Process instance while another
is still modifying it.
Fixes#8512
This class acts like a combined ref-count as well as a spin-lock
(only when adding the first or removing the last reference), allowing
to run a specific action atomically when adding the first or dropping
the last reference.
This adds a formatter function for OwnPtr<KString>. This is added mainly
because lots of dbgln() statements generate Strings (such as absolute
paths) which are only used for debugging. Instead of catching possible
OOM situations at all the dbgln() callsites, this makes it possible to
let the formatter code handle those situations by outputting "[out of
memory]" if the OwnPtr is null.
This adds KLexicalPath::try_join(). As this cannot be done without
allocation, it uses KString and can fail. This patch also uses it at one
place. All the other cases of String::formatted("{}/{}", ...) currently
rely on the return value being a String, which means they cannot easily
be converted to use the new API.
This replaces all uses of LexicalPath in the Kernel with the functions
from KLexicalPath. This also allows the Kernel to stop including
AK::LexicalPath.
This adds KLexicalPath, which are a few static functions which aim to
mostly emulate AK::LexicalPath. They are however constrained to work
with absolute paths only, containing no '.' or '..' path segments and no
consecutive slashes. This way, it is possible to avoid use StringView
for the return values and thus avoid allocating new String objects.
As explained above, the functions are currently very strict about the
allowed input paths. This seems to not be a problem currently. Since the
functions VERIFY this, potential bugs caused by this will become
immediately obvious.