The only purpose of the remap() in Region::try_clone() is to ensure
non-writable page table entries for CoW regions. If a region is already
non-writable, there's no need to waste time updating the page tables.
When mapping or unmapping completely inaccessible memory regions,
we don't need to update the page tables at all. This saves a bunch of
time in some situations, most notably during dynamic linking, where we
make a large VM reservation and immediately throw it away. :^)
We were already only tracking kernel regions, this patch just makes it
more clear by having it reflected in the name of the registration
helpers.
We also stop calling them for userspace regions, avoiding some spinlock
action in such cases.
This optimization was added when region lookup was O(n), before we had
the O(log n) RedBlackTree. Let's remove it to simplify the code, as we
have no evidence that it remains valuable.
Previously we would only remove them from the map if they were attached
to an AddressSpace, even though we would always add them to the map on
construction. This results in an assertion failure on destruction if
the page directory was never attached to an AddressSpace. (for example,
on an allocation failure of said AddressSpace)
This mostly just moved the problem, as a lot of the callers are not
capable of propagating the errors themselves, but it's a step in the
right direction.
When deleting an entire AddressSpace, we don't need to do TLB flushes
at all (since the entire page directory is going away anyway).
We also don't need to deallocate VM ranges one by one, since the entire
VM range allocator will be deleted anyway.
The purpose of the PageDirectory::m_page_tables map was really just
to act as ref-counting storage for PhysicalPage objects that were
being used for the directory's page tables.
However, this was basically redundant, since we can find the physical
address of each page table from the page directory, and we can find the
PhysicalPage object from MemoryManager::get_physical_page_entry().
So if we just manually ref() and unref() the pages when they go in and
out of the directory, we no longer need PageDirectory::m_page_tables!
Not only does this remove a bunch of kmalloc() traffic, it also solves
a race condition that would occur when lazily adding a new page table
to a directory:
Previously, when MemoryManager::ensure_pte() would call HashMap::set()
to insert the new page table into m_page_tables, if the HashMap had to
grow its internal storage, it would call kmalloc(). If that kmalloc()
would need to perform heap expansion, it would end up calling
ensure_pte() again, which would clobber the page directory mapping used
by the outer invocation of ensure_pte().
The net result of the above bug would be that any invocation of
MemoryManager::ensure_pte() could erroneously return a pointer into
a kernel page table instead of the correct one!
This whole problem goes away when we remove the HashMap, as ensure_pte()
no longer does anything that allocates from the heap.
Not all drivers need the PhysicalPage output parameter while creating
a DMA buffer. This overload will avoid creating a temporary variable
for the caller
The cacheable parameter to allocate_kernel_region should be explicitly
set to No as this region is used to do physical memory transfers. Even
though most architectures ignore this even if it is set, it is better
to make this explicit.
FixedArray now doesn't expose any infallible constructors anymore.
Rather, it exposes fallible methods. Therefore, it can be used for
OOM-safe code.
This commit also converts the rest of the system to use the new API.
However, as an example, VMObject can't take advantage of this yet,
as we would have to endow VMObject with a fallible static
construction method, which would require a very fundamental change
to VMObject's whole inheritance hierarchy.
So far we only had mmap(2) functionality on the /dev/mem device, but now
we can also do read(2) on it.
The test unit was updated to check we are doing it safely.
As it was pointed by Idan Horowitz, the rest of the method doesn't
assume we have any reserved ranges to allow mmap(2) to work on them, so
the VERIFY is not needed at all.
This was a premature optimization from the early days of SerenityOS.
The eternal heap was a simple bump pointer allocator over a static
byte array. My original idea was to avoid heap fragmentation and improve
data locality, but both ideas were rooted in cargo culting, not data.
We would reserve 4 MiB at boot and only ended up using ~256 KiB, wasting
the rest.
This patch replaces all kmalloc_eternal() usage by regular kmalloc().
Previously, the heap expansion logic could end up calling kmalloc
recursively, which was quite messy and hard to reason about.
This patch redesigns heap expansion so that it's kmalloc-free:
- We make a single large virtual range allocation at startup
- When expanding, we bump allocate VM from that region
- When expanding, we populate page tables directly ourselves,
instead of going via MemoryManager.
This makes heap expansion a great deal simpler. However, do note that it
introduces two new flaws that we'll need to deal with eventually:
- The single virtual range allocation is limited to 64 MiB and once
exhausted, kmalloc() will fail. (Actually, it will PANIC for now..)
- The kmalloc heap can no longer shrink once expanded. Subheaps stay
in place once constructed.
The function to protect ksyms after initialization, is only used during
boot of the system, so it can be UNMAP_AFTER_INIT as well.
This requires we switch the order of the init sequence, so we now call
`MM.protect_ksyms_after_init()` before `MM.unmap_text_after_init()`.
As a small cleanup, this also makes `page_round_up` verify its
precondition with `page_round_up_would_wrap` (which callers are expected
to call), rather than having its own logic.
Fixes#11297.
This error only ever gets propagated to the userspace if
MAP_FIXED_NOREPLACE is requested, as MAP_FIXED unmaps intersecting
ranges beforehand, and non-fixed mmap() calls will just fall back to
allocating anywhere.
Linux specifies MAP_FIXED_NOREPLACE to return EEXIST when it can't
allocate, we now match that behavior.
The Prekernel's memory is only accessed until MemoryManager has been
initialized. Keeping them around afterwards is both unnecessary and bad,
as it prevents the userland from using the 0x100000-0x155000 virtual
address range.
Co-authored-by: Idan Horowitz <idan.horowitz@gmail.com>
In order to reduce our reliance on __builtin_{ffs, clz, ctz, popcount},
this commit removes all calls to these functions and replaces them with
the equivalent functions in AK/BuiltinWrappers.h.
We can leave the .ksyms section mapped-but-read-only and then have the
symbols index simply point into it.
Note that we manually insert null-terminators into the symbols section
while parsing it.
This gets rid of ~950 KiB of kmalloc_eternal() at startup. :^)
Since it's possible to determine where the small zones will start to
occur for each PhysicalRegion, we can use arithmetic so that the call
time for both large and small zones is identical.
It's not enough to just find the largest-address-not-above the argument,
we must also check that the found region actually contains the argument.
Regressed in a23edd42b8, thanks to Idan
for pointing this out.
Most of the time, we will be freeing physical pages within the
full-sized zones. We can do some simple math to find the right zone
immediately instead of looping through the zones, checking each one.
We still do loop through the slack/remainder zones at the end.
There's probably an even nicer way to solve this, but this is already a
nice improvement. :^)
We were already doing this for userspace memory regions (in the
Memory::AddressSpace class), so let's do it for kernel regions as well.
This gives a nice speed-up on test-js and probably basically everything
else as well. :^)
SIGSTKFLT is a signal that signifies a stack fault in a x87 coprocessor,
this signal is not POSIX and also unused by Linux and the BSDs, so let's
use SIGSEGV so programs that setup signal handlers for the common
signals could still handle them in serenity.