If we unregister from the RegionTree before unmapping, there's a race
where a new region can get inserted at the same address that we're about
to unmap. If this happens, ~Region() will then unmap the newly inserted
region, which now finds itself with cleared-out page table entries.
This had no business being in RegionTree, since RegionTree doesn't track
identity-mapped regions anyway. (We allow *any* address to be identity
mapped, not just the ones that are part of the RegionTree's range.)
This patch adds RegionTree::get_lock() which exposes the internal lock
inside RegionTree. We can then lock it from the outside when doing
lookups or traversal.
This solution is not very beautiful, we should find a way to protect
this data with SpinlockProtected or something similar. This is a stopgap
patch to try and fix the currently flaky CI.
...and remove the last remaining client of the API. It's no longer
possible to ask the RegionTree for a VM range. You can only ask it to
place your Region somewhere in available space.
This patch move AddressSpace (the per-process memory manager) to using
the new atomic "place" APIs in RegionTree as well, just like we did for
MemoryManager in the previous commit.
This required updating quite a few places where VM allocation and
actually committing a Region object to the AddressSpace were separated
by other code.
All you have to do now is call into AddressSpace once and it'll take
care of everything for you.
Instead of first allocating the VM range, and then inserting a region
with that range into the MM region tree, we now do both things in a
single atomic operation:
- RegionTree::place_anywhere(Region&, size, alignment)
- RegionTree::place_specifically(Region&, address, size)
To reduce the number of things we do while locking the region tree,
we also require callers to provide a constructed Region object.
This patch ports MemoryManager to RegionTree as well. The biggest
difference between this and the userspace code is that kernel regions
are owned by extant OwnPtr<Region> objects spread around the kernel,
while userspace regions are owned by the AddressSpace itself.
For kernelspace, there are a couple of situations where we need to make
large VM reservations that never get backed by regular VMObjects
(for example the kernel image reservation, or the big kmalloc range.)
Since we can't make a VM reservation without a Region object anymore,
this patch adds a way to create unbacked Region objects that can be
used for this exact purpose. They have no internal VMObject.)
RegionTree holds an IntrusiveRedBlackTree of Region objects and vends a
set of APIs for allocating memory ranges.
It's used by AddressSpace at the moment, and will be used by MM soon.
This patch stops using VirtualRangeAllocator in AddressSpace and instead
looks for holes in the region tree when allocating VM space.
There are many benefits:
- VirtualRangeAllocator is non-intrusive and would call kmalloc/kfree
when used. This new solution is allocation-free. This was a source
of unpleasant MM/kmalloc deadlocks.
- We consolidate authority on what the address space looks like in a
single place. Previously, we had both the range allocator *and* the
region tree both being used to determine if an address was valid.
Now there is only the region tree.
- Deallocation of VM when splitting regions is no longer complicated,
as we don't need to keep two separate trees in sync.
Now that we reclaim the memory range that is created by KASLR before
the start of the kernel image, there's no need to be conservative with
the KASLR offset.
This ensures we don't just waste the memory range between the default
base load address and the actual load address that was shifted by the
KASLR offset.
If we crashed in the middle of mapping in Regions, some of the regions
may not have a page directory yet, and will result in a crash when
Region::remap() is called.
If someone specifically wants contiguous memory in the low-physical-
address-for-DMA range ("super pages"), they can use the
allocate_dma_buffer_pages() helper.
Function-local `static constexpr` variables can be `constexpr`. This
can reduce memory consumption, binary size, and offer additional
compiler optimizations.
These changes result in a stripped x86_64 kernel binary size reduction
of 592 bytes.
As make<T> is infallible, it really should not be used anywhere in the
Kernel. Instead replace with fallible `new (nothrow)` calls, that will
eventually be error-propagated.
When a page fault led to the mapping of a new physical page, we were
updating the page tables for *every* region that shared the same
underlying VMObject.
Let's just not do that, avoiding a bunch of unnecessary page table
updates and TLB invalidations.