Commit Graph

612 Commits

Author SHA1 Message Date
Andreas Kling
f35108fc31 Kernel: Simplify PageDirectory allocation failure
This patch gets rid of the "valid" bit in PageDirectory since it was
only used to communicate an allocation failure during construction.

We now do all the work in the static factory functions instead of in the
constructor, which allows us to simply return nullptr instead of an
"invalid" PageDirectory.
2021-08-06 00:37:47 +02:00
Andreas Kling
27100126c0 Kernel: Fix logic typo in AnonymousVMObject::handle_cow_fault()
Introduced in dd58d0f650
2021-08-06 00:37:47 +02:00
Idan Horowitz
3e909c0c49 Kernel: Remove double-counting of allocated pages in AnonymousVMObject
When constructing an AnonymousVMObject with the AllocateNow allocation
strategy we accidentally allocated the committed pages directly through
MemoryManager instead of taking them from our m_unused_physical_pages
CommittedPhysicalPageSet, which meant they were counted as allocated in
MemoryManager, but were still counted as unallocated in the PageSet,
who would then try to uncommit them on destruction, resulting in a
failed assertion.

To help prevent similar issues in the future a Badge<T> was added to
MM::allocate_committed_user_physical_page to prevent allocation of
commited pages not via a CommittedPhysicalPageSet.
2021-08-05 20:26:47 +02:00
Andreas Kling
dd58d0f650 Kernel: Uncommit a shared COW page when discovering it was unshared
When we hit a COW fault and discover than no other process is sharing
the physical page, we simply remap it r/w and save ourselves the
trouble. When this happens, we can also give back (uncommit) one of our
shared committed COW pages, since we won't be needing it.

We had this optimization before, but I mistakenly removed it in
50472fd69f since I had misunderstood
it to be the reason for a panic.
2021-08-05 17:41:58 +02:00
Andreas Kling
843d0d0d15 Kernel: Detach AnonymousVMObject from shared COW pages set once emptied
We currently overcommit for COW when forking a process and cloning its
memory regions. Both the parent and child process share a set of.
committed COW pages.

If there's COW sharing across more than two processeses within a lineage
(e.g parent, child & grandchild), it's possible to exhaust these pages.
When the shared set is emptied, the next COW fault in each process must
detach from the shared set and fall back to on demand allocation.

This patch makes sure that we detach from the shared set once we
discover it to be empty (during COW fault handling). This fixes an issue
where we'd try to allocate from an exhausted shared set while building
GNU binutils inside SerenityOS.
2021-08-05 17:41:58 +02:00
Andreas Kling
89a9ae7d0c Kernel: Handle AnonymousVMObject allocation failure when forking
Thanks to all the RAII, AnonymousVMObject::try_clone() can now
gracefully handle allocation failure.
2021-08-05 17:41:58 +02:00
Andreas Kling
0672163840 Kernel: Simplify AnonymousVMObject copy constructor
It was doing a bunch of things it didn't need to do. I think we had
misunderstood the base class as having copied m_lock in its copy
constructor but it's actually default initialized.
2021-08-05 17:41:58 +02:00
Andreas Kling
fa627c1eb2 Kernel: Use RAII to manage committed physical pages
We had issues with committed physical pages getting miscounted in some
situations, and instead of figuring out what was going wrong and making
sure all the commits had matching uncommits, this patch makes the
problem go away by adding an RAII class to manage this instead. :^)

MemoryManager::commit_user_physical_pages() now returns an (optional)
CommittedPhysicalPageSet. You can then allocate pages from the page set
by calling take_one() on it. Any unallocated pages are uncommitted upon
destruction of the page set.
2021-08-05 17:41:58 +02:00
Idan Horowitz
ffee3d6c5d Kernel: Remove the always 1-sized super physical regions Vector
Since we only ever add 1 super physical region, theres no reason to
add the extra redirection and allocation that comes with a dynamically
sized Vector.
2021-08-04 21:02:40 +02:00
Andreas Kling
821a6e8b4c Kernel: Remap regions after changing purgeable VM object volatile flag
Otherwise we could end up with stale page table mappings that give us
erroneous Protection Violation faults when accessing valid addresses.
2021-08-03 10:23:36 +02:00
Andreas Kling
363a901603 Kernel: Copy the "purgeable" flag when cloning AnonymousVMObject 2021-07-31 11:22:53 +02:00
Andreas Kling
25b76462bf Revert "Kernel: Share committed COW pages between whole VMObject lineage"
This reverts commit 2c0df5e7e7.

This caused OOM on CI, so let's roll it back until we can figure it out.
2021-07-30 13:50:24 +02:00
Andreas Kling
2c0df5e7e7 Kernel: Share committed COW pages between whole VMObject lineage
When cloning an AnonymousVMObject, committed COW pages are shared
between the parent and child object. Whicever object COW's first will
take the shared committed page, and if the other object ends up doing
a COW as well, it will notice that the page is no longer shared by
two objects and simple remap it as read/write.

When a child is COW'ed again, while still having shared committed
pages with its own parent, the grandchild object will now join in the
sharing pool with its parent and grandparent. This means that the first
2 of 3 objects that COW will draw from the shared committed pages, and
3rd will remap read/write.

Previously, we would "fork" the shared committed pages when cloning,
which could lead to a situation where the grandparent held on to 1 of
the 3 needed shared committed pages. If both the child and grandchild
COW'ed, they wouldn't have enough pages, and since the grandparent
maintained an extra +1 ref count on the page, it wasn't possible to
to remap read/write.
2021-07-30 13:17:55 +02:00
Andreas Kling
bccdc08487 Kernel: Unmapping a non-mapped region with munmap() should be a no-op
Not a regression per se from 0fcb9efd86
since we were crashing before that which is obviously worse.
2021-07-30 13:16:55 +02:00
Brian Gianforcaro
0fcb9efd86 Kernel: Return an error when unmap finds no intersecting region
We currently always crash if a user attempts to unmap a range that
does not intersect with an existing region, no matter the size. This
happens because we will never explicitly check to see if the search
for intersecting regions found anything, instead loop over the results,
which might be an empty vector. We then attempt to deallocate the
requested range from the `RangeAllocator` unconditionally, which will
be invalid if the specified range is not managed by the RangeAllocator.
We will assert validating m_total_range.contains(..) the range we are
requesting to deallocate.

This fix to this is straight forward, error out if we weren't able to
find any intersections.

You can get stress-ng to attempt this pattern with the following
arguments, which will attempt to unmap 0x0 through some large offset:

```
stress-ng --vm-segv 1
```

Fixes: #8483

Co-authored-by: Federico Guerinoni <guerinoni.federico@gmail.com>
2021-07-30 11:28:55 +02:00
Andreas Kling
c69e147b8b Kernel: Improve some comments in Space
Remove bogus FIXME's and improve some comments.
2021-07-27 15:04:36 +02:00
Andreas Kling
a085168c52 Kernel: Rename Space::create => Space::try_create() 2021-07-27 14:54:35 +02:00
Andreas Kling
1e43292c3b Kernel: Introduce ProcessorSpecific<T> for per-CPU data structures
To add a new per-CPU data structure, add an ID for it to the
ProcessorSpecificDataID enum.

Then call ProcessorSpecific<T>::initialize() when you are ready to
construct the per-CPU data structure on the current CPU. It can then
be accessed via ProcessorSpecific<T>::get().

This patch replaces the existing hard-coded mechanisms for Scheduler
and MemoryManager per-CPU data structure.
2021-07-27 14:32:30 +02:00
Andreas Kling
559ab00249 Kernel: Remove unused Region::translate_vmobject_page_range() 2021-07-27 13:17:33 +02:00
Gunnar Beutner
57417a3d6e Kernel: Support loading the kernel at almost arbitrary virtual addresses
This enables further work on implementing KASLR by adding relocation
support to the pre-kernel and updating the kernel to be less dependent
on specific virtual memory layouts.
2021-07-27 13:15:16 +02:00
Gunnar Beutner
b10a86d463 Prekernel: Export some multiboot parameters in our own BootInfo struct
This allows us to specify virtual addresses for things the kernel should
access via virtual addresses later on. By doing this we can make the
kernel independent from specific physical addresses.
2021-07-27 13:15:16 +02:00
Gunnar Beutner
3c616ae00f Kernel: Make the kernel independent from specific physical addresses
Previously the kernel relied on a fixed offset between virtual and
physical addresses based on the kernel's load address. This allows us
to specify an independent offset.
2021-07-27 13:15:16 +02:00
Andreas Kling
50472fd69f Kernel: Don't try to return a committed page that we don't have
When we get a COW fault and discover that whoever we were COW'ing
together with has either COW'ed that page on their end (or they have
unmapped/exited) we simplify life for ourselves by clearing the COW
bit and keeping the page we already have. (No need to COW if the page
is not shared!)

The act of doing this does not return a committed page to the pool.
In fact, that committed page we had reserved for this purpose was used
up (allocated) by our COW buddy when they COW'ed the page.

This fixes a kernel panic when running TestLibCMkTemp. :^)
2021-07-26 00:39:10 +02:00
Andreas Kling
101486279f Kernel: Clear the COW bits when making an AnonymousVMObject volatile 2021-07-26 00:39:10 +02:00
Andreas Kling
6a537ceef1 Kernel: Remove ContiguousVMObject, let AnonymousVMObject do the job
We don't need an entirely separate VMObject subclass to influence the
location of the physical pages.

Instead, we simply allocate enough physically contiguous memory first,
and then pass it to the AnonymousVMObject constructor that takes a span
of physical pages.
2021-07-25 18:44:47 +02:00
Andreas Kling
9a701eafc4 Kernel: Run clang-format on AnonymousVMObject.cpp 2021-07-25 18:21:59 +02:00
Andreas Kling
0d963fd641 Kernel: Remove unnecessary counting of VMObject-attached Regions
VMObject already has an IntrusiveList of all the Regions that map it.
We were keeping a counter in addition to this, and only using it in
a single place to avoid iterating over the list in case it only had
1 entry.

Simplify VMObject by removing this counter and always iterating the
list even if there's only 1 entry. :^)
2021-07-25 17:28:06 +02:00
Andreas Kling
ae3778c303 Kernel: Remove unused enum Region::SetVolatileError 2021-07-25 17:28:06 +02:00
Andreas Kling
4648bcd3d4 Kernel: Remove unnecessary weak pointer from Region to owning Process
This was previously used for a single debug logging statement during
memory purging. There are no remaining users of this weak pointer,
so let's get rid of it.
2021-07-25 17:28:06 +02:00
Andreas Kling
25a5fd870c Kernel: Add missing locking when registering VMObjectDeletedHandlers 2021-07-25 17:28:06 +02:00
Andreas Kling
5fb91e2e84 Kernel: Don't COW volatile VM objects
If a purgeable VM object is in the "volatile" state when we're asked
to make a COW clone of it, make life simpler by simply "purging"
the cloned object right away.

This effectively means that a fork()'ed child process will discover
its purgeable+volatile regions to be empty if/when it tries making
them non-volatile.
2021-07-25 17:28:05 +02:00
Andreas Kling
297c0748f0 Kernel: Minor cleanup around purge() during physical page allocation 2021-07-25 17:28:05 +02:00
Andreas Kling
2d1a651e0a Kernel: Make purgeable memory a VMObject level concept (again)
This patch changes the semantics of purgeable memory.

- AnonymousVMObject now has a "purgeable" flag. It can only be set when
  constructing the object. (Previously, all anonymous memory was
  effectively purgeable.)

- AnonymousVMObject now has a "volatile" flag. It covers the entire
  range of physical pages. (Previously, we tracked ranges of volatile
  pages, effectively making it a page-level concept.)

- Non-volatile objects maintain a physical page reservation via the
  committed pages mechanism, to ensure full coverage for page faults.

- When an object is made volatile, it relinquishes any unused committed
  pages immediately. If later made non-volatile again, we then attempt
  to make a new committed pages reservation. If this fails, we return
  ENOMEM to userspace.

mmap() now creates purgeable objects if passed the MAP_PURGEABLE option
together with MAP_ANONYMOUS. anon_create() memory is always purgeable.
2021-07-25 17:28:05 +02:00
Andreas Kling
13a2e91fc5 Kernel: No need to use safe_memcpy() when handling an inode fault
We're copying the inode contents from a stack buffer into a page that
we just quick-mapped, so there's no reason for this memcpy() to fail.
2021-07-23 14:19:47 +02:00
Andreas Kling
082ed6f417 Kernel: Simplify VMObject locking & page fault handlers
This patch greatly simplifies VMObject locking by doing two things:

1. Giving VMObject an IntrusiveList of all its mapping Region objects.
2. Removing VMObject::m_paging_lock in favor of VMObject::m_lock

Before (1), VMObject::for_each_region() was forced to acquire the
global MM lock (since it worked by walking MemoryManager's list of
all regions and checking for regions that pointed to itself.)

With each VMObject having its own list of Regions, VMObject's own
m_lock is all we need.

Before (2), page fault handlers used a separate mutex for preventing
overlapping work. This design required multiple temporary unlocks
and was generally extremely hard to reason about.

Instead, page fault handlers now use VMObject's own m_lock as well.
2021-07-23 03:24:44 +02:00
Andreas Kling
64babcaa83 Kernel: Remove unused MAP_SHARED_ZERO_PAGE_LAZILY code path 2021-07-23 03:24:44 +02:00
Andreas Kling
e44a41d0bf Kernel: Convert Region to east-const style 2021-07-22 23:34:33 +02:00
Gunnar Beutner
f2be1f9326 Kernel: Fix the variable declaration for some linker script symbols
Despite what the declaration would have us believe these are not "u8*".
If they were we wouldn't have to use the & operator to get the address
of them and then cast them to "u8*"/FlatPtr afterwards.
2021-07-22 22:27:11 +02:00
Andreas Kling
0642f8f2c6 Kernel: Make committed physical page allocation return NonnullRefPtr
Since we're taking from the committed set of pages, there should never
be a reason for this call to fail.

Also add a Badge to disallow taking committed pages from anywhere but
the Region class.
2021-07-22 14:20:05 +02:00
Andreas Kling
5217875f6a Kernel: Consolidate API for creating AnonymousVMObject with given pages
We don't need to have a dedicated API for creating a VMObject with a
single page, the multi-page API option works in all cases.

Also make the API take a Span<NonnullRefPtr<PhysicalPage>> instead of
a NonnullRefPtrVector<PhysicalPage>.
2021-07-22 09:17:02 +02:00
Andreas Kling
9e15708aa0 Kernel: Convert VMObject & subclasses to east-const style 2021-07-22 09:17:02 +02:00
Gunnar Beutner
b4272d731f Kernel: Make sure crash dumps are properly aligned on x86_64 2021-07-22 08:57:01 +02:00
Gunnar Beutner
36e36507d5 Everywhere: Prefer using {:#x} over 0x{:x}
We have a dedicated format specifier which adds the "0x" prefix, so
let's use that instead of adding it manually.
2021-07-22 08:57:01 +02:00
Gunnar Beutner
31f30e732a Everywhere: Prefix hexadecimal numbers with 0x
Depending on the values it might be difficult to figure out whether a
value is decimal or hexadecimal. So let's make this more obvious. Also
this allows copying and pasting those numbers into GNOME calculator and
probably also other apps which auto-detect the base.
2021-07-22 08:57:01 +02:00
Gunnar Beutner
be795d5812 Prekernel: Use physical addresses for some of the BootInfo parameters
The kernel would just turn those virtual addresses into physical
addresses later on, so let's just use physical addresses right from the
start.
2021-07-20 15:12:19 +02:00
Gunnar Beutner
dd42093b93 Kernel: Move boot info declarations to a header file
Instead of manually redeclaring those variables in various files this
now adds a header file for them.
2021-07-20 15:12:19 +02:00
Brian Gianforcaro
85e95105c6 Kernel: Mark read only RegisterState function parameters as const 2021-07-20 03:21:14 +02:00
Brian Gianforcaro
27e1120dff Kernel: Move syscall precondition validates to MM
Move these to MM to simplify the flow of the syscall handler.

While here, also make sure we hold the process space lock for
the duration of the validation to avoid potential issues where
another thread attempts to modify the process space during the
validation. This will allow us to move the validation out of the
big process lock scope in a future change.

Additionally utilize the new no_lock variants of functions to avoid
unnecessary recursive process space spinlock acquisitions.
2021-07-20 03:21:14 +02:00
Brian Gianforcaro
308396bca1 Kernel: No lock validate_user_stack variant, switch to Space as argument
The entire process is not needed, just require the user to pass in the
Space. Also provide no_lock variant to use when you already have the
VM/Space lock acquired, to avoid unnecessary recursive spinlock
acquisitions.
2021-07-20 03:21:14 +02:00
Gunnar Beutner
a364f5c7b7 Kernel: Make sure super pages are in the first 16MiB of physical memory
This was broken by recent changes.
2021-07-19 11:29:09 +02:00