Commit Graph

224 Commits

Author SHA1 Message Date
Andreas Kling
9a157b5e81 Revert "Kernel: Move Kernel mapping to 0xc0000000"
This reverts commit bd33c66273.

This broke the network card drivers, since they depended on kmalloc
addresses being identity-mapped.
2019-11-23 17:27:09 +01:00
Jesse Buhagiar
bd33c66273 Kernel: Move Kernel mapping to 0xc0000000
The kernel is now no longer identity mapped to the bottom 8MiB of
memory, and is now mapped at the higher address of `0xc0000000`.

The lower ~1MiB of memory (from GRUB's mmap), however is still
identity mapped to provide an easy way for the kernel to get
physical pages for things such as DMA etc. These could later be
mapped to the higher address too, as I'm not too sure how to
go about doing this elegantly without a lot of address subtractions.
2019-11-22 16:23:23 +01:00
Andreas Kling
794758df3a Kernel: Implement some basic stack pointer validation
VM regions can now be marked as stack regions, which is then validated
on syscall, and on page fault.

If a thread is caught with its stack pointer pointing into anything
that's *not* a Region with its stack bit set, we'll crash the whole
process with SIGSTKFLT.

Userspace must now allocate custom stacks by using mmap() with the new
MAP_STACK flag. This mechanism was first introduced in OpenBSD, and now
we have it too, yay! :^)
2019-11-17 12:15:43 +01:00
Liav A
bce510bf6f Kernel: Fix the search method of free userspace physical pages (#742)
Now the userspace page allocator will search through physical regions,
and stop the search as it finds an available page.

Also remove an "address of" sign since we don't need that when
counting size of physical regions
2019-11-08 22:39:29 +01:00
supercomputer7
c3c905aa6c Kernel: Removing hardcoded offsets from Memory Manager
Now the kernel page directory and the page tables are located at a
safe address, to prevent from paging data colliding with garbage.
2019-11-08 17:38:23 +01:00
Andreas Kling
19398cd7d5 Kernel: Reorganize memory layout a bit
Move the kernel image to the 1 MB physical mark. This prevents it from
colliding with stuff like the VGA memory. This was causing us to end
up with the BIOS screen contents sneaking into kernel memory sometimes.

This patch also bumps the kmalloc heap size from 1 MB to 3 MB. It's not
the perfect permanent solution (obviously) but it should get the OOM
monkey off our backs for a while.
2019-11-04 12:04:35 +01:00
Andreas Kling
d67c6a92db Kernel: Move page fault handling from MemoryManager to Region
After the page fault handler has found the region in which the fault
occurred, do the rest of the work in the region itself.

This patch also makes all fault types consistently crash the process
if a new page is needed but we're all out of pages.
2019-11-04 00:47:03 +01:00
Andreas Kling
0e8f1d7cb6 Kernel: Don't expose a region's page directory to the outside world
Now that region manages its own mapping/unmapping, there's no need for
the outside world to be able to grab at its page directory.
2019-11-04 00:26:00 +01:00
Andreas Kling
9b2dc36229 Kernel: Merge MemoryManager::map_region_at_address() into Region::map() 2019-11-04 00:05:57 +01:00
Andreas Kling
98b328754e Kernel: Fix bad setup of CoW faults for offset regions
Regions with an offset into their VMObject were incorrectly adding the
page offset when indexing into the CoW bitmap.
2019-11-03 23:54:35 +01:00
Andreas Kling
5b7f8634e3 Kernel: Set the G (global) bit for kernel page tables
Since the kernel page tables are shared between all processes, there's
no need to (implicitly) flush the TLB for them on every context switch.

Setting the G bit on kernel page tables allows the CPU to keep the
translation caches around.
2019-11-03 23:51:55 +01:00
Andreas Kling
4bf1a72d21 Kernel: Teach Region how to remap itself
Now remapping (i.e flushing kernel metadata to the CPU page tables)
is done by simply calling Region::remap().
2019-11-03 21:11:08 +01:00
Andreas Kling
3dce0f23f4 Kernel: Regions should be mapped into a PageDirectory, not a Process
This patch changes the parameter to Region::map() to be a PageDirectory
since that matches how we think about the memory model:

Regions are views onto VMObjects, and are mapped into PageDirectories.
Each Process has a PageDirectory. The kernel also has a PageDirectory.
2019-11-03 21:11:08 +01:00
Andreas Kling
2cfc43c982 Kernel: Move region map/unmap operations into the Region class
The more Region can take care of itself, the better.
2019-11-03 21:11:08 +01:00
Andreas Kling
a221cddeec Kernel: Clean up a bunch of wrong-looking Region/VMObject code
Since a Region is merely a "window" onto a VMObject, it can both begin
and end at a distance from the VMObject's boundaries.
Therefore, we should always be computing indices into a VMObject's
physical page array by adding the Region's "first_page_index()".

There was a whole bunch of code that forgot to do that. This fixes
many wrong behaviors for Regions that start part-way into a VMObject.
2019-11-03 15:44:13 +01:00
Andreas Kling
fe455c5ac4 Kernel: Move page remapping into Region::remap_page(index)
Let Region deal with this, instead of everyone calling MemoryManager.
2019-11-03 15:32:11 +01:00
Andreas Kling
b0321bf290 Kernel: Zero-fill faults should not temporarily enable interrupts
We were doing a temporary STI/CLI in MemoryManager::zero_page() to be
able to acquire the VMObject's lock before zeroing out a page.

This logic was inherited from the inode fault handler, where we need
to enable interrupts anyway, since we might need to interact with the
underlying storage device.

Zero-fill faults don't actually need to lock the VMObject, since they
are already guaranteed exclusivity by interrupts being disabled when
entering the fault handler.

This is different from inode faults, where a second thread can often
get an inode fault for the same exact page in the same VMObject before
the first fault handler has received a response from the disk.
This is why the lock exists in the first place, to prevent this race.

This fixes an intermittent crash in sys$execve() that was made much
more visible after I made userspace stacks lazily allocated.
2019-11-01 17:59:47 +01:00
Tom
00a7c48d6e APIC: Enable APIC and start APs 2019-10-16 19:14:02 +02:00
Andreas Kling
35138437ef Kernel+SystemMonitor: Add fault counters
This patch adds three separate per-process fault counters:

- Inode faults

    An inode fault happens when we've memory-mapped a file from disk
    and we end up having to load 1 page (4KB) of the file into memory.

- Zero faults

    Memory returned by mmap() is lazily zeroed out. Every time we have
    to zero out 1 page, we count a zero fault.

- CoW faults

    VM objects can be shared by multiple mappings that make their own
    unique copy iff they want to modify it. The typical reason here is
    memory shared between a parent and child process.
2019-10-02 14:13:49 +02:00
Andreas Kling
d481ae95b5 Kernel: Defer creation of Region CoW bitmaps until they're needed
Instead of allocating and populating a Copy-on-Write bitmap for each
Region up front, wait until we actually clone the Region for sharing
with another process.

In most cases, we never need any CoW bits and we save ourselves a lot
of kmalloc() memory and time.
2019-10-01 19:58:41 +02:00
Conrad Pankoff
fa20a447a9 Kernel: Repair unaligned regions supplied by the boot loader
We were just blindly trusting that the bootloader would only give us
page-aligned memory regions. This is apparently not always the case,
so now we can try to repair those regions.

Fixes #601
2019-09-28 09:23:52 +02:00
Andreas Kling
2584636d19 Kernel: Fix partial munmap() deallocating still-in-use VM
We were always returning the full VM range of the partially-unmapped
Region to the range allocator. This caused us to re-use those addresses
for subsequent VM allocations.

This patch also skips creating a new VMObject in partial munmap().
Instead we just make split regions that point into the same VMObject.

This fixes the mysterious GCC ICE on large C++ programs.
2019-09-27 20:21:52 +02:00
Andreas Kling
7f9a33dba1 Kernel: Make Region single-owner instead of ref-counted
This simplifies the ownership model and makes Region easier to reason
about. Userspace Regions are now primarily kept by Process::m_regions.

Kernel Regions are kept in various OwnPtr<Regions>'s.

Regions now only ever get unmapped when they are destroyed.
2019-09-27 14:25:42 +02:00
Conrad Pankoff
9c5e3cd818 Kernel: Ignore memory the bootloader gives us above 2^32 2019-09-17 09:27:23 +02:00
Andreas Kling
a34f3a3729 Kernel: Fix some bitrot in MemoryManager debug logging code 2019-09-16 14:45:44 +02:00
Andreas Kling
a40afc4562 Kernel: Get rid of MemoryManager::allocate_page_table()
We can just use the physical page allocator directly, there's no need
for a dedicated function for page tables.
2019-09-15 20:34:03 +02:00
Andreas Kling
e25ade7579 Kernel: Rename "vmo" to "vmobject" everywhere 2019-09-04 11:27:14 +02:00
Andreas Kling
dde10f534f Revert "Kernel: Avoid a memcpy() of the whole block when paging in from inode"
This reverts commit 11896d0e26.

This caused a race where other processes using the same InodeVMObject
could end up accessing the newly-mapped physical page before we've
actually filled it with bytes from disk.

It would be nice to avoid these copies without breaking anything.
2019-08-26 13:50:55 +02:00
Andreas Kling
e29fd3cd20 Kernel: Display virtual addresses as V%p instead of L%x
The L was a leftover from when these were called linear addresses.
2019-08-26 11:31:58 +02:00
Andreas Kling
11896d0e26 Kernel: Avoid a memcpy() of the whole block when paging in from inode 2019-08-25 14:34:53 +02:00
Andreas Kling
179158bc97 Kernel: Put debug spam about already-paged-in inode pages behind #ifdef 2019-08-19 17:30:36 +02:00
Andreas Kling
9104d32341 Kernel: Use range-for with InlineLinkedList 2019-08-08 13:40:58 +02:00
Andreas Kling
07425580a8 Kernel: Put all Regions on InlineLinkedLists (separated by user/kernel)
Remove the global hash tables and replace them with InlineLinkedLists.
This significantly reduces the kernel heap pressure from doing many
small mmap()'s.
2019-08-08 11:11:22 +02:00
Andreas Kling
a96d76fd90 Kernel: Put all VMObjects in an InlineLinkedList instead of a HashTable
Using a HashTable to track "all instances of Foo" is only useful if we
actually need to look up entries by some kind of index. And since they
are HashTable (not HashMap), the pointer *is* the index.

Since we have the pointer, we can just use it directly. Duh.
This increase sizeof(VMObject) by two pointers, but removes a global
table that had an entry for every VMObject, where the cost was higher.
It also avoids all the general hash tabling business when creating or
destroying VMObjects. Generally we should do more of this. :^)
2019-08-08 11:11:22 +02:00
Andreas Kling
98ce498922 Kernel: Remove unused MemoryManager::remove_identity_mapping()
This was not actually used and just sitting there being confusing.
2019-08-07 22:13:10 +02:00
Andreas Kling
f5ff796970 Kernel: Always give back VM to the RangeAllocator when unmapping Region
We were only doing this in Process::deallocate_region(), which meant
that kernel-only Regions never gave back their VM.

With this patch, we can start reusing freed-up address space! :^)
2019-08-07 21:57:39 +02:00
Andreas Kling
6bdb81ad87 Kernel: Split VMObject into two classes: Anonymous- and InodeVMObject
InodeVMObject is a VMObject with an underlying Inode in the filesystem.
AnonymousVMObject has no Inode.

I'm happy that InodeVMObject::inode() can now return Inode& instead of
VMObject::inode() return Inode*. :^)
2019-08-07 18:09:32 +02:00
Andreas Kling
cb2d572a14 Kernel: Remove "allow CPU caching" flag on VMObject
This wasn't really thought-through, I was just trying anything to see
if it would make WindowServer faster. This doesn't seem to make much of
a difference either way, so let's just not do it for now.

It's easy to bring back if we think we need it in the future.
2019-08-07 16:34:00 +02:00
Andreas Kling
c973a51a23 Kernel: Make KBuffer lazily populated
KBuffers are now zero-filled on demand instead of up front. This means
that you can create a huge KBuffer and it will only take up VM, not
physical pages (until you access them.)
2019-08-06 15:06:31 +02:00
Andreas Kling
a0bb592b4f Kernel: Allow zero-fill page faults on kernel-only pages
We were short-circuiting the page fault handler a little too eagerly
for page-not-present faults in kernel memory.
If the current page directory already has up-to-date mapps for kernel
memory, allow it to progress to checking for zero-fill conditions.

This will enable us to have lazily populated kernel regions.
2019-08-06 15:06:31 +02:00
Andreas Kling
2ad963d261 Kernel: Add mapping from page directory base (PDB) to PageDirectory
This allows the page fault code to find the owning PageDirectory and
corresponding process for faulting regions.

The mapping is implemented as a global hash map right now, which is
definitely not optimal. We can come up with something better when it
becomes necessary.
2019-08-06 11:30:26 +02:00
Andreas Kling
8d07bce12a Kernel: Break region_from_vaddr() into {user,kernel}_region_from_vaddr
Sometimes you're only interested in either user OR kernel regions but
not both. Let's break this into two functions so the caller can choose
what he's interested in.
2019-08-06 10:33:22 +02:00
Andreas Kling
945f8eb22a Kernel: Don't treat read faults like CoW exceptions
I'm not sure why we would have a non-readable CoW region, but I suppose
we could, so let's not Copy-on-Read in those cases.
2019-08-06 09:39:39 +02:00
Andreas Kling
af4cf01560 Kernel: Clean up the page fault handling code a bit
Not using "else" after "return" unnests the code and makes it easier to
follow. Also use an enum for the two different page fault types.
2019-08-06 09:33:35 +02:00
Andreas Kling
da6c8fe3f8 Kernel: On kernel NP fault, always copy into *active* page directory
If we were using a ProcessPagingScope to temporarily go into another
process's page tables, things would fall apart when hitting a kernel
NP fault, since we'd clone the kernel page directory entry into the
*currently active process's* page directory rather than cloning it
into the *currently active* page directory.
2019-08-06 07:28:35 +02:00
Andreas Kling
b5f1a4ac07 Kernel: Flush the TLB (page only) when copying in a new kernel mapping
Not flushing the TLB here puts us in an infinite page fault loop.
2019-08-04 21:22:11 +02:00
Andreas Kling
f8beb0f665 Kernel: Share the "return to ring 0/3 from signal" trampolines globally.
Generate a special page containing the "return from signal" trampoline code
on startup and then route signalled threads to it. This avoids a page
allocation in every process that ever receives a signal.
2019-07-19 17:01:16 +02:00
Andreas Kling
fdf931cfce Kernel: Remove accidental use of removed Region::set_user_accessible(). 2019-07-19 16:22:09 +02:00
Andreas Kling
5b2447a27b Kernel: Track user accessibility per Region.
Region now has is_user_accessible(), which informs the memory manager how
to map these pages. Previously, we were just passing a "bool user_allowed"
to various functions and I'm not at all sure that any of that was correct.

All the Region constructors are now hidden, and you must go through one of
these helpers to construct a region:

- Region::create_user_accessible(...)
- Region::create_kernel_only(...)

That ensures that we don't accidentally create a Region without specifying
user accessibility. :^)
2019-07-19 16:11:52 +02:00
Andreas Kling
5254a320d8 Kernel: Remove use of copy_ref() in favor of regular RefPtr copies.
This is obviously more readable. If we ever run into a situation where
ref count churn is actually causing trouble in the future, we can deal with
it then. For now, let's keep it simple. :^)
2019-07-11 15:40:04 +02:00
Andreas Kling
27f699ef0c AK: Rename the common integer typedefs to make it obvious what they are.
These types can be picked up by including <AK/Types.h>:

* u8, u16, u32, u64 (unsigned)
* i8, i16, i32, i64 (signed)
2019-07-03 21:20:13 +02:00
Andreas Kling
601b0a8c68 Kernel: Use NonnullRefPtrVector in parts of the kernel. 2019-06-27 13:35:02 +02:00
Andreas Kling
8f3f5ac8ce Kernel: Automatically populate page tables with lazy kernel regions.
If we get an NP page fault in a process, and the fault address is in the
kernel address range (anywhere above 0xc0000000), we probably just need
to copy the page table info over from the kernel page directory.

The kernel doesn't allocate address space until it's needed, and when it
does allocate some, it only puts the info in the kernel page directory,
and any *new* page directories created from that point on. Existing page
directories need to be updated, and that's what this patch fixes.
2019-06-26 22:27:41 +02:00
Andreas Kling
183205d51c Kernel: Make the x86 paging code slightly less insane.
Instead of PDE's and PTE's being weird wrappers around dword*, just have
MemoryManager::ensure_pte() return a PageDirectoryEntry&, which in turn has
a PageTableEntry* entries().

I've been trying to understand how things ended up this way, and I suspect
it was because I inadvertently invoked the PageDirectoryEntry copy ctor in
the original work on this, which must have made me very confused..

Anyways, now things are a bit saner and we can move forward towards a better
future, etc. :^)
2019-06-26 21:45:56 +02:00
Andreas Kling
90b1354688 AK: Rename RetainPtr => RefPtr and Retained => NonnullRefPtr. 2019-06-21 18:37:47 +02:00
Andreas Kling
77b9fa89dd AK: Rename Retainable => RefCounted.
(And various related renames that go along with it.)
2019-06-21 15:30:03 +02:00
Sergey Bugaev
118cb391dd VM: Pass a PhysicalPage by rvalue reference when returning it to the freelist.
This makes no functional difference, but it makes it clear that
MemoryManager and PhysicalRegion take over the actual physical
page represented by this PhysicalPage instance.
2019-06-14 16:14:49 +02:00
Andreas Kling
1c5677032a Kernel: Replace the last "linear" with "virtual". 2019-06-13 21:42:12 +02:00
Conrad Pankoff
aee9317d86 Kernel: Refactor MemoryManager to use a Bitmap rather than a Vector
This significantly reduces the pressure on the kernel heap when
allocating a lot of pages.

Previously at about 250MB allocated, the free page list would outgrow
the kernel's heap. Given that there is no longer a page list, this does
not happen.

The next barrier will be the kernel memory used by the page records for
in-use memory. This kicks in at about 1GB.
2019-06-12 15:38:17 +02:00
Andreas Kling
9da62f52a1 Kernel: Use the Multiboot memory map info to inform our paging setup.
This makes it possible to run Serenity with more than 64 MB of RAM.
Because each physical page is represented by a PhysicalPage object, and such
objects are allocated using kmalloc_eternal(), more RAM means more pressure
on kmalloc_eternal(), so we're gonna need a better strategy for this.

But for now, let's just celebrate that we can use the 128 MB of RAM we've
been telling QEMU to run with. :^)
2019-06-09 11:48:58 +02:00
Andreas Kling
736092a087 Kernel: Move i386.{cpp,h} => Arch/i386/CPU.{cpp,h}
There's a ton of work that would need to be done before we could spin up on
another architecture, but let's at least try to separate things out a bit.
2019-06-07 20:02:01 +02:00
Andreas Kling
e42c3b4fd7 Kernel: Rename LinearAddress => VirtualAddress. 2019-06-07 12:56:50 +02:00
Andreas Kling
bc951ca565 Kernel: Run clang-format on everything. 2019-06-07 11:43:58 +02:00
Andreas Kling
49768524d4 VM: Get rid of KernelPagingScope.
Every page directory inherits the kernel page directory, so there's no need
to explicitly enter the kernel's paging scope anymore.
2019-06-01 17:51:48 +02:00
Andreas Kling
baaede1bf9 Kernel: Make the Process allocate_region* API's understand "int prot".
Instead of having to inspect 'prot' at every call site, make the Process
API's take care of that so we can just pass it through.
2019-05-30 16:14:37 +02:00
Andreas Kling
bcc6ddfb6b Kernel: Let PageDirectory own the associated RangeAllocator.
Since we transition to a new PageDirectory on exec(), we need a matching
RangeAllocator to go with the new directory. Instead of juggling this in
Process and MemoryManager, simply attach the RangeAllocator to the
PageDirectory instead.

Fixes #61.
2019-05-20 04:46:29 +02:00
Andreas Kling
87b54a82c7 Kernel: Let Region keep a Range internally. 2019-05-17 04:32:08 +02:00
Andreas Kling
4a6fcfbacf Kernel: Use a RangeAllocator for kernel-only virtual space allocation too. 2019-05-17 04:02:29 +02:00
Andreas Kling
176f683f66 Kernel: Move Inode to its own files. 2019-05-16 03:02:37 +02:00
Andreas Kling
01ffcdfa31 Kernel: Encapsulate the Region's COW map a bit better. 2019-05-14 17:31:57 +02:00
Andreas Kling
7c10a93d48 Kernel: Make allocate_kernel_region() commit the region automatically.
This means that kernel regions will eagerly get physical pages allocated.
It would be nice to zero-fill these on demand instead, but that would
require a bunch of MemoryManager changes.
2019-05-14 15:38:00 +02:00
Andreas Kling
c8a216b107 Kernel: Allocate kernel stacks for threads using the region allocator.
This patch moves away from using kmalloc memory for thread kernel stacks.
This reduces pressure on kmalloc (16 KB per thread adds up fast) and
prevents kernel stack overflow from scribbling all over random unrelated
kernel memory.
2019-05-14 11:51:00 +02:00
Andreas Kling
3f6408919f AK: Improve smart pointer ergonomics a bit. 2019-04-14 02:36:06 +02:00
Andreas Kling
b9738fa8ac Kernel: Move VM-related files into Kernel/VM/.
Also break MemoryManager.{cpp,h} into one file per class.
2019-04-03 15:13:07 +02:00