Commit Graph

45 Commits

Author SHA1 Message Date
Andreas Kling
197e73ee31 Kernel+LibELF: Enable SMAP protection during non-syscall exec()
When loading a new executable, we now map the ELF image in kernel-only
memory and parse it there. Then we use copy_to_user() when initializing
writable regions with data from the executable.

Note that the exec() syscall still disables SMAP protection and will
require additional work. This patch only affects kernel-originated
process spawns.
2020-01-10 10:57:06 +01:00
Andreas Kling
ea1911b561 Kernel: Share code between Region::map() and Region::remap_page()
These were doing mostly the same things, so let's just share the code.
2020-01-01 19:32:55 +01:00
Andreas Kling
0d5e0e4cad Kernel+SystemMonitor: Expose amount of per-process dirty private memory
Dirty private memory is all memory in non-inode-backed mappings that's
process-private, meaning it's not shared with any other process.

This patch exposes that number via SystemMonitor, giving us an idea of
how much memory each process is responsible for all on its own.
2019-12-29 12:28:32 +01:00
Andreas Kling
7a0088c4d2 Kernel: Clean up Region access bit setters a little 2019-12-25 02:58:03 +01:00
Andreas Kling
b6ee8a2c8d Kernel: Rename vmo => vmobject everywhere 2019-12-19 19:15:27 +01:00
Andreas Kling
1d4d6f16b2 Kernel: Add a specific-page variant of Region::commit() 2019-12-18 22:43:32 +01:00
Andreas Kling
931e4b7f5e Kernel+SystemMonitor: Prevent userspace access to process ELF image
Every process keeps its own ELF executable mapped in memory in case we
need to do symbol lookup (for backtraces, etc.)

Until now, it was mapped in a way that made it accessible to the
program, despite the program not having mapped it itself.
I don't really see a need for userspace to have access to this right
now, so let's lock things down a little bit.

This patch makes it inaccessible to userspace and exposes that fact
through /proc/PID/vm (per-region "user_accessible" flag.)
2019-12-15 20:11:57 +01:00
Andreas Kling
3fbc50a350 Kernel+SystemMonitor: Expose the number of set CoW bits in each Region
This number tells us how many more pages in a given region will trigger
a CoW fault if written to.
2019-12-15 16:53:00 +01:00
Andreas Kling
f41ae755ec Kernel: Crash on memory access in non-readable regions
This patch makes it possible to make memory regions non-readable.
This is enforced using the "present" bit in the page tables.
A process that hits an not-present page fault in a non-readable
region will be crashed.
2019-12-02 19:18:52 +01:00
Andreas Kling
7dc9c90f83 Kernel: Fix bug where mprotect() would ignore setting PROT_WRITE
A typo in Region::set_writable() caused us to update the readable flag
rather than the writable flag.
2019-12-02 18:15:36 +01:00
Andreas Kling
3dc87be891 Kernel: Mark mmap()-created regions with a special bit
Then only allow regions with that bit to be manipulated via munmap()
and mprotect(). This prevents messing with non-mmap()ed regions in
a process's address space (stacks, shared buffers, ...)
2019-11-24 12:26:21 +01:00
Andreas Kling
794758df3a Kernel: Implement some basic stack pointer validation
VM regions can now be marked as stack regions, which is then validated
on syscall, and on page fault.

If a thread is caught with its stack pointer pointing into anything
that's *not* a Region with its stack bit set, we'll crash the whole
process with SIGSTKFLT.

Userspace must now allocate custom stacks by using mmap() with the new
MAP_STACK flag. This mechanism was first introduced in OpenBSD, and now
we have it too, yay! :^)
2019-11-17 12:15:43 +01:00
Andreas Kling
d67c6a92db Kernel: Move page fault handling from MemoryManager to Region
After the page fault handler has found the region in which the fault
occurred, do the rest of the work in the region itself.

This patch also makes all fault types consistently crash the process
if a new page is needed but we're all out of pages.
2019-11-04 00:47:03 +01:00
Andreas Kling
0e8f1d7cb6 Kernel: Don't expose a region's page directory to the outside world
Now that region manages its own mapping/unmapping, there's no need for
the outside world to be able to grab at its page directory.
2019-11-04 00:26:00 +01:00
Andreas Kling
6ed9cc4717 Kernel: Remove Region API's for setting/unsetting the page directory
This is done implicitly by mapping or unmapping the region.
2019-11-04 00:24:20 +01:00
Andreas Kling
e3dda4e87b Kernel: Fix weird Region constructor that took nullable RefPtr<Inode>
It's never valid to construct a Region with a null Inode pointer using
this constructor, so just take a NonnullRefPtr<Inode> instead.
2019-11-04 00:21:08 +01:00
Andreas Kling
4bf1a72d21 Kernel: Teach Region how to remap itself
Now remapping (i.e flushing kernel metadata to the CPU page tables)
is done by simply calling Region::remap().
2019-11-03 21:11:08 +01:00
Andreas Kling
3dce0f23f4 Kernel: Regions should be mapped into a PageDirectory, not a Process
This patch changes the parameter to Region::map() to be a PageDirectory
since that matches how we think about the memory model:

Regions are views onto VMObjects, and are mapped into PageDirectories.
Each Process has a PageDirectory. The kernel also has a PageDirectory.
2019-11-03 21:11:08 +01:00
Andreas Kling
2cfc43c982 Kernel: Move region map/unmap operations into the Region class
The more Region can take care of itself, the better.
2019-11-03 21:11:08 +01:00
Andreas Kling
fe455c5ac4 Kernel: Move page remapping into Region::remap_page(index)
Let Region deal with this, instead of everyone calling MemoryManager.
2019-11-03 15:32:11 +01:00
Andreas Kling
d481ae95b5 Kernel: Defer creation of Region CoW bitmaps until they're needed
Instead of allocating and populating a Copy-on-Write bitmap for each
Region up front, wait until we actually clone the Region for sharing
with another process.

In most cases, we never need any CoW bits and we save ourselves a lot
of kmalloc() memory and time.
2019-10-01 19:58:41 +02:00
Andreas Kling
c58d1868cb Kernel: Fix munmap() bad splitting of already-split Regions
When splitting an Region that's already the result of an earlier split,
we have to take the Region's offset-in-VMObject into account since it
may be non-zero.
2019-10-01 11:40:40 +02:00
Andreas Kling
7f9a33dba1 Kernel: Make Region single-owner instead of ref-counted
This simplifies the ownership model and makes Region easier to reason
about. Userspace Regions are now primarily kept by Process::m_regions.

Kernel Regions are kept in various OwnPtr<Regions>'s.

Regions now only ever get unmapped when they are destroyed.
2019-09-27 14:25:42 +02:00
Andreas Kling
5d491fa1cd Kernel: Add a simple slab allocator for small allocations
This is a freelist allocator with static size classes that works as a
complement to the generic kmalloc(). It's a lot faster than kmalloc()
since allocation just means popping from the freelist.

It's also significantly more compact when there are a lot of objects
smaller than the minimum kmalloc chunk size (32 bytes.)

This patch enables it for the Region and PhysicalPage classes.
In the PhysicalPage (8 bytes) case, it's a huge improvement since we
no longer waste 75% of the storage allocated.

There are also a number of ways this can be improved, so let's keep
working on it going forward.
2019-09-16 10:33:27 +02:00
Andreas Kling
73fdbba59c AK: Rename <AK/AKString.h> to <AK/String.h>
This was a workaround to be able to build on case-insensitive file
systems where it might get confused about <string.h> vs <String.h>.

Let's just not support building that way, so String.h can have an
objectively nicer name. :^)
2019-09-06 15:36:54 +02:00
Andreas Kling
e25ade7579 Kernel: Rename "vmo" to "vmobject" everywhere 2019-09-04 11:27:14 +02:00
Andreas Kling
0e53b1d1ad Kernel: Add some convenient getters to Region
Add getters for the underlying Range, the access bits, and also add
contains(Range) which just wraps m_range.contains().
2019-08-29 20:55:40 +02:00
Andreas Kling
f5d779f47e Kernel: Never forcibly page in entire executables
We were doing this for the initial kernel-spawned userspace process(es)
to work around instability in the page fault handler. Now that the page
fault handler is more robust, we can stop worrying about this.

Specifically, the page fault handler was previous not able to handle
getting a page fault in anything but the currently executing task's
page directory.
2019-08-26 13:20:01 +02:00
Andreas Kling
07425580a8 Kernel: Put all Regions on InlineLinkedLists (separated by user/kernel)
Remove the global hash tables and replace them with InlineLinkedLists.
This significantly reduces the kernel heap pressure from doing many
small mmap()'s.
2019-08-08 11:11:22 +02:00
Andreas Kling
5b2447a27b Kernel: Track user accessibility per Region.
Region now has is_user_accessible(), which informs the memory manager how
to map these pages. Previously, we were just passing a "bool user_allowed"
to various functions and I'm not at all sure that any of that was correct.

All the Region constructors are now hidden, and you must go through one of
these helpers to construct a region:

- Region::create_user_accessible(...)
- Region::create_kernel_only(...)

That ensures that we don't accidentally create a Region without specifying
user accessibility. :^)
2019-07-19 16:11:52 +02:00
Andreas Kling
5254a320d8 Kernel: Remove use of copy_ref() in favor of regular RefPtr copies.
This is obviously more readable. If we ever run into a situation where
ref count churn is actually causing trouble in the future, we can deal with
it then. For now, let's keep it simple. :^)
2019-07-11 15:40:04 +02:00
Andreas Kling
27f699ef0c AK: Rename the common integer typedefs to make it obvious what they are.
These types can be picked up by including <AK/Types.h>:

* u8, u16, u32, u64 (unsigned)
* i8, i16, i32, i64 (signed)
2019-07-03 21:20:13 +02:00
Andreas Kling
90b1354688 AK: Rename RetainPtr => RefPtr and Retained => NonnullRefPtr. 2019-06-21 18:37:47 +02:00
Andreas Kling
77b9fa89dd AK: Rename Retainable => RefCounted.
(And various related renames that go along with it.)
2019-06-21 15:30:03 +02:00
Andreas Kling
de65c960e9 Kernel: Tweak some String&& => const String&.
String&& is just not very practical. Also return const String& when the
returned string is a member variable. The call site is free to make a copy
if he wants, but otherwise we can avoid the retain count churn.
2019-06-07 20:58:12 +02:00
Andreas Kling
39d1a9ae66 Meta: Tweak .clang-format to not wrap braces after enums. 2019-06-07 17:13:23 +02:00
Andreas Kling
e42c3b4fd7 Kernel: Rename LinearAddress => VirtualAddress. 2019-06-07 12:56:50 +02:00
Andreas Kling
bc951ca565 Kernel: Run clang-format on everything. 2019-06-07 11:43:58 +02:00
Andreas Kling
baaede1bf9 Kernel: Make the Process allocate_region* API's understand "int prot".
Instead of having to inspect 'prot' at every call site, make the Process
API's take care of that so we can just pass it through.
2019-05-30 16:14:37 +02:00
Robin Burchell
0dc9af5f7e Add clang-format file
Also run it across the whole tree to get everything using the One True Style.
We don't yet run this in an automated fashion as it's a little slow, but
there is a snippet to do so in makeall.sh.
2019-05-28 17:31:20 +02:00
Andreas Kling
87b54a82c7 Kernel: Let Region keep a Range internally. 2019-05-17 04:32:08 +02:00
Andreas Kling
01ffcdfa31 Kernel: Encapsulate the Region's COW map a bit better. 2019-05-14 17:31:57 +02:00
Andreas Kling
b8e60b6652 Kernel: Remove unused Region::is_bitmap(). 2019-05-02 23:31:11 +02:00
Andreas Kling
3f6408919f AK: Improve smart pointer ergonomics a bit. 2019-04-14 02:36:06 +02:00
Andreas Kling
b9738fa8ac Kernel: Move VM-related files into Kernel/VM/.
Also break MemoryManager.{cpp,h} into one file per class.
2019-04-03 15:13:07 +02:00