ladybird

mirror of https://github.com/LadybirdBrowser/ladybird.git synced 2024-11-10 13:00:29 +03:00

Author	SHA1	Message	Date
Andreas Kling	a819eb5016	Kernel: Skip TLB flushes while cloning regions in sys$fork() Since we know for sure that the virtual memory regions in the new process being created are not being used on any CPU, there's no need to do TLB flushes for every mapped page.	2021-03-03 22:57:45 +01:00
Andreas Kling	5d180d1f99	Everywhere: Rename ASSERT => VERIFY (...and ASSERT_NOT_REACHED => VERIFY_NOT_REACHED) Since all of these checks are done in release builds as well, let's rename them to VERIFY to prevent confusion, as everyone is used to assertions being compiled out in release. We can introduce a new ASSERT macro that is specifically for debug checks, but I'm doing this wholesale conversion first since we've accumulated thousands of these already, and it's not immediately obvious which ones are suitable for ASSERT.	2021-02-23 20:56:54 +01:00
Andreas Kling	4021264201	Kernel: Make the Region constructor private We can use adopt_own(*new T) instead of make<T>().	2021-02-14 01:39:04 +01:00
Andreas Kling	8415866c03	Kernel: Remove user/kernel flags from Region Now that we no longer need to support the signal trampolines being user-accessible inside the kernel memory range, we can get rid of the "kernel" and "user-accessible" flags on Region and simply use the address of the region to determine whether it's kernel or user. This also tightens the page table mapping code, since it can now set user-accessibility based solely on the virtual address of a page.	2021-02-14 01:34:23 +01:00
Andreas Kling	0dbb22e9e0	Kernel: Remove a handful of unused things in VM/ directory Also add some missing initializers.	2021-02-11 22:02:39 +01:00
Andreas Kling	823186031d	Kernel: Add a way to specify which memory regions can make syscalls This patch adds sys$msyscall() which is loosely based on an OpenBSD mechanism for preventing syscalls from non-blessed memory regions. It works similarly to pledge and unveil, you can call it as many times as you like, and when you're finished, you call it with a null pointer and it will stop accepting new regions from then on. If a syscall later happens and doesn't originate from one of the previously blessed regions, the kernel will simply crash the process.	2021-02-02 20:13:44 +01:00
Andreas Kling	e55ef70e5e	Kernel: Remove "has made executable exception for dynamic loader" flag As Idan pointed out, this flag is actually not needed, since we don't allow transitioning from previously-executable to writable anyway.	2021-01-30 10:06:52 +01:00
Andreas Kling	af3d3c5c4a	Kernel: Enforce W^X more strictly (like PaX MPROTECT) This patch adds enforcement of two new rules: - Memory that was previously writable cannot become executable - Memory that was previously executable cannot become writable Unfortunately we have to make an exception for text relocations in the dynamic loader. Since those necessitate writing into a private copy of library code, we allow programs to transition from RW to RX under very specific conditions. See the implementation of sys$mprotect()'s should_make_executable_exception_for_dynamic_loader() for details.	2021-01-29 14:52:27 +01:00
Tom	250a310454	Kernel: Release MM lock while yielding from inode page fault handler We need to make sure other processors can grab the MM lock while we wait, so release it when we might block. Reading the page from disk may also block, so release it during that time as well.	2021-01-27 22:48:41 +01:00
Andreas Kling	a131927c75	Kernel: sys$munmap() region splitting did not preserve "shared" flag This was exploitable since the shared flag determines whether inode permission checks are applied in sys$mprotect(). The bug was pretty hard to spot due to default arguments being used instead. This patch removes the default arguments to make explicit at each call site what's being done.	2021-01-26 18:35:04 +01:00
Tom	1d621ab172	Kernel: Some futex improvements This adds support for FUTEX_WAKE_OP, FUTEX_WAIT_BITSET, FUTEX_WAKE_BITSET, FUTEX_REQUEUE, and FUTEX_CMP_REQUEUE, as well well as global and private futex and absolute/relative timeouts against the appropriate clock. This also changes the implementation so that kernel resources are only used when a thread is blocked on a futex. Global futexes are implemented as offsets in VMObjects, so that different processes can share a futex against the same VMObject despite potentially being mapped at different virtual addresses.	2021-01-17 20:30:31 +01:00
Andreas Kling	43109f9614	Kernel: Remove unused syscall sys$minherit() This is no longer used. We can bring it back the day we need it.	2021-01-16 14:52:04 +01:00
Tom	c630669304	Kernel: If a VMObject is shared, broadcast page remappings If we remap pages (e.g. lazy allocation) inside a VMObject that is shared among more than one region, broadcast it to any other region that may be mapping the same page.	2021-01-02 20:56:35 +01:00
Andreas Kling	5dae85afe7	Kernel: Pass "shared" flag to Region constructor Before this change, we would sometimes map a region into the address space with !is_shared(), and then moments later call set_shared(true). I found this very confusing while debugging, so this patch makes us pass the initial shared flag to the Region constructor, ensuring that it's in the correct state by the time we first map the region.	2021-01-02 16:57:31 +01:00
Tom	2f429bd2d5	Kernel: Pass new region owner to Region::clone	2021-01-01 23:43:44 +01:00
Tom	476f17b3f1	Kernel: Merge PurgeableVMObject into AnonymousVMObject This implements memory commitments and lazy-allocation of committed memory.	2021-01-01 23:43:44 +01:00
Tom	b2a52f6208	Kernel: Implement lazy committed page allocation By designating a committed page pool we can guarantee to have physical pages available for lazy allocation in mappings. However, when forking we will overcommit. The assumption is that worst-case it's better for the fork to die due to insufficient physical memory on COW access than the parent that created the region. If a fork wants to ensure that all memory is available (trigger a commit) then it can use madvise. This also means that fork now can gracefully fail if we don't have enough physical pages available.	2021-01-01 23:43:44 +01:00
Tom	c3451899bc	Kernel: Add MAP_NORESERVE support to mmap Rather than lazily committing regions by default, we now commit the entire region unless MAP_NORESERVE is specified. This solves random crashes in low-memory situations where e.g. the malloc heap allocated memory, but using pages that haven't been used before triggers a crash when no more physical memory is available. Use this flag to create large regions without actually committing the backing memory. madvise() can be used to commit arbitrary areas of such regions after creating them.	2021-01-01 23:43:44 +01:00
Tom	bc5d6992a4	Kernel: Memory purging improvements This adds the ability for a Region to define volatile/nonvolatile areas within mapped memory using madvise(). This also means that memory purging takes into account all views of the PurgeableVMObject and only purges memory that is not needed by all of them. When calling madvise() to change an area to nonvolatile memory, return whether memory from that area was purged. At that time also try to remap all memory that is requested to be nonvolatile, and if insufficient pages are available notify the caller of that fact.	2021-01-01 23:43:44 +01:00
Andreas Kling	30dbe9c78a	Kernel+LibC: Add a very limited sys$mremap() implementation This syscall can currently only remap a shared file-backed mapping into a private file-backed mapping.	2020-12-29 02:20:43 +01:00
Ben Wiederhake	64cc3f51d0	Meta+Kernel: Make clang-format-10 clean	2020-09-25 21:18:17 +02:00
Tom	c8d9f1b9c9	Kernel: Make copy_to/from_user safe and remove unnecessary checks Since the CPU already does almost all necessary validation steps for us, we don't really need to attempt to do this. Doing it ourselves doesn't really work very reliably, because we'd have to account for other processors modifying virtual memory, and we'd have to account for e.g. pages not being able to be allocated due to insufficient resources. So change the copy_to/from_user (and associated helper functions) to use the new safe_memcpy, which will return whether it succeeded or not. The only manual validation step needed (which the CPU can't perform for us) is making sure the pointers provided by user mode aren't pointing to kernel mappings. To make it easier to read/write from/to either kernel or user mode data add the UserOrKernelBuffer helper class, which will internally either use copy_from/to_user or directly memcpy, or pass the data through directly using a temporary buffer on the stack. Last but not least we need to keep syscall params trivial as we need to copy them from/to user mode using copy_from/to_user.	2020-09-13 21:19:15 +02:00
Tom	bf268a0185	Kernel: Handle committing pages in regions more gracefully Sometimes a physical underlying page may be there, but we may be unable to allocate a page table that may be needed to map it. Bubble up such mapping errors so that they can be handled more appropriately.	2020-09-02 00:35:56 +02:00
Andreas Kling	949aef4aef	Kernel: Move syscall implementations out of Process.cpp This is something I've been meaning to do for a long time, and here we finally go. This patch moves all sys$foo functions out of Process.cpp and into files in Kernel/Syscalls/. It's not exactly one syscall per file (although it could be, but I got a bit tired of the repetitive work here..) This makes hacking on individual syscalls a lot less painful since you don't have to rebuild nearly as much code every time. I'm also hopeful that this makes it easier to understand individual syscalls. :^)	2020-07-30 23:40:57 +02:00
Tom	06d50f64b0	Kernel: Aggregate TLB flush requests for Regions for SMP Rather than sending one TLB flush request for each page, aggregate them so that we're not spamming the other processors with FlushTLB IPIs.	2020-07-06 22:39:06 +02:00
Tom	d98edb3171	Kernel: List all CPUs in /proc/cpuinfo	2020-07-01 12:07:01 +02:00
Tom	841364b609	Kernel: Add mechanism to identity map the lowest 2MB	2020-06-04 18:15:23 +02:00
Andreas Kling	6fe83b0ac4	Kernel: Crash the current process on OOM (instead of panicking kernel) This patch adds PageFaultResponse::OutOfMemory which informs the fault handler that we were unable to allocate a necessary physical page and cannot continue. In response to this, the kernel will crash the current process. Because we are OOM, we can't symbolicate the crash like we normally would (since the ELF symbolication code needs to allocate), so we also communicate to Process::crash() that we're out of memory. Now we can survive "allocate 300 MB" (only the allocate process dies.) This is definitely not perfect and can easily end up killing a random innocent other process who happened to allocate one page at the wrong time, but it's a lot better than panicking on OOM. :^)	2020-05-06 22:28:23 +02:00
Andreas Kling	9c856811b2	Kernel: Add Region helpers for accessing underlying physical pages Since a Region is basically a view into a potentially larger VMObject, it was always necessary to include the Region starting offset when accessing its underlying physical pages. Until now, you had to do that manually, but this patch adds a simple Region::physical_page() for read-only access and a physical_page_slot() when you want a mutable reference to the RefPtr<PhysicalPage> itself. A lot of code is simplified by making use of this.	2020-04-28 17:05:14 +02:00
Itamar	b306ac9b2b	ptrace: Add PT_POKE PT_POKE writes a single word to the tracee's address space. Some caveats: - If the user requests to write to an address in a read-only region, we temporarily change the page's protections to allow it. - If the user requests to write to a region that's backed by a SharedInodeVMObject, we replace the vmobject with a PrivateIndoeVMObject.	2020-04-13 00:53:22 +02:00
Andreas Kling	c19b56dc99	Kernel+LibC: Add minherit() and MAP_INHERIT_ZERO This patch adds the minherit() syscall originally invented by OpenBSD. Only the MAP_INHERIT_ZERO mode is supported for now. If set on an mmap region, that region will be zeroed out on fork().	2020-04-12 20:22:26 +02:00
Andreas Kling	88b334135b	Kernel: Remove some Region construction helpers It's now up to the caller to provide a VMObject when constructing a new Region object. This will make it easier to handle things going wrong, like allocation failures, etc.	2020-03-01 11:23:10 +01:00
Andreas Kling	aa1e209845	Kernel: Remove some unnecessary indirection in InodeFile::mmap() InodeFile now directly calls Process::allocate_region_with_vmobject() instead of taking an awkward detour via a special Region constructor.	2020-02-28 20:29:14 +01:00
Andreas Kling	30a8991dbf	Kernel: Make Region weakable and use WeakPtr<Region> instead of Region* This turns use-after-free bugs into null pointer dereferences instead.	2020-02-24 13:32:45 +01:00
Andreas Kling	f17c377a0c	Kernel: Use bitfields in Region This makes Region 4 bytes smaller and we can use bitfield initializers since they are allowed in C++20. :^)	2020-02-19 12:03:11 +01:00
Andreas Kling	1d611e4a11	Kernel: Reduce header dependencies of MemoryManager and Region	2020-02-16 01:33:41 +01:00
Andreas Kling	a356e48150	Kernel: Move all code into the Kernel namespace	2020-02-16 01:27:42 +01:00
Andreas Kling	94ca55cefd	Meta: Add license header to source files As suggested by Joshua, this commit adds the 2-clause BSD license as a comment block to the top of every source file. For the first pass, I've just added myself for simplicity. I encourage everyone to add themselves as copyright holders of any file they've added or modified in some significant way. If I've added myself in error somewhere, feel free to replace it with the appropriate copyright holder instead. Going forward, all new source files should include a license header.	2020-01-18 09:45:54 +01:00
Liav A	d2b41010c5	Kernel: Change Region allocation helpers We now can create a cacheable Region, so when map() is called, if a Region is cacheable then all the virtual memory space being allocated to it will be marked as not cache disabled. In addition to that, OS components can create a Region that will be mapped to a specific physical address by using the appropriate helper method.	2020-01-14 15:38:58 +01:00
Andreas Kling	197e73ee31	Kernel+LibELF: Enable SMAP protection during non-syscall exec() When loading a new executable, we now map the ELF image in kernel-only memory and parse it there. Then we use copy_to_user() when initializing writable regions with data from the executable. Note that the exec() syscall still disables SMAP protection and will require additional work. This patch only affects kernel-originated process spawns.	2020-01-10 10:57:06 +01:00
Andreas Kling	ea1911b561	Kernel: Share code between Region::map() and Region::remap_page() These were doing mostly the same things, so let's just share the code.	2020-01-01 19:32:55 +01:00
Andreas Kling	0d5e0e4cad	Kernel+SystemMonitor: Expose amount of per-process dirty private memory Dirty private memory is all memory in non-inode-backed mappings that's process-private, meaning it's not shared with any other process. This patch exposes that number via SystemMonitor, giving us an idea of how much memory each process is responsible for all on its own.	2019-12-29 12:28:32 +01:00
Andreas Kling	7a0088c4d2	Kernel: Clean up Region access bit setters a little	2019-12-25 02:58:03 +01:00
Andreas Kling	b6ee8a2c8d	Kernel: Rename vmo => vmobject everywhere	2019-12-19 19:15:27 +01:00
Andreas Kling	1d4d6f16b2	Kernel: Add a specific-page variant of Region::commit()	2019-12-18 22:43:32 +01:00
Andreas Kling	931e4b7f5e	Kernel+SystemMonitor: Prevent userspace access to process ELF image Every process keeps its own ELF executable mapped in memory in case we need to do symbol lookup (for backtraces, etc.) Until now, it was mapped in a way that made it accessible to the program, despite the program not having mapped it itself. I don't really see a need for userspace to have access to this right now, so let's lock things down a little bit. This patch makes it inaccessible to userspace and exposes that fact through /proc/PID/vm (per-region "user_accessible" flag.)	2019-12-15 20:11:57 +01:00
Andreas Kling	3fbc50a350	Kernel+SystemMonitor: Expose the number of set CoW bits in each Region This number tells us how many more pages in a given region will trigger a CoW fault if written to.	2019-12-15 16:53:00 +01:00
Andreas Kling	f41ae755ec	Kernel: Crash on memory access in non-readable regions This patch makes it possible to make memory regions non-readable. This is enforced using the "present" bit in the page tables. A process that hits an not-present page fault in a non-readable region will be crashed.	2019-12-02 19:18:52 +01:00
Andreas Kling	7dc9c90f83	Kernel: Fix bug where mprotect() would ignore setting PROT_WRITE A typo in Region::set_writable() caused us to update the readable flag rather than the writable flag.	2019-12-02 18:15:36 +01:00
Andreas Kling	3dc87be891	Kernel: Mark mmap()-created regions with a special bit Then only allow regions with that bit to be manipulated via munmap() and mprotect(). This prevents messing with non-mmap()ed regions in a process's address space (stacks, shared buffers, ...)	2019-11-24 12:26:21 +01:00

1 2

84 Commits