ladybird

mirror of https://github.com/LadybirdBrowser/ladybird.git synced 2024-11-10 13:00:29 +03:00

Author	SHA1	Message	Date
Andreas Kling	a87544fe8b	Kernel: Refuse to allocate 0 bytes of virtual address space	2020-02-19 22:19:55 +01:00
Andreas Kling	f17c377a0c	Kernel: Use bitfields in Region This makes Region 4 bytes smaller and we can use bitfield initializers since they are allowed in C++20. :^)	2020-02-19 12:03:11 +01:00
Andreas Kling	4b16ac0034	Kernel: Purging a page should point it back to the shared zero page Anonymous VM objects should never have null entries in their physical page list. Instead, "empty" or untouched pages should refer to the shared zero page. Fixes #1237.	2020-02-18 09:56:11 +01:00
Andreas Kling	48f7c28a5c	Kernel: Replace "current" with Thread::current and Process::current Suggested by Sergey. The currently running Thread and Process are now Thread::current and Process::current respectively. :^)	2020-02-17 15:04:27 +01:00
Andreas Kling	31e1af732f	Kernel+LibC: Allow sys$mmap() callers to specify address alignment This is exposed via the non-standard serenity_mmap() call in userspace.	2020-02-16 12:55:56 +01:00
Andreas Kling	7533d61458	Kernel: Fix weird whitespace mistake in RangeAllocator	2020-02-16 08:01:33 +01:00
Andreas Kling	635ae70b8f	Kernel: More header dependency reduction work	2020-02-16 02:15:33 +01:00
Andreas Kling	e28809a996	Kernel: Add forward declaration header	2020-02-16 01:50:32 +01:00
Andreas Kling	1d611e4a11	Kernel: Reduce header dependencies of MemoryManager and Region	2020-02-16 01:33:41 +01:00
Andreas Kling	a356e48150	Kernel: Move all code into the Kernel namespace	2020-02-16 01:27:42 +01:00
Andreas Kling	5507945306	Kernel: Widen PhysicalPage refcount to 32 bits A 16-bit refcount is just begging for trouble right nowl. A 32-bit refcount will be begging for trouble later down the line, so we'll have to revisit this eventually. :^)	2020-02-15 22:34:48 +01:00
Andreas Kling	c624d3875e	Kernel: Use a shared physical page for zero-filled pages until written This patch adds a globally shared zero-filled PhysicalPage that will be mapped into every slot of every zero-filled AnonymousVMObject until that page is written to, achieving CoW-like zero-filled pages. Initial testing show that this doesn't actually achieve any sharing yet but it seems like a good design regardless, since it may reduce the number of page faults taken by programs. If you look at the refcount of MM.shared_zero_page() it will have quite a high refcount, but that's just because everything maps it everywhere. If you want to see the "real" refcount, you can build with the MAP_SHARED_ZERO_PAGE_LAZILY flag, and we'll defer mapping of the shared zero page until the first NP read fault. I've left this behavior behind a flag for future testing of this code.	2020-02-15 13:17:40 +01:00
Andreas Kling	27f0102bbe	Kernel: Add getter and setter for the X86 CR3 register This gets rid of a bunch of inline assembly.	2020-02-10 20:00:32 +01:00
Andreas Kling	ccfee3e573	Kernel: Remove more <LibBareMetal/Output/kstdio.h> includes	2020-02-10 12:07:48 +01:00
Andreas Kling	6cbd72f54f	AK: Remove bitrotted Traits::dump() mechanism This was only used by HashTable::dump() which I used when doing the first HashTable implementation. Removing this allows us to also remove most includes of <AK/kstdio.h>.	2020-02-10 11:55:34 +01:00
Liav A	99ea80695e	Kernel: Use VirtualAddress & PhysicalAddress classes from LibBareMetal	2020-02-09 19:38:17 +01:00
Liav A	e559af2008	Kernel: Apply changes to use LibBareMetal definitions	2020-02-09 19:38:17 +01:00
Andreas Kling	00d8ec3ead	Kernel: The inode fault handler should grab the VMObject lock earlier It doesn't look healthy to create raw references into an array before a temporary unlock. In fact, that temporary unlock looks generally unhealthy, but it's a different problem.	2020-02-08 12:55:21 +01:00
Andreas Kling	a9d7902bb7	x86: Simplify region unmapping a bit Add PageTableEntry::clear() to zero out a whole PTE, and use that for unmapping instead of clearing individual fields.	2020-02-08 12:49:38 +01:00
Andreas Kling	f91b3aab47	Kernel: Cloned shared regions should also be marked as shared	2020-02-08 02:39:46 +01:00
Andreas Kling	bf5b7c32d8	Kernel: Add some sanity assertions in RangeAllocator::deallocate() We should never end up deallocating an empty range, or a range that ends before it begins.	2020-01-30 21:51:27 +01:00
Andreas Kling	31a141bd10	Kernel: Range::contains() should reject ranges with 2^32 wrap-around	2020-01-30 21:51:27 +01:00
Andreas Kling	a27c5d2fb7	Kernel: Fail with EFAULT for any address+size that would wrap around Previously we were only checking that each of the virtual pages in the specified range were valid. This made it possible to pass in negative buffer sizes to some syscalls as long as (address) and (address+size) were on the same page.	2020-01-29 12:56:07 +01:00
Andreas Kling	c17f80e720	Kernel: AnonymousVMObject::create_for_physical_range() should fail more Previously it was not possible for this function to fail. You could exploit this by triggering the creation of a VMObject whose physical memory range would wrap around the 32-bit limit. It was quite easy to map kernel memory into userspace and read/write whatever you wanted in it. Test: Kernel/bxvga-mmap-kernel-into-userspace.cpp	2020-01-28 20:48:07 +01:00
Andreas Kling	8131875da6	Kernel: Remove outdated comment in MemoryManager Regions do zero-fill on demand now. :^)	2020-01-28 10:28:04 +01:00
Andreas Kling	3de5439579	AK: Let's call decrementing reference counts "unref" instead of "deref" It always bothered me that we're using the overloaded "dereference" term for this. Let's call it "unreference" instead. :^)	2020-01-23 15:14:21 +01:00
Andreas Kling	f38cfb3562	Kernel: Tidy up debug logging a little bit When using dbg() in the kernel, the output is automatically prefixed with [Process(PID:TID)]. This makes it a lot easier to understand which thread is generating the output. This patch also cleans up some common logging messages and removes the now-unnecessary "dbg() << *current << ..." pattern.	2020-01-21 16:16:20 +01:00
Liav A	200a5b0649	Kernel: Remove map_for_kernel() in MemoryManager We don't need to have this method anymore. It was a hack that was used in many components in the system but currently we use better methods to create virtual memory mappings. To prevent any further use of this method it's best to just remove it completely. Also, the APIC code is disabled for now since it doesn't help booting the system, and is broken since it relies on identity mapping to exist in the first 1MB. Any call to the APIC code will result in assertion failed. In addition to that, the name of the method which is responsible to create an identity mapping between 1MB to 2MB was changed, to be more precise about its purpose.	2020-01-21 11:29:58 +01:00
Andreas Kling	a0b716cfc5	Add AnonymousVMObject::create_with_physical_page() This can be used to create a VMObject for a single PhysicalPage.	2020-01-20 13:13:03 +01:00
Andreas Kling	4ebff10bde	Kernel: Write-only regions should still be mapped as present There is no real "read protection" on x86, so we have no choice but to map write-only pages simply as "present & read/write". If we get a read page fault in a non-readable region, that's still a correctness issue, so we crash the process. It's by no means a complete protection against invalid reads, since it's trivial to fool the kernel by first causing a write fault in the same region.	2020-01-20 13:13:03 +01:00
Andreas Kling	4b7a89911c	Kernel: Remove some unnecessary casts to uintptr_t VirtualAddress is constructible from uintptr_t and const void. PhysicalAddress is constructible from uintptr_t but not const void.	2020-01-20 13:13:03 +01:00
Andreas Kling	a246e9cd7e	Use uintptr_t instead of u32 when storing pointers as integers uintptr_t is 32-bit or 64-bit depending on the target platform. This will help us write pointer size agnostic code so that when the day comes that we want to do a 64-bit port, we'll be in better shape.	2020-01-20 13:13:03 +01:00
Andreas Kling	05836757c6	Kernel: Oops, fix bad sort order of available VM ranges This made the allocator perform worse, so here's another second off of the Kernel/Process.cpp compile time from a simple bugfix! (31s to 30s)	2020-01-19 15:53:43 +01:00
Andreas Kling	6eab7b398d	Kernel: Make ProcessPagingScope restore CR3 properly Instead of restoring CR3 to the current process's paging scope when a ProcessPagingScope goes out of scope, we now restore exactly whatever the CR3 value was when we created the ProcessPagingScope. This fixes breakage in situations where a process ends up with nested ProcessPagingScopes. This was making profiling very fragile, and with this change it's now possible to profile g++! :^)	2020-01-19 13:44:53 +01:00
Andreas Kling	ad3f931707	Kernel: Optimize VM range deallocation a bit Previously, when deallocating a range of VM, we would sort and merge the range list. This was quite slow for large processes. This patch optimizes VM deallocation in the following ways: - Use binary search instead of linear scan to find the place to insert the deallocated range. - Insert at the right place immediately, removing the need to sort. - Merge the inserted range with any adjacent range(s) in-line instead of doing a separate merge pass into a list copy. - Add Traits<Range> to inform Vector that Range objects are trivial and can be moved using memmove(). I've also added an assertion that deallocated ranges are actually part of the RangeAllocator's initial address range. I've benchmarked this using g++ to compile Kernel/Process.cpp. With these changes, compilation goes from ~41 sec to ~35 sec.	2020-01-19 13:29:59 +01:00
Andreas Kling	f7b394e9a1	Kernel: Assert that copy_to/from_user() are called with user addresses This will panic the kernel immediately if these functions are misused so we can catch it and fix the misuse. This patch fixes a couple of misuses: - create_signal_trampolines() writes to a user-accessible page above the 3GB address mark. We should really get rid of this page but that's a whole other thing. - CoW faults need to use copy_from_user rather than copy_to_user since it's the source pointer that points to user memory. - Inode faults need to use memcpy rather than copy_to_user since we're copying a kernel stack buffer into a quickmapped page. This should make the copy_to/from_user() functions slightly less useful for exploitation. Before this, they were essentially just glorified memcpy() with SMAP disabled. :^)	2020-01-19 09:18:55 +01:00
Andreas Kling	2cd212e5df	Kernel: Let's say that everything < 3GB is user virtual memory Technically the bottom 2MB is still identity-mapped for the kernel and not made available to userspace at all, but for simplicity's sake we can just ignore that and make "address < 0xc0000000" the canonical check for user/kernel.	2020-01-19 08:58:33 +01:00
Andreas Kling	862b3ccb4e	Kernel: Enforce W^X between sys$mmap() and sys$execve() It's now an error to sys$mmap() a file as writable if it's currently mapped executable by anyone else. It's also an error to sys$execve() a file that's currently mapped writable by anyone else. This fixes a race condition vulnerability where one program could make modifications to an executable while another process was in the kernel, in the middle of exec'ing the same executable. Test: Kernel/elf-execve-mmap-race.cpp	2020-01-18 23:40:12 +01:00
Andreas Kling	6fea316611	Kernel: Move all CPU feature initialization into cpu_setup() ..and do it very very early in boot.	2020-01-18 10:11:29 +01:00
Andreas Kling	94ca55cefd	Meta: Add license header to source files As suggested by Joshua, this commit adds the 2-clause BSD license as a comment block to the top of every source file. For the first pass, I've just added myself for simplicity. I encourage everyone to add themselves as copyright holders of any file they've added or modified in some significant way. If I've added myself in error somewhere, feel free to replace it with the appropriate copyright holder instead. Going forward, all new source files should include a license header.	2020-01-18 09:45:54 +01:00
Andreas Kling	19c31d1617	Kernel: Always dump kernel regions when dumping process regions	2020-01-18 08:57:18 +01:00
Andreas Kling	345f92d5ac	Kernel: Remove two unused MemoryManager functions	2020-01-18 08:57:18 +01:00
Andreas Kling	3e8b60c618	Kernel: Clean up MemoryManager initialization a bit more Move the CPU feature enabling to functions in Arch/i386/CPU.cpp.	2020-01-18 00:28:16 +01:00
Andreas Kling	a850a89c1b	Kernel: Add a random offset to the base of the per-process VM allocator This is not ASLR, but it does de-trivialize exploiting the ELF loader which would previously always parse executables at 0x01001000 in every single exec(). I've taken advantage of this multiple times in my own toy exploits and it's starting to feel cheesy. :^)	2020-01-17 23:29:54 +01:00
Andreas Kling	536c0ff3ee	Kernel: Only clone the bottom 2MB of mappings from kernel to processes	2020-01-17 22:34:36 +01:00
Andreas Kling	122c76d7fa	Kernel: Don't allocate per-process PDPT from super pages either The default system is now down to 3 super pages allocated on boot. :^)	2020-01-17 22:34:36 +01:00
Andreas Kling	ad1f79fb4a	Kernel: Stop allocating page tables from the super pages pool We now use the regular "user" physical pages for on-demand page table allocations. This was by far the biggest source of super physical page exhaustion, so that bug should be a thing of the past now. :^) We still have super pages, but they are barely used. They remain useful for code that requires memory with a low physical address. Fixes #1000.	2020-01-17 22:34:36 +01:00
Andreas Kling	f71fc88393	Kernel: Re-enable protection of the kernel image in memory	2020-01-17 22:34:36 +01:00
Andreas Kling	59b584d983	Kernel: Tidy up the lowest part of the address space After MemoryManager initialization, we now only leave the lowest 1MB of memory identity-mapped. The very first (null) page is not present. All other pages are RW but not X. Supervisor only.	2020-01-17 22:34:36 +01:00
Andreas Kling	545ec578b3	Kernel: Tidy up the types imported from boot.S a little bit	2020-01-17 22:34:36 +01:00
Andreas Kling	7e6f0efe7c	Kernel: Move Multiboot memory map parsing to its own function	2020-01-17 22:34:36 +01:00
Andreas Kling	ba8275a48e	Kernel: Clean up ensure_pte()	2020-01-17 22:34:36 +01:00
Andreas Kling	e362b56b4f	Kernel: Move kernel above the 3GB virtual address mark The kernel and its static data structures are no longer identity-mapped in the bottom 8MB of the address space, but instead move above 3GB. The first 8MB above 3GB are pseudo-identity-mapped to the bottom 8MB of the physical address space. But things don't have to stay this way! Thanks to Jesse who made an earlier attempt at this, it was really easy to get device drivers working once the page tables were in place! :^) Fixes #734.	2020-01-17 22:34:26 +01:00
Liav A	d2b41010c5	Kernel: Change Region allocation helpers We now can create a cacheable Region, so when map() is called, if a Region is cacheable then all the virtual memory space being allocated to it will be marked as not cache disabled. In addition to that, OS components can create a Region that will be mapped to a specific physical address by using the appropriate helper method.	2020-01-14 15:38:58 +01:00
Andreas Kling	5c3c2a9bac	Kernel: Copy Region's "is_mmap" flag when cloning regions for fork() Otherwise child processes will not be allowed to munmap(), madvise(), etc. on the cloned regions!	2020-01-10 19:24:01 +01:00
Andreas Kling	62c45850e1	Kernel: Page allocation should not use memset_user() when zeroing We're not zeroing new pages through a userspace address, so this should not use memset_user().	2020-01-10 10:57:33 +01:00
Andreas Kling	197e73ee31	Kernel+LibELF: Enable SMAP protection during non-syscall exec() When loading a new executable, we now map the ELF image in kernel-only memory and parse it there. Then we use copy_to_user() when initializing writable regions with data from the executable. Note that the exec() syscall still disables SMAP protection and will require additional work. This patch only affects kernel-originated process spawns.	2020-01-10 10:57:06 +01:00
Andreas Kling	8e7420ddf2	Kernel: Harden memory mapping of the kernel image We now map the kernel's text and rodata segments read+execute. We also make the data and bss segments non-executable. Thanks to q3k for the idea! :^)	2020-01-06 13:55:39 +01:00
Andreas Kling	9eef39d68a	Kernel: Start implementing x86 SMAP support Supervisor Mode Access Prevention (SMAP) is an x86 CPU feature that prevents the kernel from accessing userspace memory. With SMAP enabled, trying to read/write a userspace memory address while in the kernel will now generate a page fault. Since it's sometimes necessary to read/write userspace memory, there are two new instructions that quickly switch the protection on/off: STAC (disables protection) and CLAC (enables protection.) These are exposed in kernel code via the stac() and clac() helpers. There's also a SmapDisabler RAII object that can be used to ensure that you don't forget to re-enable protection before returning to userspace code. THis patch also adds copy_to_user(), copy_from_user() and memset_user() which are the "correct" way of doing things. These functions allow us to briefly disable protection for a specific purpose, and then turn it back on immediately after it's done. Going forward all kernel code should be moved to using these and all uses of SmapDisabler are to be considered FIXME's. Note that we're not realizing the full potential of this feature since I've used SmapDisabler quite liberally in this initial bring-up patch.	2020-01-05 18:14:51 +01:00
Andreas Kling	aba7829724	Kernel: InodeVMObject can't call Inode::size() with interrupts disabled Inode::size() may try to take a lock, so we can't be calling it with interrupts disabled. This fixes a kernel hang when trying to execute a binary in a TmpFS.	2020-01-03 15:40:03 +01:00
Andreas Kling	0f9800ca57	Kernel: Make the loop that marks the bottom 1MB NX a little less busy	2020-01-02 22:02:29 +01:00
Andreas Kling	32ec1e5aed	Kernel: Mask kernel addresses in backtraces and profiles Addresses outside the userspace virtual range will now show up as 0xdeadc0de in backtraces and profiles generated by unprivileged users.	2020-01-02 20:51:31 +01:00
Andreas Kling	3dcec260ed	Kernel: Validate the full range of user memory passed to syscalls We now validate the full range of userspace memory passed into syscalls instead of just checking that the first and last byte of the memory are in process-owned regions. This fixes an issue where it was possible to avoid rejection of invalid addresses that sat between two valid ones, simply by passing a valid address and a size large enough to put the end of the range at another valid address. I added a little test utility that tries to provoke EFAULT in various ways to help verify this. I'm sure we can think of more ways to test this but it's at least a start. :^) Thanks to mozjag for pointing out that this code was still lacking! Incidentally this also makes backtraces work again. Fixes #989.	2020-01-02 02:17:12 +01:00
Andreas Kling	ea1911b561	Kernel: Share code between Region::map() and Region::remap_page() These were doing mostly the same things, so let's just share the code.	2020-01-01 19:32:55 +01:00
Andreas Kling	5aeaab601e	Kernel: Move CPU feature detection to Arch/x86/CPU.{cpp.h} We now refuse to boot on machines that don't support PAE since all of our paging code depends on it. Also let's only enable SSE and PGE support if the CPU advertises it.	2020-01-01 12:57:00 +01:00
Andreas Kling	8602fa5b49	Kernel: Enable x86 SMEP (Supervisor Mode Execution Protection) This prevents the kernel from jumping to code in userspace memory.	2020-01-01 01:59:52 +01:00
Andreas Kling	c9ec415e2f	Kernel: Always reject never-userspace addresses before checking regions At the moment, addresses below 8MB and above 3GB are never accessible to userspace, so just reject them without even looking at the current process's memory regions.	2019-12-31 03:45:54 +01:00
Andreas Kling	66d5ebafa6	Kernel: Let's also not consider kernel regions to be valid user stacks This one is less obviously exploitable than the previous one, but still a bug nonetheless.	2019-12-31 00:28:14 +01:00
Andreas Kling	0fc24fe256	Kernel: User pointer validation should reject kernel-only addresses We were happily allowing syscalls with pointers into kernel-only regions (virtual address >= 0xc0000000). This patch fixes that by only considering user regions in the current process, and also double-checking the Region::is_user_accessible() flag before approving an access. Thanks to Fire30 for finding the bug! :^)	2019-12-31 00:24:35 +01:00
Andreas Kling	1f31156173	Kernel: Add a mode flag to sys$purge and allow purging clean inodes	2019-12-29 13:16:53 +01:00
Andreas Kling	c74cde918a	Kernel+SystemMonitor: Expose amount of per-process clean inode memory This is memory that's loaded from an inode (file) but not modified in memory, so still identical to what's on disk. This kind of memory can be freed and reloaded transparently from disk if needed.	2019-12-29 12:45:58 +01:00
Andreas Kling	0d5e0e4cad	Kernel+SystemMonitor: Expose amount of per-process dirty private memory Dirty private memory is all memory in non-inode-backed mappings that's process-private, meaning it's not shared with any other process. This patch exposes that number via SystemMonitor, giving us an idea of how much memory each process is responsible for all on its own.	2019-12-29 12:28:32 +01:00
Andreas Kling	c1f8291ce4	Kernel: When physical page allocation fails, try to purge something Instead of panicking right away when we run out of physical pages, we now try to find a PurgeableVMObject with some volatile pages in it. If we find one, we purge that entire object and steal one of its pages. This makes it possible for the kernel to keep going instead of dying. Very cool. :^)	2019-12-26 11:45:36 +01:00
Conrad Pankoff	17aef7dc99	Kernel: Detect support for no-execute (NX) CPU features Previously we assumed all hosts would have support for IA32_EFER.NXE. This is mostly true for newer hardware, but older hardware will crash and burn if you try to use this feature. Now we check for support via CPUID.80000001[20].	2019-12-26 10:05:51 +01:00
Andreas Kling	9e55bcb7da	Kernel: Make kernel memory regions be non-executable by default From now on, you'll have to request executable memory specifically if you want some.	2019-12-25 22:41:34 +01:00
Andreas Kling	0b7a2e0a5a	Kernel: Set NX bit for virtual addresses 0-1MB and 2-8MB This removes the ability to jump into kmalloc memory, etc. Only the kernel image itself is allowed to exec, located between 1-2MB.	2019-12-25 22:24:28 +01:00
Andreas Kling	ce5f7f6c07	Kernel: Use the CPU's NX bit to enforce PROT_EXEC on memory mappings Now that we have PAE support, we can ask the CPU to crash processes for trying to execute non-executable memory. This is pretty cool! :^)	2019-12-25 13:35:57 +01:00
Andreas Kling	52deb09382	Kernel: Enable PAE (Physical Address Extension) Introduce one more (CPU) indirection layer in the paging code: the page directory pointer table (PDPT). Each PageDirectory now has 4 separate PageDirectoryEntry arrays, governing 1 GB of VM each. A really neat side-effect of this is that we can now share the physical page containing the >=3GB kernel-only address space metadata between all processes, instead of lazily cloning it on page faults. This will give us access to the NX (No eXecute) bit, allowing us to prevent execution of memory that's not supposed to be executed.	2019-12-25 13:35:57 +01:00
Andreas Kling	c087abc48d	Kernel: Rename PageDirectory::find_by_pdb() => find_by_cr3() I caught myself wondering what "pdb" stood for, so let's rename this to something more obvious.	2019-12-25 02:58:03 +01:00
Andreas Kling	7a0088c4d2	Kernel: Clean up Region access bit setters a little	2019-12-25 02:58:03 +01:00
Andreas Kling	c9a5253ac2	Kernel: Uh, actually actually turn on CR4.PGE I'm not sure how I managed to misread the location of this bit twice. But I did! Here is finally the correct value, according to Intel: "Page Global Enable (bit 7 of CR4)" Jeez! :^)	2019-12-25 02:58:03 +01:00
Andreas Kling	3623e35978	Kernel: Oops, actually enable CR4.PGE (page table global bit) Turns out we were setting the wrong bit here. Now we will actually keep kernel memory mappings in the TLB across context switches.	2019-12-24 22:45:27 +01:00
Andreas Kling	ae2d72377d	Kernel: Enable the x86 WP bit to catch invalid memory writes in ring 0 Setting this bit will cause the CPU to generate a page fault when writing to read-only memory, even if we're executing in the kernel. Seemingly the only change needed to make this work was to have the inode-backed page fault handler use a temporary mapping for writing the read-from-disk data into the newly-allocated physical page.	2019-12-21 16:21:13 +01:00
Andreas Kling	62c2309336	Kernel: Fix some warnings about passing non-POD to kprintf	2019-12-20 20:19:46 +01:00
Andreas Kling	b6ee8a2c8d	Kernel: Rename vmo => vmobject everywhere	2019-12-19 19:15:27 +01:00
Andreas Kling	1d4d6f16b2	Kernel: Add a specific-page variant of Region::commit()	2019-12-18 22:43:32 +01:00
Andreas Kling	0a75a46501	Kernel: Make sure the kernel info page is read-only for userspace To enforce this, we create two separate mappings of the same underlying physical page. A writable mapping for the kernel, and a read-only one for userspace (the one returned by sys$get_kernel_info_page.)	2019-12-15 22:21:28 +01:00
Andreas Kling	931e4b7f5e	Kernel+SystemMonitor: Prevent userspace access to process ELF image Every process keeps its own ELF executable mapped in memory in case we need to do symbol lookup (for backtraces, etc.) Until now, it was mapped in a way that made it accessible to the program, despite the program not having mapped it itself. I don't really see a need for userspace to have access to this right now, so let's lock things down a little bit. This patch makes it inaccessible to userspace and exposes that fact through /proc/PID/vm (per-region "user_accessible" flag.)	2019-12-15 20:11:57 +01:00
Andreas Kling	05a441afb2	Kernel: Don't turn private read-only regions into shared ones on fork Even if they are read-only now, they can be mprotect(PROT_WRITE)'d in the future, so we have to make sure they are CoW mapped.	2019-12-15 16:53:46 +01:00
Andreas Kling	3fbc50a350	Kernel+SystemMonitor: Expose the number of set CoW bits in each Region This number tells us how many more pages in a given region will trigger a CoW fault if written to.	2019-12-15 16:53:00 +01:00
Andreas Kling	9ad151c665	Kernel: Improve comment about the system virtual memory map a bit	2019-12-15 16:13:08 +01:00
Andreas Kling	65229a4082	Kernel: Move VMObject::for_each_region() to MemoryManager.h It can't be in VMObject.h since it depends on MemoryManager.h	2019-12-09 20:06:03 +01:00
Andreas Kling	a22b7f96fc	Kernel: Remap all regions referring to a PurgeableVMObject on purge Otherwise we won't get page faults next time you try to access the purged memory.	2019-12-09 20:05:04 +01:00
Andreas Kling	dbb644f20c	Kernel: Start implementing purgeable memory support It's now possible to get purgeable memory by using mmap(MAP_PURGEABLE). Purgeable memory has a "volatile" flag that can be set using madvise(): - madvise(..., MADV_SET_VOLATILE) - madvise(..., MADV_SET_NONVOLATILE) When in the "volatile" state, the kernel may take away the underlying physical memory pages at any time, without notifying the owner. This gives you a guilt discount when caching very large things. :^) Setting a purgeable region to non-volatile will return whether or not the memory has been taken away by the kernel while being volatile. Basically, if madvise(..., MADV_SET_NONVOLATILE) returns 1, that means the memory was purged while volatile, and whatever was in that piece of memory needs to be reconstructed before use.	2019-12-09 19:12:38 +01:00
Andreas Kling	05c65fb4f1	Kernel: Don't CoW non-writable pages A page fault in a page marked for CoW should not trigger a CoW if the page is non-writable. I think this makes sense.	2019-12-02 19:20:09 +01:00
Andreas Kling	f41ae755ec	Kernel: Crash on memory access in non-readable regions This patch makes it possible to make memory regions non-readable. This is enforced using the "present" bit in the page tables. A process that hits an not-present page fault in a non-readable region will be crashed.	2019-12-02 19:18:52 +01:00
Andreas Kling	7dc9c90f83	Kernel: Fix bug where mprotect() would ignore setting PROT_WRITE A typo in Region::set_writable() caused us to update the readable flag rather than the writable flag.	2019-12-02 18:15:36 +01:00
Andreas Kling	cde0a1eeb5	Kernel: Put some debug spam behind PAGE_FAULT_DEBUG	2019-12-01 16:03:24 +01:00
Andreas Kling	e56daf547c	Kernel: Disallow syscalls from writeable memory Processes will now crash with SIGSEGV if they attempt making a syscall from PROT_WRITE memory. This neat idea comes from OpenBSD. :^)	2019-11-29 16:30:05 +01:00
Andreas Kling	2d1bcce34a	Kernel: Fix triple-fault when clicking on SystemServer in SystemMonitor The fault was happening when retrieving a current backtrace for the SystemServer process. To generate a backtrace, we go into the paging scope of the process, meaning we temporarily switch to using its page directory as our own. Because kernel VM is allocated on demand, it's possible for a process's mappings above the 3GB mark to be out-of-date. Normally this just gets fixed up transparently by the page fault handler (which simply copies the PDE from the canonical MM.kernel_page_directory() into the current process.) However, if the current kernel stack is in a piece of memory that the backtraced process lacks up-to-date PDE's for, we still get a page fault, but are unable to handle it, since the CPU wants to push to the stack as part of calling the page fault handler. So we're screwed and it's a triple-fault. Fix this by always updating the kernel VM mappings before switching into a paging scope. In practical terms, this is a 1KB memcpy() that happens when generating a backtrace, or doing exec().	2019-11-27 12:40:42 +01:00

1 2 3 4 5 ...

273 Commits