ladybird

mirror of https://github.com/LadybirdBrowser/ladybird.git synced 2024-09-20 09:49:15 +03:00

Author	SHA1	Message	Date
Andreas Kling	4cd2c475a8	Kernel: Make the space lock a RecursiveSpinLock	2021-02-08 22:28:48 +01:00
Andreas Kling	9ca42c4c0e	Kernel: Always hold space lock while calculating memory statistics And put the locker at the top of the functions for clarity.	2021-02-08 22:23:29 +01:00
Andreas Kling	8bda30edd2	Kernel: Move memory statistics helpers from Process to Space	2021-02-08 22:23:29 +01:00
Andreas Kling	f1b5def8fd	Kernel: Factor address space management out of the Process class This patch adds Space, a class representing a process's address space. - Each Process has a Space. - The Space owns the PageDirectory and all Regions in the Process. This allows us to reorganize sys$execve() so that it constructs and populates a new Space fully before committing to it. Previously, we would construct the new address space while still running in the old one, and encountering an error meant we had to do tedious and error-prone rollback. Those problems are now gone, replaced by what's hopefully a set of much smaller problems and missing cleanups. :^)	2021-02-08 18:27:28 +01:00
Andreas Kling	b2cba3036e	Kernel: Remove unused MemoryManager::validate_range() This is no longer used since we've switched to using the MMU to generate EFAULT errors.	2021-02-08 18:27:28 +01:00
AnotherTest	09a43969ba	Everywhere: Replace dbgln<flag>(...) with dbgln_if(flag, ...) Replacement made by `find Kernel Userland -name '.h' -o -name '.cpp' \| sed -i -Ee 's/dbgln\b<(\w+)>\(/dbgln_if(\1, /g'`	2021-02-08 18:08:55 +01:00
Andreas Kling	9c77980965	Everywhere: Remove some bitrotted "#if 0" blocks	2021-02-03 11:17:47 +01:00
Andreas Kling	823186031d	Kernel: Add a way to specify which memory regions can make syscalls This patch adds sys$msyscall() which is loosely based on an OpenBSD mechanism for preventing syscalls from non-blessed memory regions. It works similarly to pledge and unveil, you can call it as many times as you like, and when you're finished, you call it with a null pointer and it will stop accepting new regions from then on. If a syscall later happens and doesn't originate from one of the previously blessed regions, the kernel will simply crash the process.	2021-02-02 20:13:44 +01:00
Liav A	5ab1864497	Kernel: Introduce the MemoryDevice This is a character device that is being used by the dmidecode utility. We only allow to map the BIOS ROM area to userspace with this device.	2021-02-01 17:13:23 +01:00
Andreas Kling	1320b9351e	Revert "Kernel: Don't clone kernel mappings for bottom 2 MiB VM into processes" This reverts commit `da7b21dc06`. This broke SMP boot, oops! :^)	2021-01-31 19:00:53 +01:00
Andreas Kling	da7b21dc06	Kernel: Don't clone kernel mappings for bottom 2 MiB VM into processes I can't think of anything that needs these mappings anymore, so let's get rid of them.	2021-01-31 15:20:18 +01:00
Andreas Kling	e55ef70e5e	Kernel: Remove "has made executable exception for dynamic loader" flag As Idan pointed out, this flag is actually not needed, since we don't allow transitioning from previously-executable to writable anyway.	2021-01-30 10:06:52 +01:00
Jorropo	df30b3e54c	Kernel: RangeAllocator randomized correctly check if size is in bound. (#5164 ) The random address proposals were not checked with the size so it was increasely likely to try to allocate outside of available space with larger and larger sizes. Now they will be ignored instead of triggering a Kernel assertion failure. This is a continuation of: `c8e7baf4b8`	2021-01-29 17:18:23 +01:00
Andreas Kling	af3d3c5c4a	Kernel: Enforce W^X more strictly (like PaX MPROTECT) This patch adds enforcement of two new rules: - Memory that was previously writable cannot become executable - Memory that was previously executable cannot become writable Unfortunately we have to make an exception for text relocations in the dynamic loader. Since those necessitate writing into a private copy of library code, we allow programs to transition from RW to RX under very specific conditions. See the implementation of sys$mprotect()'s should_make_executable_exception_for_dynamic_loader() for details.	2021-01-29 14:52:27 +01:00
Andreas Kling	c8e7baf4b8	Kernel: Check for alignment size overflow when allocating VM ranges Also add some sanity check assertions that we're generating and returning ranges contained within the RangeAllocator's total range. Fixes #5162.	2021-01-29 12:11:42 +01:00
Tom	affb4ef01b	Kernel: Allow specifying a physical alignment when allocating Some drivers may require allocating contiguous physical pages with a specific alignment for the physical address.	2021-01-28 18:52:59 +01:00
Andreas Kling	80837d43a2	Kernel: Remove outdated debug logging from RangeAllocator If someone wants to debug this code, it's better that they rewrite the logging code to take randomization and guard pages into account.	2021-01-28 16:23:38 +01:00
Andreas Kling	b6937e2560	Kernel+LibC: Add MAP_RANDOMIZED flag for sys$mmap() This can be used to request random VM placement instead of the highly predictable regular mmap(nullptr, ...) VM allocation strategy. It will soon be used to implement ASLR in the dynamic loader. :^)	2021-01-28 16:23:38 +01:00
Andreas Kling	d3de138d64	Kernel: Add sanity check assertion in RangeAllocator::allocate_specific The specific virtual address should always be page aligned.	2021-01-28 16:23:38 +01:00
Andreas Kling	27d07796b4	Kernel: Add sanity check assertion in RangeAllocator::allocate_anywhere The requested alignment should always be a multiple of the page size.	2021-01-28 16:23:38 +01:00
Tom	250a310454	Kernel: Release MM lock while yielding from inode page fault handler We need to make sure other processors can grab the MM lock while we wait, so release it when we might block. Reading the page from disk may also block, so release it during that time as well.	2021-01-27 22:48:41 +01:00
Andreas Kling	e67402c702	Kernel: Remove Range "valid" state and use Optional<Range> instead It's easier to understand VM ranges if they are always valid. We can simply use an empty Optional<Range> to encode absence when needed.	2021-01-27 21:14:42 +01:00
Tom	e2f9e557d3	Kernel: Make Processor::id a static function This eliminates the window between calling Processor::current and the member function where a thread could be moved to another processor. This is generally not as big of a concern as with Processor::current_thread, but also slightly more light weight.	2021-01-27 21:12:24 +01:00
Andreas Kling	76a69be217	Kernel: Assert in RangeAllocator that sizes are multiple of PAGE_SIZE	2021-01-27 19:45:53 +01:00
asynts	7cf0c7cc0d	Meta: Split debug defines into multiple headers. The following script was used to make these changes: #!/bin/bash set -e tmp=$(mktemp -d) echo "tmp=$tmp" find Kernel $ -name '.cpp' -o -name '.h' $ \| sort > $tmp/Kernel.files find . $ -path ./Toolchain -prune -o -path ./Build -prune -o -path ./Kernel -prune $ -o $ -name '.cpp' -o -name '.h' $ -print \| sort > $tmp/EverythingExceptKernel.files cat $tmp/Kernel.files \| xargs grep -Eho '[A-Z0-9_]+_DEBUG' \| sort \| uniq > $tmp/Kernel.macros cat $tmp/EverythingExceptKernel.files \| xargs grep -Eho '[A-Z0-9_]+_DEBUG' \| sort \| uniq > $tmp/EverythingExceptKernel.macros comm -23 $tmp/Kernel.macros $tmp/EverythingExceptKernel.macros > $tmp/Kernel.unique comm -1 $tmp/Kernel.macros $tmp/EverythingExceptKernel.macros > $tmp/EverythingExceptKernel.unique cat $tmp/Kernel.unique \| awk '{ print "#cmakedefine01 "$1 }' > $tmp/Kernel.header cat $tmp/EverythingExceptKernel.unique \| awk '{ print "#cmakedefine01 "$1 }' > $tmp/EverythingExceptKernel.header for macro in $(cat $tmp/Kernel.unique) do cat $tmp/Kernel.files \| xargs grep -l $macro >> $tmp/Kernel.new-includes \|\|: done cat $tmp/Kernel.new-includes \| sort > $tmp/Kernel.new-includes.sorted for macro in $(cat $tmp/EverythingExceptKernel.unique) do cat $tmp/Kernel.files \| xargs grep -l $macro >> $tmp/Kernel.old-includes \|\|: done cat $tmp/Kernel.old-includes \| sort > $tmp/Kernel.old-includes.sorted comm -23 $tmp/Kernel.new-includes.sorted $tmp/Kernel.old-includes.sorted > $tmp/Kernel.includes.new comm -13 $tmp/Kernel.new-includes.sorted $tmp/Kernel.old-includes.sorted > $tmp/Kernel.includes.old comm -12 $tmp/Kernel.new-includes.sorted $tmp/Kernel.old-includes.sorted > $tmp/Kernel.includes.mixed for file in $(cat $tmp/Kernel.includes.new) do sed -i -E 's/#include <AK\/Debug\.h>/#include <Kernel\/Debug\.h>/' $file done for file in $(cat $tmp/Kernel.includes.mixed) do echo "mixed include in $file, requires manual editing." done	2021-01-26 21:20:00 +01:00
Andreas Kling	3ff88a1d77	Kernel: Assert on attempt to map private region backed by shared inode If we find ourselves with a user-accessible, non-shared Region backed by a SharedInodeVMObject, that's pretty bad news, so let's just panic the kernel instead of getting abused. There might be a better place for this kind of check, so I've added a FIXME about putting more thought into that.	2021-01-26 18:35:10 +01:00
Andreas Kling	a131927c75	Kernel: sys$munmap() region splitting did not preserve "shared" flag This was exploitable since the shared flag determines whether inode permission checks are applied in sys$mprotect(). The bug was pretty hard to spot due to default arguments being used instead. This patch removes the default arguments to make explicit at each call site what's being done.	2021-01-26 18:35:04 +01:00
asynts	eea72b9b5c	Everywhere: Hook up remaining debug macros to Debug.h.	2021-01-25 09:47:36 +01:00
asynts	8465683dcf	Everywhere: Debug macros instead of constexpr. This was done with the following script: find . $ -name '.cpp' -o -name '.h' -o -name '.in' $ -not -path './Toolchain/' -not -path './Build/' -exec sed -i -E 's/dbgln<debug_([a-z_]+)>/dbgln<\U\1_DEBUG>/' {} \; find . $ -name '.cpp' -o -name '.h' -o -name '.in' $ -not -path './Toolchain/' -not -path './Build/' -exec sed -i -E 's/if constexpr \(debug_([a-z0-9_]+)/if constexpr \(\U\1_DEBUG/' {} \;	2021-01-25 09:47:36 +01:00
asynts	acdcf59a33	Everywhere: Remove unnecessary debug comments. It would be tempting to uncomment these statements, but that won't work with the new changes. This was done with the following commands: find . $ -name '.cpp' -o -name '.h' -o -name '.in' $ -not -path './Toolchain/' -not -path './Build/' -exec awk -i inplace '$0 !~ /\/\/#define/ { if (!toggle) { print; } else { toggle = !toggle } } ; $0 ~/\/\/#define/ { toggle = 1 }' {} \; find . $ -name '.cpp' -o -name '.h' -o -name '.in' $ -not -path './Toolchain/' -not -path './Build/' -exec awk -i inplace '$0 !~ /\/\/ #define/ { if (!toggle) { print; } else { toggle = !toggle } } ; $0 ~/\/\/ #define/ { toggle = 1 }' {} \;	2021-01-25 09:47:36 +01:00
asynts	1a3a0836c0	Everywhere: Use CMake to generate AK/Debug.h. This was done with the help of several scripts, I dump them here to easily find them later: awk '/#ifdef/ { print "#cmakedefine01 "$2 }' AK/Debug.h.in for debug_macro in $(awk '/#ifdef/ { print $2 }' AK/Debug.h.in) do find . $ -name '.cpp' -o -name '.h' -o -name '.in' $ -not -path './Toolchain/' -not -path './Build/*' -exec sed -i -E 's/#ifdef '$debug_macro'/#if '$debug_macro'/' {} \; done # Remember to remove WRAPPER_GERNERATOR_DEBUG from the list. awk '/#cmake/ { print "set("$2" ON)" }' AK/Debug.h.in	2021-01-25 09:47:36 +01:00
Jean-Baptiste Boric	ec056f3bd1	Kernel: Parse boot modules from Multiboot specification	2021-01-22 22:17:39 +01:00
Jean-Baptiste Boric	3cbe805486	Kernel: Move kmalloc heaps and super pages inside .bss segment The kernel ignored the first 8 MiB of RAM while parsing the memory map because the kmalloc heaps and the super physical pages lived here. Move all that stuff inside the .bss segment so that those memory regions are accounted for, otherwise we risk overwriting boot modules placed next to the kernel.	2021-01-22 22:17:39 +01:00
Jean-Baptiste Boric	5cd1217b6e	Kernel: Remove trace log in MemoryManager::deallocate_user_physical_page()	2021-01-22 22:17:39 +01:00
asynts	27bc48e06c	Everywhere: Replace a bundle of dbg with dbgln. These changes are arbitrarily divided into multiple commits to make it easier to find potentially introduced bugs with git bisect.	2021-01-22 22:14:30 +01:00
Linus Groh	421587c15c	Everywhere: Fix typos	2021-01-22 18:41:29 +01:00
Andreas Kling	cfe54f86bd	Kernel: Remove unused /proc/mm file This was a file I used very early on to dump information about kernel VM objects. It's long since superseded by other JSON-based files.	2021-01-17 21:14:20 +01:00
Tom	1d621ab172	Kernel: Some futex improvements This adds support for FUTEX_WAKE_OP, FUTEX_WAIT_BITSET, FUTEX_WAKE_BITSET, FUTEX_REQUEUE, and FUTEX_CMP_REQUEUE, as well well as global and private futex and absolute/relative timeouts against the appropriate clock. This also changes the implementation so that kernel resources are only used when a thread is blocked on a futex. Global futexes are implemented as offsets in VMObjects, so that different processes can share a futex against the same VMObject despite potentially being mapped at different virtual addresses.	2021-01-17 20:30:31 +01:00
Andreas Kling	43109f9614	Kernel: Remove unused syscall sys$minherit() This is no longer used. We can bring it back the day we need it.	2021-01-16 14:52:04 +01:00
Lenny Maiorani	e6f907a155	AK: Simplify constructors and conversions from nullptr_t Problem: - Many constructors are defined as `{}` rather than using the ` = default` compiler-provided constructor. - Some types provide an implicit conversion operator from `nullptr_t` instead of requiring the caller to default construct. This violates the C++ Core Guidelines suggestion to declare single-argument constructors explicit (https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#c46-by-default-declare-single-argument-constructors-explicit). Solution: - Change default constructors to use the compiler-provided default constructor. - Remove implicit conversion operators from `nullptr_t` and change usage to enforce type consistency without conversion.	2021-01-12 09:11:45 +01:00
Andreas Kling	f7435dd95f	Kernel: Remove MM_DEBUG debug spam code This was too spammy to ever actually be used anyway.	2021-01-11 22:09:40 +01:00
Andreas Kling	7c4ddecacb	Kernel: Convert a bunch of String::format() => String::formatted()	2021-01-11 22:07:01 +01:00
Sahan Fernando	9bf76a85c8	Everywhere: Fix incorrect uses of String::format and StringBuilder::appendf These changes are arbitrarily divided into multiple commits to make it easier to find potentially introduced bugs with git bisect.	2021-01-11 21:06:32 +01:00
Sahan Fernando	099b83fd28	Everywhere: Fix incorrect uses of String::format and StringBuilder::appendf These changes are arbitrarily divided into multiple commits to make it easier to find potentially introduced bugs with git bisect.	2021-01-11 21:06:32 +01:00
asynts	723effd051	Everywhere: Replace a bundle of dbg with dbgln. These changes are arbitrarily divided into multiple commits to make it easier to find potentially introduced bugs with git bisect.Everything:	2021-01-11 11:55:47 +01:00
asynts	938e5c7719	Everywhere: Replace a bundle of dbg with dbgln. These changes are arbitrarily divided into multiple commits to make it easier to find potentially introduced bugs with git bisect.Everything: The modifications in this commit were automatically made using the following command: find . -name '.cpp' -exec sed -i -E 's/dbg << ("[^"{]");/dbgln$\1$;/' {} \;	2021-01-09 21:11:09 +01:00
Tom	d3e6cdf21f	Kernel: Provide consistent memory stats in ProcFS We should take the MM lock when gathering all the statistics that we need so that the values are consistent.	2021-01-05 10:59:00 +01:00
Tom	901ef3f1c8	Kernel: Specify default memory order for some non-synchronizing Atomics	2021-01-04 19:13:52 +01:00
Tom	0d44ee6f2b	Kernel: Ignore TLB flush requests for user addresses of other processes If a TLB flush request is broadcast to other processors and the addresses to flush are user mode addresses, we can ignore such a request on the target processor if the page directory currently in use doesn't match the addresses to be flushed. We still need to broadcast to all processors in that case because the other processors may switch to that same page directory at any time.	2021-01-02 20:56:35 +01:00
Tom	c630669304	Kernel: If a VMObject is shared, broadcast page remappings If we remap pages (e.g. lazy allocation) inside a VMObject that is shared among more than one region, broadcast it to any other region that may be mapping the same page.	2021-01-02 20:56:35 +01:00
Tom	e3190bd144	Revert "Kernel: Allocate shared memory regions immediately" This reverts commit `fe6b3f99d1`.	2021-01-02 20:56:35 +01:00
Andreas Kling	fe6b3f99d1	Kernel: Allocate shared memory regions immediately Lazily committed shared memory was not working in situations where one process would write to the memory and another would only read from it. Since the reading process would never cause a write fault in the shared region, we'd never notice that the writing process had added real physical pages to the VMObject. This happened because the lazily committed pages were marked "present" in the page table. This patch solves the issue by always allocating shared memory up front and not trying to be clever about it.	2021-01-02 16:57:31 +01:00
Andreas Kling	5dae85afe7	Kernel: Pass "shared" flag to Region constructor Before this change, we would sometimes map a region into the address space with !is_shared(), and then moments later call set_shared(true). I found this very confusing while debugging, so this patch makes us pass the initial shared flag to the Region constructor, ensuring that it's in the correct state by the time we first map the region.	2021-01-02 16:57:31 +01:00
Andreas Kling	14493645e0	Kernel: Make Region::amount_shared() and amount_resident() lazy-aware Don't count the lazy-committed page towards shared/resident amounts.	2021-01-02 00:47:55 +01:00
Tom	2f429bd2d5	Kernel: Pass new region owner to Region::clone	2021-01-01 23:43:44 +01:00
Tom	bf9be3ec01	Kernel: More gracefully handle out-of-memory when creating PageDirectory	2021-01-01 23:43:44 +01:00
Tom	476f17b3f1	Kernel: Merge PurgeableVMObject into AnonymousVMObject This implements memory commitments and lazy-allocation of committed memory.	2021-01-01 23:43:44 +01:00
Tom	b2a52f6208	Kernel: Implement lazy committed page allocation By designating a committed page pool we can guarantee to have physical pages available for lazy allocation in mappings. However, when forking we will overcommit. The assumption is that worst-case it's better for the fork to die due to insufficient physical memory on COW access than the parent that created the region. If a fork wants to ensure that all memory is available (trigger a commit) then it can use madvise. This also means that fork now can gracefully fail if we don't have enough physical pages available.	2021-01-01 23:43:44 +01:00
Tom	c3451899bc	Kernel: Add MAP_NORESERVE support to mmap Rather than lazily committing regions by default, we now commit the entire region unless MAP_NORESERVE is specified. This solves random crashes in low-memory situations where e.g. the malloc heap allocated memory, but using pages that haven't been used before triggers a crash when no more physical memory is available. Use this flag to create large regions without actually committing the backing memory. madvise() can be used to commit arbitrary areas of such regions after creating them.	2021-01-01 23:43:44 +01:00
Tom	bc5d6992a4	Kernel: Memory purging improvements This adds the ability for a Region to define volatile/nonvolatile areas within mapped memory using madvise(). This also means that memory purging takes into account all views of the PurgeableVMObject and only purges memory that is not needed by all of them. When calling madvise() to change an area to nonvolatile memory, return whether memory from that area was purged. At that time also try to remap all memory that is requested to be nonvolatile, and if insufficient pages are available notify the caller of that fact.	2021-01-01 23:43:44 +01:00
Andreas Kling	7c3b6b10e4	Kernel: Remove the limited use of AK::TypeTraits we had in the kernel This was only used for VMObject and we can do without it there. This is preparation for migrating to dynamic_cast-based helpers in userspace.	2021-01-01 15:32:44 +01:00
Tom	82c4812730	Kernel: Remove flawed SharedInodeVMObject assertion This assertion cannot be safely/reliably made in the ~SharedInodeVMObject destructor. The problem is that Inode::is_shared_vmobject holds a weak reference to the instance that is being destroyed (ref count 0). Checking the pointer using WeakPtr::unsafe_ptr will produce nullptr depending on timing in this case, and WeakPtr::safe_ref will reliably produce a nullptr as soon as the reference count drops to 0. The only case where this assertion could succeed is when WeakPtr::unsafe_ptr returned the pointer because it won the race against revoking it. And because WeakPtr::safe_ref will always return a nullptr, we cannot reliably assert this from the ~SharedInodeVMObject destructor. Fixes #4621	2020-12-31 10:52:45 +01:00
Luke	865f5ed4f6	Kernel: Prevent sign bit extension when creating a PDPTE When doing the cast to u64 on the page directory physical address, the sign bit was being extended. This only beomes an issue when crossing the 2 GiB boundary. At >= 2 GiB, the physical address has the sign bit set. For example, 0x80000000. This set all the reserved bits in the PDPTE, causing a GPF when loading the PDPT pointer into CR3. The reserved bits are presumably there to stop you writing out a physical address that the CPU physically cannot handle, as the size of the reserved bits is determined by the physical address width of the CPU. This fixes this by casting to FlatPtr instead. I believe the sign extension only happens when casting to a bigger type. I'm also using FlatPtr because it's a pointer we're writing into the PDPTE. sizeof(FlatPtr) will always be the same size as sizeof(void*). This also now asserts that the physical address in the PDPTE is within the max physical address the CPU supports. This is better than getting a GPF, because CPU::handle_crash tries to do the same operation that caused the GPF in the first place. That would cause an infinite loop of GPFs until the stack was exhausted, causing a triple fault. As far as I know and tested, I believe we can now use the full 32-bit physical range without crashing. Fixes #4584. See that issue for the full debugging story.	2020-12-30 20:33:15 +01:00
asynts	50d24e4f98	AK: Make binary_search signature more generic.	2020-12-30 02:13:30 +01:00
Andreas Kling	30dbe9c78a	Kernel+LibC: Add a very limited sys$mremap() implementation This syscall can currently only remap a shared file-backed mapping into a private file-backed mapping.	2020-12-29 02:20:43 +01:00
Luke	b980782343	Kernel/VM: Make local_offset in PhysicalRegion::find_one_free_page unsigned An extension to #4613, as I didn't notice that it also happens here.	2020-12-29 02:20:26 +01:00
Luke	eb38fe4a82	Kernel/VM: Make local_offset in PhysicalRegion::free_page_at unsigned Anything above or equal to the 2 GB mark has the left most bit set (0x8000...), which was falsely interpreted as negative due to local_offset being signed. This makes it unsigned by using FlatPtr. To check for underflow as was intended, lets use Checked instead. Fixes #4585	2020-12-29 01:41:16 +01:00
Andreas Kling	ed5c26d698	AK: Remove custom %w format string specifier This was a non-standard specifier alias for %04x. This patch replaces all uses of it with new-style formatting functions instead.	2020-12-25 17:05:05 +01:00
Liav A	afba614d68	Kernel: Don't skip if found free page to allocate from a super region This was a bad pattern that wasn't detected because we only had one super physical region that was initialized by MemoryManager.	2020-12-21 00:15:58 +01:00
Lenny Maiorani	765936ebae	Everywhere: Switch from (void) to [[maybe_unused]] (#4473 ) Problem: - `(void)` simply casts the expression to void. This is understood to indicate that it is ignored, but this is really a compiler trick to get the compiler to not generate a warning. Solution: - Use the `[[maybe_unused]]` attribute to indicate the value is unused. Note: - Functions taking a `(void)` argument list have also been changed to `()` because this is not needed and shows up in the same grep command.	2020-12-21 00:09:48 +01:00
Andreas Kling	8e79bde2b7	Kernel: Move KBufferBuilder to the fallible KBuffer API KBufferBuilder::build() now returns an OwnPtr<KBuffer> and can fail. Clients of the API have been updated to handle that situation.	2020-12-18 19:22:26 +01:00
Tom	da5cc34ebb	Kernel: Fix some issues related to fixes and block conditions Fix some problems with join blocks where the joining thread block condition was added twice, which lead to a crash when trying to unblock that condition a second time. Deferred block condition evaluation by File objects were also not properly keeping the File object alive, which lead to some random crashes and corruption problems. Other problems were caused by the fact that the Queued state didn't handle signals/interruptions consistently. To solve these issues we remove this state entirely, along with Thread::wait_on and change the WaitQueue into a BlockCondition instead. Also, deliver signals even if there isn't going to be a context switch to another thread. Fixes #4336 and #4330	2020-12-12 21:28:12 +01:00
Tom	78f1b5e359	Kernel: Fix some problems with Thread::wait_on and Lock This changes the Thread::wait_on function to not enable interrupts upon leaving, which caused some problems with page fault handlers and in other situations. It may now be called from critical sections, with interrupts enabled or disabled, and returns to the same state. This also requires some fixes to Lock. To aid debugging, a new define LOCK_DEBUG is added that enables checking for Lock leaks upon finalization of a Thread.	2020-12-01 09:48:34 +01:00
Tom	5b38132e3c	Kernel: Protect the PageDirectory from concurrent access	2020-11-11 12:27:25 +01:00
Tom	2b25a89ab5	Kernel: Add locks around RangeAllocator We need to keep multiple processors from changing it at the same time.	2020-11-11 12:27:25 +01:00
Tom	75f61fe3d9	AK: Make RefPtr, NonnullRefPtr, WeakPtr thread safe This makes most operations thread safe, especially so that they can safely be used in the Kernel. This includes obtaining a strong reference from a weak reference, which now requires an explicit call to WeakPtr::strong_ref(). Another major change is that Weakable::make_weak_ref() may require the explicit target type. Previously we used reinterpret_cast in WeakPtr, assuming that it can be properly converted. But WeakPtr does not necessarily have the knowledge to be able to do this. Instead, we now ask the class itself to deliver a WeakPtr to the type that we want. Also, WeakLink is no longer specific to a target type. The reason for this is that we want to be able to safely convert e.g. WeakPtr<T> to WeakPtr<U>, and before this we just reinterpret_cast the internal WeakLink<T> to WeakLink<U>, which is a bold assumption that it would actually produce the correct code. Instead, WeakLink now operates on just a raw pointer and we only make those constructors/operators available if we can verify that it can be safely cast. In order to guarantee thread safety, we now use the least significant bit in the pointer for locking purposes. This also means that only properly aligned pointers can be used.	2020-11-10 19:11:52 +01:00
Tom	13aa3d2d62	Kernel: Flush TLB when quick-mapping PD/PT that was mapped on other CPU If a PD/PT was quick-mapped by another CPU we still need to flush the TLB on the current CPU. Fixes #3885	2020-11-01 18:48:36 +01:00
Tom	6fbced6f4f	Kernel: Ensure PhysicalRegion free page hint is within valid range Fixes #3770	2020-10-16 17:39:42 +02:00
asynts	71fd54f76b	MemoryManager: Off-by-one error when collecting memory pages. Notice that we ensured that the size is a multiple of the page size and that there is at least one page there, otherwise, this change would be invalid. We create an empty region and then expand it: // First iteration. m_user_physical_regions.append(PhysicalRegion::create(addr, addr)); // Following iterations. region->expand(region->lower(), addr); So if the memory region only has one page, we would end up with an empty region. Thus we need to do one more iteration.	2020-10-12 19:39:00 +02:00
Ben Wiederhake	64cc3f51d0	Meta+Kernel: Make clang-format-10 clean	2020-09-25 21:18:17 +02:00
Luke	68b361bd21	Kernel: Return ENOMEM in more places There are plenty of places in the kernel that aren't checking if they actually got their allocation. This fixes some of them, but definitely not all. Fixes #3390 Fixes #3391 Also, let's make find_one_free_page() return nullptr if it doesn't get a free index. This stops the kernel crashing when out of memory and allows memory purging to take place again. Fixes #3487	2020-09-16 20:38:19 +02:00
Tom	c8d9f1b9c9	Kernel: Make copy_to/from_user safe and remove unnecessary checks Since the CPU already does almost all necessary validation steps for us, we don't really need to attempt to do this. Doing it ourselves doesn't really work very reliably, because we'd have to account for other processors modifying virtual memory, and we'd have to account for e.g. pages not being able to be allocated due to insufficient resources. So change the copy_to/from_user (and associated helper functions) to use the new safe_memcpy, which will return whether it succeeded or not. The only manual validation step needed (which the CPU can't perform for us) is making sure the pointers provided by user mode aren't pointing to kernel mappings. To make it easier to read/write from/to either kernel or user mode data add the UserOrKernelBuffer helper class, which will internally either use copy_from/to_user or directly memcpy, or pass the data through directly using a temporary buffer on the stack. Last but not least we need to keep syscall params trivial as we need to copy them from/to user mode using copy_from/to_user.	2020-09-13 21:19:15 +02:00
Ben Wiederhake	0d79e57c4d	Kernel: Fix various forward declarations I decided to modify MappedROM.h because all other entried in Forward.h are also classes, and this is visually more pleasing. Other than that, it just doesn't make any difference which way we resolve the conflicts.	2020-09-12 13:46:15 +02:00
Tom	efe2b75017	Kernel: Optimize single physical page allocation and randomize returns Rather than trying to find a contiguous set of bits of size 1, just find one single available bit using a hint. Also, try to randomize returned physical pages a bit by placing them into a 256 entry queue rather than making them available immediately. Then, once the queue is filled, pick a random one, make it available again and use that slot for the latest page to be returned.	2020-09-09 13:02:14 +02:00
asynts	ec1080b18a	Refactor: Replace usages of FixedArray with Vector.	2020-09-08 14:01:21 +02:00
Tom	bf268a0185	Kernel: Handle committing pages in regions more gracefully Sometimes a physical underlying page may be there, but we may be unable to allocate a page table that may be needed to map it. Bubble up such mapping errors so that they can be handled more appropriately.	2020-09-02 00:35:56 +02:00
Tom	83ddf3d850	Kernel: Fix memory purge clobbering mapped page directory in ensure_pte If allocating a page table triggers purging memory, we need to call quickmap_pd again to make sure the underlying physical page is remapped to the correct one. This is needed because purging itself may trigger calls to ensure_pte as well. Fixes #3370	2020-09-01 22:08:43 +02:00
Tom	30d36a3ad1	Kernel: Remove assertion from Region::commit We should be able to gracefully fail a commit in low-memory situations.	2020-09-01 22:08:43 +02:00
Tom	eb1cc5d665	Kernel: Only remap regions if memory was purged from them	2020-09-01 22:08:43 +02:00
Andreas Kling	171868e4f7	Kernel: Preserve internal state in cloned PurgeableVMObjects When cloning a purgeable memory region (which happens on fork), we need to preserve the "was purged" and "volatile" state of the original region, or they will always appear as non-volatile and unpurged regions in the child process. Fixes #3374.	2020-09-01 17:45:28 +02:00
Andreas Kling	cc5403f77b	Kernel: Remove unused variable PhysicalRegion::m_last	2020-08-30 13:13:55 +02:00
Tom	4b66692a55	Kernel: Make Heap implementation reusable, and make kmalloc expandable Add an ExpandableHeap and switch kmalloc to use it, which allows for the kmalloc heap to grow as needed. In order to make heap expansion to work, we keep around a 1 MiB backup memory region, because creating a region would require space in the same heap. This means, the heap will grow as soon as the reported utilization is less than 1 MiB. It will also return memory if an entire subheap is no longer needed, although that is rarely possible.	2020-08-30 11:39:38 +02:00
Ben Wiederhake	081bb29626	Kernel: Unbreak building with extra debug macros, part 2	2020-08-30 09:43:49 +02:00
Tom	67dbb56444	Kernel: Release page tables when no longer needed When unmapping regions, check if page tables can be freed. This is a follow-up change for #3254.	2020-08-28 09:21:24 +02:00
Tom	dd83f6a266	Kernel: Fix losing PTEs We can't use a HashMap with a small key that doesn't guarantee collisions. Change it to a HashTable instead. Fixes #3254	2020-08-26 01:25:06 +02:00
Tom	bcbe2fe525	Kernel: Protect looping over VMObject regions We need to hold the memory manager lock so nobody else can modify these lists while we're iterating them.	2020-08-26 01:25:06 +02:00
Tom	d89582880e	Kernel: Switch singletons to use new Singleton class MemoryManager cannot use the Singleton class because MemoryManager::initialize is called before the global constructors are run. That caused the Singleton to be re-initialized, causing it to create another MemoryManager instance. Fixes #3226	2020-08-25 09:48:48 +02:00
Tom	ba6e4fb77f	Kernel: Fix kmalloc memory corruption Rather than hardcoding where the kmalloc pool should be, place it at the end of the kernel image instead. This avoids corrupting global variables or other parts of the kernel as it grows. Fixes #3257	2020-08-25 09:48:48 +02:00
Tom	08a569fbe0	Kernel: Make PhysicalPage not movable and use atomic ref counting We should not be moving ref-counted objects.	2020-08-25 09:48:48 +02:00
Andreas Kling	2fd9e72264	Revert "Kernel: Switch singletons to use new Singleton class" This reverts commit `f48feae0b2`.	2020-08-22 18:01:59 +02:00
Andreas Kling	8925ad3fa0	Revert "Kernel: Move Singleton class to AK" This reverts commit `f0906250a1`.	2020-08-22 16:34:49 +02:00
Andreas Kling	b0a24a83be	Revert "Kernel: Fix regression where MemoryManager is initialized twice" This reverts commit `8a75e0b892`.	2020-08-22 16:34:15 +02:00
Andreas Kling	68580d5a8d	Revert "AK: Get rid of make_singleton function" This reverts commit `5a98e329d1`.	2020-08-22 16:34:14 +02:00
Andreas Kling	0db7e04c2e	Revert "Kernel: Make PhysicalPage not movable and use atomic ref counting" This reverts commit `a89ccd842b`.	2020-08-22 16:34:11 +02:00
Tom	a89ccd842b	Kernel: Make PhysicalPage not movable and use atomic ref counting We should not be moving ref-counted objects.	2020-08-22 10:46:24 +02:00
Tom	5a98e329d1	AK: Get rid of make_singleton function Just default the InitFunction template argument.	2020-08-22 10:46:24 +02:00
Tom	8a75e0b892	Kernel: Fix regression where MemoryManager is initialized twice MemoryManager cannot use the Singleton class because MemoryManager::initialize is called before the global constructors are run. That caused the Singleton to be re-initialized, causing it to create another MemoryManager instance.	2020-08-22 10:46:24 +02:00
Tom	f0906250a1	Kernel: Move Singleton class to AK	2020-08-22 10:46:24 +02:00
Tom	cf8ce839da	Kernel: Fix assertion when releasing contiguous memory region There is no guarantee that the memory manager lock is held when physical pages are released, so just acquire the memory manager lock.	2020-08-21 12:03:20 +02:00
Tom	f48feae0b2	Kernel: Switch singletons to use new Singleton class Fixes #3226	2020-08-21 11:47:35 +02:00
Nico Weber	b31aa2e918	Kernel: Switch a comment to GiB	2020-08-16 16:33:28 +02:00
Nico Weber	430b265cd4	AK: Rename KB, MB, GB to KiB, MiB, GiB The SI prefixes "k", "M", "G" mean "10^3", "10^6", "10^9". The IEC prefixes "Ki", "Mi", "Gi" mean "2^10", "2^20", "2^30". Let's use the correct name, at least in code. Only changes the name of the constants, no other behavior change.	2020-08-16 16:33:28 +02:00
Nico Weber	6e1d6d1ff5	Kernel: Don't request a random u32 when all but 5 bits are immediately masked off	2020-08-13 18:52:24 +02:00
Muhammad Zahalqa	615ba0f368	AK: Fix overflow and mixed-signedness issues in binary_search() (#2961 )	2020-08-02 21:10:35 +02:00
Andreas Kling	be7add690d	Kernel: Rename region_from_foo() => find_region_from_foo() Let's emphasize that these functions actually go out and find regions.	2020-07-30 23:52:28 +02:00
Andreas Kling	949aef4aef	Kernel: Move syscall implementations out of Process.cpp This is something I've been meaning to do for a long time, and here we finally go. This patch moves all sys$foo functions out of Process.cpp and into files in Kernel/Syscalls/. It's not exactly one syscall per file (although it could be, but I got a bit tired of the repetitive work here..) This makes hacking on individual syscalls a lot less painful since you don't have to rebuild nearly as much code every time. I'm also hopeful that this makes it easier to understand individual syscalls. :^)	2020-07-30 23:40:57 +02:00
Andreas Kling	fe6474e692	Kernel: Switch to using AK::is and AK::downcast	2020-07-26 17:51:00 +02:00
asynts	707d92db61	Refactor: Change the AK::binary_search signature to use AK::Span.	2020-07-26 16:49:06 +02:00
Tom	06d50f64b0	Kernel: Aggregate TLB flush requests for Regions for SMP Rather than sending one TLB flush request for each page, aggregate them so that we're not spamming the other processors with FlushTLB IPIs.	2020-07-06 22:39:06 +02:00
Tom	655f4daeb1	Kernel: Minor MM optimization for SMP MemoryManager::quickmap_pd and MemoryManager::quickmap_pt can only be called by one processor at the time anyway, since anything using these must have the MM lock held. So, no need to inform the other CPUs to flush their TLBs, we can just flush our own.	2020-07-06 17:17:24 +02:00
Tom	bc107d0b33	Kernel: Add SMP IPI support We can now properly initialize all processors without crashing by sending SMP IPI messages to synchronize memory between processors. We now initialize the APs once we have the scheduler running. This is so that we can process IPI messages from the other cores. Also rework interrupt handling a bit so that it's more of a 1:1 mapping. We need to allocate non-sharable interrupts for IPIs. This also fixes the occasional hang/crash because all CPUs now synchronize memory with each other.	2020-07-06 17:07:44 +02:00
Tom	9b4e6f6a23	Kernel: Consolidate features into CPUFeature enum This allows us to consolidate printing out all the CPU features into one log statement. Also expose them in /proc/cpuinfo	2020-07-03 19:32:34 +02:00
Tom	e373e5f007	Kernel: Fix signal delivery When delivering urgent signals to the current thread we need to check if we should be unblocked, and if not we need to yield to another process. We also need to make sure that we suppress context switches during Process::exec() so that we don't clobber the registers that it sets up (eip mainly) by a context switch. To be able to do that we add the concept of a critical section, which are similar to Process::m_in_irq but different in that they can be requested at any time. Calls to Scheduler::yield and Scheduler::donate_to will return instantly without triggering a context switch, but the processor will then asynchronously trigger a context switch once the critical section is left.	2020-07-03 19:32:34 +02:00
Tom	2a38cc9a12	Kernel: Add a quickmap region for each processor Threads need to be able to concurrently quickmap things.	2020-07-01 12:07:01 +02:00
Tom	16783bd14d	Kernel: Turn Thread::current and Process::current into functions This allows us to query the current thread and process on a per processor basis	2020-07-01 12:07:01 +02:00
Tom	d98edb3171	Kernel: List all CPUs in /proc/cpuinfo	2020-07-01 12:07:01 +02:00
Tom	fb41d89384	Kernel: Implement software context switching and Processor structure Moving certain globals into a new Processor structure for each CPU allows us to eventually run an instance of the scheduler on each CPU.	2020-07-01 12:07:01 +02:00
Peter Elliott	e1aef94a40	Kernel: Make Random work on CPUs without rdrand - If rdseed is not available, fallback to rdrand. - If rdrand is not available, block for entropy, or use insecure prng depending on if user wants fast or good random.	2020-06-27 19:40:33 +02:00
Tom	841364b609	Kernel: Add mechanism to identity map the lowest 2MB	2020-06-04 18:15:23 +02:00
etaIneLp	826dc94187	Kernel: Create page structures correctly in boot.s	2020-05-26 09:50:12 +02:00
Andreas Kling	e870b936c3	Kernel: Add non-const version of TypedMapping::operator->()	2020-05-23 15:57:19 +02:00
Andreas Kling	4b847810bf	Kernel: Simplify scanning BIOS/EBDA and MP parser initialization Add a MappedROM::find_chunk_starting_with() helper since that's a very common usage pattern in clients of this code. Also convert MultiProcessorParser from a persistent singleton object to a temporary object constructed via a failable factory function.	2020-05-22 13:36:57 +02:00
Andreas Kling	84b7bc5e14	Kernel: Add convenient ways to map whole BIOS and EBDA into memory This patch adds a MappedROM abstraction to the Kernel VM subsystem. It's basically the read-only byte buffer equivalent of a TypedMapping. We use this in the ACPI and MP table parsers to scan for interesting stuff in low memory instead of doing a bunch of address arithmetic.	2020-05-22 13:17:38 +02:00
Sergey Bugaev	746db0bedb	Kernel: Validate access to whole regions	2020-05-20 14:11:13 +02:00
Sergey Bugaev	0dd68a2949	Kernel: Look for a user region first We're far more likely to be looking for a user region than otherwise, so optimize for that case.	2020-05-20 14:11:13 +02:00
Andreas Kling	21d5f4ada1	Kernel: Absorb LibBareMetal back into the kernel This was supposed to be the foundation for some kind of pre-kernel environment, but nobody is working on it right now, so let's move everything back into the kernel and remove all the confusion.	2020-05-16 12:00:04 +02:00
Sergey Bugaev	450a2a0f9c	Build: Switch to CMake :^) Closes https://github.com/SerenityOS/serenity/issues/2080	2020-05-14 20:15:18 +02:00
Andreas Kling	85a3678b4f	Kernel: Assert on startup if we don't find any physical pages Instead of checking this on every page allocation, just check it once on startup. :^)	2020-05-08 22:15:02 +02:00
Andreas Kling	55f61c0004	Kernel: Add for_each_vmobject_of_type<T> This makes iterating over a specific type of VMObjects a bit nicer.	2020-05-08 22:10:47 +02:00
Andreas Kling	042b1f6814	Kernel: Propagate failure to commit VM regions in more places Ultimately we should not panic just because we can't fully commit a VM region (by populating it with physical pages.) This patch handles some of the situations where commit() can fail.	2020-05-08 21:47:08 +02:00
Andreas Kling	d74650e80d	Kernel: Use NonnullRefPtrVector<T> instead of Vector<RefPtr<T>> some	2020-05-08 21:12:16 +02:00
Andreas Kling	beaec6bd2d	Kernel: Memory purging was incorrectly "purging" the shared zero page This caused us to report one purged page per occurrence of the shared zero page in a purgeable memory region, despite it being a no-op. Thanks to Sergey for spotting the bad assertion removal that led to this being found!	2020-05-07 09:44:41 +02:00
Andreas Kling	6fe83b0ac4	Kernel: Crash the current process on OOM (instead of panicking kernel) This patch adds PageFaultResponse::OutOfMemory which informs the fault handler that we were unable to allocate a necessary physical page and cannot continue. In response to this, the kernel will crash the current process. Because we are OOM, we can't symbolicate the crash like we normally would (since the ELF symbolication code needs to allocate), so we also communicate to Process::crash() that we're out of memory. Now we can survive "allocate 300 MB" (only the allocate process dies.) This is definitely not perfect and can easily end up killing a random innocent other process who happened to allocate one page at the wrong time, but it's a lot better than panicking on OOM. :^)	2020-05-06 22:28:23 +02:00
Andreas Kling	c633c1c2ea	Kernel: Assert on OOM in Region::commit() This function has a lot of callers that don't bother checking if it returns successfully or not. We'll need to handle failure in a bunch of places and then we can remove this assertion.	2020-05-06 22:28:23 +02:00
Andreas Kling	43593455db	Kernel: Don't assert on OOM in allocate_user_physical_page() We now give callers a chance to react to OOM situations.	2020-05-06 22:28:23 +02:00
Andreas Kling	4419685b7e	Kernel: Leave VMObject alone on OOM during CoW fault If we OOM during a CoW fault and fail to allocate a new page for the writing process, just leave the original VMObject alone so everyone else can keep using it.	2020-04-28 17:05:14 +02:00
Andreas Kling	9c856811b2	Kernel: Add Region helpers for accessing underlying physical pages Since a Region is basically a view into a potentially larger VMObject, it was always necessary to include the Region starting offset when accessing its underlying physical pages. Until now, you had to do that manually, but this patch adds a simple Region::physical_page() for read-only access and a physical_page_slot() when you want a mutable reference to the RefPtr<PhysicalPage> itself. A lot of code is simplified by making use of this.	2020-04-28 17:05:14 +02:00
Andreas Kling	1d43544e08	Kernel: Switch the first-8MB-of-upper-3GB pseudo mappings to 4KB pages This memory range was set up using 2MB pages by the code in boot.S. Because of that, the kernel image protection code didn't work, since it assumed 4KB pages. We now switch to 4KB pages during MemoryManager initialization. This makes the kernel image protection code work correctly again. :^)	2020-04-13 22:35:37 +02:00
Itamar	b306ac9b2b	ptrace: Add PT_POKE PT_POKE writes a single word to the tracee's address space. Some caveats: - If the user requests to write to an address in a read-only region, we temporarily change the page's protections to allow it. - If the user requests to write to a region that's backed by a SharedInodeVMObject, we replace the vmobject with a PrivateIndoeVMObject.	2020-04-13 00:53:22 +02:00
Andreas Kling	c19b56dc99	Kernel+LibC: Add minherit() and MAP_INHERIT_ZERO This patch adds the minherit() syscall originally invented by OpenBSD. Only the MAP_INHERIT_ZERO mode is supported for now. If set on an mmap region, that region will be zeroed out on fork().	2020-04-12 20:22:26 +02:00
Andreas Kling	f614f0e2cb	Kernel: Add typed_map<T>(PhysicalAddress) and use it in ACPI parsing There was a frequently occurring pattern of "map this physical address into kernel VM, then read from it, then unmap it again". This new typed_map() encapsulates that logic by giving you back a typed pointer to the kind of structure you're interested in accessing. It returns a TypedMapping<T> that can be used mostly like a pointer. When destroyed, the TypedMapping object will unmap the memory. :^)	2020-04-09 17:19:11 +02:00
Andreas Kling	522d8c5d71	Kernel: Non-readable-but-writable regions should still be mapped Fixes #1436.	2020-04-03 10:10:56 +02:00
Andreas Kling	7d862dd5fc	AK: Reduce header dependency graph of String.h String.h no longer pulls in StringView.h. We do this by moving a bunch of String functions out-of-line.	2020-03-23 13:48:44 +01:00
Shannon Booth	81adefef27	Kernel: Run clang-format on files Let's rip off the band-aid	2020-03-22 01:22:32 +01:00
Liav A	d6e122fd3a	Kernel: Allow contiguous allocations in physical memory For that, we have a new type of VMObject, called ContiguousVMObject, that is responsible for allocating contiguous physical pages.	2020-03-08 14:13:30 +01:00
Andreas Kling	fa9fba6901	Kernel: Add missing #includes now that <AK/StdLibExtras.h> is smaller	2020-03-08 13:06:51 +01:00
Andreas Kling	b1058b33fb	AK: Add global FlatPtr typedef. It's u32 or u64, based on sizeof(void*) Use this instead of uintptr_t throughout the codebase. This makes it possible to pass a FlatPtr to something that has u32 and u64 overloads.	2020-03-08 13:06:51 +01:00
Andreas Kling	7a6c4a72d5	LibWeb: Move everything into the Web namespace	2020-03-07 10:27:02 +01:00
Andreas Kling	c6693f9b3a	Kernel: Simplify a bunch of dbg() and klog() calls LogStream can handle VirtualAddress and PhysicalAddress directly.	2020-03-06 15:00:44 +01:00
Andreas Kling	94f287b1c0	Kernel: Unmap non-readable pages This was caught by running all crash tests with "crash -A". Basically, non-readable pages need to not be mapped at all so that a "page not present" exception is provoked on access. Unfortunately x86 does not support write-only mappings, so this is the best we can do. Fixes #1336.	2020-03-06 09:58:59 +01:00
Liav A	0fc60e41dd	Kernel: Use klog() instead of kprintf() Also, duplicate data in dbg() and klog() calls were removed. In addition, leakage of virtual address to kernel log is prevented. This is done by replacing kprintf() calls to dbg() calls with the leaked data instead. Also, other kprintf() calls were replaced with klog().	2020-03-02 22:23:39 +01:00
Andreas Kling	918ebabf60	Kernel: MemoryManager should create cacheable regions by default	2020-03-02 13:04:17 +01:00
Andreas Kling	ecdd9a5bc6	Kernel: Reduce code duplication a little bit in Region allocation This patch reduces the number of code paths that lead to the allocation of a Region object. It's quite hard to follow the various ways in which this can happen, so this is an effort to simplify.	2020-03-01 15:56:23 +01:00
Andreas Kling	5e0c4d689f	Kernel: Move ProcessPagingScope to its own files	2020-03-01 15:38:09 +01:00
Andreas Kling	fee20bd8de	Kernel: Remove some more harmless InodeVMObject miscasts	2020-03-01 12:27:03 +01:00
Andreas Kling	b614462079	Kernel: Include the dirty bits when cloning an InodeVMObject Now that (private) InodeVMObjects can be CoW-cloned on fork(), we need to make sure we clone the dirty bits as well.	2020-03-01 12:11:50 +01:00
Andreas Kling	48bbfe51fb	Kernel: Add some InodeVMObject type assertions in Region::clone() Let's make sure that we're never cloning shared inode-backed objects as if they were private, and vice versa.	2020-03-01 11:23:10 +01:00
Andreas Kling	88b334135b	Kernel: Remove some Region construction helpers It's now up to the caller to provide a VMObject when constructing a new Region object. This will make it easier to handle things going wrong, like allocation failures, etc.	2020-03-01 11:23:10 +01:00
Andreas Kling	fddc3c957b	Kernel: CoW-clone private inode-backed memory regions on fork() When forking a process, we now turn all of the private inode-backed mmap() regions into copy-on-write regions in both the parent and child. This patch also removes an assertion that becomes irrelevant.	2020-03-01 11:23:10 +01:00
Andreas Kling	7cd1bdfd81	Kernel: Simplify some dbg() logging We don't have to log the process name/PID/TID, dbg() automatically adds that as a prefix to every line. Also we don't have to do .characters() on Strings passed to dbg() :^)	2020-02-29 13:39:06 +01:00
Andreas Kling	5f7056d62c	Kernel: Expose the VMObject type of each Region in /proc/PID/vm	2020-02-28 23:25:40 +01:00
Andreas Kling	aa1e209845	Kernel: Remove some unnecessary indirection in InodeFile::mmap() InodeFile now directly calls Process::allocate_region_with_vmobject() instead of taking an awkward detour via a special Region constructor.	2020-02-28 20:29:14 +01:00
Andreas Kling	651417a085	Kernel: Split InodeVMObject into two subclasses We now have PrivateInodeVMObject and SharedInodeVMObject, corresponding to MAP_PRIVATE and MAP_SHARED respectively. Note that PrivateInodeVMObject is not used yet.	2020-02-28 20:20:35 +01:00
Andreas Kling	07a26aece3	Kernel: Rename InodeVMObject => SharedInodeVMObject	2020-02-28 20:07:51 +01:00
Liav A	d16b26f83a	MemoryManager: Use dbg() instead of dbgprintf()	2020-02-27 13:05:12 +01:00
Liav A	42665817d1	RangeAllocator: Use dbg() instead of dbgprintf()	2020-02-27 13:05:12 +01:00
Liav A	3f2d5f2774	PhysicalPage: Use dbg() instead of dbgprintf()	2020-02-27 13:05:12 +01:00
Liav A	24d2aeda8e	Region: Use dbg() instead of dbgprintf()	2020-02-27 13:05:12 +01:00
Liav A	3f95a7fc97	InodeVMObject: Use dbg() instead of dbgprintf()	2020-02-27 13:05:12 +01:00
Liav A	62adbbc598	PageDirectory: Use dbg() instead of dbgprintf()	2020-02-27 13:05:12 +01:00
Andreas Kling	ceec1a7d38	AK: Make Vector use size_t for its size and capacity	2020-02-25 14:52:35 +01:00
Andreas Kling	30a8991dbf	Kernel: Make Region weakable and use WeakPtr<Region> instead of Region* This turns use-after-free bugs into null pointer dereferences instead.	2020-02-24 13:32:45 +01:00
Andreas Kling	0763f67043	AK: Make Bitmap use size_t for its size Also rework its API's to return Optional<size_t> instead of int with -1 as the error value.	2020-02-24 09:56:07 +01:00
Andreas Kling	7ec758773c	Kernel: Dump all kernel regions when we hit a page fault during IRQ This way you can try to figure out what the faulting address is.	2020-02-23 11:10:52 +01:00
Andreas Kling	f020081a38	Kernel: Put "Couldn't find user region" spam behind MM_DEBUG This basically never tells us anything actionable anyway, and it's a real annoyance when doing something validation-heavy like profiling.	2020-02-22 10:09:54 +01:00
Andreas Kling	b298c01e92	Kernel: Log instead of crashing when getting a page fault during IRQ This is definitely a bug, but it seems to happen randomly every now and then and we need more info to track it down, so let's log for now.	2020-02-21 19:05:45 +01:00
Andreas Kling	04e40da188	Kernel: Fix crash when reading /proc/PID/vmobjects InodeVMObjects can have nulled-out physical page slots. That just means we haven't cached that page from disk right now.	2020-02-21 16:03:56 +01:00
Andreas Kling	59b9e49bcd	Kernel: Don't trigger page faults during profiling stack walk The kernel sampling profiler will walk thread stacks during the timer tick handler. Since it's not safe to trigger page faults during IRQ's, we now avoid this by checking the page tables manually before accessing each stack location.	2020-02-21 15:49:39 +01:00
Andreas Kling	d46071c08f	Kernel: Assert on page fault during IRQ We're not equipped to deal with page faults during an IRQ handler, so add an assertion so we can immediately tell what's wrong. This is why profiling sometimes hangs the system -- walking the stack of the profiled thread causes a page fault and things fall apart.	2020-02-21 15:49:34 +01:00
Andreas Kling	a87544fe8b	Kernel: Refuse to allocate 0 bytes of virtual address space	2020-02-19 22:19:55 +01:00
Andreas Kling	f17c377a0c	Kernel: Use bitfields in Region This makes Region 4 bytes smaller and we can use bitfield initializers since they are allowed in C++20. :^)	2020-02-19 12:03:11 +01:00
Andreas Kling	4b16ac0034	Kernel: Purging a page should point it back to the shared zero page Anonymous VM objects should never have null entries in their physical page list. Instead, "empty" or untouched pages should refer to the shared zero page. Fixes #1237.	2020-02-18 09:56:11 +01:00
Andreas Kling	48f7c28a5c	Kernel: Replace "current" with Thread::current and Process::current Suggested by Sergey. The currently running Thread and Process are now Thread::current and Process::current respectively. :^)	2020-02-17 15:04:27 +01:00
Andreas Kling	31e1af732f	Kernel+LibC: Allow sys$mmap() callers to specify address alignment This is exposed via the non-standard serenity_mmap() call in userspace.	2020-02-16 12:55:56 +01:00
Andreas Kling	7533d61458	Kernel: Fix weird whitespace mistake in RangeAllocator	2020-02-16 08:01:33 +01:00
Andreas Kling	635ae70b8f	Kernel: More header dependency reduction work	2020-02-16 02:15:33 +01:00
Andreas Kling	e28809a996	Kernel: Add forward declaration header	2020-02-16 01:50:32 +01:00
Andreas Kling	1d611e4a11	Kernel: Reduce header dependencies of MemoryManager and Region	2020-02-16 01:33:41 +01:00
Andreas Kling	a356e48150	Kernel: Move all code into the Kernel namespace	2020-02-16 01:27:42 +01:00
Andreas Kling	5507945306	Kernel: Widen PhysicalPage refcount to 32 bits A 16-bit refcount is just begging for trouble right nowl. A 32-bit refcount will be begging for trouble later down the line, so we'll have to revisit this eventually. :^)	2020-02-15 22:34:48 +01:00
Andreas Kling	c624d3875e	Kernel: Use a shared physical page for zero-filled pages until written This patch adds a globally shared zero-filled PhysicalPage that will be mapped into every slot of every zero-filled AnonymousVMObject until that page is written to, achieving CoW-like zero-filled pages. Initial testing show that this doesn't actually achieve any sharing yet but it seems like a good design regardless, since it may reduce the number of page faults taken by programs. If you look at the refcount of MM.shared_zero_page() it will have quite a high refcount, but that's just because everything maps it everywhere. If you want to see the "real" refcount, you can build with the MAP_SHARED_ZERO_PAGE_LAZILY flag, and we'll defer mapping of the shared zero page until the first NP read fault. I've left this behavior behind a flag for future testing of this code.	2020-02-15 13:17:40 +01:00
Andreas Kling	27f0102bbe	Kernel: Add getter and setter for the X86 CR3 register This gets rid of a bunch of inline assembly.	2020-02-10 20:00:32 +01:00
Andreas Kling	ccfee3e573	Kernel: Remove more <LibBareMetal/Output/kstdio.h> includes	2020-02-10 12:07:48 +01:00
Andreas Kling	6cbd72f54f	AK: Remove bitrotted Traits::dump() mechanism This was only used by HashTable::dump() which I used when doing the first HashTable implementation. Removing this allows us to also remove most includes of <AK/kstdio.h>.	2020-02-10 11:55:34 +01:00
Liav A	99ea80695e	Kernel: Use VirtualAddress & PhysicalAddress classes from LibBareMetal	2020-02-09 19:38:17 +01:00
Liav A	e559af2008	Kernel: Apply changes to use LibBareMetal definitions	2020-02-09 19:38:17 +01:00
Andreas Kling	00d8ec3ead	Kernel: The inode fault handler should grab the VMObject lock earlier It doesn't look healthy to create raw references into an array before a temporary unlock. In fact, that temporary unlock looks generally unhealthy, but it's a different problem.	2020-02-08 12:55:21 +01:00
Andreas Kling	a9d7902bb7	x86: Simplify region unmapping a bit Add PageTableEntry::clear() to zero out a whole PTE, and use that for unmapping instead of clearing individual fields.	2020-02-08 12:49:38 +01:00
Andreas Kling	f91b3aab47	Kernel: Cloned shared regions should also be marked as shared	2020-02-08 02:39:46 +01:00
Andreas Kling	bf5b7c32d8	Kernel: Add some sanity assertions in RangeAllocator::deallocate() We should never end up deallocating an empty range, or a range that ends before it begins.	2020-01-30 21:51:27 +01:00
Andreas Kling	31a141bd10	Kernel: Range::contains() should reject ranges with 2^32 wrap-around	2020-01-30 21:51:27 +01:00
Andreas Kling	a27c5d2fb7	Kernel: Fail with EFAULT for any address+size that would wrap around Previously we were only checking that each of the virtual pages in the specified range were valid. This made it possible to pass in negative buffer sizes to some syscalls as long as (address) and (address+size) were on the same page.	2020-01-29 12:56:07 +01:00
Andreas Kling	c17f80e720	Kernel: AnonymousVMObject::create_for_physical_range() should fail more Previously it was not possible for this function to fail. You could exploit this by triggering the creation of a VMObject whose physical memory range would wrap around the 32-bit limit. It was quite easy to map kernel memory into userspace and read/write whatever you wanted in it. Test: Kernel/bxvga-mmap-kernel-into-userspace.cpp	2020-01-28 20:48:07 +01:00
Andreas Kling	8131875da6	Kernel: Remove outdated comment in MemoryManager Regions do zero-fill on demand now. :^)	2020-01-28 10:28:04 +01:00
Andreas Kling	3de5439579	AK: Let's call decrementing reference counts "unref" instead of "deref" It always bothered me that we're using the overloaded "dereference" term for this. Let's call it "unreference" instead. :^)	2020-01-23 15:14:21 +01:00
Andreas Kling	f38cfb3562	Kernel: Tidy up debug logging a little bit When using dbg() in the kernel, the output is automatically prefixed with [Process(PID:TID)]. This makes it a lot easier to understand which thread is generating the output. This patch also cleans up some common logging messages and removes the now-unnecessary "dbg() << *current << ..." pattern.	2020-01-21 16:16:20 +01:00
Liav A	200a5b0649	Kernel: Remove map_for_kernel() in MemoryManager We don't need to have this method anymore. It was a hack that was used in many components in the system but currently we use better methods to create virtual memory mappings. To prevent any further use of this method it's best to just remove it completely. Also, the APIC code is disabled for now since it doesn't help booting the system, and is broken since it relies on identity mapping to exist in the first 1MB. Any call to the APIC code will result in assertion failed. In addition to that, the name of the method which is responsible to create an identity mapping between 1MB to 2MB was changed, to be more precise about its purpose.	2020-01-21 11:29:58 +01:00
Andreas Kling	a0b716cfc5	Add AnonymousVMObject::create_with_physical_page() This can be used to create a VMObject for a single PhysicalPage.	2020-01-20 13:13:03 +01:00
Andreas Kling	4ebff10bde	Kernel: Write-only regions should still be mapped as present There is no real "read protection" on x86, so we have no choice but to map write-only pages simply as "present & read/write". If we get a read page fault in a non-readable region, that's still a correctness issue, so we crash the process. It's by no means a complete protection against invalid reads, since it's trivial to fool the kernel by first causing a write fault in the same region.	2020-01-20 13:13:03 +01:00
Andreas Kling	4b7a89911c	Kernel: Remove some unnecessary casts to uintptr_t VirtualAddress is constructible from uintptr_t and const void. PhysicalAddress is constructible from uintptr_t but not const void.	2020-01-20 13:13:03 +01:00
Andreas Kling	a246e9cd7e	Use uintptr_t instead of u32 when storing pointers as integers uintptr_t is 32-bit or 64-bit depending on the target platform. This will help us write pointer size agnostic code so that when the day comes that we want to do a 64-bit port, we'll be in better shape.	2020-01-20 13:13:03 +01:00
Andreas Kling	05836757c6	Kernel: Oops, fix bad sort order of available VM ranges This made the allocator perform worse, so here's another second off of the Kernel/Process.cpp compile time from a simple bugfix! (31s to 30s)	2020-01-19 15:53:43 +01:00
Andreas Kling	6eab7b398d	Kernel: Make ProcessPagingScope restore CR3 properly Instead of restoring CR3 to the current process's paging scope when a ProcessPagingScope goes out of scope, we now restore exactly whatever the CR3 value was when we created the ProcessPagingScope. This fixes breakage in situations where a process ends up with nested ProcessPagingScopes. This was making profiling very fragile, and with this change it's now possible to profile g++! :^)	2020-01-19 13:44:53 +01:00
Andreas Kling	ad3f931707	Kernel: Optimize VM range deallocation a bit Previously, when deallocating a range of VM, we would sort and merge the range list. This was quite slow for large processes. This patch optimizes VM deallocation in the following ways: - Use binary search instead of linear scan to find the place to insert the deallocated range. - Insert at the right place immediately, removing the need to sort. - Merge the inserted range with any adjacent range(s) in-line instead of doing a separate merge pass into a list copy. - Add Traits<Range> to inform Vector that Range objects are trivial and can be moved using memmove(). I've also added an assertion that deallocated ranges are actually part of the RangeAllocator's initial address range. I've benchmarked this using g++ to compile Kernel/Process.cpp. With these changes, compilation goes from ~41 sec to ~35 sec.	2020-01-19 13:29:59 +01:00
Andreas Kling	f7b394e9a1	Kernel: Assert that copy_to/from_user() are called with user addresses This will panic the kernel immediately if these functions are misused so we can catch it and fix the misuse. This patch fixes a couple of misuses: - create_signal_trampolines() writes to a user-accessible page above the 3GB address mark. We should really get rid of this page but that's a whole other thing. - CoW faults need to use copy_from_user rather than copy_to_user since it's the source pointer that points to user memory. - Inode faults need to use memcpy rather than copy_to_user since we're copying a kernel stack buffer into a quickmapped page. This should make the copy_to/from_user() functions slightly less useful for exploitation. Before this, they were essentially just glorified memcpy() with SMAP disabled. :^)	2020-01-19 09:18:55 +01:00
Andreas Kling	2cd212e5df	Kernel: Let's say that everything < 3GB is user virtual memory Technically the bottom 2MB is still identity-mapped for the kernel and not made available to userspace at all, but for simplicity's sake we can just ignore that and make "address < 0xc0000000" the canonical check for user/kernel.	2020-01-19 08:58:33 +01:00
Andreas Kling	862b3ccb4e	Kernel: Enforce W^X between sys$mmap() and sys$execve() It's now an error to sys$mmap() a file as writable if it's currently mapped executable by anyone else. It's also an error to sys$execve() a file that's currently mapped writable by anyone else. This fixes a race condition vulnerability where one program could make modifications to an executable while another process was in the kernel, in the middle of exec'ing the same executable. Test: Kernel/elf-execve-mmap-race.cpp	2020-01-18 23:40:12 +01:00
Andreas Kling	6fea316611	Kernel: Move all CPU feature initialization into cpu_setup() ..and do it very very early in boot.	2020-01-18 10:11:29 +01:00
Andreas Kling	94ca55cefd	Meta: Add license header to source files As suggested by Joshua, this commit adds the 2-clause BSD license as a comment block to the top of every source file. For the first pass, I've just added myself for simplicity. I encourage everyone to add themselves as copyright holders of any file they've added or modified in some significant way. If I've added myself in error somewhere, feel free to replace it with the appropriate copyright holder instead. Going forward, all new source files should include a license header.	2020-01-18 09:45:54 +01:00
Andreas Kling	19c31d1617	Kernel: Always dump kernel regions when dumping process regions	2020-01-18 08:57:18 +01:00
Andreas Kling	345f92d5ac	Kernel: Remove two unused MemoryManager functions	2020-01-18 08:57:18 +01:00
Andreas Kling	3e8b60c618	Kernel: Clean up MemoryManager initialization a bit more Move the CPU feature enabling to functions in Arch/i386/CPU.cpp.	2020-01-18 00:28:16 +01:00
Andreas Kling	a850a89c1b	Kernel: Add a random offset to the base of the per-process VM allocator This is not ASLR, but it does de-trivialize exploiting the ELF loader which would previously always parse executables at 0x01001000 in every single exec(). I've taken advantage of this multiple times in my own toy exploits and it's starting to feel cheesy. :^)	2020-01-17 23:29:54 +01:00
Andreas Kling	536c0ff3ee	Kernel: Only clone the bottom 2MB of mappings from kernel to processes	2020-01-17 22:34:36 +01:00
Andreas Kling	122c76d7fa	Kernel: Don't allocate per-process PDPT from super pages either The default system is now down to 3 super pages allocated on boot. :^)	2020-01-17 22:34:36 +01:00
Andreas Kling	ad1f79fb4a	Kernel: Stop allocating page tables from the super pages pool We now use the regular "user" physical pages for on-demand page table allocations. This was by far the biggest source of super physical page exhaustion, so that bug should be a thing of the past now. :^) We still have super pages, but they are barely used. They remain useful for code that requires memory with a low physical address. Fixes #1000.	2020-01-17 22:34:36 +01:00
Andreas Kling	f71fc88393	Kernel: Re-enable protection of the kernel image in memory	2020-01-17 22:34:36 +01:00
Andreas Kling	59b584d983	Kernel: Tidy up the lowest part of the address space After MemoryManager initialization, we now only leave the lowest 1MB of memory identity-mapped. The very first (null) page is not present. All other pages are RW but not X. Supervisor only.	2020-01-17 22:34:36 +01:00
Andreas Kling	545ec578b3	Kernel: Tidy up the types imported from boot.S a little bit	2020-01-17 22:34:36 +01:00
Andreas Kling	7e6f0efe7c	Kernel: Move Multiboot memory map parsing to its own function	2020-01-17 22:34:36 +01:00
Andreas Kling	ba8275a48e	Kernel: Clean up ensure_pte()	2020-01-17 22:34:36 +01:00
Andreas Kling	e362b56b4f	Kernel: Move kernel above the 3GB virtual address mark The kernel and its static data structures are no longer identity-mapped in the bottom 8MB of the address space, but instead move above 3GB. The first 8MB above 3GB are pseudo-identity-mapped to the bottom 8MB of the physical address space. But things don't have to stay this way! Thanks to Jesse who made an earlier attempt at this, it was really easy to get device drivers working once the page tables were in place! :^) Fixes #734.	2020-01-17 22:34:26 +01:00
Liav A	d2b41010c5	Kernel: Change Region allocation helpers We now can create a cacheable Region, so when map() is called, if a Region is cacheable then all the virtual memory space being allocated to it will be marked as not cache disabled. In addition to that, OS components can create a Region that will be mapped to a specific physical address by using the appropriate helper method.	2020-01-14 15:38:58 +01:00
Andreas Kling	5c3c2a9bac	Kernel: Copy Region's "is_mmap" flag when cloning regions for fork() Otherwise child processes will not be allowed to munmap(), madvise(), etc. on the cloned regions!	2020-01-10 19:24:01 +01:00
Andreas Kling	62c45850e1	Kernel: Page allocation should not use memset_user() when zeroing We're not zeroing new pages through a userspace address, so this should not use memset_user().	2020-01-10 10:57:33 +01:00
Andreas Kling	197e73ee31	Kernel+LibELF: Enable SMAP protection during non-syscall exec() When loading a new executable, we now map the ELF image in kernel-only memory and parse it there. Then we use copy_to_user() when initializing writable regions with data from the executable. Note that the exec() syscall still disables SMAP protection and will require additional work. This patch only affects kernel-originated process spawns.	2020-01-10 10:57:06 +01:00
Andreas Kling	8e7420ddf2	Kernel: Harden memory mapping of the kernel image We now map the kernel's text and rodata segments read+execute. We also make the data and bss segments non-executable. Thanks to q3k for the idea! :^)	2020-01-06 13:55:39 +01:00
Andreas Kling	9eef39d68a	Kernel: Start implementing x86 SMAP support Supervisor Mode Access Prevention (SMAP) is an x86 CPU feature that prevents the kernel from accessing userspace memory. With SMAP enabled, trying to read/write a userspace memory address while in the kernel will now generate a page fault. Since it's sometimes necessary to read/write userspace memory, there are two new instructions that quickly switch the protection on/off: STAC (disables protection) and CLAC (enables protection.) These are exposed in kernel code via the stac() and clac() helpers. There's also a SmapDisabler RAII object that can be used to ensure that you don't forget to re-enable protection before returning to userspace code. THis patch also adds copy_to_user(), copy_from_user() and memset_user() which are the "correct" way of doing things. These functions allow us to briefly disable protection for a specific purpose, and then turn it back on immediately after it's done. Going forward all kernel code should be moved to using these and all uses of SmapDisabler are to be considered FIXME's. Note that we're not realizing the full potential of this feature since I've used SmapDisabler quite liberally in this initial bring-up patch.	2020-01-05 18:14:51 +01:00
Andreas Kling	aba7829724	Kernel: InodeVMObject can't call Inode::size() with interrupts disabled Inode::size() may try to take a lock, so we can't be calling it with interrupts disabled. This fixes a kernel hang when trying to execute a binary in a TmpFS.	2020-01-03 15:40:03 +01:00
Andreas Kling	0f9800ca57	Kernel: Make the loop that marks the bottom 1MB NX a little less busy	2020-01-02 22:02:29 +01:00

... 3 4 5 6 7 ...

612 Commits