Commit Graph

428 Commits

Author SHA1 Message Date
Andreas Kling
fdf03852c9 Kernel: Slap UNMAP_AFTER_INIT on a whole bunch of functions
There's no real system here, I just added it to various functions
that I don't believe we ever want to call after initialization
has finished.

With these changes, we're able to unmap 60 KiB of kernel text
after init. :^)
2021-02-19 20:23:05 +01:00
Andreas Kling
6136faa4eb Kernel: Add .unmap_after_init section for code we don't need after init
You can now declare functions with UNMAP_AFTER_INIT and they'll get
segregated into a separate kernel section that gets completely
unmapped at the end of initialization.

This can be used for anything we don't need to call once we've booted
into userspace.

There are two nice things about this mechanism:

- It allows us to free up entire pages of memory for other use.
  (Note that this patch does not actually make use of the freed
  pages yet, but in the future we totally could!)

- It allows us to get rid of obviously dangerous gadgets like
  write-to-CR0 and write-to-CR4 which are very useful for an attacker
  trying to disable SMAP/SMEP/etc.

I've also made sure to include a helpful panic message in case you
hit a kernel crash because of this protection. :^)
2021-02-19 20:23:05 +01:00
Brian Gianforcaro
7482cb6531 Kernel: Avoid some un-necessary copies coming from range based for loops
- The irq_controller was getting add_ref/released needlessly during enumeration.

- Used ranges were also getting needlessly copied.
2021-02-15 15:25:23 +01:00
Andreas Kling
d8013c60bb Kernel: Add mechanism to make some memory read-only after init finishes
You can now use the READONLY_AFTER_INIT macro when declaring a variable
and we will put it in a special ".ro_after_init" section in the kernel.

Data in that section remains writable during the boot and init process,
and is then marked read-only just before launching the SystemServer.

This is based on an idea from the Linux kernel. :^)
2021-02-14 18:11:32 +01:00
Andreas Kling
09b1b09c19 Kernel: Assert if rounding-up-to-page-size would wrap around to 0
If we try to align a number above 0xfffff000 to the next multiple of
the page size (4 KiB), it would wrap around to 0. This is most likely
never what we want, so let's assert if that happens.
2021-02-14 10:01:50 +01:00
Andreas Kling
198d641808 Kernel: Panic on attempt to map mmap'ed page at a kernel address
If we somehow get tricked into mapping user-controlled mmap memory
at a kernel address, let's just panic the kernel.
2021-02-14 09:36:58 +01:00
Andreas Kling
4021264201 Kernel: Make the Region constructor private
We can use adopt_own(*new T) instead of make<T>().
2021-02-14 01:39:04 +01:00
Andreas Kling
8415866c03 Kernel: Remove user/kernel flags from Region
Now that we no longer need to support the signal trampolines being
user-accessible inside the kernel memory range, we can get rid of the
"kernel" and "user-accessible" flags on Region and simply use the
address of the region to determine whether it's kernel or user.

This also tightens the page table mapping code, since it can now set
user-accessibility based solely on the virtual address of a page.
2021-02-14 01:34:23 +01:00
Andreas Kling
a5def4e98c Kernel: Sanity check the VM range when constructing a Region
This should help us catch bogus VM ranges ending up in a process's
address space sooner.
2021-02-13 01:18:03 +01:00
Andreas Kling
7551090056 Kernel: Round up ranges to page size multiples in munmap and mprotect
This prevents passing bad inputs to RangeAllocator who then asserts.

Found by fuzz-syscalls. :^)
2021-02-13 01:18:03 +01:00
Andreas Kling
e050577f0a Kernel: Make MAP_RANDOMIZED honor alignment requests
Previously, we only cared about the alignment on the fallback path.
2021-02-12 19:15:59 +01:00
Andreas Kling
4e2802bf91 Kernel: Move region dumps from dmesg to debug log
Also fix a broken format string caught by the new format string checks.
2021-02-12 16:33:58 +01:00
Andreas Kling
c4db224c94 Kernel: Convert klog() => dmesgln() / dbgln() in MemoryManager 2021-02-12 16:24:40 +01:00
Andreas Kling
5af69d6e93 Kernel: Convert klog() to dmesgln() in RangeAllocator 2021-02-12 16:24:40 +01:00
Andreas Kling
54986228bf Kernel: Oops, add missing #include to fix ENABLE_ALL_THE_DEBUG_MACROS 2021-02-11 22:15:55 +01:00
Andreas Kling
0dbb22e9e0 Kernel: Remove a handful of unused things in VM/ directory
Also add some missing initializers.
2021-02-11 22:02:39 +01:00
Andreas Kling
4cd2c475a8 Kernel: Make the space lock a RecursiveSpinLock 2021-02-08 22:28:48 +01:00
Andreas Kling
9ca42c4c0e Kernel: Always hold space lock while calculating memory statistics
And put the locker at the top of the functions for clarity.
2021-02-08 22:23:29 +01:00
Andreas Kling
8bda30edd2 Kernel: Move memory statistics helpers from Process to Space 2021-02-08 22:23:29 +01:00
Andreas Kling
f1b5def8fd Kernel: Factor address space management out of the Process class
This patch adds Space, a class representing a process's address space.

- Each Process has a Space.
- The Space owns the PageDirectory and all Regions in the Process.

This allows us to reorganize sys$execve() so that it constructs and
populates a new Space fully before committing to it.

Previously, we would construct the new address space while still
running in the old one, and encountering an error meant we had to do
tedious and error-prone rollback.

Those problems are now gone, replaced by what's hopefully a set of much
smaller problems and missing cleanups. :^)
2021-02-08 18:27:28 +01:00
Andreas Kling
b2cba3036e Kernel: Remove unused MemoryManager::validate_range()
This is no longer used since we've switched to using the MMU to
generate EFAULT errors.
2021-02-08 18:27:28 +01:00
AnotherTest
09a43969ba Everywhere: Replace dbgln<flag>(...) with dbgln_if(flag, ...)
Replacement made by `find Kernel Userland -name '*.h' -o -name '*.cpp' | sed -i -Ee 's/dbgln\b<(\w+)>\(/dbgln_if(\1, /g'`
2021-02-08 18:08:55 +01:00
Andreas Kling
9c77980965 Everywhere: Remove some bitrotted "#if 0" blocks 2021-02-03 11:17:47 +01:00
Andreas Kling
823186031d Kernel: Add a way to specify which memory regions can make syscalls
This patch adds sys$msyscall() which is loosely based on an OpenBSD
mechanism for preventing syscalls from non-blessed memory regions.

It works similarly to pledge and unveil, you can call it as many
times as you like, and when you're finished, you call it with a null
pointer and it will stop accepting new regions from then on.

If a syscall later happens and doesn't originate from one of the
previously blessed regions, the kernel will simply crash the process.
2021-02-02 20:13:44 +01:00
Liav A
5ab1864497 Kernel: Introduce the MemoryDevice
This is a character device that is being used by the dmidecode utility.
We only allow to map the BIOS ROM area to userspace with this device.
2021-02-01 17:13:23 +01:00
Andreas Kling
1320b9351e Revert "Kernel: Don't clone kernel mappings for bottom 2 MiB VM into processes"
This reverts commit da7b21dc06.

This broke SMP boot, oops! :^)
2021-01-31 19:00:53 +01:00
Andreas Kling
da7b21dc06 Kernel: Don't clone kernel mappings for bottom 2 MiB VM into processes
I can't think of anything that needs these mappings anymore, so let's
get rid of them.
2021-01-31 15:20:18 +01:00
Andreas Kling
e55ef70e5e Kernel: Remove "has made executable exception for dynamic loader" flag
As Idan pointed out, this flag is actually not needed, since we don't
allow transitioning from previously-executable to writable anyway.
2021-01-30 10:06:52 +01:00
Jorropo
df30b3e54c
Kernel: RangeAllocator randomized correctly check if size is in bound. (#5164)
The random address proposals were not checked with the size so it was
increasely likely to try to allocate outside of available space with
larger and larger sizes.

Now they will be ignored instead of triggering a Kernel assertion
failure.

This is a continuation of: c8e7baf4b8
2021-01-29 17:18:23 +01:00
Andreas Kling
af3d3c5c4a Kernel: Enforce W^X more strictly (like PaX MPROTECT)
This patch adds enforcement of two new rules:

- Memory that was previously writable cannot become executable
- Memory that was previously executable cannot become writable

Unfortunately we have to make an exception for text relocations in the
dynamic loader. Since those necessitate writing into a private copy
of library code, we allow programs to transition from RW to RX under
very specific conditions. See the implementation of sys$mprotect()'s
should_make_executable_exception_for_dynamic_loader() for details.
2021-01-29 14:52:27 +01:00
Andreas Kling
c8e7baf4b8 Kernel: Check for alignment size overflow when allocating VM ranges
Also add some sanity check assertions that we're generating and
returning ranges contained within the RangeAllocator's total range.

Fixes #5162.
2021-01-29 12:11:42 +01:00
Tom
affb4ef01b Kernel: Allow specifying a physical alignment when allocating
Some drivers may require allocating contiguous physical pages with
a specific alignment for the physical address.
2021-01-28 18:52:59 +01:00
Andreas Kling
80837d43a2 Kernel: Remove outdated debug logging from RangeAllocator
If someone wants to debug this code, it's better that they rewrite the
logging code to take randomization and guard pages into account.
2021-01-28 16:23:38 +01:00
Andreas Kling
b6937e2560 Kernel+LibC: Add MAP_RANDOMIZED flag for sys$mmap()
This can be used to request random VM placement instead of the highly
predictable regular mmap(nullptr, ...) VM allocation strategy.

It will soon be used to implement ASLR in the dynamic loader. :^)
2021-01-28 16:23:38 +01:00
Andreas Kling
d3de138d64 Kernel: Add sanity check assertion in RangeAllocator::allocate_specific
The specific virtual address should always be page aligned.
2021-01-28 16:23:38 +01:00
Andreas Kling
27d07796b4 Kernel: Add sanity check assertion in RangeAllocator::allocate_anywhere
The requested alignment should always be a multiple of the page size.
2021-01-28 16:23:38 +01:00
Tom
250a310454 Kernel: Release MM lock while yielding from inode page fault handler
We need to make sure other processors can grab the MM lock while we
wait, so release it when we might block. Reading the page from
disk may also block, so release it during that time as well.
2021-01-27 22:48:41 +01:00
Andreas Kling
e67402c702 Kernel: Remove Range "valid" state and use Optional<Range> instead
It's easier to understand VM ranges if they are always valid. We can
simply use an empty Optional<Range> to encode absence when needed.
2021-01-27 21:14:42 +01:00
Tom
e2f9e557d3 Kernel: Make Processor::id a static function
This eliminates the window between calling Processor::current and
the member function where a thread could be moved to another
processor. This is generally not as big of a concern as with
Processor::current_thread, but also slightly more light weight.
2021-01-27 21:12:24 +01:00
Andreas Kling
76a69be217 Kernel: Assert in RangeAllocator that sizes are multiple of PAGE_SIZE 2021-01-27 19:45:53 +01:00
asynts
7cf0c7cc0d Meta: Split debug defines into multiple headers.
The following script was used to make these changes:

    #!/bin/bash
    set -e

    tmp=$(mktemp -d)

    echo "tmp=$tmp"

    find Kernel \( -name '*.cpp' -o -name '*.h' \) | sort > $tmp/Kernel.files
    find . \( -path ./Toolchain -prune -o -path ./Build -prune -o -path ./Kernel -prune \) -o \( -name '*.cpp' -o -name '*.h' \) -print | sort > $tmp/EverythingExceptKernel.files

    cat $tmp/Kernel.files | xargs grep -Eho '[A-Z0-9_]+_DEBUG' | sort | uniq > $tmp/Kernel.macros
    cat $tmp/EverythingExceptKernel.files | xargs grep -Eho '[A-Z0-9_]+_DEBUG' | sort | uniq > $tmp/EverythingExceptKernel.macros

    comm -23 $tmp/Kernel.macros $tmp/EverythingExceptKernel.macros > $tmp/Kernel.unique
    comm -1 $tmp/Kernel.macros $tmp/EverythingExceptKernel.macros > $tmp/EverythingExceptKernel.unique

    cat $tmp/Kernel.unique | awk '{ print "#cmakedefine01 "$1 }' > $tmp/Kernel.header
    cat $tmp/EverythingExceptKernel.unique | awk '{ print "#cmakedefine01 "$1 }' > $tmp/EverythingExceptKernel.header

    for macro in $(cat $tmp/Kernel.unique)
    do
        cat $tmp/Kernel.files | xargs grep -l $macro >> $tmp/Kernel.new-includes ||:
    done
    cat $tmp/Kernel.new-includes | sort > $tmp/Kernel.new-includes.sorted

    for macro in $(cat $tmp/EverythingExceptKernel.unique)
    do
        cat $tmp/Kernel.files | xargs grep -l $macro >> $tmp/Kernel.old-includes ||:
    done
    cat $tmp/Kernel.old-includes | sort > $tmp/Kernel.old-includes.sorted

    comm -23 $tmp/Kernel.new-includes.sorted $tmp/Kernel.old-includes.sorted > $tmp/Kernel.includes.new
    comm -13 $tmp/Kernel.new-includes.sorted $tmp/Kernel.old-includes.sorted > $tmp/Kernel.includes.old
    comm -12 $tmp/Kernel.new-includes.sorted $tmp/Kernel.old-includes.sorted > $tmp/Kernel.includes.mixed

    for file in $(cat $tmp/Kernel.includes.new)
    do
        sed -i -E 's/#include <AK\/Debug\.h>/#include <Kernel\/Debug\.h>/' $file
    done

    for file in $(cat $tmp/Kernel.includes.mixed)
    do
        echo "mixed include in $file, requires manual editing."
    done
2021-01-26 21:20:00 +01:00
Andreas Kling
3ff88a1d77 Kernel: Assert on attempt to map private region backed by shared inode
If we find ourselves with a user-accessible, non-shared Region backed by
a SharedInodeVMObject, that's pretty bad news, so let's just panic the
kernel instead of getting abused.

There might be a better place for this kind of check, so I've added a
FIXME about putting more thought into that.
2021-01-26 18:35:10 +01:00
Andreas Kling
a131927c75 Kernel: sys$munmap() region splitting did not preserve "shared" flag
This was exploitable since the shared flag determines whether inode
permission checks are applied in sys$mprotect().

The bug was pretty hard to spot due to default arguments being used
instead. This patch removes the default arguments to make explicit
at each call site what's being done.
2021-01-26 18:35:04 +01:00
asynts
eea72b9b5c Everywhere: Hook up remaining debug macros to Debug.h. 2021-01-25 09:47:36 +01:00
asynts
8465683dcf Everywhere: Debug macros instead of constexpr.
This was done with the following script:

    find . \( -name '*.cpp' -o -name '*.h' -o -name '*.in' \) -not -path './Toolchain/*' -not -path './Build/*' -exec sed -i -E 's/dbgln<debug_([a-z_]+)>/dbgln<\U\1_DEBUG>/' {} \;

    find . \( -name '*.cpp' -o -name '*.h' -o -name '*.in' \) -not -path './Toolchain/*' -not -path './Build/*' -exec sed -i -E 's/if constexpr \(debug_([a-z0-9_]+)/if constexpr \(\U\1_DEBUG/' {} \;
2021-01-25 09:47:36 +01:00
asynts
acdcf59a33 Everywhere: Remove unnecessary debug comments.
It would be tempting to uncomment these statements, but that won't work
with the new changes.

This was done with the following commands:

    find . \( -name '*.cpp' -o -name '*.h' -o -name '*.in' \) -not -path './Toolchain/*' -not -path './Build/*' -exec awk -i inplace '$0 !~ /\/\/#define/ { if (!toggle) { print; } else { toggle = !toggle } } ; $0 ~/\/\/#define/ { toggle = 1 }' {} \;

    find . \( -name '*.cpp' -o -name '*.h' -o -name '*.in' \) -not -path './Toolchain/*' -not -path './Build/*' -exec awk -i inplace '$0 !~ /\/\/ #define/ { if (!toggle) { print; } else { toggle = !toggle } } ; $0 ~/\/\/ #define/ { toggle = 1 }' {} \;
2021-01-25 09:47:36 +01:00
asynts
1a3a0836c0 Everywhere: Use CMake to generate AK/Debug.h.
This was done with the help of several scripts, I dump them here to
easily find them later:

    awk '/#ifdef/ { print "#cmakedefine01 "$2 }' AK/Debug.h.in

    for debug_macro in $(awk '/#ifdef/ { print $2 }' AK/Debug.h.in)
    do
        find . \( -name '*.cpp' -o -name '*.h' -o -name '*.in' \) -not -path './Toolchain/*' -not -path './Build/*' -exec sed -i -E 's/#ifdef '$debug_macro'/#if '$debug_macro'/' {} \;
    done

    # Remember to remove WRAPPER_GERNERATOR_DEBUG from the list.
    awk '/#cmake/ { print "set("$2" ON)" }' AK/Debug.h.in
2021-01-25 09:47:36 +01:00
Jean-Baptiste Boric
ec056f3bd1 Kernel: Parse boot modules from Multiboot specification 2021-01-22 22:17:39 +01:00
Jean-Baptiste Boric
3cbe805486 Kernel: Move kmalloc heaps and super pages inside .bss segment
The kernel ignored the first 8 MiB of RAM while parsing the memory map
because the kmalloc heaps and the super physical pages lived here. Move
all that stuff inside the .bss segment so that those memory regions are
accounted for, otherwise we risk overwriting boot modules placed next
to the kernel.
2021-01-22 22:17:39 +01:00
Jean-Baptiste Boric
5cd1217b6e Kernel: Remove trace log in MemoryManager::deallocate_user_physical_page() 2021-01-22 22:17:39 +01:00