Commit Graph

580 Commits

Author SHA1 Message Date
Andreas Kling
b7b7a48c66 Kernel: Move process signal trampoline address into protected data 2021-03-11 14:21:49 +01:00
Andreas Kling
08e0e2eb41 Kernel: Move process umask into protected data :^) 2021-03-11 14:21:49 +01:00
Andreas Kling
90c0f9664e Kernel: Don't keep protected Process data in a separate allocation
The previous architecture had a huge flaw: the pointer to the protected
data was itself unprotected, allowing you to overwrite it at any time.

This patch reorganizes the protected data so it's part of the Process
class itself. (Actually, it's a new ProcessBase helper class.)

We use the first 4 KB of Process objects themselves as the new storage
location for protected data. Then we make Process objects page-aligned
using MAKE_ALIGNED_ALLOCATED.

This allows us to easily turn on/off write-protection for everything in
the ProcessBase portion of Process. :^)

Thanks to @bugaevc for pointing out the flaw! This is still not perfect
but it's an improvement.
2021-03-11 14:21:49 +01:00
Andreas Kling
de6c5128fd Kernel: Move process pledge promises into protected data 2021-03-10 22:50:00 +01:00
Andreas Kling
37ad880660 Kernel: Move process "dumpable" flag into protected data 2021-03-10 22:42:07 +01:00
Andreas Kling
3d27269f13 Kernel: Move process parent PID into protected data :^) 2021-03-10 22:30:02 +01:00
Andreas Kling
d677a73b0e Kernel: Move process extra_gids into protected data :^) 2021-03-10 22:30:02 +01:00
Andreas Kling
cbcf891040 Kernel: Move select Process members into protected memory
Process member variable like m_euid are very valuable targets for
kernel exploits and until now they have been writable at all times.

This patch moves m_euid along with a whole bunch of other members
into a new Process::ProtectedData struct. This struct is remapped
as read-only memory whenever we don't need to write to it.

This means that a kernel write primitive is no longer enough to
overwrite a process's effective UID, you must first unprotect the
protected data where the UID is stored. :^)
2021-03-10 22:30:02 +01:00
Andreas Kling
84725ef3a5 Kernel+UserspaceEmulator: Add sys$emuctl() system call
This returns ENOSYS if you are running in the real kernel, and some
other result if you are running in UserspaceEmulator.

There are other ways we could check if we're inside an emulator, but
it seemed easier to just ask. :^)
2021-03-09 08:58:26 +01:00
Andreas Kling
b425c2602c Kernel: Better handling of allocation failure in profiling
If we can't allocate a PerformanceEventBuffer to store the profiling
events, we now fail sys$profiling_enable() and sys$perf_event()
with ENOMEM instead of carrying on with a broken buffer.
2021-03-02 22:38:06 +01:00
Ben Wiederhake
336303bda4 Kernel: Make kgettimeofday use AK::Time 2021-03-02 08:36:08 +01:00
Ben Wiederhake
05d5e3fad9 Kernel: Remove duplicative kgettimeofday(timeval&) function 2021-03-02 08:36:08 +01:00
Ben Wiederhake
c040e64b7d Kernel: Make TimeManagement use AK::Time internally
I don't dare touch the multi-threading logic and locking mechanism, so it stays
timespec for now. However, this could and should be changed to AK::Time, and I
bet it will simplify the "increment_time_since_boot()" code.
2021-03-02 08:36:08 +01:00
Andreas Kling
272c2e6ec5 Kernel: Use Userspace<T> in sys${munmap,mprotect,madvise,msyscall}() 2021-03-01 15:53:33 +01:00
Andreas Kling
bebceaa32c Kernel: Use Userspace<T> in sys$select() 2021-03-01 15:07:01 +01:00
Andreas Kling
a1a82c1d95 Kernel: Use Userspace<T> in sys$get_dir_entries() 2021-03-01 15:04:31 +01:00
Andreas Kling
b5f32be577 Kernel: Use Userspace<T> in sys$get_stack_bounds() 2021-03-01 14:50:36 +01:00
Andreas Kling
122c7b6cbb Kernel: Use Userspace<T> in sys$write() 2021-03-01 14:35:06 +01:00
Andreas Kling
6a6eb8844a Kernel: Use Userspace<T> in sys$sigaction()
fuzz-syscalls found a bunch of unaligned accesses into struct sigaction
via this syscall. This patch fixes that issue by porting the syscall
to Userspace<T> which we should have done anyway. :^)

Fixes #5500.
2021-03-01 14:06:20 +01:00
Andreas Kling
ac71775de5 Kernel: Make all syscall functions return KResultOr<T>
This makes it a lot easier to return errors since we no longer have to
worry about negating EFOO errors and can just return them flat.
2021-03-01 13:54:32 +01:00
Linus Groh
e265054c12 Everywhere: Remove a bunch of redundant 'AK::' namespace prefixes
This is basically just for consistency, it's quite strange to see
multiple AK container types next to each other, some with and some
without the namespace prefix - we're 'using AK::Foo;' a lot and should
leverage that. :^)
2021-02-26 16:59:56 +01:00
Andreas Kling
8eeb8db2ed Kernel: Don't disable interrupts while dealing with a process crash
This was necessary in the past when crash handling would modify
various global things, but all that stuff is long gone so we can
simplify crashes by leaving the interrupt flag alone.
2021-02-25 19:36:36 +01:00
Andreas Kling
8f70528f30 Kernel: Take some baby steps towards x86_64
Make more of the kernel compile in 64-bit mode, and make some things
pointer-size-agnostic (by using FlatPtr.)

There's a lot of work to do here before the kernel will even compile.
2021-02-25 16:27:12 +01:00
Andreas Kling
5d180d1f99 Everywhere: Rename ASSERT => VERIFY
(...and ASSERT_NOT_REACHED => VERIFY_NOT_REACHED)

Since all of these checks are done in release builds as well,
let's rename them to VERIFY to prevent confusion, as everyone is
used to assertions being compiled out in release.

We can introduce a new ASSERT macro that is specifically for debug
checks, but I'm doing this wholesale conversion first since we've
accumulated thousands of these already, and it's not immediately
obvious which ones are suitable for ASSERT.
2021-02-23 20:56:54 +01:00
Andreas Kling
84b2d4c475 Kernel: Add "map_fixed" pledge promise
This is a new promise that guards access to mmap() with MAP_FIXED.

Fixed-address mappings are rarely used, but can be useful if you are
trying to groom the process address space for malicious purposes.

None of our programs need this at the moment, as the only user of
MAP_FIXED is DynamicLoader, but the fixed mappings are constructed
before the process has had a chance to pledge anything.
2021-02-21 01:08:48 +01:00
Andreas Kling
55a9a4f57a Kernel: Use KResult a bit more in sys$execve() 2021-02-18 09:37:33 +01:00
AnotherTest
a3a7ab83c4 Kernel+LibC: Implement readv
We already had writev, so let's just add readv too.
2021-02-15 17:32:56 +01:00
Andreas Kling
781d29a337 Kernel+Userland: Give sys$recvfd() an options argument for O_CLOEXEC
@bugaevc pointed out that we shouldn't be setting this flag in
userspace, and he's right of course.
2021-02-14 10:39:48 +01:00
Andreas Kling
1593219a41 Kernel: Map signal trampoline into each process's address space
The signal trampoline was previously in kernelspace memory, but with
a special exception to make it user-accessible.

This patch moves it into each process's regular address space so we
can stop supporting user-allowed memory above 0xc0000000.
2021-02-14 01:33:17 +01:00
Andreas Kling
1ef43ec89a Kernel: Move get_interpreter_load_offset() out of Process class
This is only used inside the sys$execve() implementation so just make
it a execve.cpp local function.
2021-02-12 16:30:29 +01:00
Andreas Kling
4ff0f971f7 Kernel: Prevent execve/ptrace race
Add a per-process ptrace lock and use it to prevent ptrace access to a
process after it decides to commit to a new executable in sys$execve().

Fixes #5230.
2021-02-08 23:05:41 +01:00
Andreas Kling
0d7af498d7 Kernel: Move ShouldAllocateTls enum from Process to execve.cpp 2021-02-08 22:24:37 +01:00
Andreas Kling
8bda30edd2 Kernel: Move memory statistics helpers from Process to Space 2021-02-08 22:23:29 +01:00
Andreas Kling
f1b5def8fd Kernel: Factor address space management out of the Process class
This patch adds Space, a class representing a process's address space.

- Each Process has a Space.
- The Space owns the PageDirectory and all Regions in the Process.

This allows us to reorganize sys$execve() so that it constructs and
populates a new Space fully before committing to it.

Previously, we would construct the new address space while still
running in the old one, and encountering an error meant we had to do
tedious and error-prone rollback.

Those problems are now gone, replaced by what's hopefully a set of much
smaller problems and missing cleanups. :^)
2021-02-08 18:27:28 +01:00
Andreas Kling
cf5ab665e0 Kernel: Remove unused Process::for_each_thread_in_coredump() 2021-02-08 18:27:28 +01:00
Ben Wiederhake
0a2304ba05 Everywhere: Fix weird includes 2021-02-08 18:03:57 +01:00
Andreas Kling
5c1c82cd33 Kernel: Remove unused function Process::backtrace() 2021-02-07 19:27:00 +01:00
Andreas Kling
b1813e5dae Kernel: Remove some unused declarations from Process 2021-02-07 19:27:00 +01:00
Andreas Kling
823186031d Kernel: Add a way to specify which memory regions can make syscalls
This patch adds sys$msyscall() which is loosely based on an OpenBSD
mechanism for preventing syscalls from non-blessed memory regions.

It works similarly to pledge and unveil, you can call it as many
times as you like, and when you're finished, you call it with a null
pointer and it will stop accepting new regions from then on.

If a syscall later happens and doesn't originate from one of the
previously blessed regions, the kernel will simply crash the process.
2021-02-02 20:13:44 +01:00
Ben Wiederhake
cbee0c26e1 Kernel+keymap+KeyboardMapper: New pledge for getkeymap 2021-02-01 09:54:32 +01:00
Ben Wiederhake
a2c21a55e1 Kernel+LibKeyboard: Enable querying the current keymap 2021-02-01 09:54:32 +01:00
Andreas Kling
d0c5979d96 Kernel: Add "prot_exec" pledge promise and require it for PROT_EXEC
This prevents sys$mmap() and sys$mprotect() from creating executable
memory mappings in pledged programs that don't have this promise.

Note that the dynamic loader runs before pledging happens, so it's
unaffected by this.
2021-01-29 18:56:34 +01:00
Andreas Kling
5ff355c0cd Kernel: Generate coredump backtraces from "threads for coredump" list
This broke with the change that gave each process a list of its own
threads. Since threads are removed slightly earlier from that list
during process teardown, we're not able to use it for generating
coredump backtraces. Fortunately we have the "threads for coredump"
list for just this purpose. :^)
2021-01-28 08:41:18 +01:00
Tom
8104abf640 Kernel: Remove colonel special-case from Process::for_each_thread
Since each Process now has its own list of threads, we don't need
to treat colonel any different anymore. This also means that it
reports all kernel threads, not just the idle threads.
2021-01-28 08:14:12 +01:00
Tom
ac3927086f Kernel: Keep a list of threads per Process
This allow us to iterate only the threads of the process.
2021-01-27 22:48:41 +01:00
Tom
03a9ee79fa Kernel: Implement thread priority queues
Rather than walking all Thread instances and putting them into
a vector to be sorted by priority, queue them into priority sorted
linked lists as soon as they become ready to be executed.
2021-01-27 22:48:41 +01:00
Andreas Kling
e67402c702 Kernel: Remove Range "valid" state and use Optional<Range> instead
It's easier to understand VM ranges if they are always valid. We can
simply use an empty Optional<Range> to encode absence when needed.
2021-01-27 21:14:42 +01:00
Tom
21d288a10e Kernel: Make Thread::current smp-safe
Change Thread::current to be a static function and read using the fs
register, which eliminates a window between Processor::current()
returning and calling a function on it, which can trigger preemption
and a move to a different processor, which then causes operating
on the wrong object.
2021-01-27 21:12:24 +01:00
Andreas Kling
c7858622ec Kernel: Update process promise states on execve() and fork()
We now move the execpromises state into the regular promises, and clear
the execpromises state.

Also make sure to duplicate the promise state on fork.

This fixes an issue where "su" would launch a shell which immediately
crashed due to not having pledged "stdio".
2021-01-26 15:26:37 +01:00
Andreas Kling
1e25d2b734 Kernel: Remove allocate_region() functions that don't take a Range
Let's force callers to provide a VM range when allocating a region.
This makes ENOMEM error handling more visible and removes implicit
VM allocation which felt a bit magical.
2021-01-26 14:13:57 +01:00