Commit Graph

609 Commits

Author SHA1 Message Date
Andreas Kling
de69f84868 Kernel: Remove SmapDisablers in fchmod() and fchown() 2020-01-10 14:20:14 +01:00
Andreas Kling
952bb95baa Kernel: Enable SMAP protection during the execve() syscall
The userspace execve() wrapper now measures all the strings and puts
them in a neat and tidy structure on the stack.

This way we know exactly how much to copy in the kernel, and we don't
have to use the SMAP-violating validate_read_str(). :^)
2020-01-10 12:20:36 +01:00
Andreas Kling
197e73ee31 Kernel+LibELF: Enable SMAP protection during non-syscall exec()
When loading a new executable, we now map the ELF image in kernel-only
memory and parse it there. Then we use copy_to_user() when initializing
writable regions with data from the executable.

Note that the exec() syscall still disables SMAP protection and will
require additional work. This patch only affects kernel-originated
process spawns.
2020-01-10 10:57:06 +01:00
Andreas Kling
ff16298b44 Kernel: Removed an unused global variable 2020-01-09 18:02:37 +01:00
Andreas Kling
17ef5bc0ac Kernel: Rename {ss,esp}_if_crossRing to userspace_{ss,esp}
These were always so awkwardly named.
2020-01-09 18:02:01 +01:00
Andreas Kling
4b4d369c5d Kernel: Take path+length in the unlink() and umount() syscalls 2020-01-09 16:23:41 +01:00
Andrew Kaster
e594724b01 Kernel: mmap(..., MAP_PRIVATE, fd, offset) is not supported
Make mmap return -ENOTSUP in this case to make sure users don't get
confused and think they're using a private mapping when it's actually
shared. It's currenlty not possible to open a file and mmap it
MAP_PRIVATE, and change the perms of the private mapping to ones that
don't match the permissions of the underlying file.
2020-01-09 09:29:36 +01:00
Andreas Kling
e1d4b19461 Kernel: open() and openat() should ignore non-permission bits in mode 2020-01-08 15:21:06 +01:00
Andreas Kling
532f240f24 Kernel: Remove unused syscall for setting the signal mask 2020-01-08 15:21:06 +01:00
Andreas Kling
200459d644 Kernel: Fix SMAP violation in join_thread() 2020-01-08 15:21:05 +01:00
Andreas Kling
50056d1d84 Kernel: mmap() should fail with ENODEV for directories 2020-01-08 12:47:37 +01:00
Andreas Kling
fe9680f0a4 Kernel: Validate PROT_READ and PROT_WRITE against underlying file
This patch fixes some issues with the mmap() and mprotect() syscalls,
neither of whom were checking the permission bits of the underlying
files when mapping an inode MAP_SHARED.

This made it possible to subvert execution of any running program
by simply memory-mapping its executable and replacing some of the code.

Test: Kernel/mmap-write-into-running-programs-executable-file.cpp
2020-01-07 19:32:32 +01:00
Andreas Kling
5387a19268 Kernel: Make Process::file_description() vend a RefPtr<FileDescription>
This encourages callers to strongly reference file descriptions while
working with them.

This fixes a use-after-free issue where one thread would close() an
open fd while another thread was blocked on it becoming readable.

Test: Kernel/uaf-close-while-blocked-in-read.cpp
2020-01-07 15:53:42 +01:00
Andreas Kling
6a4b376021 Kernel: Validate ftruncate(fd, length) syscall inputs
- EINVAL if 'length' is negative
- EBADF if 'fd' is not open for writing
2020-01-07 14:48:43 +01:00
Andreas Kling
78a63930cc Kernel+LibELF: Validate PT_LOAD and PT_TLS offsets before memcpy()'ing
Before this, you could make the kernel copy memory from anywhere by
setting up an ELF executable with a program header specifying file
offsets outside the file.

Since ELFImage didn't even know how large it was, we had no clue that
we were copying things from outside the ELF.

Fix this by adding a size field to ELFImage and validating program
header ranges before memcpy()'ing to them.

The ELF code is definitely going to need more validation and checking.
2020-01-06 21:04:57 +01:00
Andreas Kling
8088fa0556 Kernel: Process::send_signal() should prefer main thread
The main/first thread in a process always has the same TID as the PID.
2020-01-06 14:37:26 +01:00
Andreas Kling
a803312eb4 Kernel: Send SIGCHLD to the thread with same PID as my PPID
Instead of delivering SIGCHLD to "any thread" in the process with PPID,
send it to the thread with the same TID as my PPID.
2020-01-06 14:35:42 +01:00
Andreas Kling
cd42ccd686 Kernel: The waitpid() syscall was not storing to "wstatus" in all cases 2020-01-06 14:34:04 +01:00
Andreas Kling
47cc3e68c6 Kernel: Remove bogus kernel image access validation checks
This code had been misinterpreting the Multiboot ELF section headers
since the beginning. Furthermore QEMU wasn't even passing us any
headers at all, so this wasn't checking anything.
2020-01-06 13:27:14 +01:00
Andreas Kling
53bda09d15 Kernel: Make utime() take path+length, remove SmapDisabler 2020-01-06 12:23:30 +01:00
Andreas Kling
1226fec19e Kernel: Remove SmapDisablers in stat() and lstat() 2020-01-06 12:13:48 +01:00
Andreas Kling
a47f0c93de Kernel: Pass name+length to mmap() and remove SmapDisabler 2020-01-06 12:04:55 +01:00
Andreas Kling
33025a8049 Kernel: Pass name+length to set_mmap_name() and remove SmapDisabler 2020-01-06 11:56:59 +01:00
Andreas Kling
6af8392cf8 Kernel: Remove SmapDisabler in futex() 2020-01-06 11:44:15 +01:00
Andreas Kling
a30fb5c5c1 Kernel: SMAP fixes for module_load() and module_unload()
Remove SmapDisabler in module_load() + use get_syscall_path_argument().
Also fix a SMAP violation in module_unload().
2020-01-06 11:36:16 +01:00
Andreas Kling
7c916b9fe9 Kernel: Make realpath() take path+length, get rid of SmapDisabler 2020-01-06 11:32:25 +01:00
Andreas Kling
d6b06fd5a3 Kernel: Make watch_file() syscall take path length as a size_t
We don't care to handle negative path lengths anyway.
2020-01-06 11:15:49 +01:00
Andreas Kling
cf7df95ffe Kernel: Use get_syscall_path_argument() for syscalls that take paths 2020-01-06 11:15:49 +01:00
Andreas Kling
0df72d4712 Kernel: Pass path+length to mkdir(), rmdir() and chmod() 2020-01-06 11:15:49 +01:00
Andreas Kling
642137f014 Kernel: Make access() take path+length
Also, let's return EFAULT for nullptr at the LibC layer. We can't do
all bad addresses this way, but we can at least do null. :^)
2020-01-06 11:15:48 +01:00
Andreas Kling
2c3a6c37ac Kernel: Paper over SMAP violations in clock_{gettime,nanosleep}()
Just put some SmapDisablers here to unbreak the nesalizer port.
2020-01-05 23:20:33 +01:00
Andreas Kling
c5890afc8b Kernel: Make chdir() take path+length 2020-01-05 22:06:25 +01:00
Andreas Kling
f231e9ea76 Kernel: Pass path+length to the stat() and lstat() syscalls
It's not pleasant having to deal with null-terminated strings as input
to syscalls, so let's get rid of them one by one.
2020-01-05 22:02:54 +01:00
Andreas Kling
152a83fac5 Kernel: Remove SmapDisabler in watch_file() 2020-01-05 21:55:20 +01:00
Andreas Kling
80cbb72f2f Kernel: Remove SmapDisablers in open(), openat() and set_thread_name()
This patch introduces a helpful copy_string_from_user() function
that takes a bounded null-terminated string from userspace memory
and copies it into a String object.
2020-01-05 21:51:06 +01:00
Andreas Kling
c4a1ea34c2 Kernel: Fix SMAP violation in writev() syscall 2020-01-05 19:20:08 +01:00
Andreas Kling
9eef39d68a Kernel: Start implementing x86 SMAP support
Supervisor Mode Access Prevention (SMAP) is an x86 CPU feature that
prevents the kernel from accessing userspace memory. With SMAP enabled,
trying to read/write a userspace memory address while in the kernel
will now generate a page fault.

Since it's sometimes necessary to read/write userspace memory, there
are two new instructions that quickly switch the protection on/off:
STAC (disables protection) and CLAC (enables protection.)
These are exposed in kernel code via the stac() and clac() helpers.

There's also a SmapDisabler RAII object that can be used to ensure
that you don't forget to re-enable protection before returning to
userspace code.

THis patch also adds copy_to_user(), copy_from_user() and memset_user()
which are the "correct" way of doing things. These functions allow us
to briefly disable protection for a specific purpose, and then turn it
back on immediately after it's done. Going forward all kernel code
should be moved to using these and all uses of SmapDisabler are to be
considered FIXME's.

Note that we're not realizing the full potential of this feature since
I've used SmapDisabler quite liberally in this initial bring-up patch.
2020-01-05 18:14:51 +01:00
Andreas Kling
1525c11928 Kernel: Add missing iovec base validation for writev() syscall
We were forgetting to validate the base pointers of iovecs passed into
the writev() syscall.

Thanks to braindead for finding this bug! :^)
2020-01-05 10:38:02 +01:00
Andreas Kling
c89fe8a6a3 Kernel: Fix bad TOCTOU pattern in syscalls that take a parameter struct
Our syscall calling convention only allows passing up to 3 arguments in
registers. For syscalls that take more arguments, we bake them into a
struct and pass a pointer to that struct instead.

When doing pointer validation, this is what we would do:

    1) Validate the "params" struct
    2) Validate "params->some_pointer"
    3) ... other stuff ...
    4) Use "params->some_pointer"

Since the parameter struct is stored in userspace, it can be modified
by userspace after validation has completed.

This was a recurring pattern in many syscalls that was further hidden
by me using structured binding declarations to give convenient local
names to things in the parameter struct:

    auto& [some_pointer, ...] = *params;
    memcpy(some_pointer, ...);

This devilishly makes "some_pointer" look like a local variable but
it's actually more like an alias for "params->some_pointer" and will
expand to a dereference when accessed!

This patch fixes the issues by explicitly copying out each member from
the parameter structs before validating them, and then never using
the "param" pointers beyond that.

Thanks to braindead for finding this bug! :^)
2020-01-05 10:37:57 +01:00
Andreas Kling
3a27790fa7 Kernel: Use Thread::from_tid() in more places 2020-01-04 18:56:04 +01:00
Andreas Kling
95ba0d5a02 Kernel: Remove unused "putch" syscall 2020-01-04 16:00:25 +01:00
Andreas Kling
5abc30e057 Kernel: Allow setgroups() to drop all groups with nullptr
Previously we'd EFAULT for setgroups(0, nullptr), but we can just as
well tolerate it if someone wants to drop groups without a pointer.
2020-01-04 13:47:54 +01:00
Andreas Kling
d84299c7be Kernel: Allow fchmod() and fchown() on pre-bind() local sockets
In order to ensure a specific owner and mode when the local socket
filesystem endpoint is instantiated, we need to be able to call
fchmod() and fchown() on a socket fd between socket() and bind().

This is because until we call bind(), there is no filesystem inode
for the socket yet.
2020-01-03 20:14:56 +01:00
Andreas Kling
1dc64ec064 Kernel: Remove unnecessary logic in kill() and killpg() syscalls
As Sergey pointed out, do_killpg() already interprets PID 0 as the
PGID of the calling process.
2020-01-03 12:58:59 +01:00
Andreas Kling
9026598999 Kernel: Add a more expressive API for getting random bytes
We now have these API's in <Kernel/Random.h>:

    - get_fast_random_bytes(u8* buffer, size_t buffer_size)
    - get_good_random_bytes(u8* buffer, size_t buffer_size)
    - get_fast_random<T>()
    - get_good_random<T>()

Internally they both use x86 RDRAND if available, otherwise they fall
back to the same LCG we had in RandomDevice all along.

The main purpose of this patch is to give kernel code a way to better
express its needs for random data.

Randomness is something that will require a lot more work, but this is
hopefully a step in the right direction.
2020-01-03 12:43:07 +01:00
Andreas Kling
24cc67d199 Kernel: Remove read_tsc() syscall
Since nothing is using this, let's just remove it. That's one less
thing to worry about.
2020-01-03 09:27:09 +01:00
Andreas Kling
8cc5fa5598 Kernel: Unbreak module loading (broke with NX bit changes)
Modules are now mapped fully RWX. This can definitely be improved,
but at least it unbreaks the feature for now.
2020-01-03 03:44:55 +01:00
Andreas Kling
0a1865ebc6 Kernel: read() and write() should fail with EBADF for wrong mode fd's
It was previously possible to write to read-only file descriptors,
and read from write-only file descriptors.

All FileDescription objects now start out non-readable + non-writable,
and whoever is creating them has to "manually" enable reading/writing
by calling set_readable() and/or set_writable() on them.
2020-01-03 03:29:59 +01:00
Andreas Kling
15f3abc849 Kernel: Handle O_DIRECTORY in VFS::open() instead of in each syscall
Just taking care of some FIXMEs.
2020-01-03 03:16:29 +01:00
Andreas Kling
05653a9189 Kernel: killpg() with pgrp=0 should signal every process in the group
In the same group as the calling process, that is.
2020-01-03 03:16:29 +01:00