Commit Graph

180 Commits

Author SHA1 Message Date
Andreas Kling
48f7c28a5c Kernel: Replace "current" with Thread::current and Process::current
Suggested by Sergey. The currently running Thread and Process are now
Thread::current and Process::current respectively. :^)
2020-02-17 15:04:27 +01:00
Andreas Kling
a356e48150 Kernel: Move all code into the Kernel namespace 2020-02-16 01:27:42 +01:00
Andreas Kling
0341ddc5eb Kernel: Rename RegisterDump => RegisterState 2020-02-16 00:15:37 +01:00
Andreas Kling
e576c9e952 Kernel: Clear ESI and EDI on syscall entry
Since these are not part of the system call convention, we don't care
what userspace had in there. Might as well scrub it before entering
the kernel.

I would scrub EBP too, but that breaks the comfy kernel-thru-userspace
stack traces we currently get. It can be done with some effort.
2020-01-25 10:34:32 +01:00
Andreas Kling
5ce1cc89a0 Kernel: Add fast-path for sys$gettid()
The userspace locks are very aggressively calling sys$gettid() to find
out which thread ID they have.

Since syscalls are quite heavy, this can get very expensive for some
programs. This patch adds a fast-path for sys$gettid(), which makes it
skip all of the usual syscall validation and just return the thread ID
right away.

This cuts Kernel/Process.cpp compile time by ~18%, from ~29 to ~24 sec.
2020-01-19 17:32:05 +01:00
Andreas Kling
94ca55cefd Meta: Add license header to source files
As suggested by Joshua, this commit adds the 2-clause BSD license as a
comment block to the top of every source file.

For the first pass, I've just added myself for simplicity. I encourage
everyone to add themselves as copyright holders of any file they've
added or modified in some significant way. If I've added myself in
error somewhere, feel free to replace it with the appropriate copyright
holder instead.

Going forward, all new source files should include a license header.
2020-01-18 09:45:54 +01:00
Andreas Kling
8b54ba0d61 Kernel: Dispatch pending signals when returning from a syscall
It was quite easy to put the system into a heavy churn state by doing
e.g "cat /dev/zero".

It was then basically impossible to kill the "cat" process, even with
"kill -9", since signals are only delivered in two conditions:

a) The target thread is blocked in the kernel
b) The target thread is running in userspace

However, since "cat /dev/zero" command spends most of its time actively
running in the kernel, not blocked, the signal dispatch code just kept
postponing actually handling the signal indefinitely.

To fix this, we now check before returning from a syscall if there are
any pending unmasked signals, and if so, we take a dramatic pause by
blocking the current thread, knowing it will immediately be unblocked
by signal dispatch anyway. :^)
2020-01-12 15:04:33 +01:00
Andreas Kling
17ef5bc0ac Kernel: Rename {ss,esp}_if_crossRing to userspace_{ss,esp}
These were always so awkwardly named.
2020-01-09 18:02:01 +01:00
Andreas Kling
9eef39d68a Kernel: Start implementing x86 SMAP support
Supervisor Mode Access Prevention (SMAP) is an x86 CPU feature that
prevents the kernel from accessing userspace memory. With SMAP enabled,
trying to read/write a userspace memory address while in the kernel
will now generate a page fault.

Since it's sometimes necessary to read/write userspace memory, there
are two new instructions that quickly switch the protection on/off:
STAC (disables protection) and CLAC (enables protection.)
These are exposed in kernel code via the stac() and clac() helpers.

There's also a SmapDisabler RAII object that can be used to ensure
that you don't forget to re-enable protection before returning to
userspace code.

THis patch also adds copy_to_user(), copy_from_user() and memset_user()
which are the "correct" way of doing things. These functions allow us
to briefly disable protection for a specific purpose, and then turn it
back on immediately after it's done. Going forward all kernel code
should be moved to using these and all uses of SmapDisabler are to be
considered FIXME's.

Note that we're not realizing the full potential of this feature since
I've used SmapDisabler quite liberally in this initial bring-up patch.
2020-01-05 18:14:51 +01:00
Andreas Kling
f081990717 Kernel: Use get_fast_random() for the random syscall stack offset 2020-01-03 12:48:28 +01:00
Andreas Kling
1d94b5eb04 Kernel: Add a random offset to kernel stacks upon syscall entry
When entering the kernel from a syscall, we now insert a small bit of
stack padding after the RegisterDump. This makes kernel stacks less
deterministic across syscalls and may make some bugs harder to exploit.

Inspired by Elena Reshetova's talk on kernel stack exploitation.
2020-01-01 23:21:24 +01:00
Andreas Kling
f01fd54d1b Kernel: Make separate kernel entry points for each PIC IRQ
Instead of having a common entry point and looking at the PIC ISR to
figure out which IRQ we're servicing, just make a separate entryway
for each IRQ that pushes the IRQ number and jumps to a common routine.

This fixes a weird issue where incoming network packets would sometimes
cause the mouse to stop working. I didn't track it down further than
realizing we were sometimes EOI'ing the wrong IRQ.
2019-12-15 12:47:53 +01:00
Andreas Kling
e49d6cc7e9 Kernel: Tidy up kernel entry points a little bit
Now that we can see the kernel entry points all the time in profiles,
let's tweak the names a little bit and switch to named exceptions.
2019-12-14 16:16:57 +01:00
Andreas Kling
e56daf547c Kernel: Disallow syscalls from writeable memory
Processes will now crash with SIGSEGV if they attempt making a syscall
from PROT_WRITE memory.

This neat idea comes from OpenBSD. :^)
2019-11-29 16:30:05 +01:00
Andreas Kling
5b8cf2ee23 Kernel: Make syscall counters and page fault counters per-thread
Now that we show individual threads in SystemMonitor and "top",
it's also very nice to have individual counters for the threads. :^)
2019-11-26 21:37:38 +01:00
Andreas Kling
c23addd1fb Kernel: When userspaces calls a removed syscall, fail with ENOSYS
This is a bit gentler than jumping to 0x0, which always crashes the
whole process. Also log a debug message about what happened, and let
the user know that it's probably time to rebuild the program.
2019-11-18 12:35:14 +01:00
Andreas Kling
3da6d89d1f Kernel+LibC: Remove the isatty() syscall
This can be implemented entirely in userspace by calling tcgetattr().
To avoid screwing up the syscall indexes, this patch also adds a
mechanism for removing a syscall without shifting the index of other
syscalls.

Note that ports will still have to be rebuilt after this change,
as their LibC code will try to make the isatty() syscall on startup.
2019-11-17 20:03:42 +01:00
Andreas Kling
794758df3a Kernel: Implement some basic stack pointer validation
VM regions can now be marked as stack regions, which is then validated
on syscall, and on page fault.

If a thread is caught with its stack pointer pointing into anything
that's *not* a Region with its stack bit set, we'll crash the whole
process with SIGSTKFLT.

Userspace must now allocate custom stacks by using mmap() with the new
MAP_STACK flag. This mechanism was first introduced in OpenBSD, and now
we have it too, yay! :^)
2019-11-17 12:15:43 +01:00
Sergey Bugaev
1e1ddce9d8 Kernel: Unwind kernel stacks before dying
While executing in the kernel, a thread can acquire various resources
that need cleanup, such as locks and references to RefCounted objects.
This cleanup normally happens on the exit path, such as in destructors
for various RAII guards. But we weren't calling those exit paths when
killing threads that have been executing in the kernel, such as threads
blocked on reading or sleeping, thus causing leaks.

This commit changes how killing threads works. Now, instead of killing
a thread directly, one is supposed to call thread->set_should_die(),
which will unblock it and make it unwind the stack if it is blocked
in the kernel. Then, just before returning to the userspace, the thread
will automatically die.
2019-11-14 20:05:58 +01:00
Andreas Kling
69ca9cfd78 LibPthread: Start working on a POSIX threading library
This patch adds pthread_create() and pthread_exit(), which currently
simply wrap our existing create_thread() and exit_thread() syscalls.

LibThread is also ported to using LibPthread.
2019-11-13 21:49:24 +01:00
Andreas Kling
b285a1944e Kernel: Clear the x86 DF flag when entering the kernel
The SysV ABI says that the DF flag should be clear on function entry.
That means we have to clear it when jumping into the kernel from some
random userspace context.
2019-11-09 22:42:19 +01:00
Andreas Kling
fbeb1ab15b Kernel: Use a lookup table for syscalls
Instead of the big ugly switch statement, build a lookup table using
the syscall enumeration macro.

This greatly simplifies the syscall implementation. :^)
2019-11-09 22:42:19 +01:00
Andreas Kling
9a4b117f48 Kernel: Simplify kernel entry points slightly
It was silly to push the address of the stack pointer when we can also
just change the callee argument to be a value type.
2019-11-06 13:15:55 +01:00
Andreas Kling
1c6f8d3cbd Kernel: Don't build with -mregparm=3
It was really confusing to have different calling conventions in kernel
and userspace. Also this has prevented us from linking with libgcc.
2019-11-06 13:04:47 +01:00
Andreas Kling
cc68654a44 Kernel+LibC: Implement clock_gettime() and clock_nanosleep()
Only the CLOCK_MONOTONIC clock is supported at the moment, and it only
has millisecond precision. :^)
2019-11-02 19:34:06 +01:00
Calvin Buckley
7e4e092653 Kernel: Add a Linux-style getrandom syscall
The way it gets the entropy and blasts it to the buffer is pretty
ugly IMHO, but it does work for now. (It should be replaced, by
not truncating a u32.)

It implements an (unused for now) flags argument, like Linux but
instead of OpenBSD's. This is in case we want to distinguish
between entropy sources or any other reason and have to implement
a new syscall later. Of course, learn from Linux's struggles with
entropy sourcing too.
2019-10-13 18:03:21 +02:00
Drew Stratford
7fc903b97a Kernel: Add exception_code to RegisterDump.
Added the exception_code field to RegisterDump, removing the need
for RegisterDumpWithExceptionCode. To accomplish this, I had to
push a dummy exception code during some interrupt entries to properly
pad out the RegisterDump. Note that we also needed to change some code
in sys$sigreturn to deal with the new RegisterDump layout.
2019-10-07 16:39:47 +02:00
Mauri de Souza Nunes
7d85fc00e4 Kernel: Implement fchdir syscall
The fchdir() function is equivalent to chdir() except that the
directory that is to be the new current working directory is
specified by a file descriptor.
2019-09-13 14:04:38 +02:00
Drew Stratford
81d0f96f20 Kernel: Use user stack for signal handlers.
This commit drastically changes how signals are handled.

In the case that an unblocked thread is signaled it works much
in the same way as previously. However, when a blocking syscall
is interrupted, we set up the signal trampoline on the user
stack, complete the blocking syscall, return down the kernel
stack and then jump to the handler. This means that from the
kernel stack's perspective, we only ever get one system call deep.

The signal trampoline has also been changed in order to properly
store the return value from system calls. This is necessary due
to the new way we exit from signaled system calls.
2019-09-05 16:37:09 +02:00
Rok Povsic
18fbe4ac83 Kernel: Add realpath syscall 2019-08-25 19:47:37 +02:00
Sergey Bugaev
425c356288 Kernel+LibC+Userland: Support mounting other kinds of filesystems 2019-08-17 12:07:55 +02:00
Jesse Buhagiar
bc22456f89 Kernel: Added unmount ability to VFS
It is now possible to unmount file systems from the VFS via `umount`.
It works via looking up the `fsid` of the filesystem from the `Inode`'s
metatdata so I'm not sure how fragile it is. It seems to work for now
though as something to get us going.
2019-08-17 09:29:54 +02:00
Andreas Kling
6ad3efe067 Kernel+LibC: Add get_process_name() syscall
It does exactly what it sounds like:

    int get_process_name(char* buffer, int buffer_size);
2019-08-15 20:55:10 +02:00
Andreas Kling
7d6689055f Kernel+LibC+crash: Add mprotect() syscall
This patch adds the mprotect() syscall to allow changing the protection
flags for memory regions. We don't do any region splitting/merging yet,
so this only works on whole mmap() regions.

Added a "crash -r" flag to verify that we crash when you attempt to
write to read-only memory. :^)
2019-08-12 19:33:24 +02:00
Sergey Bugaev
9c3b1ca0c6 Kernel+LibC: Support passing O_CLOEXEC to pipe()
In the userspace, this mimics the Linux pipe2() syscall;
in the kernel, the Process::sys$pipe() now always accepts
a flags argument, the no-argument pipe() syscall is now a
userspace wrapper over pipe2().
2019-08-05 16:04:31 +02:00
Jesse
401c87a0cc Kernel: mount system call (#396)
It is now possible to mount ext2 `DiskDevice` devices under Serenity on
any folder in the root filesystem. Currently any user can do this with
any permissions. There's a fair amount of assumptions made here too,
that might not be too good, but can be worked on in the future. This is
a good start to allow more dynamic operation under the OS itself.

It is also currently impossible to unmount and such, and devices will
fail to mount in Linux as the FS 'needs to be cleaned'. I'll work on
getting `umount` done ASAP to rectify this (as well as working on less
assumption-making in the mount syscall. We don't want to just be able
to mount DiskDevices!). This could probably be fixed with some `-t`
flag or something similar.
2019-08-02 15:18:47 +02:00
Andreas Kling
5ded77df39 Kernel+ProcessManager: Let processes have an icon and show it in the table.
Processes can now have an icon assigned, which is essentially a 16x16 RGBA32
bitmap exposed as a shared buffer ID.

You set the icon ID by calling set_process_icon(int) and the icon ID will be
exposed through /proc/all.

To make this work, I added a mechanism for making shared buffers globally
accessible. For safety reasons, each app seals the icon buffer before making
it global.

Right now the first call to GWindow::set_icon() is what determines the
process icon. We'll probably change this in the future. :^)
2019-07-29 07:26:01 +02:00
Andreas Kling
c8e2bb5605 Kernel: Add a mechanism for listening for changes to an inode.
The syscall is quite simple:

    int watch_file(const char* path, int path_length);

It returns a file descriptor referring to a "InodeWatcher" object in the
kernel. It becomes readable whenever something changes about the inode.

Currently this is implemented by hooking the "metadata dirty bit" in
Inode which isn't perfect, but it's a start. :^)
2019-07-22 20:01:11 +02:00
Andreas Kling
af81645a2a Kernel+LibC: Add a dbgputstr() syscall for sending strings to debug output.
This is very handy for the DebugLogStream implementation, among others. :^)
2019-07-21 21:43:37 +02:00
Andreas Kling
3fce2fb205 Kernel+LibC: Add a dbgputch() syscall and use it for userspace dbgprintf().
The "stddbg" stream was a cute idea but we never ended up using it in
practice, so let's simplify this and implement userspace dbgprintf() on top
of a simple dbgputch() syscall instead.

This makes debugging LibC startup a little bit easier. :^)
2019-07-21 19:45:31 +02:00
Andreas Kling
d2b521f0ab Kernel+LibC: Add a dump_backtrace() syscall.
This is very simple but already very useful. Now you're able to call to
dump_backtrace() from anywhere userspace to get a nice symbolicated
backtrace in the debugger output. :^)
2019-07-21 09:59:17 +02:00
Jesse
a5d80f7e3b Kernel: Only allow superuser to halt() the system (#342)
Following the discussion in #334, shutdown must also have root-only
run permissions.
2019-07-19 13:08:26 +02:00
Jesse
a27c9e3e01 Kernel+Userland: Addd reboot syscall (#334)
Rolling with the theme of adding a dialog to shutdown the machine, it is
probably nice to have a way to reboot the machine without performing a full
system powerdown.

A reboot program has been added to `/bin/` as well as a corresponding
`syscall` (SC_reboot). This syscall works by attempting to pulse the 8042
keyboard controller. Note that this is NOT supported on  new machines, and
should only be a fallback until we have proper ACPI support.

The implementation causes a triple fault in QEMU, which then restarts the
system. The filesystems are locked and synchronized before this occurs,
so there shouldn't be any corruption etctera.
2019-07-19 09:58:12 +02:00
Robin Burchell
b907608e46 SharedBuffer: Split the creation and share steps
This allows us to seal a buffer *before* anyone else has access to it
(well, ok, the creating process still does, but you can't win them all).

It also means that a SharedBuffer can be shared with multiple clients:
all you need is to have access to it to share it on again.
2019-07-18 10:06:20 +02:00
Andreas Kling
c110cf193d Kernel: Have the open() syscall take an explicit path length parameter.
Instead of computing the path length inside the syscall handler, let the
caller do that work. This allows us to implement to new variants of open()
and creat(), called open_with_path_length() and creat_with_path_length().
These are suitable for use with e.g StringView.
2019-07-08 20:01:49 +02:00
Andreas Kling
27f699ef0c AK: Rename the common integer typedefs to make it obvious what they are.
These types can be picked up by including <AK/Types.h>:

* u8, u16, u32, u64 (unsigned)
* i8, i16, i32, i64 (signed)
2019-07-03 21:20:13 +02:00
Robin Burchell
952382b413 Kernel/Userland: Add a halt syscall, and a shutdown binary to invoke it 2019-06-16 12:25:30 +02:00
Andreas Kling
736092a087 Kernel: Move i386.{cpp,h} => Arch/i386/CPU.{cpp,h}
There's a ton of work that would need to be done before we could spin up on
another architecture, but let's at least try to separate things out a bit.
2019-06-07 20:02:01 +02:00
Andreas Kling
bc951ca565 Kernel: Run clang-format on everything. 2019-06-07 11:43:58 +02:00
Andreas Kling
93d3d1ede1 Kernel: Add fchown() syscall. 2019-06-01 20:31:36 +02:00