Commit Graph

262 Commits

Author SHA1 Message Date
Tom
fb41d89384 Kernel: Implement software context switching and Processor structure
Moving certain globals into a new Processor structure for
each CPU allows us to eventually run an instance of the
scheduler on each CPU.
2020-07-01 12:07:01 +02:00
Sergey Bugaev
d2b500fbcb AK+Kernel: Help the compiler inline a bunch of trivial methods
If these methods get inlined, the compiler is able to statically eliminate most
of the assertions. Alas, it doesn't realize this, and believes inlining them to
be too expensive. So give it a strong hint that it's not the case.

This *decreases* the kernel binary size.
2020-05-20 14:11:13 +02:00
Brian Gianforcaro
eeb5318c25 Kernel: Expose timers via a TimerId type
The public consumers of the timer API shouldn't need to know
the how timer id's are tracked internally. Expose a typedef
instead to allow the internal implementation to be protected
from potential churn in the future.

It's also just good API design.
2020-04-27 11:14:41 +02:00
Brian Gianforcaro
faf15e3721 Kernel: Add timeout support to Thread::wait_on
This change plumbs a new optional timeout option to wait_on.
The timeout is enabled by enqueing a timer on the timer queue
while we are waiting. We can then see if we were woken up or
timed out by checking if we are still on the wait queue or not.
2020-04-26 21:31:52 +02:00
Itamar
9e51e295cf ptrace: Add PT_SETREGS
PT_SETTREGS sets the regsiters of the traced thread. It can only be
used when the tracee is stopped.

Also, refactor ptrace.
The implementation was getting long and cluttered the alraedy large
Process.cpp file.

This commit moves the bulk of the implementation to Kernel/Ptrace.cpp,
and factors out peek & poke to separate methods of the Process class.
2020-04-13 00:53:22 +02:00
Itamar
4568a628f9 Thread: Set m_blocker to null in Thread::unblock()
Before this commit, m_blocker was only set to null in Thread::block,
after the thread has been unblocked.

Starting with this commit, m_blocker is also set to null in
Thread::unblock.

This change will allow us to implement a missing feature of the PT_TRACE
command of the ptrace syscall - stopping the traced thread when it
exits the execve syscall.

That feature will be implemented by sending a blocking SIGSTOP to the
traced thread after it has executed the execve logic and before it
starts executing the new program in userspace.

However, since Process::exec arranges the tss to return to userspace
(the so-called "yield-teleport"), the code in Thread::block that should
be run after the thread unblocks, and sets m_blocker to null, never
actually runs.

Setting m_blocker to null in Thread::unblock allows us to avoid an
incorrect state where the thread is in a Running state but conatins a
pointer to a Blocker.
2020-04-13 00:53:22 +02:00
Peter Nelson
eff27f39d5
Kernel: Store previous thread state upon all transitions to Stopped (#1753)
We now store the previous thread state in m_stop_state for all
transitions to the Stopped state via Thread::set_state.

Fixes #1752 whereupon resuming a thread that was stopped with SIGTSTP,
the previous state of the thread is not remembered correctly, resulting
in m_stop_state == State::Invalid and the associated assertion fails.
2020-04-11 23:39:46 +02:00
Andrew Kaster
21b5909dc6 LibELF: Move ELF classes into namespace ELF
This is for consistency with other namespace changes that were made
a while back to the other libraries :)
2020-04-11 22:41:05 +02:00
Andreas Kling
b7ff3b5ad1 Kernel: Include the current instruction pointer in profile samples
We were missing the innermost instruction pointer when sampling.
This makes the instruction-level profile info a lot cooler! :^)
2020-04-11 21:04:45 +02:00
Andreas Kling
dc7340332d Kernel: Update cryptically-named functions related to symbolication 2020-04-08 17:19:46 +02:00
Itamar
6b74d38aab Kernel: Add 'ptrace' syscall
This commit adds a basic implementation of
the ptrace syscall, which allows one process
(the tracer) to control another process (the tracee).

While a process is being traced, it is stopped whenever a signal is
received (other than SIGCONT).

The tracer can start tracing another thread with PT_ATTACH,
which causes the tracee to stop.

From there, the tracer can use PT_CONTINUE
to continue the execution of the tracee,
or use other request codes (which haven't been implemented yet)
to modify the state of the tracee.

Additional request codes are PT_SYSCALL, which causes the tracee to
continue exection but stop at the next entry or exit from a syscall,
and PT_GETREGS which fethces the last saved register set of the tracee
(can be used to inspect syscall arguments and return value).

A special request code is PT_TRACE_ME, which is issued by the tracee
and causes it to stop when it calls execve and wait for the
tracer to attach.
2020-03-28 18:27:18 +01:00
Andreas Kling
b1058b33fb AK: Add global FlatPtr typedef. It's u32 or u64, based on sizeof(void*)
Use this instead of uintptr_t throughout the codebase. This makes it
possible to pass a FlatPtr to something that has u32 and u64 overloads.
2020-03-08 13:06:51 +01:00
Liav A
0fc60e41dd Kernel: Use klog() instead of kprintf()
Also, duplicate data in dbg() and klog() calls were removed.
In addition, leakage of virtual address to kernel log is prevented.
This is done by replacing kprintf() calls to dbg() calls with the
leaked data instead.
Also, other kprintf() calls were replaced with klog().
2020-03-02 22:23:39 +01:00
Andreas Kling
678c87087d Kernel: Load executables on demand when symbolicating
Previously we would map the entire executable of a program in its own
address space (but make it unavailable to userspace code.)

This patch removes that and changes the symbolication code to remap
the executable on demand (and into the kernel's own address space
instead of the process address space.)

This opens up a couple of further simplifications that will follow.
2020-03-02 11:20:34 +01:00
Andreas Kling
5e0c4d689f Kernel: Move ProcessPagingScope to its own files 2020-03-01 15:38:09 +01:00
Andreas Kling
2839bb0be1 Kernel: Restore the previous thread state on SIGCONT after SIGSTOP
When stopping a thread with the SIGSTOP signal, we now store the thread
state in Thread::m_stop_state. That state is then restored on SIGCONT.
This fixes an issue where previously-blocked threads would unblock
upon resume. Now they simply resume in the Blocked state, and it's up
to the regular unblocking mechanism to unblock them.

Fixes #1326.
2020-03-01 15:14:17 +01:00
Andreas Kling
8b6d548b55 Kernel: Disable interrupts throughout Thread::raw_backtrace()
Otherwise we may hit an assertion when validating stack addresses.
2020-02-29 22:06:56 +01:00
Andreas Kling
7cd1bdfd81 Kernel: Simplify some dbg() logging
We don't have to log the process name/PID/TID, dbg() automatically adds
that as a prefix to every line.

Also we don't have to do .characters() on Strings passed to dbg() :^)
2020-02-29 13:39:06 +01:00
Liav A
a506b2a48e Thread: Use dbg() instead of dbgprintf() 2020-02-27 13:05:12 +01:00
Cristian-Bogdan SIRB
05ce8586ea Kernel: Fix ASSERTION failed in join_thread syscall
set_interrupted_by_death was never called whenever a thread that had
a joiner died, so the joiner remained with the joinee pointer there,
resulting in an assertion fail in JoinBlocker: m_joinee pointed to
a freed task, filled with garbage.

Thread::current->m_joinee may not be valid after the unblock

Properly return the joinee exit value to the joiner thread.
2020-02-27 10:09:44 +01:00
Cristian-Bogdan SIRB
717cd5015e Kernel: Allow process with multiple threads to call exec and exit
This allows a process wich has more than 1 thread to call exec, even
from a thread. This kills all the other threads, but it won't wait for
them to finish, just makes sure that they are not in a running/runable
state.

In the case where a thread does exec, the new program PID will be the
thread TID, to keep the PID == TID in the new process.

This introduces a new function inside the Process class,
kill_threads_except_self which is called on exit() too (exit with
multiple threads wasn't properly working either).

Inside the Lock class, there is the need for a new function,
clear_waiters, which removes all the waiters from the
Process::big_lock. This is needed since after a exit/exec, there should
be no other threads waiting for this lock, the threads should be simply
killed. Only queued threads should wait for this lock at this point,
since blocked threads are handled in set_should_die.
2020-02-26 13:06:40 +01:00
Andreas Kling
ceec1a7d38 AK: Make Vector use size_t for its size and capacity 2020-02-25 14:52:35 +01:00
Andreas Kling
94652fd2fb Kernel: Fully validate pointers when walking stack during profiling
It's not enough to just check that things wouldn't page fault, we also
need to verify that addresses are accessible to the profiled thread.
2020-02-22 10:09:54 +01:00
Andreas Kling
59b9e49bcd Kernel: Don't trigger page faults during profiling stack walk
The kernel sampling profiler will walk thread stacks during the timer
tick handler. Since it's not safe to trigger page faults during IRQ's,
we now avoid this by checking the page tables manually before accessing
each stack location.
2020-02-21 15:49:39 +01:00
Andreas Kling
9aa234cc47 Kernel: Reset FPU state on exec() 2020-02-18 13:44:27 +01:00
Andreas Kling
48f7c28a5c Kernel: Replace "current" with Thread::current and Process::current
Suggested by Sergey. The currently running Thread and Process are now
Thread::current and Process::current respectively. :^)
2020-02-17 15:04:27 +01:00
Andreas Kling
1d611e4a11 Kernel: Reduce header dependencies of MemoryManager and Region 2020-02-16 01:33:41 +01:00
Andreas Kling
a356e48150 Kernel: Move all code into the Kernel namespace 2020-02-16 01:27:42 +01:00
Andreas Kling
0341ddc5eb Kernel: Rename RegisterDump => RegisterState 2020-02-16 00:15:37 +01:00
Andreas Kling
934b1d8a9b Kernel: Finalizer should not go back to sleep if there's more to do
Before putting itself back on the wait queue, the finalizer task will
now check if there's more work to do, and if so, do it first. :^)

This patch also puts a bunch of process/thread debug logging behind
PROCESS_DEBUG and THREAD_DEBUG since it was unbearable to debug this
stuff with all the spam.
2020-02-01 10:56:17 +01:00
Andreas Kling
5163c5cc63 Kernel: Expose the signal that stopped a thread via sys$waitpid() 2020-01-27 20:47:10 +01:00
Andreas Kling
17210a39e4 Kernel: Remove ancient hack that put the current PID in TSS.SS2
While I was bringing up multitasking, I put the current PID in the SS2
(ring 2 stack segment) slot of the TSS. This was so I could see which
PID was currently running when just inspecting the CPU state.
2020-01-27 13:10:24 +01:00
Andreas Kling
ae0f92a0a1 Kernel: Simplify kernel thread stack allocation
We had two identical code paths doing this for some reason.
2020-01-27 12:52:45 +01:00
Andreas Kling
f38cfb3562 Kernel: Tidy up debug logging a little bit
When using dbg() in the kernel, the output is automatically prefixed
with [Process(PID:TID)]. This makes it a lot easier to understand which
thread is generating the output.

This patch also cleans up some common logging messages and removes the
now-unnecessary "dbg() << *current << ..." pattern.
2020-01-21 16:16:20 +01:00
Andreas Kling
e901a3695a Kernel: Use the templated copy_to/from_user() in more places
These ensure that the "to" and "from" pointers have the same type,
and also that we copy the correct number of bytes.
2020-01-20 13:41:21 +01:00
Andreas Kling
4b7a89911c Kernel: Remove some unnecessary casts to uintptr_t
VirtualAddress is constructible from uintptr_t and const void*.
PhysicalAddress is constructible from uintptr_t but not const void*.
2020-01-20 13:13:03 +01:00
Andreas Kling
a246e9cd7e Use uintptr_t instead of u32 when storing pointers as integers
uintptr_t is 32-bit or 64-bit depending on the target platform.
This will help us write pointer size agnostic code so that when the day
comes that we want to do a 64-bit port, we'll be in better shape.
2020-01-20 13:13:03 +01:00
Andreas Kling
1d02ac35fc Kernel: Limit Thread::raw_backtrace() to the max profiler stack size
Let's avoid walking overly long stacks here, since kmalloc() is finite.
2020-01-19 13:54:09 +01:00
Andreas Kling
87583aea9c Kernel: Use copy_from_user() when appropriate during thread backtracing 2020-01-19 10:33:26 +01:00
Andreas Kling
94ca55cefd Meta: Add license header to source files
As suggested by Joshua, this commit adds the 2-clause BSD license as a
comment block to the top of every source file.

For the first pass, I've just added myself for simplicity. I encourage
everyone to add themselves as copyright holders of any file they've
added or modified in some significant way. If I've added myself in
error somewhere, feel free to replace it with the appropriate copyright
holder instead.

Going forward, all new source files should include a license header.
2020-01-18 09:45:54 +01:00
Andreas Kling
65cb406327 Kernel: Allow unlocking a held Lock with interrupts disabled
This is needed to eliminate a race in Thread::wait_on() where we'd
otherwise have to wait until after unlocking the process lock before
we can disable interrupts.
2020-01-13 18:56:46 +01:00
Andreas Kling
41376d4662 Kernel: Fix Lock racing to the WaitQueue
There was a time window between releasing Lock::m_lock and calling into
the lock's WaitQueue where someone else could take m_lock and bring two
threads into a deadlock situation.

Fix this issue by holding Lock::m_lock until interrupts are disabled by
either Thread::wait_on() or WaitQueue::wake_one().
2020-01-12 19:04:16 +01:00
Andreas Kling
a885719af5 Kernel: Keep SMAP protection enabled in Thread::backtrace_impl() 2020-01-12 10:47:01 +01:00
Andreas Kling
f6c0fccc01 Kernel: Fix busted backtraces when a thread backtraces itself
When the current thread is backtracing itself, we now start walking the
stack from the current EBP register value, instead of the TSS one.

Now SystemMonitor always appears to be running Thread::backtrace() when
sampled, which makes perfect sense. :^)
2020-01-12 10:19:37 +01:00
Andreas Kling
8c5cd97b45 Kernel: Fix kernel null deref on process crash during join_thread()
The join_thread() syscall is not supposed to be interruptible by
signals, but it was. And since the process death mechanism piggybacked
on signal interrupts, it was possible to interrupt a pthread_join() by
killing the process that was doing it, leading to confusing due to some
assumptions being made by Thread::finalize() for threads that have a
pending joiner.

This patch fixes the issue by making "interrupted by death" a distinct
block result separate from "interrupted by signal". Then we handle that
state in join_thread() and tidy things up so that thread finalization
doesn't get confused by the pending joiner being gone.

Test: Tests/Kernel/null-deref-crash-during-pthread_join.cpp
2020-01-10 19:23:45 +01:00
Andreas Kling
17ef5bc0ac Kernel: Rename {ss,esp}_if_crossRing to userspace_{ss,esp}
These were always so awkwardly named.
2020-01-09 18:02:01 +01:00
Andreas Kling
e23f05a157 Kernel: Remove unused variable Thread::m_userspace_stack_region 2020-01-09 12:31:18 +01:00
Andreas Kling
f6691ad26e Kernel: Fix SMAP violation in thread signal dispatch 2020-01-05 18:19:26 +01:00
Andreas Kling
9eef39d68a Kernel: Start implementing x86 SMAP support
Supervisor Mode Access Prevention (SMAP) is an x86 CPU feature that
prevents the kernel from accessing userspace memory. With SMAP enabled,
trying to read/write a userspace memory address while in the kernel
will now generate a page fault.

Since it's sometimes necessary to read/write userspace memory, there
are two new instructions that quickly switch the protection on/off:
STAC (disables protection) and CLAC (enables protection.)
These are exposed in kernel code via the stac() and clac() helpers.

There's also a SmapDisabler RAII object that can be used to ensure
that you don't forget to re-enable protection before returning to
userspace code.

THis patch also adds copy_to_user(), copy_from_user() and memset_user()
which are the "correct" way of doing things. These functions allow us
to briefly disable protection for a specific purpose, and then turn it
back on immediately after it's done. Going forward all kernel code
should be moved to using these and all uses of SmapDisabler are to be
considered FIXME's.

Note that we're not realizing the full potential of this feature since
I've used SmapDisabler quite liberally in this initial bring-up patch.
2020-01-05 18:14:51 +01:00
Andreas Kling
3a27790fa7 Kernel: Use Thread::from_tid() in more places 2020-01-04 18:56:04 +01:00
Andreas Kling
32ec1e5aed Kernel: Mask kernel addresses in backtraces and profiles
Addresses outside the userspace virtual range will now show up as
0xdeadc0de in backtraces and profiles generated by unprivileged users.
2020-01-02 20:51:31 +01:00
Andreas Kling
f598bbbb1d Kernel: Prevent executing I/O instructions in userspace
All threads were running with iomapbase=0 in their TSS, which the CPU
interprets as "there's an I/O permission bitmap starting at offset 0
into my TSS".

Because of that, any bits that were 1 inside the TSS would allow the
thread to execute I/O instructions on the port with that bit index.

Fix this by always setting the iomapbase to sizeof(TSS32), and also
setting the TSS descriptor's limit to sizeof(TSS32), effectively making
the I/O permissions bitmap zero-length.

This should make it no longer possible to do I/O from userspace. :^)
2020-01-01 17:31:41 +01:00
Andreas Kling
fd740829d1 Kernel: Switch to eagerly restoring x86 FPU state on context switch
Lazy FPU restore is well known to be vulnerable to timing attacks,
and eager restore is a lot simpler anyway, so let's just do it eagerly.
2020-01-01 16:54:21 +01:00
Andreas Kling
54d182f553 Kernel: Remove some unnecessary leaking of kernel pointers into dmesg
There's a lot more of this and we need to stop printing kernel pointers
anywhere but the debug console.
2019-12-31 01:22:00 +01:00
Andreas Kling
610f3ad12f Kernel: Add a basic thread boosting mechanism
This patch introduces a syscall:

    int set_thread_boost(int tid, int amount)

You can use this to add a permanent boost value to the effective thread
priority of any thread with your UID (or any thread in the system if
you are the superuser.)

This is quite crude, but opens up some interesting opportunities. :^)
2019-12-30 19:23:13 +01:00
Andreas Kling
50677bf806 Kernel: Refactor scheduler to use dynamic thread priorities
Threads now have numeric priorities with a base priority in the 1-99
range.

Whenever a runnable thread is *not* scheduled, its effective priority
is incremented by 1. This is tracked in Thread::m_extra_priority.
The effective priority of a thread is m_priority + m_extra_priority.

When a runnable thread *is* scheduled, its m_extra_priority is reset to
zero and the effective priority returns to base.

This means that lower-priority threads will always eventually get
scheduled to run, once its effective priority becomes high enough to
exceed the base priority of threads "above" it.

The previous values for ThreadPriority (Low, Normal and High) are now
replaced as follows:

    Low -> 10
    Normal -> 30
    High -> 50

In other words, it will take 20 ticks for a "Low" priority thread to
get to "Normal" effective priority, and another 20 to reach "High".

This is not perfect, and I've used some quite naive data structures,
but I think the mechanism will allow us to build various new and
interesting optimizations, and we can figure out better data structures
later on. :^)
2019-12-30 18:46:17 +01:00
Andreas Kling
9e55bcb7da Kernel: Make kernel memory regions be non-executable by default
From now on, you'll have to request executable memory specifically
if you want some.
2019-12-25 22:41:34 +01:00
Andreas Kling
52deb09382 Kernel: Enable PAE (Physical Address Extension)
Introduce one more (CPU) indirection layer in the paging code: the page
directory pointer table (PDPT). Each PageDirectory now has 4 separate
PageDirectoryEntry arrays, governing 1 GB of VM each.

A really neat side-effect of this is that we can now share the physical
page containing the >=3GB kernel-only address space metadata between
all processes, instead of lazily cloning it on page faults.

This will give us access to the NX (No eXecute) bit, allowing us to
prevent execution of memory that's not supposed to be executed.
2019-12-25 13:35:57 +01:00
Conrad Pankoff
0fdbe08637 Kernel: Fix debug message and kernel stack region names in thread setup 2019-12-24 01:28:38 +01:00
Conrad Pankoff
0cb89f5927 Kernel: Mark kernel stack regions as... stack regions 2019-12-24 01:28:38 +01:00
Conrad Pankoff
b557aab884 Kernel: Move ring0 stacks out of kmalloc_eternal
This allows us to use all the same fun memory protection features as the
rest of the system for ring0 processes. Previously a ring0 process could
over- or underrun its stack and nobody cared, since kmalloc_eternal is the
wild west of memory.
2019-12-24 01:28:38 +01:00
Conrad Pankoff
3aaeff483b Kernel: Add a size argument to validate_read_from_kernel 2019-12-24 01:28:38 +01:00
Andreas Kling
523fd6533e Kernel: Unlock the Process when exit()ing
If there are more threads in a process when exit()ing, we need to give
them a chance to unwind any kernel stacks. This means we have to unlock
the process lock before giving control to the scheduler.

Fixes #891 (together with all of the other "no more main thread" work.)
2019-12-22 12:38:01 +01:00
Andreas Kling
f4978b2be1 Kernel: Use IntrusiveList to make WaitQueue allocation-free :^) 2019-12-22 12:38:01 +01:00
Andreas Kling
4b8851bd01 Kernel: Make TID's be unique PID's
This is a little strange, but it's how I understand things should work.

The first thread in a new process now has TID == PID.
Additional threads subsequently spawned in that process all have unique
TID's generated by the PID allocator. TIDs are now globally unique.
2019-12-22 12:38:01 +01:00
Andreas Kling
16812f0f98 Kernel: Get rid of "main thread" concept
The idea of all processes reliably having a main thread was nice in
some ways, but cumbersome in others. More importantly, it didn't match
up with POSIX thread semantics, so let's move away from it.

This thread gets rid of Process::main_thread() and you now we just have
a bunch of Thread objects floating around each Process.

When the finalizer nukes the last Thread in a Process, it will also
tear down the Process.

There's a bunch of more things to fix around this, but this is where we
get started :^)
2019-12-22 12:37:58 +01:00
Andreas Kling
3012b224f0 Kernel: Fix intermittent assertion failure in sys$exec()
While setting up the main thread stack for a new process, we'd incur
some zero-fill page faults. This was to be expected, since we allocate
a huge stack but lazily populate it with physical pages.

The problem is that page fault handlers may enable interrupts in order
to grab a VMObject lock (or to page in from an inode.)

During exec(), a process is reorganizing itself and will be in a very
unrunnable state if the scheduler should interrupt it and then later
ask it to run again. Which is exactly what happens if the process gets
pre-empted while the new stack's zero-fill page fault grabs the lock.

This patch fixes the issue by creating new main thread stacks before
disabling interrupts and going into the critical part of exec().
2019-12-18 23:03:23 +01:00
Andreas Kling
7a64f55c0f Kernel: Fix get_register_dump_from_stack() after IRQ entry changes
I had to change the layout of RegisterDump a little bit to make the new
IRQ entry points work. This broke get_register_dump_from_stack() which
was expecting the RegisterDump to be badly aligned due to a goofy extra
16 bits which are no longer there.
2019-12-15 17:58:53 +01:00
Andreas Kling
b32e961a84 Kernel: Implement a simple process time profiler
The kernel now supports basic profiling of all the threads in a process
by calling profiling_enable(pid_t). You finish the profiling by calling
profiling_disable(pid_t).

This all works by recording thread stacks when the timer interrupt
fires and the current thread is in a process being profiled.
Note that symbolication is deferred until profiling_disable() to avoid
adding more noise than necessary to the profile.

A simple "/bin/profile" command is included here that can be used to
start/stop profiling like so:

    $ profile 10 on
    ... wait ...
    $ profile 10 off

After a profile has been recorded, it can be fetched in /proc/profile

There are various limits (or "bugs") on this mechanism at the moment:

- Only one process can be profiled at a time.
- We allocate 8MB for the samples, if you use more space, things will
  not work, and probably break a bit.
- Things will probably fall apart if the profiled process dies during
  profiling, or while extracing /proc/profile
2019-12-11 20:36:56 +01:00
Andrew Kaster
9058962712 Kernel: Allow setting thread names
The main thread of each kernel/user process will take the name of
the process. Extra threads will get a fancy new name
"ProcessName[<tid>]".

Thread backtraces now list the thread name in addtion to tid.

Add the thread name to /proc/all (should it get its own proc
file?).

Add two new syscalls, set_thread_name and get_thread_name.
2019-12-08 14:09:29 +01:00
Andreas Kling
8bb98aa31b Kernel: Use a WaitQueue to implement finalizer wakeup
This gets rid of the special "Lurking" thread state and replaces it
with a generic WaitQueue :^)
2019-12-01 19:17:17 +01:00
Andreas Kling
5859e16e53 Kernel: Use a dedicated thread state for wait-queued threads
Instead of using the generic block mechanism, wait-queued threads now
go into the special Queued state.

This fixes an issue where signal dispatch would unblock a wait-queued
thread (because signal dispatch unblocks blocked threads) and cause
confusion since the thread only expected to be awoken by the queue.
2019-12-01 16:02:58 +01:00
Andreas Kling
f067730f6b Kernel: Add a WaitQueue for Thread queueing/waking and use it for Lock
The kernel's Lock class now uses a proper wait queue internally instead
of just having everyone wake up regularly to try to acquire the lock.

We also keep the donation mechanism, so that whenever someone tries to
take the lock and fails, that thread donates the remainder of its
timeslice to the current lock holder.

After unlocking a Lock, the unlocking thread calls WaitQueue::wake_one,
which unblocks the next thread in queue.
2019-12-01 12:07:43 +01:00
Andreas Kling
f75a6b9daa Kernel: Demangle kernel C++ symbols correctly again
I broke this while implementing module linking. Also move the actual
demangling work to AK, in AK::demangle(const char*)
2019-11-29 14:59:15 +01:00
Andreas Kling
e34ed04d1e Kernel+LibPthread+LibC: Create secondary thread stacks in userspace
Have pthread_create() allocate a stack and passing it to the kernel
instead of this work happening in the kernel. The more of this we can
do in userspace, the better.

This patch also unexposes the raw create_thread() and exit_thread()
syscalls since they are now only used by LibPthread anyway.
2019-11-17 17:29:20 +01:00
Andreas Kling
794758df3a Kernel: Implement some basic stack pointer validation
VM regions can now be marked as stack regions, which is then validated
on syscall, and on page fault.

If a thread is caught with its stack pointer pointing into anything
that's *not* a Region with its stack bit set, we'll crash the whole
process with SIGSTKFLT.

Userspace must now allocate custom stacks by using mmap() with the new
MAP_STACK flag. This mechanism was first introduced in OpenBSD, and now
we have it too, yay! :^)
2019-11-17 12:15:43 +01:00
Andreas Kling
73d6a69b3f Kernel: Release the big process lock while yielding in sys$yield()
Otherwise, a thread calling sched_yield() will prevent other threads
in that process from entering the kernel.
2019-11-16 12:18:59 +01:00
Andreas Kling
cb5021419e Kernel: Move Thread::m_joinee_exit_value into the JoinBlocker
There's no need for this to be a permanent Thread member. Just use a
reference in the JoinBlocker instead.
2019-11-14 21:04:34 +01:00
Andreas Kling
69efa3f630 Kernel+LibPthread: Implement pthread_join()
It's now possible to block until another thread in the same process has
exited. We can also retrieve its exit value, which is whatever value it
passed to pthread_exit(). :^)
2019-11-14 20:58:23 +01:00
Sergey Bugaev
1e1ddce9d8 Kernel: Unwind kernel stacks before dying
While executing in the kernel, a thread can acquire various resources
that need cleanup, such as locks and references to RefCounted objects.
This cleanup normally happens on the exit path, such as in destructors
for various RAII guards. But we weren't calling those exit paths when
killing threads that have been executing in the kernel, such as threads
blocked on reading or sleeping, thus causing leaks.

This commit changes how killing threads works. Now, instead of killing
a thread directly, one is supposed to call thread->set_should_die(),
which will unblock it and make it unwind the stack if it is blocked
in the kernel. Then, just before returning to the userspace, the thread
will automatically die.
2019-11-14 20:05:58 +01:00
Andreas Kling
083c5f8b89 Kernel: Rework Process::Priority into ThreadPriority
Scheduling priority is now set at the thread level instead of at the
process level.

This is a step towards allowing processes to set different priorities
for threads. There's no userspace API for that yet, since only the main
thread's priority is affected by sched_setparam().
2019-11-06 16:30:06 +01:00
Andreas Kling
49635e62fa LibELF: Move AK/ELF/ into Libraries/LibELF/
Let's arrange things like this instead. It didn't feel right for all of
the ELF handling code to live in AK.
2019-11-06 13:42:38 +01:00
Drew Stratford
5efbb4ae95 Kernel: Fix bug in Thread::dispatch_signal().
dispatch_signal() expected a RegisterDump on the kernel stack. However
in certain cases, like just after a clone, this was not the case and
dispatch_signal() would instead write to an incorrect user stack pointer.

We now use the threads TSS in situations where the RegisterDump may not
be valid, fixing the issue.
2019-11-04 10:12:59 +01:00
Drew Stratford
44f22c99ef Thread.cpp: add method get_RegisterDump_from_stack().
This refactors some the RegisterDump code from dispatch_signal
into a stand-alone function, allowing for better reuse.
2019-11-04 10:12:59 +01:00
Andreas Kling
cc68654a44 Kernel+LibC: Implement clock_gettime() and clock_nanosleep()
Only the CLOCK_MONOTONIC clock is supported at the moment, and it only
has millisecond precision. :^)
2019-11-02 19:34:06 +01:00
Andreas Kling
904c871727 Kernel: Allow userspace stacks to grow up to 4 MB by default
Make userspace stacks lazily allocated and allow them to grow up to
4 megabytes. This avoids a lot of silly crashes we were running into
with software expecting much larger stacks. :^)
2019-10-31 13:57:07 +01:00
Andrew Kaster
98c86e5109 Kernel: Move E2BIG calculation from Thread to Process
Thread::make_userspace_stack_for_main_thread is only ever called from
Process::do_exec, after all the fun ELF loading and TSS setup has
occured.

The calculations in there that check if the combined argv + envp
size will exceed the default stack size are not used in the rest of
the stack setup. So, it should be safe to move this to the beginning
of do_exec and bail early with -E2BIG, just like the man pages say.

Additionally, advertise this limit in limits.h to be a good POSIX.1
citizen. :)
2019-10-23 07:45:41 +02:00
Andreas Kling
40beb4c5c0 Kernel: Don't leak an FPU state buffer for every spawned thread
We were leaking 512 bytes of kmalloc memory for every new thread.
This patch fixes that, and also makes sure to zero out the FPU state
buffer after allocating it, and finally also makes the LogStream
operator<< for Thread look a little bit nicer. :^)
2019-10-13 14:36:55 +02:00
Drew Stratford
c136fd3fe2 Kernel: Send SIGSEGV on seg-fault
Now programs can catch the SIGSEGV signal when they segfault.

This commit also introduced the send_urgent_signal_to_self method,
which is needed to send signals to a thread when handling exceptions
caused by the same thread.
2019-10-07 16:39:47 +02:00
Andreas Kling
d5f3972012 Kernel: No need to manually deallocate kernel stack Region in ~Thread()
Since we're keeping this Region in an OwnPtr, it will be torn down when
we get to ~OwnPtr anyway.
2019-09-27 19:10:52 +02:00
Drew Stratford
b65bedd610 Kernel: Change m_blockers to m_blocker.
Because of the way signals now work there should
not be more than one blocker per thread. This
changes the blocker and thread class to reflect
that.
2019-09-09 08:35:43 +02:00
Drew Stratford
e529042895 Kernel: Remove reduntant kernel/user signal stacks.
Due to the changes in signal handling m_kernel_stack_for_signal_handler_region
and m_signal_stack_user_region are no longer necessary, and so, have been
removed. I've also removed the similarly reduntant m_tss_to_resume_kernel.
2019-09-09 08:35:43 +02:00
Andreas Kling
e386579436 Kernel: Fix bitrotted code behind #ifdef SIGNAL_DEBUG 2019-09-08 14:29:59 +02:00
Andreas Kling
899233a925 Kernel: Handle running programs that don't have a TLS image
Programs without a PT_TLS header won't have a master TLS image for us
to copy, so we shouldn't try to copy the m_master_tls_region then.
2019-09-07 17:06:25 +02:00
Andreas Kling
ec6bceaa08 Kernel: Support thread-local storage
This patch adds support for TLS according to the x86 System V ABI.
Each thread gets a thread-specific memory region, and the GS segment
register always points _to a pointer_ to the thread-specific memory.

In other words, to access thread-local variables, userspace programs
start by dereferencing the pointer at [gs:0].

The Process keeps a master copy of the TLS segment that new threads
should use, and when a new thread is created, they get a copy of it.
It's basically whatever the PT_TLS program header in the ELF says.
2019-09-07 15:55:36 +02:00
Drew Stratford
95fe775d81 Kernel: Add SysV stack alignment to signal trampoline
In both dispatch signal and asm_signal_trampoline we
now ensure that the stack is 16 byte aligned, as per
the System V ABI.
2019-09-05 16:37:09 +02:00
Drew Stratford
81d0f96f20 Kernel: Use user stack for signal handlers.
This commit drastically changes how signals are handled.

In the case that an unblocked thread is signaled it works much
in the same way as previously. However, when a blocking syscall
is interrupted, we set up the signal trampoline on the user
stack, complete the blocking syscall, return down the kernel
stack and then jump to the handler. This means that from the
kernel stack's perspective, we only ever get one system call deep.

The signal trampoline has also been changed in order to properly
store the return value from system calls. This is necessary due
to the new way we exit from signaled system calls.
2019-09-05 16:37:09 +02:00
Drew Stratford
259a1d56b0 Thread: added member m_kernel_stack_top.
This value stores the top of a threads kernel_stack.
2019-09-05 16:37:09 +02:00
Andreas Kling
77737be7b3 Kernel: Stop eagerly loading entire executables
We were forced to do this because the page fault code would fall apart
when trying to generate a backtrace for a non-current thread.

This issue has been fixed for a while now, so let's go back to lazily
loading executable pages which should make everything a little better.
2019-08-15 10:29:44 +02:00
Andreas Kling
83fdad25ed Kernel: For signal-killed threads, dump backtrace from finalizer thread
Instead of dumping the dying thread's backtrace in the signal handling
code, wait until we're finalizing the thread. Since signalling happens
during scheduling, the less work we do there the better.

Basically the less that happens during a scheduler pass the better. :^)
2019-08-06 19:45:08 +02:00
Andreas Kling
5e01ebfc56 Kernel: Clean up thread stacks when a thread dies
We were forgetting where we put the userspace thread stacks, so added a
member called Thread::m_userspace_thread_stack to keep track of it.

Then, in ~Thread(), we now deallocate the userspace, kernel and signal
stacks (if present.)

Out of curiosity, the "init_stage2" process doesn't have a kernel stack
which I found surprising. :^)
2019-08-01 20:17:12 +02:00
Andreas Kling
3ad6ae1842 Kernel: Delete non-main threads immediately after finalizing them
Previously we would wait until the whole process died before actually
deleting its threads.
2019-08-01 20:01:23 +02:00
Andreas Kling
be4d33fb2c Kernel+LibC: A lot of the signal handling code was off-by-one.
There is no signal 0. The valid ones are 1 (SIGHUP) through 31 (SIGSYS)
Found by PVS-Studio.
2019-08-01 11:03:48 +02:00
Andreas Kling
a79d8d8ae5 Kernel: Add (expensive) but valuable userspace symbols to stacks.
This is expensive because we have to page in the entire executable for every
process up front for this to work. This is due to the page fault code not
being strong enough to run while another process is active.

Note that we already had userspace symbols in *crash* stacks. This patch
adds them generally, so they show up in /proc, Process Manager, etc.

There's room for improvement here, but the debugging benefits way overshadow
the performance penalty right now. :^)
2019-07-27 12:02:56 +02:00
Andreas Kling
4316fa8123 Kernel: Dump backtrace to debugger for DefaultSignalAction::DumpCore.
This makes assertion failures generate backtraces again. Sorry to everyone
who suffered from the lack of backtraces lately. :^)

We share code with the /proc/PID/stack implementation. You can now get the
current backtrace for a Thread via Thread::backtrace(), and all the traces
for a Process via Process::backtrace().
2019-07-25 21:02:19 +02:00
Robin Burchell
342f7a6b0f Move runnable/non-runnable list control entirely over to Scheduler
This way, we can change how the scheduler works without having to change Thread too.
2019-07-22 09:42:39 +02:00
Robin Burchell
dea7f937bf Scheduler: Allow reentry into block()
With the presence of signal handlers, it is possible that a thread might
be blocked multiple times. Picture for instance a signal handler using
read(), or wait() while the thread is already blocked elsewhere before
the handler is invoked.

To fix this, we turn m_blocker into a chain of handlers. Each block()
call now prepends to the list, and unblocking will only consider the
most recent (first) blocker in the chain.

Fixes #309
2019-07-21 12:42:22 +02:00
Robin Burchell
d48c73b10a Thread: Cleanup m_blocker handling
The only two places we set m_blocker now are Thread::set_state(), and
Thread::block(). set_state is mostly just an issue of clarity: we don't
want to end up with state() != Blocked with an m_blocker, because that's
weird. It's also possible: if we yield, someone else may set_state() us.

We also now set_state() and set m_blocker under lock in block(), rather
than unlocking which might allow someone else to mess with our internals
while we're in the process of trying to block.

This seems to fix sending STOP & CONT causing a panic.

My guess as to what was happening is this:

    thread A blocks in select(): Blocking & m_blocker != nullptr
    thread B sends SIGSTOP: Stopped & m_blocker != nullptr
    thread B sends SIGCONT: we continue execution. Runnable & m_blocker != nullptr
    thread A tries to block in select() again:
        * sets m_blocker
        * unlocks (in block_helper)
        * someone else tries to unblock us? maybe from the old m_blocker? unclear -- clears m_blocker
        * sets Blocked (while unlocked!)

So, thread A is left with state Blocked & m_blocker == nullptr, leading
to the scheduler assert (m_blocker != nullptr) failing.

Long story short, let's do all our data management with the lock _held_.
2019-07-20 19:31:52 +02:00
Robin Burchell
96de90ceef Net: Merge Thread::wait_for_connect into LocalSocket (as the only place that uses it)
Also do this more like other blockers, don't call yield ourselves, as
block will do that for us.
2019-07-20 12:15:24 +02:00
Robin Burchell
833d444cd8 Thread: Return a result from block() indicating why the block terminated
And use this to return EINTR in various places; some of which we were
not handling properly before.

This might expose a few bugs in userspace, but should be more compatible
with other POSIX systems, and is certainly a little cleaner.
2019-07-20 12:15:24 +02:00
Andreas Kling
f8beb0f665 Kernel: Share the "return to ring 0/3 from signal" trampolines globally.
Generate a special page containing the "return from signal" trampoline code
on startup and then route signalled threads to it. This avoids a page
allocation in every process that ever receives a signal.
2019-07-19 17:01:16 +02:00
Robin Burchell
53262cd08b AK: Introduce IntrusiveList
And use it in the scheduler.

IntrusiveList is similar to InlineLinkedList, except that rather than
making assertions about the type (and requiring inheritance), it
provides an IntrusiveListNode type that can be used to put an instance
into many different lists at once.

As a proof of concept, port the scheduler over to use it. The only
downside here is that the "list" global needs to know the position of
the IntrusiveListNode member, so we have to position things a little
awkwardly to make that happen. We also move the runnable lists to
Thread, to avoid having to publicize the node.
2019-07-19 15:42:30 +02:00
Andreas Kling
705cd2491c Kernel: Some small refinements to the thread blockers.
Committing some things my hands did while browsing through this code.

- Mark all leaf classes "final".
- FileDescriptionBlocker now stores a NonnullRefPtr<FileDescription>.
- FileDescriptionBlocker::blocked_description() now returns a reference.
- ConditionBlocker takes a Function&&.
2019-07-19 13:19:47 +02:00
Robin Burchell
e74dce65e6 Thread: Normalize all for_each constructs to use IterationDecision
This way a caller can abort the for_each early if they want.
2019-07-19 13:19:02 +02:00
Robin Burchell
cd76b691fb Kernel: Remove memory allocations from the new Blocker API 2019-07-19 11:03:22 +02:00
Robin Burchell
99c5377653 Kernel: Remove old block(State) API
New API should be used always :)
2019-07-19 11:03:22 +02:00
Robin Burchell
762333ba95 Kernel: Restore state strings for block states
"Blocking" is not terribly informative, but now that everything is
ported over, we can force the blocker to provide us with a reason.

This does mean that to_string(State) needed to become a member, but
that's OK.
2019-07-19 11:03:22 +02:00
Robin Burchell
b13f1699fc Kernel: Rename Condition state to Blocked now we only have one blocking mechanism :) 2019-07-19 11:03:22 +02:00
Robin Burchell
d2ca91c024 Kernel: Convert BlockedSignal and BlockedLurking to the new Blocker mechanism
The last two of the old block states gone :)
2019-07-19 11:03:22 +02:00
Robin Burchell
52743f9eec Kernel: Rename ThreadBlocker classes to avoid stutter
Thread::ThreadBlockerFoo is a lot less nice to read than Thread::FooBlocker
2019-07-19 11:03:22 +02:00
Robin Burchell
782e4ee6e1 Kernel: Port wait to ThreadBlocker 2019-07-19 11:03:22 +02:00
Robin Burchell
4f9ae9b970 Kernel: Port select to ThreadBlocker 2019-07-19 11:03:22 +02:00
Robin Burchell
32fcfb79e9 Kernel: Port sleep to ThreadBlocker 2019-07-19 11:03:22 +02:00
Robin Burchell
0c8813e6d9 Kernel: Introduce ThreadBlocker as a way to make unblocking neater :)
And port all the descriptor-based blocks over to it as a proof of concept.
2019-07-19 11:03:22 +02:00
Robin Burchell
f2fdac789c Kernel: Add a new block state for accept() on a blocking socket
Rather than asserting, which really ruins everyone's day.
2019-07-18 10:56:49 +02:00
Robin Burchell
4f94fbc9e1 Kernel: Split SCHEDULER_DEBUG into a new SCHEDULER_RUNNABLE_DEBUG
And use dbgprintf() consistently on a few of the pieces of logging here.

This is useful when trying to track thread switching when you don't
really care about what it's switching _to_.
2019-07-17 14:23:15 +02:00
Andreas Kling
b2e502e533 Kernel: Add Thread::block_until(Condition).
Replace the class-based snooze alarm mechanism with a per-thread callback.
This makes it easy to block the current thread on an arbitrary condition:

    void SomeDevice::wait_for_irq() {
        m_interrupted = false;
        current->block_until([this] { return m_interrupted; });
    }
    void SomeDevice::handle_irq() {
        m_interrupted = true;
    }

Use this in the SB16 driver, and in NetworkTask :^)
2019-07-14 14:54:54 +02:00
Andreas Kling
54e79a4640 Kernel: Make it easier to add Thread block states in the future. 2019-07-13 20:14:39 +02:00
Andreas Kling
4d904340b4 Kernel: Don't interrupt blocked syscalls to dispatch ignored signals.
This was just causing syscalls to return EINTR for no reason.
2019-07-08 18:59:48 +02:00
Andreas Kling
27f699ef0c AK: Rename the common integer typedefs to make it obvious what they are.
These types can be picked up by including <AK/Types.h>:

* u8, u16, u32, u64 (unsigned)
* i8, i16, i32, i64 (signed)
2019-07-03 21:20:13 +02:00
Andreas Kling
e7ce4514ec Kernel: Disable interrupts in Thread::set_state().
We don't want to get interrupted while we're manipulating the thread lists.
2019-06-30 11:42:27 +02:00
Andreas Kling
c1bbd40b9e Kernel: Rename "descriptor" to "description" where appropriate.
Now that FileDescription is called that, variables of that type should not
be called "descriptor". This is kinda wordy but we'll get used to it.
2019-06-13 22:03:04 +02:00
Andreas Kling
39d1a9ae66 Meta: Tweak .clang-format to not wrap braces after enums. 2019-06-07 17:13:23 +02:00
Andreas Kling
e42c3b4fd7 Kernel: Rename LinearAddress => VirtualAddress. 2019-06-07 12:56:50 +02:00
Andreas Kling
bc951ca565 Kernel: Run clang-format on everything. 2019-06-07 11:43:58 +02:00
Andreas Kling
08cd75ac4b Kernel: Rename FileDescriptor to FileDescription.
After reading a bunch of POSIX specs, I've learned that a file descriptor
is the number that refers to a file description, not the description itself.
So this patch renames FileDescriptor to FileDescription, and Process now has
FileDescription* file_description(int fd).
2019-06-07 09:36:51 +02:00
Andreas Kling
8098d2e337 Kernel: If a signal is ignored, make sure we unset BlockedSignal state. 2019-05-22 13:23:41 +02:00
Andreas Kling
c9a9ca0dfe Kernel: Bump kernel stacks to 64 KB.
This makes the ELF symbolication crash go away while I work out a smart fix.
2019-05-21 16:15:52 +02:00
Andreas Kling
7900da9667 Kernel: Make sure we never put the colonel thread in the runnable list.
This would cause it to get scheduled unnecessarily.
2019-05-18 20:28:04 +02:00
Andreas Kling
64a4f3df69 Kernel: Add a Thread::set_thread_list() helper to keep logic in one place. 2019-05-18 20:28:04 +02:00
Andreas Kling
8c7d5abdc4 Kernel: Refactor thread scheduling a bit, breaking it into multiple lists.
There are now two thread lists, one for runnable threads and one for non-
runnable threads. Thread::set_state() is responsible for moving threads
between the lists.

Each thread also has a back-pointer to the list it's currently in.
2019-05-18 20:28:04 +02:00
Andreas Kling
45ff3a7e6a Kernel: Make Thread::kernel_stack_base() work for kernel processes. 2019-05-17 03:43:51 +02:00
Andreas Kling
7c10a93d48 Kernel: Make allocate_kernel_region() commit the region automatically.
This means that kernel regions will eagerly get physical pages allocated.
It would be nice to zero-fill these on demand instead, but that would
require a bunch of MemoryManager changes.
2019-05-14 15:38:00 +02:00
Andreas Kling
486c675850 Kernel: Allocate kernel signal stacks using the region allocator as well. 2019-05-14 12:06:09 +02:00
Andreas Kling
c8a216b107 Kernel: Allocate kernel stacks for threads using the region allocator.
This patch moves away from using kmalloc memory for thread kernel stacks.
This reduces pressure on kmalloc (16 KB per thread adds up fast) and
prevents kernel stack overflow from scribbling all over random unrelated
kernel memory.
2019-05-14 11:51:00 +02:00
Andreas Kling
03da7046bd Kernel: Prepare Socket for becoming a File.
Make the Socket functions take a FileDescriptor& rather than a socket role
throughout the code. Also change threads to block on a FileDescriptor,
rather than either an fd index or a Socket.
2019-05-03 20:15:54 +02:00
Andreas Kling
0a0d739e98 Kernel: Make FIFO inherit from File. 2019-04-29 04:55:54 +02:00
Andreas Kling
c5c4e54a67 Kernel: Process destruction should destroy all child threads.
We were only destroying the main thread when a process died, leaving any
secondary threads around. They couldn't run, but because they were still
in the global thread list, strange things could happen since they had some
now-stale pointers to their old process.
2019-04-23 22:17:01 +02:00
Andreas Kling
5562ab3f5a Kernel: Remove some more unnecessary Thread members. 2019-04-20 19:29:48 +02:00
Andreas Kling
b2ebf6c798 Kernel: Shrink Thread by making kernel resume TSS heap-allocated. 2019-04-20 19:23:45 +02:00