Rolling with the theme of adding a dialog to shutdown the machine, it is
probably nice to have a way to reboot the machine without performing a full
system powerdown.
A reboot program has been added to `/bin/` as well as a corresponding
`syscall` (SC_reboot). This syscall works by attempting to pulse the 8042
keyboard controller. Note that this is NOT supported on new machines, and
should only be a fallback until we have proper ACPI support.
The implementation causes a triple fault in QEMU, which then restarts the
system. The filesystems are locked and synchronized before this occurs,
so there shouldn't be any corruption etctera.
This allows us to seal a buffer *before* anyone else has access to it
(well, ok, the creating process still does, but you can't win them all).
It also means that a SharedBuffer can be shared with multiple clients:
all you need is to have access to it to share it on again.
Exec doesn't leave through the syscall handler, so it didn't unlock the
big_lock. This means that reentering can lock it again, and then another
thread could endlessly yield waiting to acquire the lock (futilely).
This fixes AudioServer using 100% CPU.
We were locking the list of references, and then destroying the
reference, which made things go a little crazy.
It's more straightforward to just remove the per-reference lock: the
syscalls all have to lock the full list anyway, so let's just do that
and avoid the hassle.
While I'm at it, also move the SharedBuffer code out to its own file as it's
getting a little long and unwieldly, and Process.cpp is already huge.
Rather than limiting it to two shared processes, store a Vector of
references, so we can add more if we want. Makes the code a little
more generic.
No actual change to the syscall interface yet, so nothing takes
advantage of this yet.
This makes waitpid() return when a child process is stopped via a signal.
Use this in Shell to catch stopped children and return control to the
command line. :^)
Fixes#298.
This is obviously more readable. If we ever run into a situation where
ref count churn is actually causing trouble in the future, we can deal with
it then. For now, let's keep it simple. :^)
Instead of computing the path length inside the syscall handler, let the
caller do that work. This allows us to implement to new variants of open()
and creat(), called open_with_path_length() and creat_with_path_length().
These are suitable for use with e.g StringView.
String&& is just not very practical. Also return const String& when the
returned string is a member variable. The call site is free to make a copy
if he wants, but otherwise we can avoid the retain count churn.
After reading a bunch of POSIX specs, I've learned that a file descriptor
is the number that refers to a file description, not the description itself.
So this patch renames FileDescriptor to FileDescription, and Process now has
FileDescription* file_description(int fd).
Define the multiboot info struct properly so we don't have to grab at byte
offsets in the memory access checker code. Also print kernel command line
in init().
We definitely need to replace m_executable before clearing interrupts, since
otherwise we might call ~Custody() which would make it assert in locking.
Also avoid calling FileDescriptor::metadata() repeatedly and just cache the
result from the first call.
I also added a comment at the point where we've decided to commit to the new
executable and follow through with the swap.
The current working directory is now stored as a custody. Likewise for a
process executable file. This unbreaks /proc/PID/fd which has not been
working since we made the filesystem bigger.
This still needs a bunch of work, for instance when renaming or removing
a file somewhere, we have to update the relevant custody links.
Right now, we allow anything inside a user to raise or lower any other process's
priority. This feels simple enough to me. Linux disallows raising, but
that's annoying in practice.
By moving the sending (and receiving) thread into the BlockedSignal state,
we ensure that the thread doesn't continue executing until the signal
has been dispatched.
This is in preparation for eventually using it in userspace.
LinearAddress.h has not been moved for the time being (as it seems to be
only used by a very small part of the code).
Since we transition to a new PageDirectory on exec(), we need a matching
RangeAllocator to go with the new directory. Instead of juggling this in
Process and MemoryManager, simply attach the RangeAllocator to the
PageDirectory instead.
Fixes#61.
Passing this flag to recv() temporarily puts the file descriptor into
non-blocking mode.
Also implement LocalSocket::recv() as a simple forwarding to read().
select essentially has 3 modes (which is presumably why we're finding it
so hard to get this right in a reliable way :)).
1. NULL timeout -- no timeout on blocking
2. non-NULL timeout that is not zero'd -- timeout on blocking
3. non-NULL but zero timeout -- no blocking at all (immediate poll)
For cases 1 and 2, we want to block the thread. We have a timeout set
only for case 2, though.
Case 3 should not block the thread, and does not have a timeout set.
The scheduler expects m_select_timeout to act as a deadline. That is, it
should contain the time that a task should wake at -- but we were
directly copying the time from userspace, which meant that it always
returned virtually immediately.
At the same time, fix CEventLoop to not rely on the broken select behavior
by subtracting the current time from the time of the nearest timer.
Based on the description I read, this syscall doesn't seem completely
reasonable, but let's at least return a number that is likely to change
between invocations in case somebody depends on that happening.
These functions were doing exactly the same thing for range allocation, so
share that code in an allocate_range() helper.
Region allocation will now also fail if range allocation fails, which means
that mmap() can actually fail without falling apart. Exciting times!
This replaces the previous virtual address allocator which was basically
just "m_next_address += size;"
With this in place, virtual addresses can get reused, which cuts down on
the number of page tables created. When we implement ASLR some day, we'll
probably have to do page table deallocation, but for now page tables are
only deallocated once the process dies.
Stash away the ELFLoader used to load an executable in Process so we can use
it for symbolicating userspace addresses later on. This will make debugging
userspace programs a lot nicer. :^)
This patch moves away from using kmalloc memory for thread kernel stacks.
This reduces pressure on kmalloc (16 KB per thread adds up fast) and
prevents kernel stack overflow from scribbling all over random unrelated
kernel memory.
Make the Socket functions take a FileDescriptor& rather than a socket role
throughout the code. Also change threads to block on a FileDescriptor,
rather than either an fd index or a Socket.
We were only destroying the main thread when a process died, leaving any
secondary threads around. They couldn't run, but because they were still
in the global thread list, strange things could happen since they had some
now-stale pointers to their old process.
Calling systrace(pid) gives you a file descriptor with a stream of the
syscalls made by a peer process. The process must be owned by the same
UID who calls systrace(). :^)
I was originally implementing signals by looking at some man page about
sigaction() to see how it works. It seems like the restorer thingy is
system-specific and not required by POSIX, so let's get rid of it.
If connect() is called on a non-blocking socket, it will "fail" immediately
with -EINPROGRESS. After that, you select() on the socket and wait for it to
become writable.
We can't have multiple threads in the same process running in the kernel
at the same time, so let's have a per-process lock that threads have to
acquire on syscall entry/exit (and yield while blocked.)
This currently only works for "normal" processes created by fork().
It does not work for create_user_process() processes spawned by the
kernel, as those are a bit special during construction.
It's basically a userspace port of the kernel's Lock class.
Added gettid() and donate() syscalls to support the timeslice donation
feature we already enjoyed in the kernel.
This introduces a tiny amount of timer drift which I will have to fix
somehow eventually, but it's a huge improvement in timing consistency
as we no longer suddenly jump from e.g 10:45:49.123 to 10:45:50.000.
The scheduler now operates on threads, rather than on processes.
Each process has a main thread, and can have any number of additional
threads. The process exits when the main thread exits.
This patch doesn't actually spawn any additional threads, it merely
does all the plumbing needed to make it possible. :^)
Now that everything is nice and mature, the WindowServer can just use the
client PID it receives in the Greeting message, and we can get rid of this
hacky ioctl. :^)
This is accomplished using a new Alarm class and a BlockedSnoozing state.
Basically, you call Process::snooze_until(some_alarm) and then the scheduler
won't wake up the process until some_alarm.is_ringing() returns true.
We had a bug where an accepted socket would appear to be EOF/disconnected
on the accepting side until the connecting side had called attach_fd().
Fix this by adding a new SocketRole::Connecting state.
Only the receive timeout is hooked up yet. You can change the timeout by
calling setsockopt(..., SOL_SOCKET, SO_RCVTIMEO, ...).
Use this mechanism to make /bin/ping report timeouts.
Finally fixed the weird flaky crashing when resizing Terminal windows.
It was because we were dispatching a signal to "current" from the scheduler.
Yet another thing I dislike about even having a "current" process while
we're in the scheduler. Not sure yet how to fix this.
Let the signal handler's kernel stack be a kmalloc() allocation for now.
Once we can do allocation of consecutive physical pages in the supervisor
memory region, we can use that for all types of kernel stacks.
It's now possible to create symbolic links! :^)
This exposed an issue in Ext2FS where we'd write uninitialized data past
the end of an inode's content. Fix this by zeroing out the tail end of
the last block in a file.
The idea here is to combine a potential syscall error code with an arbitrary
type in the case of success. I feel like this will end up much less error
prone than returning some arbitrary type that kinda sorta has bool semantics
(but sometimes not really) and passing the error through an out-param.
This patch only converts a few syscalls to using it. More to come.
It was very confusing that you had to open a FileDescriptor in order to stat
a file. This patch gives VFS a separate stat() function and uses it to
implement the stat() and lstat() syscalls.
I set it up so that TIOCSWINSZ on a master PTY gets forwarded to the slave.
This feels intuitively right. Terminal can then use that to inform the shell
or whoever is inside the slave that the window size has changed.
TIOCSWINSZ also triggers the generation of a SIGWINCH signal. :^)
Turns out FD_CLOEXEC and O_CLOEXEC are different values. Silly mistake.
I noticed that Terminal's shell process still had the Terminal's window
server connection open, albeit in a broken state.
When the kernel performs a successful exec(), whatever was on the kernel
stack for that process before goes away. For this reason, we need to make
sure we don't have any stack objects holding onto kmalloc memory.
The mismatch between the two was causing some trouble if you'd mmap e.g 1KB
and then try to munmap() it. The kernel would whine that it couldn't find
any such mapping (because mmap() actually rounded the 1KB to a 4KB page.)
This is a monster patch that required changing a whole bunch of things.
There are performance and stability issues all over the place, but it works.
Pretty cool, I have to admit :^)