Commit Graph

458 Commits

Author SHA1 Message Date
Sergey Bugaev
9a41dda029 Kernel: Expose blocking and cloexec fd flags in ProcFS 2019-09-28 22:27:45 +02:00
Andreas Kling
2584636d19 Kernel: Fix partial munmap() deallocating still-in-use VM
We were always returning the full VM range of the partially-unmapped
Region to the range allocator. This caused us to re-use those addresses
for subsequent VM allocations.

This patch also skips creating a new VMObject in partial munmap().
Instead we just make split regions that point into the same VMObject.

This fixes the mysterious GCC ICE on large C++ programs.
2019-09-27 20:21:52 +02:00
Andreas Kling
7f9a33dba1 Kernel: Make Region single-owner instead of ref-counted
This simplifies the ownership model and makes Region easier to reason
about. Userspace Regions are now primarily kept by Process::m_regions.

Kernel Regions are kept in various OwnPtr<Regions>'s.

Regions now only ever get unmapped when they are destroyed.
2019-09-27 14:25:42 +02:00
Andreas Kling
7a7f6a24e9 Kernel: Fix bitrotted FORK_DEBUG logging code 2019-09-27 14:25:39 +02:00
Andreas Kling
bba24b09f7 Kernel: Avoid creating a temporary String("mmap") for every mmap() call 2019-09-22 19:47:00 +02:00
Drew Stratford
6e51ebad8c Kernel: Stop hardcoding syscall in signal trampoline.
We now no longer hardcode the sigreturn syscall in
the signal trampoline. Because of the way inline asm inputs
work, I've had to enclose the trampoline in the function
signal_trampoline_dummy.
2019-09-17 16:00:37 +02:00
Andreas Kling
1c692e87a6 Kernel: Move kmalloc() into a Kernel/Heap/ directory 2019-09-16 09:01:44 +02:00
Andreas Kling
85d629103d Kernel: Implement shebang executables ("#!/bin/sh")
This patch makes it possible to *run* text files that start with the
characters "#!" followed by an interpreter.

I've tested this with both the Serenity built-in shell and the Bash
shell, and it works as expected. :^)
2019-09-15 11:47:21 +02:00
Andreas Kling
2d1f3ec749 Kernel: fchdir() should fail for non-searchable directories
So sayeth POSIX.
2019-09-13 14:35:18 +02:00
Mauri de Souza Nunes
7d85fc00e4 Kernel: Implement fchdir syscall
The fchdir() function is equivalent to chdir() except that the
directory that is to be the new current working directory is
specified by a file descriptor.
2019-09-13 14:04:38 +02:00
Drew Stratford
e529042895 Kernel: Remove reduntant kernel/user signal stacks.
Due to the changes in signal handling m_kernel_stack_for_signal_handler_region
and m_signal_stack_user_region are no longer necessary, and so, have been
removed. I've also removed the similarly reduntant m_tss_to_resume_kernel.
2019-09-09 08:35:43 +02:00
Andreas Kling
23eafdb8d6 Kernel: waitpid() should unblock and -ECHILD if SIG_IGN reaps child 2019-09-08 14:01:00 +02:00
Andreas Kling
ec6bceaa08 Kernel: Support thread-local storage
This patch adds support for TLS according to the x86 System V ABI.
Each thread gets a thread-specific memory region, and the GS segment
register always points _to a pointer_ to the thread-specific memory.

In other words, to access thread-local variables, userspace programs
start by dereferencing the pointer at [gs:0].

The Process keeps a master copy of the TLS segment that new threads
should use, and when a new thread is created, they get a copy of it.
It's basically whatever the PT_TLS program header in the ELF says.
2019-09-07 15:55:36 +02:00
Drew Stratford
95fe775d81 Kernel: Add SysV stack alignment to signal trampoline
In both dispatch signal and asm_signal_trampoline we
now ensure that the stack is 16 byte aligned, as per
the System V ABI.
2019-09-05 16:37:09 +02:00
Drew Stratford
81d0f96f20 Kernel: Use user stack for signal handlers.
This commit drastically changes how signals are handled.

In the case that an unblocked thread is signaled it works much
in the same way as previously. However, when a blocking syscall
is interrupted, we set up the signal trampoline on the user
stack, complete the blocking syscall, return down the kernel
stack and then jump to the handler. This means that from the
kernel stack's perspective, we only ever get one system call deep.

The signal trampoline has also been changed in order to properly
store the return value from system calls. This is necessary due
to the new way we exit from signaled system calls.
2019-09-05 16:37:09 +02:00
Andreas Kling
e25ade7579 Kernel: Rename "vmo" to "vmobject" everywhere 2019-09-04 11:27:14 +02:00
Andreas Kling
6fe0fa30f2 Kernel: Fix broken passing of String as printf() argument in realpath() 2019-08-29 21:01:33 +02:00
Andreas Kling
d720388acf Kernel: Support partial munmap()
You can now munmap() a part of a region. The kernel will then create
one or two new regions around the "hole" and re-map them using the same
physical pages as before.

This goes towards fixing #175, but not all the way since we don't yet
do munmap() across multiple mappings.
2019-08-29 20:57:02 +02:00
Andreas Kling
f5d779f47e Kernel: Never forcibly page in entire executables
We were doing this for the initial kernel-spawned userspace process(es)
to work around instability in the page fault handler. Now that the page
fault handler is more robust, we can stop worrying about this.

Specifically, the page fault handler was previous not able to handle
getting a page fault in anything but the currently executing task's
page directory.
2019-08-26 13:20:01 +02:00
Andreas Kling
e29fd3cd20 Kernel: Display virtual addresses as V%p instead of L%x
The L was a leftover from when these were called linear addresses.
2019-08-26 11:31:58 +02:00
Rok Povsic
18fbe4ac83 Kernel: Add realpath syscall 2019-08-25 19:47:37 +02:00
Andreas Kling
32810a920f Kernel: Implement kill(0, signal)
This sends the signal to everyone in the same process group as the
calling process.
2019-08-23 18:28:59 +02:00
Andreas Kling
06de0e670c Kernel: Use IteratorDecision in Process::for_each_in_pgrp() 2019-08-23 18:28:59 +02:00
Sergey Bugaev
acccf9ccda Kernel: Move device lookup to Device class itself
Previously, VFS stored a list of all devices, and devices had to
register and unregister themselves with it. This cleans up things
a bit.
2019-08-18 15:59:59 +02:00
Andreas Kling
272bd1d3ef Kernel: Make crash dumps look aligned once again
This broke with the recent changes to make printf hex fields behave
a bit more correctly.
2019-08-17 21:29:46 +02:00
Andreas Kling
5f6b6c1665 Kernel: Do the umount() by the guest's root inode identifier
It was previously possible to unmount a filesystem mounted on /mnt by
doing e.g "umount /mnt/some/path".
2019-08-17 14:28:13 +02:00
Sergey Bugaev
425c356288 Kernel+LibC+Userland: Support mounting other kinds of filesystems 2019-08-17 12:07:55 +02:00
Jesse Buhagiar
bc22456f89 Kernel: Added unmount ability to VFS
It is now possible to unmount file systems from the VFS via `umount`.
It works via looking up the `fsid` of the filesystem from the `Inode`'s
metatdata so I'm not sure how fragile it is. It seems to work for now
though as something to get us going.
2019-08-17 09:29:54 +02:00
Andreas Kling
6ad3efe067 Kernel+LibC: Add get_process_name() syscall
It does exactly what it sounds like:

    int get_process_name(char* buffer, int buffer_size);
2019-08-15 20:55:10 +02:00
Andreas Kling
77737be7b3 Kernel: Stop eagerly loading entire executables
We were forced to do this because the page fault code would fall apart
when trying to generate a backtrace for a non-current thread.

This issue has been fixed for a while now, so let's go back to lazily
loading executable pages which should make everything a little better.
2019-08-15 10:29:44 +02:00
Andreas Kling
e8eadd19a5 Kernel: Show region access bits (R/W/X) in crash dump region lists
It's pretty helpful to be able to see the various access bits for each
region in a crash dump. :^)
2019-08-12 19:37:28 +02:00
Andreas Kling
7d6689055f Kernel+LibC+crash: Add mprotect() syscall
This patch adds the mprotect() syscall to allow changing the protection
flags for memory regions. We don't do any region splitting/merging yet,
so this only works on whole mmap() regions.

Added a "crash -r" flag to verify that we crash when you attempt to
write to read-only memory. :^)
2019-08-12 19:33:24 +02:00
Sergey Bugaev
43ce6c5474 Kernel: Move socket role tracking to the Socket class itself
This is more logical and allows us to solve the problem of
non-blocking TCP sockets getting stuck in SocketRole::None.

The only complication is that a single LocalSocket may be shared
between two file descriptions (on the connect and accept sides),
and should have two different roles depending from which side
you look at it. To deal with it, Socket::role() is made a
virtual method that accepts a file description, and LocalSocket
internally tracks which FileDescription is the which one and
returns a correct role.
2019-08-11 16:30:43 +02:00
Sergey Bugaev
1606261c58 Kernel: Fix cloning file descriptions on fork
After a fork, the parent and the child are supposed to share
the same file description. For example, modifying the current
offset of a file description is visible in both of them.
2019-08-11 16:30:43 +02:00
Andreas Kling
752de9cd27 FileDescription: Disallow construction with a null File
It's not valid for a FileDescription to not have a file, so let's
disallow it by taking a File& (or FIFO&) in the constructor.
2019-08-11 09:33:31 +02:00
Conrad Pankoff
bd6d2c0819 Kernel: Use a more detailed state machine for socket setup 2019-08-10 09:07:11 +02:00
Conrad Pankoff
5308e310a0 Kernel: Make accept() fill address with peer name rather than local name 2019-08-10 08:50:44 +02:00
Andreas Kling
a720061423 Kernel: Zero-length dbgputstr() should just return 0 2019-08-09 19:44:14 +02:00
Andreas Kling
6d32b8fc36 Kernel: Use some more InlineLinkedList range-for iteration 2019-08-08 14:40:30 +02:00
Andreas Kling
f5ff796970 Kernel: Always give back VM to the RangeAllocator when unmapping Region
We were only doing this in Process::deallocate_region(), which meant
that kernel-only Regions never gave back their VM.

With this patch, we can start reusing freed-up address space! :^)
2019-08-07 21:57:39 +02:00
Andreas Kling
37ba2a7b65 Kernel: Use KBufferBuilder to build ProcFS files and backtraces
This is not perfect as it uses a lot of VM, but since the buffers are
supposed to be temporary it's not super terrible.

This could be improved by giving back the unused VM to the kernel's
RangeAllocator after finishing the buffer building.
2019-08-07 21:52:43 +02:00
Andreas Kling
6bdb81ad87 Kernel: Split VMObject into two classes: Anonymous- and InodeVMObject
InodeVMObject is a VMObject with an underlying Inode in the filesystem.
AnonymousVMObject has no Inode.

I'm happy that InodeVMObject::inode() can now return Inode& instead of
VMObject::inode() return Inode*. :^)
2019-08-07 18:09:32 +02:00
Andreas Kling
3364da388f Kernel: Remove VMObject names
The VMObject name was always either the owning region's name, or the
absolute path of the underlying inode.

We can reconstitute this information if wanted, no need to keep copies
of these strings around.
2019-08-07 16:14:08 +02:00
Andreas Kling
2ad963d261 Kernel: Add mapping from page directory base (PDB) to PageDirectory
This allows the page fault code to find the owning PageDirectory and
corresponding process for faulting regions.

The mapping is implemented as a global hash map right now, which is
definitely not optimal. We can come up with something better when it
becomes necessary.
2019-08-06 11:30:26 +02:00
Sergey Bugaev
9c3b1ca0c6 Kernel+LibC: Support passing O_CLOEXEC to pipe()
In the userspace, this mimics the Linux pipe2() syscall;
in the kernel, the Process::sys$pipe() now always accepts
a flags argument, the no-argument pipe() syscall is now a
userspace wrapper over pipe2().
2019-08-05 16:04:31 +02:00
Andreas Kling
7c7343de98 Kernel: mount() should fail if the provided device is not a disk device
In the future, we should allow mounting any block device. At the moment
there is too much filesystem code that depends on the underlying device
being a DiskDevice.
2019-08-02 19:31:59 +02:00
Andreas Kling
a6fb055028 Kernel: Generalize VFS metadata lookup and use it in mount() and stat()
Refactored VFS::stat() into VFS::lookup_metadata(), which can now be
used for general VFS metadata lookup by path.
2019-08-02 19:28:18 +02:00
Andreas Kling
31de5dee26 Kernel: Some improvements to the mount syscall
- You must now have superuser privileges to use mount().
- We now verify that the mount point is a valid path first, before
  trying to find a filesystem on the specified device.
- Convert some dbgprintf() to dbg().
2019-08-02 19:03:50 +02:00
Jesse
401c87a0cc Kernel: mount system call (#396)
It is now possible to mount ext2 `DiskDevice` devices under Serenity on
any folder in the root filesystem. Currently any user can do this with
any permissions. There's a fair amount of assumptions made here too,
that might not be too good, but can be worked on in the future. This is
a good start to allow more dynamic operation under the OS itself.

It is also currently impossible to unmount and such, and devices will
fail to mount in Linux as the FS 'needs to be cleaned'. I'll work on
getting `umount` done ASAP to rectify this (as well as working on less
assumption-making in the mount syscall. We don't want to just be able
to mount DiskDevices!). This could probably be fixed with some `-t`
flag or something similar.
2019-08-02 15:18:47 +02:00
Andreas Kling
1a13145cb3 Kernel: Remove unnecessary null check in Process::fork()
Found by PVS-Studio.
2019-08-01 11:15:48 +02:00