Commit Graph

2197 Commits

Author SHA1 Message Date
Andreas Kling
03d73cbaae Kernel: Allow Socket subclasses to fail construction
For example, socket(AF_INET) should only succeed for valid SOCK_TYPEs.
2020-01-23 21:33:15 +01:00
Andreas Kling
3de5439579 AK: Let's call decrementing reference counts "unref" instead of "deref"
It always bothered me that we're using the overloaded "dereference"
term for this. Let's call it "unreference" instead. :^)
2020-01-23 15:14:21 +01:00
Andreas Kling
15aac1f9e9 Build: Fix silly mistake in makeall.sh 2020-01-23 10:41:07 +01:00
Andreas Kling
e64c335e5a Revert "Kernel: Replace IRQHandler with the new InterruptHandler class"
This reverts commit 6c72736b26.

I am unable to boot on my home machine with this change in the tree.
2020-01-22 22:27:06 +01:00
Oliver Kraitschy
8e21e31b3a Build: use absolute path for /sbin/mke2fs
Distros like Debian and Ubuntu don't have /sbin in PATH, thus mke2fs is
not found.
2020-01-22 22:04:29 +01:00
Jesse Buhagiar
9cbce68b1d Meta: Change copyright holder of `FloppyDiskDevice.* 2020-01-22 14:37:44 +01:00
Liav A
6c72736b26 Kernel: Replace IRQHandler with the new InterruptHandler class
System components that need an IRQ handling are now inheriting the
InterruptHandler class.

In addition to that, the initialization process of PATAChannel was
changed to fit the changes.
PATAChannel, E1000NetworkAdapter and RTL8139NetworkAdapter are now
inheriting from PCI::Device instead of InterruptHandler directly.
2020-01-22 12:22:09 +01:00
Liav A
1ee37245cd Kernel: Introduce IRQ sharing support
The support is very basic - Each component that needs to handle IRQs
inherits from InterruptHandler class. When the InterruptHandler
constructor is called it registers itself in a SharedInterruptHandler.
When an IRQ is fired, the SharedInterruptHandler is invoked, then it
iterates through a list of the registered InterruptHandlers.

Also, InterruptEnabler class was created to provide a way to enable IRQ
handling temporarily, similar to InterruptDisabler (in CPU.h, which does
the opposite).

In addition to that a PCI::Device class has been added, that inherits
from InterruptHandler.
2020-01-22 12:22:09 +01:00
Liav A
2a160faf98 Kernel: Run clang-format on KeyboardDevice.cpp 2020-01-22 12:22:09 +01:00
Andreas Kling
30ad7953ca Kernel: Rename UnveilState to VeilState 2020-01-21 19:28:59 +01:00
Andreas Kling
66598f60fe SystemMonitor: Show process unveil() state as "Veil"
A process has one of three veil states:

- None: unveil() has never been called.
- Dropped: unveil() has been called, and can be called again.
- Locked: unveil() has been called, and cannot be called again.
2020-01-21 18:56:23 +01:00
Andreas Kling
f38cfb3562 Kernel: Tidy up debug logging a little bit
When using dbg() in the kernel, the output is automatically prefixed
with [Process(PID:TID)]. This makes it a lot easier to understand which
thread is generating the output.

This patch also cleans up some common logging messages and removes the
now-unnecessary "dbg() << *current << ..." pattern.
2020-01-21 16:16:20 +01:00
Andreas Kling
07075cd001 Kernel+LibC: Clean up open() flag (O_*) definitions
These were using a mix of decimal, octal and hexadecimal for no reason.
2020-01-21 13:34:39 +01:00
Andreas Kling
6081c76515 Kernel: Make O_RDONLY non-zero
Sergey suggested that having a non-zero O_RDONLY would make some things
less confusing, and it seems like he's right about that.

We can now easily check read/write permissions separately instead of
dancing around with the bits.

This patch also fixes unveil() validation for O_RDWR which previously
forgot to check for "r" permission.
2020-01-21 13:27:08 +01:00
Andreas Kling
1b3cac2f42 Kernel: Don't forget about unveiled paths with zero permissions
We need to keep these around, otherwise the calling process can remove
and re-add a path to increase its permissions.
2020-01-21 11:42:28 +01:00
Liav A
200a5b0649 Kernel: Remove map_for_kernel() in MemoryManager
We don't need to have this method anymore. It was a hack that was used
in many components in the system but currently we use better methods to
create virtual memory mappings. To prevent any further use of this
method it's best to just remove it completely.

Also, the APIC code is disabled for now since it doesn't help booting
the system, and is broken since it relies on identity mapping to exist
in the first 1MB. Any call to the APIC code will result in assertion
failed.

In addition to that, the name of the method which is responsible to
create an identity mapping between 1MB to 2MB was changed, to be more
precise about its purpose.
2020-01-21 11:29:58 +01:00
Liav A
60c32f44dd Kernel: ACPI code doesn't rely on identity mapping anymore
The problem was mostly in the initialization code, since in that stage
the parser assumed that there is an identity mapping in the first 1MB of
the address space. Now during initialization the parser will create the
correct mappings to locate the required data.
2020-01-21 11:29:58 +01:00
Liav A
325022cbd7 Kernel: DMIDecoder no longer depends on identity-mapping
DMIDecoder creates the mappings using the standard helpers, thus no
need to rely on the identity mapping in the first 1MB in memory.
2020-01-21 11:29:58 +01:00
Liav A
aca317d889 Kernel: PCI MMIO no longer uses map_for_kernel()
PCI MMIO access is done by modifying the related PhysicalPage directly,
then we request to remap the region to create the mapping.
2020-01-21 11:29:58 +01:00
Andreas Kling
22cfb1f3bd Kernel: Clear unveiled state on exec() 2020-01-21 10:46:31 +01:00
Andreas Kling
cf48c20170 Kernel: Forked children should inherit unveil()'ed paths 2020-01-21 09:44:32 +01:00
Andreas Kling
02406b7305 ProcFS: Add /proc/PID/unveil
This file exposes a JSON array of all the unveiled paths in a process.
2020-01-20 22:19:02 +01:00
Andreas Kling
0569123ad7 Kernel: Add a basic implementation of unveil()
This syscall is a complement to pledge() and adds the same sort of
incremental relinquishing of capabilities for filesystem access.

The first call to unveil() will "drop a veil" on the process, and from
now on, only unveiled parts of the filesystem are visible to it.

Each call to unveil() specifies a path to either a directory or a file
along with permissions for that path. The permissions are a combination
of the following:

- r: Read access (like the "rpath" promise)
- w: Write access (like the "wpath" promise)
- x: Execute access
- c: Create/remove access (like the "cpath" promise)

Attempts to open a path that has not been unveiled with fail with
ENOENT. If the unveiled path lacks sufficient permissions, it will fail
with EACCES.

Like pledge(), subsequent calls to unveil() with the same path can only
remove permissions, not add them.

Once you call unveil(nullptr, nullptr), the veil is locked, and it's no
longer possible to unveil any more paths for the process, ever.

This concept comes from OpenBSD, and their implementation does various
things differently, I'm sure. This is just a first implementation for
SerenityOS, and we'll keep improving on it as we go. :^)
2020-01-20 22:12:04 +01:00
Andreas Kling
f4f958f99f Kernel: Make DoubleBuffer use a KBuffer instead of kmalloc()ing
Background: DoubleBuffer is a handy buffer class in the kernel that
allows you to keep writing to it from the "outside" while the "inside"
reads from it. It's used for things like LocalSocket and TTY's.
Internally, it has a read buffer and a write buffer, but the two will
swap places when the read buffer is exhausted (by reading from it.)

Before this patch, it was internally implemented as two Vector<u8>
that we would swap between when the reader side had exhausted the data
in the read buffer. Now instead we preallocate a large KBuffer (64KB*2)
on DoubleBuffer construction and use that throughout its lifetime.

This removes all the kmalloc heap traffic caused by DoubleBuffers :^)
2020-01-20 16:08:49 +01:00
Andreas Kling
0a282e0a02 Kernel: Allow naming KBuffers 2020-01-20 14:00:11 +01:00
Andreas Kling
e901a3695a Kernel: Use the templated copy_to/from_user() in more places
These ensure that the "to" and "from" pointers have the same type,
and also that we copy the correct number of bytes.
2020-01-20 13:41:21 +01:00
Sergey Bugaev
d5426fcc88 Kernel: Misc tweaks 2020-01-20 13:26:06 +01:00
Sergey Bugaev
9bc6157998 Kernel: Return new fd from sys$fcntl(F_DUPFD)
This fixes GNU Bash getting confused after performing a redirection.
2020-01-20 13:26:06 +01:00
Andreas Kling
b52d0afecf SB16: Map the DMA buffer in kernelspace so we can write to it
This broke with the >3GB paging overhaul. It's no longer possible to
write directly to physical addresses below the 8MB mark. Physical pages
need to be mapped into kernel VM by using a Region.

Fixes #1099.
2020-01-20 13:13:03 +01:00
Andreas Kling
a0b716cfc5 Add AnonymousVMObject::create_with_physical_page()
This can be used to create a VMObject for a single PhysicalPage.
2020-01-20 13:13:03 +01:00
Andreas Kling
4ebff10bde Kernel: Write-only regions should still be mapped as present
There is no real "read protection" on x86, so we have no choice but to
map write-only pages simply as "present & read/write".

If we get a read page fault in a non-readable region, that's still a
correctness issue, so we crash the process. It's by no means a complete
protection against invalid reads, since it's trivial to fool the kernel
by first causing a write fault in the same region.
2020-01-20 13:13:03 +01:00
Andreas Kling
4b7a89911c Kernel: Remove some unnecessary casts to uintptr_t
VirtualAddress is constructible from uintptr_t and const void*.
PhysicalAddress is constructible from uintptr_t but not const void*.
2020-01-20 13:13:03 +01:00
Andreas Kling
a246e9cd7e Use uintptr_t instead of u32 when storing pointers as integers
uintptr_t is 32-bit or 64-bit depending on the target platform.
This will help us write pointer size agnostic code so that when the day
comes that we want to do a 64-bit port, we'll be in better shape.
2020-01-20 13:13:03 +01:00
Andreas Kling
5ce1cc89a0 Kernel: Add fast-path for sys$gettid()
The userspace locks are very aggressively calling sys$gettid() to find
out which thread ID they have.

Since syscalls are quite heavy, this can get very expensive for some
programs. This patch adds a fast-path for sys$gettid(), which makes it
skip all of the usual syscall validation and just return the thread ID
right away.

This cuts Kernel/Process.cpp compile time by ~18%, from ~29 to ~24 sec.
2020-01-19 17:32:05 +01:00
Andreas Kling
8d9dd1b04b Kernel: Add a 1-deep cache to Process::region_from_range()
This simple cache gets hit over 70% of the time on "g++ Process.cpp"
and shaves ~3% off the runtime.
2020-01-19 16:44:37 +01:00
Andreas Kling
ae0c435e68 Kernel: Add a Process::add_region() helper
This is a private helper for adding a Region to Process::m_regions.
It's just for convenience since it's a bit cumbersome to do this.
2020-01-19 16:26:42 +01:00
Andreas Kling
1dc9fa9506 Kernel: Simplify PageDirectory swapping in sys$execve()
Swap out both the PageDirectory and the Region list at the same time,
instead of doing the Region list slightly later.
2020-01-19 16:05:42 +01:00
Andreas Kling
05836757c6 Kernel: Oops, fix bad sort order of available VM ranges
This made the allocator perform worse, so here's another second off of
the Kernel/Process.cpp compile time from a simple bugfix! (31s to 30s)
2020-01-19 15:53:43 +01:00
Andreas Kling
167b57a6b7 TmpFS: Grow the underlying inode buffer with 2x factor when written to
Before this, we would end up in memcpy() churn hell when a program was
doing repeated write() calls to a file in /tmp.

An even better solution will be to only grow the VM allocation of the
underlying buffer and keep using the same physical pages. This would
eliminate all the memcpy() work.

I've benchmarked this using g++ to compile Kernel/Process.cpp.
With these changes, compilation goes from ~35 sec to ~31 sec. :^)
2020-01-19 14:01:32 +01:00
Andreas Kling
1d02ac35fc Kernel: Limit Thread::raw_backtrace() to the max profiler stack size
Let's avoid walking overly long stacks here, since kmalloc() is finite.
2020-01-19 13:54:09 +01:00
Andreas Kling
6eab7b398d Kernel: Make ProcessPagingScope restore CR3 properly
Instead of restoring CR3 to the current process's paging scope when a
ProcessPagingScope goes out of scope, we now restore exactly whatever
the CR3 value was when we created the ProcessPagingScope.

This fixes breakage in situations where a process ends up with nested
ProcessPagingScopes. This was making profiling very fragile, and with
this change it's now possible to profile g++! :^)
2020-01-19 13:44:53 +01:00
Andreas Kling
ad3f931707 Kernel: Optimize VM range deallocation a bit
Previously, when deallocating a range of VM, we would sort and merge
the range list. This was quite slow for large processes.

This patch optimizes VM deallocation in the following ways:

- Use binary search instead of linear scan to find the place to insert
  the deallocated range.

- Insert at the right place immediately, removing the need to sort.

- Merge the inserted range with any adjacent range(s) in-line instead
  of doing a separate merge pass into a list copy.

- Add Traits<Range> to inform Vector that Range objects are trivial
  and can be moved using memmove().

I've also added an assertion that deallocated ranges are actually part
of the RangeAllocator's initial address range.

I've benchmarked this using g++ to compile Kernel/Process.cpp.
With these changes, compilation goes from ~41 sec to ~35 sec.
2020-01-19 13:29:59 +01:00
Andreas Kling
87583aea9c Kernel: Use copy_from_user() when appropriate during thread backtracing 2020-01-19 10:33:26 +01:00
Andreas Kling
38fc31ff11 Kernel: Always switch to own page tables when crashing/asserting
I noticed this while debugging a crash in backtrace generation.
If a process would crash while temporarily inspecting another process's
address space, the crashing thread would still use the other process's
page tables while handling the crash, causing all kinds of confusion
when trying to walk the stack of the crashing thread.
2020-01-19 10:33:17 +01:00
Andreas Kling
f7b394e9a1 Kernel: Assert that copy_to/from_user() are called with user addresses
This will panic the kernel immediately if these functions are misused
so we can catch it and fix the misuse.

This patch fixes a couple of misuses:

    - create_signal_trampolines() writes to a user-accessible page
      above the 3GB address mark. We should really get rid of this
      page but that's a whole other thing.

    - CoW faults need to use copy_from_user rather than copy_to_user
      since it's the *source* pointer that points to user memory.

    - Inode faults need to use memcpy rather than copy_to_user since
      we're copying a kernel stack buffer into a quickmapped page.

This should make the copy_to/from_user() functions slightly less useful
for exploitation. Before this, they were essentially just glorified
memcpy() with SMAP disabled. :^)
2020-01-19 09:18:55 +01:00
Andreas Kling
2cd212e5df Kernel: Let's say that everything < 3GB is user virtual memory
Technically the bottom 2MB is still identity-mapped for the kernel and
not made available to userspace at all, but for simplicity's sake we
can just ignore that and make "address < 0xc0000000" the canonical
check for user/kernel.
2020-01-19 08:58:33 +01:00
Andreas Kling
5ce9382e98 Kernel: Only require "stdio" pledge for sending signals to self
This should match what OpenBSD does. Sending a signal to yourself seems
basically harmless.
2020-01-19 08:50:55 +01:00
Sergey Bugaev
3e1ed38d4b Kernel: Do not return ENOENT for unresolved symbols
ENOENT means "no such file or directory", not "no such symbol". Return EINVAL
instead, as we already do in other cases.
2020-01-18 23:51:22 +01:00
Sergey Bugaev
d0d13e2bf5 Kernel: Move setting file flags and r/w mode to VFS::open()
Previously, VFS::open() would only use the passed flags for permission checking
purposes, and Process::sys$open() would set them on the created FileDescription
explicitly. Now, they should be set by VFS::open() on any files being opened,
including files that the kernel opens internally.

This also lets us get rid of the explicit check for whether or not the returned
FileDescription was a preopen fd, and in fact, fixes a bug where a read-only
preopen fd without any other flags would be considered freshly opened (due to
O_RDONLY being indistinguishable from 0) and granted a new set of flags.
2020-01-18 23:51:22 +01:00
Sergey Bugaev
544b8286da Kernel: Do not open stdio fds for kernel processes
Kernel processes just do not need them.

This also avoids touching the file (sub)system early in the boot process when
initializing the colonel process.
2020-01-18 23:51:22 +01:00