Commit Graph

1901 Commits

Author SHA1 Message Date
Andreas Kling
36f1de3c89 Kernel: Pointer range validation should fail on wraparound
Let's reject address ranges that wrap around the 2^32 mark.
2019-12-31 18:23:17 +01:00
Andreas Kling
903b159856 Kernel: Write address validation was only checking end of write range
Thanks to yyyyyyy for finding the bug! :^)
2019-12-31 18:18:54 +01:00
Andreas Kling
d8ef13a426 ProcFS: Supervisor-only inodes should be owned by UID 0, GID 0 2019-12-31 13:22:43 +01:00
Andreas Kling
c9ec415e2f Kernel: Always reject never-userspace addresses before checking regions
At the moment, addresses below 8MB and above 3GB are never accessible
to userspace, so just reject them without even looking at the current
process's memory regions.
2019-12-31 03:45:54 +01:00
Andreas Kling
3f254bfbc8 Kernel+ping: Only allow superuser to create SOCK_RAW sockets
/bin/ping is now setuid-root, and will drop privileges immediately
after opening a raw socket.
2019-12-31 01:42:34 +01:00
Andreas Kling
9af054af9e ProcFS: Reduce the amount of info accessible to non-superusers
This patch hardens /proc a bit by making many things only accessible
to UID 0, and also disallowing access to /proc/PID/ for anyone other
than the UID of that process (and superuser, obviously.)
2019-12-31 01:32:27 +01:00
Andreas Kling
54d182f553 Kernel: Remove some unnecessary leaking of kernel pointers into dmesg
There's a lot more of this and we need to stop printing kernel pointers
anywhere but the debug console.
2019-12-31 01:22:00 +01:00
Andreas Kling
66d5ebafa6 Kernel: Let's also not consider kernel regions to be valid user stacks
This one is less obviously exploitable than the previous one, but still
a bug nonetheless.
2019-12-31 00:28:14 +01:00
Andreas Kling
0fc24fe256 Kernel: User pointer validation should reject kernel-only addresses
We were happily allowing syscalls with pointers into kernel-only
regions (virtual address >= 0xc0000000).

This patch fixes that by only considering user regions in the current
process, and also double-checking the Region::is_user_accessible() flag
before approving an access.

Thanks to Fire30 for finding the bug! :^)
2019-12-31 00:24:35 +01:00
Andreas Kling
a69734bf2e Kernel: Also add a process boosting mechanism
Let's also have set_process_boost() for giving all threads in a process
the same boost.
2019-12-30 20:10:00 +01:00
Andreas Kling
610f3ad12f Kernel: Add a basic thread boosting mechanism
This patch introduces a syscall:

    int set_thread_boost(int tid, int amount)

You can use this to add a permanent boost value to the effective thread
priority of any thread with your UID (or any thread in the system if
you are the superuser.)

This is quite crude, but opens up some interesting opportunities. :^)
2019-12-30 19:23:13 +01:00
Andreas Kling
50677bf806 Kernel: Refactor scheduler to use dynamic thread priorities
Threads now have numeric priorities with a base priority in the 1-99
range.

Whenever a runnable thread is *not* scheduled, its effective priority
is incremented by 1. This is tracked in Thread::m_extra_priority.
The effective priority of a thread is m_priority + m_extra_priority.

When a runnable thread *is* scheduled, its m_extra_priority is reset to
zero and the effective priority returns to base.

This means that lower-priority threads will always eventually get
scheduled to run, once its effective priority becomes high enough to
exceed the base priority of threads "above" it.

The previous values for ThreadPriority (Low, Normal and High) are now
replaced as follows:

    Low -> 10
    Normal -> 30
    High -> 50

In other words, it will take 20 ticks for a "Low" priority thread to
get to "Normal" effective priority, and another 20 to reach "High".

This is not perfect, and I've used some quite naive data structures,
but I think the mechanism will allow us to build various new and
interesting optimizations, and we can figure out better data structures
later on. :^)
2019-12-30 18:46:17 +01:00
Andrew Kaster
cdcab7e5f4 Kernel: Retry mmap if MAP_FIXED is not in flags and addr is not 0
If an mmap fails to allocate a region, but the addr passed in was
non-zero, non-fixed mmaps should attempt to allocate at any available
virtual address.
2019-12-29 23:01:27 +01:00
Andrew Kaster
bae8e21a8b Kernel: Add move assign operator to KResultOr 2019-12-29 23:01:27 +01:00
Andreas Kling
fed3416bd2 Kernel: Embrace the SerenityOS name 2019-12-29 19:08:02 +01:00
Andreas Kling
1f31156173 Kernel: Add a mode flag to sys$purge and allow purging clean inodes 2019-12-29 13:16:53 +01:00
Andreas Kling
c74cde918a Kernel+SystemMonitor: Expose amount of per-process clean inode memory
This is memory that's loaded from an inode (file) but not modified in
memory, so still identical to what's on disk. This kind of memory can
be freed and reloaded transparently from disk if needed.
2019-12-29 12:45:58 +01:00
Andreas Kling
0d5e0e4cad Kernel+SystemMonitor: Expose amount of per-process dirty private memory
Dirty private memory is all memory in non-inode-backed mappings that's
process-private, meaning it's not shared with any other process.

This patch exposes that number via SystemMonitor, giving us an idea of
how much memory each process is responsible for all on its own.
2019-12-29 12:28:32 +01:00
Conrad Pankoff
bbb536ebed Kernel: Fix code locked behind NETWORK_TASK_DEBUG 2019-12-28 02:03:49 +01:00
Conrad Pankoff
5ca7ae4585 Kernel: Route all loopback traffic through the loopback adapter 2019-12-28 02:03:38 +01:00
Conrad Pankoff
876323fd7a Kernel: Move incoming packet buffer off the NetworkTask stack 2019-12-28 00:24:43 +01:00
Hüseyin ASLITÜRK
3d4dddd111 MenuApplets: Add Clock applet, move code from WindowServer to the applet. 2019-12-27 22:47:31 +01:00
Stefano Cristiano
49a789ad04 Build: Allow building serenityOS ext2 root filesystem on macOS host 2019-12-27 02:19:55 +01:00
Conrad Pankoff
115b315375 Kernel: Add kernel-level timer queue (heavily based on @juliusf's work)
PR #591 defines the rationale for kernel-level timers. They're most
immediately useful for TCP retransmission, but will most likely see use
in many other areas as well.
2019-12-27 02:15:45 +01:00
Andreas Kling
abdd5aa08a Kernel: Separate runnable thread queues by priority
This patch introduces three separate thread queues, one for each thread
priority available to userspace (Low, Normal and High.)

Each queue operates in a round-robin fashion, but we now always prefer
to schedule the highest priority thread that currently wants to run.

There are tons of tweaks and improvements that we can and should make
to this mechanism, but I think this is a step in the right direction.

This makes WindowServer significantly more responsive while one of its
clients is burning CPU. :^)
2019-12-27 00:52:30 +01:00
Conrad Pankoff
0b3a868729 Kernel: Simplify force_pio logic in PATA driver (#923) 2019-12-26 22:57:58 +01:00
Andreas Kling
95034fdfbd Kernel: Move PC speaker beep timing logic from scheduler to the syscall
I don't know why I put this in the scheduler to begin with.. the caller
can just block until the beeping is finished.
2019-12-26 22:31:26 +01:00
Andreas Kling
154d10e4e9 Kernel: Process::for_each_in_pgrp() should not include dead processes
We don't care about dead processes that were once members of a specific
process group.

This was causing us to try and send SIGINT to already-dead processes
when pressing Ctrl+C in a terminal whose pgrp they were once in.

Fixes #922.
2019-12-26 22:20:39 +01:00
Andreas Kling
c1f8291ce4 Kernel: When physical page allocation fails, try to purge something
Instead of panicking right away when we run out of physical pages,
we now try to find a PurgeableVMObject with some volatile pages in it.
If we find one, we purge that entire object and steal one of its pages.

This makes it possible for the kernel to keep going instead of dying.
Very cool. :^)
2019-12-26 11:45:36 +01:00
Andreas Kling
dafd715743 ProcFS: Fix inconsistent numbers in /proc/memstat
We were listing the total number of user/super pages as the number of
"available" pages in the system. This was then misinterpreted in the
SystemMonitor program and displayed wrong in the GUI.
2019-12-26 11:43:42 +01:00
Andreas Kling
6fd655102e Kernel: Add Lock::is_locked() 2019-12-26 11:43:23 +01:00
Conrad Pankoff
17aef7dc99 Kernel: Detect support for no-execute (NX) CPU features
Previously we assumed all hosts would have support for IA32_EFER.NXE.
This is mostly true for newer hardware, but older hardware will crash
and burn if you try to use this feature.

Now we check for support via CPUID.80000001[20].
2019-12-26 10:05:51 +01:00
Andreas Kling
4a8683ea68 Kernel+LibPthread+LibC: Add a naive futex and use it for pthread_cond_t
This patch implements a simple version of the futex (fast userspace
mutex) API in the kernel and uses it to make the pthread_cond_t API's
block instead of busily sched_yield().

An arbitrary userspace address is passed to the kernel as a "token"
that identifies the futex and you can then FUTEX_WAIT and FUTEX_WAKE
that specific userspace address.

FUTEX_WAIT corresponds to pthread_cond_wait() and FUTEX_WAKE is used
for pthread_cond_signal() and pthread_cond_broadcast().

I'm pretty sure I'm missing something in this implementation, but it's
hopefully okay for a start. :^)
2019-12-25 23:54:06 +01:00
Andreas Kling
9e55bcb7da Kernel: Make kernel memory regions be non-executable by default
From now on, you'll have to request executable memory specifically
if you want some.
2019-12-25 22:41:34 +01:00
Andreas Kling
0b7a2e0a5a Kernel: Set NX bit for virtual addresses 0-1MB and 2-8MB
This removes the ability to jump into kmalloc memory, etc.
Only the kernel image itself is allowed to exec, located between 1-2MB.
2019-12-25 22:24:28 +01:00
Andreas Kling
56a28890eb Kernel: Clarify the various input validity checks in mmap()
Also share some validation logic between mmap() and mprotect().
2019-12-25 21:50:13 +01:00
Andreas Kling
aeefcd6ddb run: Run QEMU with "-cpu max"
This should give us access to the largest set of CPU features available
on the host machine.
2019-12-25 15:19:13 +01:00
Andreas Kling
419e0ced27 Kernel: Don't allow mmap()/mprotect() to set up PROT_WRITE|PROT_EXEC
..but also allow mprotect() to set PROT_EXEC on a region, something
we were just ignoring before.
2019-12-25 13:35:57 +01:00
Andreas Kling
ce5f7f6c07 Kernel: Use the CPU's NX bit to enforce PROT_EXEC on memory mappings
Now that we have PAE support, we can ask the CPU to crash processes for
trying to execute non-executable memory. This is pretty cool! :^)
2019-12-25 13:35:57 +01:00
Andreas Kling
c22a4301ed Kernel: Interpret "reserved bit violation" page faults correctly
We don't actually react to these in any meaningful way other than
crashing, but let's at least print the correct information. :^)
2019-12-25 13:35:57 +01:00
Andreas Kling
52deb09382 Kernel: Enable PAE (Physical Address Extension)
Introduce one more (CPU) indirection layer in the paging code: the page
directory pointer table (PDPT). Each PageDirectory now has 4 separate
PageDirectoryEntry arrays, governing 1 GB of VM each.

A really neat side-effect of this is that we can now share the physical
page containing the >=3GB kernel-only address space metadata between
all processes, instead of lazily cloning it on page faults.

This will give us access to the NX (No eXecute) bit, allowing us to
prevent execution of memory that's not supposed to be executed.
2019-12-25 13:35:57 +01:00
Andreas Kling
c087abc48d Kernel: Rename PageDirectory::find_by_pdb() => find_by_cr3()
I caught myself wondering what "pdb" stood for, so let's rename this
to something more obvious.
2019-12-25 02:58:03 +01:00
Andreas Kling
7a0088c4d2 Kernel: Clean up Region access bit setters a little 2019-12-25 02:58:03 +01:00
Andreas Kling
336ac9e8e7 Kernel: Clean up CPU fault register dumps
These were looking a bit messy after we started using 32-bit fields
to store segment registers in RegisterDumps.
2019-12-25 02:58:03 +01:00
Andreas Kling
c9a5253ac2 Kernel: Uh, actually *actually* turn on CR4.PGE
I'm not sure how I managed to misread the location of this bit twice.
But I did! Here is finally the correct value, according to Intel:

    "Page Global Enable (bit 7 of CR4)"

Jeez! :^)
2019-12-25 02:58:03 +01:00
Shannon Booth
0e45b9423b Kernel: Implement recursion limit on path resolution
Cautiously use 5 as a limit for now so that we don't blow the stack.
This can be increased in the future if we are sure that we won't be
blowing the stack, or if the implementation is changed to not use
recursion :^)
2019-12-24 23:14:14 +01:00
Andreas Kling
3623e35978 Kernel: Oops, actually enable CR4.PGE (page table global bit)
Turns out we were setting the wrong bit here. Now we will actually keep
kernel memory mappings in the TLB across context switches.
2019-12-24 22:45:27 +01:00
Conrad Pankoff
efa7141d14 Kernel: Fail module loading if any symbols can not be resolved 2019-12-24 11:52:01 +01:00
Conrad Pankoff
9a8032b479 Kernel: Disallow loading a module twice without explicitly unloading it
This ensures that a module has the chance to run its cleanup functions
before it's taken out of service.
2019-12-24 02:20:37 +01:00
Shannon Booth
26c9c27407 Build: Fix shellcheck warnings in makeall.sh
"SC2115: Declare and assign separately to avoid masking return values."
2019-12-24 02:19:59 +01:00