Commit Graph

541 Commits

Author SHA1 Message Date
Brian Gianforcaro
ed996fcced Kernel: Remove unused header includes 2021-08-01 08:10:16 +02:00
Andreas Kling
b807f1c3fc Kernel: Fail madvise() volatile change with EINVAL for non-purgeable mem
AnonymousVMObject::set_volatile() assumes that nobody ever calls it on
non-purgeable objects, so let's make sure we don't do that.

Also return EINVAL instead of EPERM for non-anonymous VM objects so the
error codes match.
2021-07-28 20:42:49 +02:00
Brian Gianforcaro
ddc950ce42 Kernel: Avoid file descriptor leak in Process::sys$socketpair on error
Previously it was possible to leak the file descriptor if we error out
after allocating the first descriptor. Now we perform both fd
allocations back to back so we can handle the potential error when
processing the second fd allocation.
2021-07-28 19:07:00 +02:00
Brian Gianforcaro
4b2651ddab Kernel: Track allocated FileDescriptionAndFlag elements in each Process
The way the Process::FileDescriptions::allocate() API works today means
that two callers who allocate back to back without associating a
FileDescription with the allocated FD, will receive the same FD and thus
one will stomp over the other.

Naively tracking which FileDescriptions are allocated and moving onto
the next would introduce other bugs however, as now if you "allocate"
a fd and then return early further down the control flow of the syscall
you would leak that fd.

This change modifies this behavior by tracking which descriptions are
allocated and then having an RAII type to "deallocate" the fd if the
association is not setup the end of it's scope.
2021-07-28 19:07:00 +02:00
Brian Gianforcaro
ba03b6ad02 Kernel: Make Process::FileDescriptions::allocate return KResultOr<int>
Modernize more error checking by utilizing KResultOr.
2021-07-28 19:07:00 +02:00
Brian Gianforcaro
d2cee9cbf6 Kernel: Remove unused fd allocation from Process::sys$connect(..) 2021-07-28 19:07:00 +02:00
Andreas Kling
a085168c52 Kernel: Rename Space::create => Space::try_create() 2021-07-27 14:54:35 +02:00
Andreas Kling
4648bcd3d4 Kernel: Remove unnecessary weak pointer from Region to owning Process
This was previously used for a single debug logging statement during
memory purging. There are no remaining users of this weak pointer,
so let's get rid of it.
2021-07-25 17:28:06 +02:00
Andreas Kling
09bc4cee15 Kernel: Remove unused madvise(MADV_GET_VOLATILE)
This was used to query the volatile state of a memory region, however
nothing ever actually used it.
2021-07-25 17:28:06 +02:00
Andreas Kling
2d1a651e0a Kernel: Make purgeable memory a VMObject level concept (again)
This patch changes the semantics of purgeable memory.

- AnonymousVMObject now has a "purgeable" flag. It can only be set when
  constructing the object. (Previously, all anonymous memory was
  effectively purgeable.)

- AnonymousVMObject now has a "volatile" flag. It covers the entire
  range of physical pages. (Previously, we tracked ranges of volatile
  pages, effectively making it a page-level concept.)

- Non-volatile objects maintain a physical page reservation via the
  committed pages mechanism, to ensure full coverage for page faults.

- When an object is made volatile, it relinquishes any unused committed
  pages immediately. If later made non-volatile again, we then attempt
  to make a new committed pages reservation. If this fails, we return
  ENOMEM to userspace.

mmap() now creates purgeable objects if passed the MAP_PURGEABLE option
together with MAP_ANONYMOUS. anon_create() memory is always purgeable.
2021-07-25 17:28:05 +02:00
Brian Gianforcaro
9d8482c3e8 Kernel: Use StringView when parsing pledges in sys$pledge(..)
This ensures no potential allocation as in some cases the pledge char*
could be promoted to AK::String by the compiler to execute the
comparison.
2021-07-23 19:02:25 +02:00
Brian Gianforcaro
e4b86aa5d8 Kernel: Fix bug where we half apply pledges in sys$pledge(..)
This bug manifests it self when the caller to sys$pledge() passes valid
promises, but invalid execpromises. The code would apply the promises
and then return an error for the execpromises. This leaves the user in
a confusing state, as the promises were silently applied, but we return
an error suggesting the operation has failed.

Avoid this situation by tweaking the implementation to only apply the
promises / execpromises after all validation has occurred.
2021-07-23 19:02:25 +02:00
Brian Gianforcaro
36ff717c54 Kernel: Migrate sys$pledge to use the KString API
This avoids potential unhandled OOM that's possible with the old
copy_string_from_user API.
2021-07-23 19:02:25 +02:00
Brian Gianforcaro
baec9e2d2d Kernel: Migrate sys$unveil to use the KString API
This avoids potential unhandled OOM that's possible with the old
copy_string_from_user API.
2021-07-23 19:02:25 +02:00
Brian Gianforcaro
2e7728bb05 Kernel: Use StringView literals for fs_type match in sys$mount(..) 2021-07-23 19:02:25 +02:00
Andreas Kling
082ed6f417 Kernel: Simplify VMObject locking & page fault handlers
This patch greatly simplifies VMObject locking by doing two things:

1. Giving VMObject an IntrusiveList of all its mapping Region objects.
2. Removing VMObject::m_paging_lock in favor of VMObject::m_lock

Before (1), VMObject::for_each_region() was forced to acquire the
global MM lock (since it worked by walking MemoryManager's list of
all regions and checking for regions that pointed to itself.)

With each VMObject having its own list of Regions, VMObject's own
m_lock is all we need.

Before (2), page fault handlers used a separate mutex for preventing
overlapping work. This design required multiple temporary unlocks
and was generally extremely hard to reason about.

Instead, page fault handlers now use VMObject's own m_lock as well.
2021-07-23 03:24:44 +02:00
Gunnar Beutner
31f30e732a Everywhere: Prefix hexadecimal numbers with 0x
Depending on the values it might be difficult to figure out whether a
value is decimal or hexadecimal. So let's make this more obvious. Also
this allows copying and pasting those numbers into GNOME calculator and
probably also other apps which auto-detect the base.
2021-07-22 08:57:01 +02:00
Peter Elliott
3fa2816642 Kernel+LibC: Implement fcntl(2) advisory locks
Advisory locks don't actually prevent other processes from writing to
the file, but they do prevent other processes looking to acquire and
advisory lock on the file.

This implementation currently only adds non-blocking locks, which are
all I need for now.
2021-07-20 17:44:30 +04:30
Brian Gianforcaro
8f01a8b741 Kernel: Disable big process lock for sys$yield() 2021-07-20 03:21:14 +02:00
Brian Gianforcaro
5c10fb4007 Kernel: Disable big process lock for sys$gettid()
This syscall reads a read only value from the current thread, and hence
has no need for the big process lock.
2021-07-20 03:21:14 +02:00
Brian Gianforcaro
638598b15d Kernel: Disable big process lock for sys$getpid() 2021-07-20 03:21:14 +02:00
Brian Gianforcaro
bfd4635274 Kernel: Disable big process lock for sys$uname() 2021-07-20 03:21:14 +02:00
Brian Gianforcaro
10ce896d4f Kernel: Disable big process lock in sys$gethostname() sys$sethostname() 2021-07-20 03:21:14 +02:00
Brian Gianforcaro
9201a06027 Kernel: Annotate all syscalls with VERIFY_PROCESS_BIG_LOCK_ACQUIRED
Before we start disabling acquisition of the big process lock for
specific syscalls, make sure to document and assert that all the
lock is held during all syscalls.
2021-07-20 03:21:14 +02:00
Brian Gianforcaro
308396bca1 Kernel: No lock validate_user_stack variant, switch to Space as argument
The entire process is not needed, just require the user to pass in the
Space. Also provide no_lock variant to use when you already have the
VM/Space lock acquired, to avoid unnecessary recursive spinlock
acquisitions.
2021-07-20 03:21:14 +02:00
Brian Gianforcaro
1cffecbe8d Kernel: Push ARCH specific ifdef's down into RegisterState functions
The non CPU specific code of the kernel shouldn't need to deal with
architecture specific registers, and should instead deal with an
abstract view of the machine. This allows us to remove a variety of
architecture specific ifdefs and helps keep the code slightly more
portable.

We do this by exposing the abstract representation of instruction
pointer, stack pointer, base pointer, return register, etc on the
RegisterState struct.
2021-07-19 08:46:55 +02:00
Andreas Kling
9457d83986 Kernel: Rename Locker => MutexLocker 2021-07-18 01:53:04 +02:00
Andreas Kling
cee9528168 Kernel: Rename Lock to Mutex
Let's be explicit about what kind of lock this is meant to be.
2021-07-17 21:10:32 +02:00
Brian Gianforcaro
c0987453e6 Kernel: Remove double RedBlackTree lookup in VM/Space region removal
We should never request a regions removal that we don't currently
own. We currently assert this everywhere else by all callers.

Instead lets just push the assert down into the RedBlackTree removal
and assume that we will always successfully remove the region.
2021-07-17 16:22:59 +02:00
Tom
704e1c2e3d Kernel: Rename functions to be less confusing
Thread::yield_and_release_relock_big_lock releases the big lock, yields
and then relocks the big lock.

Thread::yield_assuming_not_holding_big_lock yields assuming the big
lock is not being held.
2021-07-16 20:30:04 +02:00
Timothy
9715311837 AK+Kernel: Implement and use EnumBits has_any_flag()
This duplicates the old functionality of has_flag and will return true
when any flags present in the mask are also in the value.
2021-07-16 11:49:50 +02:00
Idan Horowitz
be475cd6a8 Kernel: Handle OOM when adding memory regions to Spaces :^) 2021-07-15 00:49:41 +02:00
Andreas Kling
859e5741ff Kernel: Fix Process use-after-free in Thread finalization
We leak a ref() onto every user process when constructing them,
either via Process::create_user_process(), or via Process::sys$fork().

This ref() is balanced by a corresponding unref() in
Thread::WaitBlockCondition::finalize().

Since kernel processes don't have a leaked ref() on them, this led to
an extra Process::unref() on kernel processes during finalization.
This happened during every boot, with the `init_stage2` process.

Found by turning off kfree() scrubbing. :^)
2021-07-14 22:36:29 +02:00
Brian Gianforcaro
84b4b9447d Kernel: Move new process registration out of Space spinlock scope
There appears to be no reason why the process registration needs
to happen under the space spin lock. As the first thread is not started
yet it should be completely uncontested, but it's still bad practice.
2021-07-12 10:20:21 +02:00
Andreas Kling
cd7a49b90d Kernel: Make Region splitting OOM-safe
Region allocation failures during splitting are now propagated all the
way out to where we can return ENOMEM for them.
2021-07-11 18:52:27 +02:00
Andreas Kling
88d490566f Kernel: Rename various *VMObject::create*() => try_create()
try_*() implies that it can fail (and they all return RefPtr with
nullptr signalling failure.)
2021-07-11 17:55:29 +02:00
Andreas Kling
af8c74a328 Kernel: Make SharedInodeVMObject allocation OOM-safe 2021-07-11 17:52:07 +02:00
Andreas Kling
07c4c89297 Kernel: Make VirtualFileSystem::sync() static 2021-07-11 00:26:17 +02:00
Andreas Kling
0d39bd04d3 Kernel: Rename VFS => VirtualFileSystem 2021-07-11 00:25:24 +02:00
Andreas Kling
d53d9d3677 Kernel: Rename FS => FileSystem
This matches our common naming style better.
2021-07-11 00:20:38 +02:00
Ralf Donau
8ee3a5e09e Kernel: Logic fix in the pledge syscall
Pledge should check m_has_promises. Calling pledge("", nullptr)
does not fail on an already pledged process anymore.
2021-07-10 21:59:29 +02:00
Gunnar Beutner
06883ed8a3 Kernel+Userland: Make the stack alignment comply with the System V ABI
The System V ABI for both x86 and x86_64 requires that the stack pointer
is 16-byte aligned on entry. Previously we did not align the stack
pointer properly.

As far as "main" was concerned the stack alignment was correct even
without this patch due to how the C++ _start function and the kernel
interacted, i.e. the kernel misaligned the stack as far as the ABI
was concerned but that misalignment (read: it was properly aligned for
a regular function call - but misaligned in terms of what the ABI
dictates) was actually expected by our _start function.
2021-07-10 01:41:57 +02:00
Ali Mohammad Pur
e37f9fa7db LibPthread+Kernel: Add pthread_kill() and the thread_kill syscall 2021-07-09 15:36:50 +02:00
Daniel Bertalan
d30dbf47f5 Kernel: Map non-page-aligned text segments correctly
`.text` segments with non-aligned offsets had their lengths applied to
the first page's base address. This meant that in some cases the last
PAGE_SIZE - 1 bytes weren't mapped. Previously, it did not cause any
problems as the GNU ld insists on aligning everything; but that's not
the case with the LLVM toolchain.
2021-07-07 22:26:53 +02:00
Max Wipfli
d5722eab36 Kernel: Custody::absolute_path() => try_create_absolute_path()
This converts most users of Custody::absolute_path() to use the new
try_create_absolute_path() API, and return ENOMEM if the KString
allocation fails.
2021-07-07 15:32:17 +02:00
Max Wipfli
ee342f5ec3 Kernel: Replace usage of LexicalPath with KLexicalPath
This replaces all uses of LexicalPath in the Kernel with the functions
from KLexicalPath. This also allows the Kernel to stop including
AK::LexicalPath.
2021-07-07 15:32:17 +02:00
Edwin Hoksberg
99328e1038 Kernel+KeyboardSettings: Remove numlock syscall and implement ioctl 2021-07-07 10:44:20 +02:00
Tom
864b50b5c2 Kernel: Do not hold spinlock while touching user mode futex values
The user_atomic_* functions are subject to the same rules as
copy_from/to/user, which may require preemption.
2021-07-07 10:05:55 +02:00
Tom
7593418b77 Kernel: Fix futex race that could lead to thread waiting forever
There is a race condition where we would remove a FutexQueue from
our futex map and in the meanwhile another thread started to queue
itself into that very same futex, leading to that thread to wait
forever as no other wake operation could discover that removed
FutexQueue.

This fixes the problem by:
* Tracking imminent waits, which prevents a new FutexQueue from being
  deleted that a thread will wait on momentarily
* Atomically marking a FutexQueue as removed, which prevents a thread
  from waiting on it before it is actually removed from the futex map.
2021-07-07 10:05:55 +02:00
Andreas Kling
565796ae4e Kernel+LibC: Remove sys$donate()
This was an old SerenityOS-specific syscall for donating the remainder
of the calling thread's time-slice to another thread within the same
process.

Now that Threading::Lock uses a pthread_mutex_t internally, we no
longer need this syscall, which allows us to get rid of a surprising
amount of unnecessary scheduler logic. :^)
2021-07-05 23:30:15 +02:00