Commit Graph

774 Commits

Author SHA1 Message Date
Liav A
0fc60e41dd Kernel: Use klog() instead of kprintf()
Also, duplicate data in dbg() and klog() calls were removed.
In addition, leakage of virtual address to kernel log is prevented.
This is done by replacing kprintf() calls to dbg() calls with the
leaked data instead.
Also, other kprintf() calls were replaced with klog().
2020-03-02 22:23:39 +01:00
Andreas Kling
47beab926d Kernel: Remove ability to create kernel-only regions at user addresses
This was only used by the mechanism for mapping executables into each
process's own address space. Now that we remap executables on demand
when needed for symbolication, this can go away.
2020-03-02 11:20:34 +01:00
Andreas Kling
e56f8706ce Kernel: Map executables at a kernel address during ELF load
This is both simpler and more robust than mapping them in the process
address space.
2020-03-02 11:20:34 +01:00
Andreas Kling
678c87087d Kernel: Load executables on demand when symbolicating
Previously we would map the entire executable of a program in its own
address space (but make it unavailable to userspace code.)

This patch removes that and changes the symbolication code to remap
the executable on demand (and into the kernel's own address space
instead of the process address space.)

This opens up a couple of further simplifications that will follow.
2020-03-02 11:20:34 +01:00
Andreas Kling
0acac186fb Kernel: Make the "entire executable" region shared
This makes Region::clone() do the right thing with it on fork().
2020-03-02 06:13:29 +01:00
Andreas Kling
5c2a296a49 Kernel: Mark read-only PT_LOAD mappings as shared regions
This makes Region::clone() do the right thing for these now that we
differentiate based on Region::is_shared().
2020-03-01 21:26:36 +01:00
Andreas Kling
ecfde5997b Kernel: Use SharedInodeVMObject for executables after all
I had the wrong idea about this. Thanks to Sergey for pointing it out!

Here's what he says (reproduced for posterity):

> Private mappings protect the underlying file from the changes made by
> you, not the other way around. To quote POSIX, "If MAP_PRIVATE is
> specified, modifications to the mapped data by the calling process
> shall be visible only to the calling process and shall not change the
> underlying object. It is unspecified whether modifications to the
> underlying object done after the MAP_PRIVATE mapping is established
> are visible through the MAP_PRIVATE mapping." In practice that means
> that the pages that were already paged in don't get updated when the
> underlying file changes, and the pages that weren't paged in yet will
> load the latest data at that moment.
> The only thing MAP_FILE | MAP_PRIVATE is really useful for is mapping
> a library and performing relocations; it's definitely useless (and
> actively harmful for the system memory usage) if you only read from
> the file.

This effectively reverts e2697c2ddd.
2020-03-01 21:16:27 +01:00
Andreas Kling
bb7dd63f74 Kernel: Run clang-format on Process.cpp 2020-03-01 21:16:27 +01:00
Andreas Kling
687b52ceb5 Kernel: Name perfcore files "perfcore.PID"
This way we can trace many things and we get one perfcore file per
process instead of everyone trying to write to "perfcore"
2020-03-01 20:59:02 +01:00
Andreas Kling
fee20bd8de Kernel: Remove some more harmless InodeVMObject miscasts 2020-03-01 12:27:03 +01:00
Andreas Kling
95e3aec719 Kernel: Fix harmless type miscast in Process::amount_clean_inode() 2020-03-01 11:23:23 +01:00
Andreas Kling
e2697c2ddd Kernel: Use PrivateInodeVMObject for loading program executables
This will be a memory usage pessimization until we actually implement
CoW sharing of the memory pages with SharedInodeVMObject.

However, it's a huge architectural improvement, so let's take it and
improve on this incrementally.

fork() should still be neutral, since all private mappings are CoW'ed.
2020-03-01 11:23:10 +01:00
Andreas Kling
88b334135b Kernel: Remove some Region construction helpers
It's now up to the caller to provide a VMObject when constructing a new
Region object. This will make it easier to handle things going wrong,
like allocation failures, etc.
2020-03-01 11:23:10 +01:00
Andreas Kling
4badef8137 Kernel: Return bytes written if sys$write() fails after writing some
If we wrote anything we should just inform userspace that we did,
and not worry about the error code. Userspace can call us again if
it wants, and we'll give them the error then.
2020-02-29 18:42:35 +01:00
Andreas Kling
7cd1bdfd81 Kernel: Simplify some dbg() logging
We don't have to log the process name/PID/TID, dbg() automatically adds
that as a prefix to every line.

Also we don't have to do .characters() on Strings passed to dbg() :^)
2020-02-29 13:39:06 +01:00
Andreas Kling
8fbdda5a2d Kernel: Implement basic support for sys$mmap() with MAP_PRIVATE
You can now mmap a file as private and writable, and the changes you
make will only be visible to you.

This works because internally a MAP_PRIVATE region is backed by a
unique PrivateInodeVMObject instead of using the globally shared
SharedInodeVMObject like we always did before. :^)

Fixes #1045.
2020-02-28 23:25:00 +01:00
Andreas Kling
aa1e209845 Kernel: Remove some unnecessary indirection in InodeFile::mmap()
InodeFile now directly calls Process::allocate_region_with_vmobject()
instead of taking an awkward detour via a special Region constructor.
2020-02-28 20:29:14 +01:00
Andreas Kling
651417a085 Kernel: Split InodeVMObject into two subclasses
We now have PrivateInodeVMObject and SharedInodeVMObject, corresponding
to MAP_PRIVATE and MAP_SHARED respectively.

Note that PrivateInodeVMObject is not used yet.
2020-02-28 20:20:35 +01:00
Andreas Kling
07a26aece3 Kernel: Rename InodeVMObject => SharedInodeVMObject 2020-02-28 20:07:51 +01:00
Andreas Kling
5af95139fa Kernel: Make Process::m_master_tls_region a WeakPtr
Let's not keep raw Region* variables around like that when it's so easy
to avoid it.
2020-02-28 14:05:30 +01:00
Andreas Kling
b0623a0c58 Kernel: Remove SmapDisabler in sys$connect() 2020-02-28 13:20:26 +01:00
Andreas Kling
dcd619bd46 Kernel: Merge the shbuf_get_size() syscall into shbuf_get()
Add an extra out-parameter to shbuf_get() that receives the size of the
shared buffer. That way we don't need to make a separate syscall to
get the size, which we always did immediately after.
2020-02-28 12:55:58 +01:00
Andreas Kling
f72e5bbb17 Kernel+LibC: Rename shared buffer syscalls to use a prefix
This feels a lot more consistent and Unixy:

    create_shared_buffer()   => shbuf_create()
    share_buffer_with()      => shbuf_allow_pid()
    share_buffer_globally()  => shbuf_allow_all()
    get_shared_buffer()      => shbuf_get()
    release_shared_buffer()  => shbuf_release()
    seal_shared_buffer()     => shbuf_seal()
    get_shared_buffer_size() => shbuf_get_size()

Also, "shared_buffer_id" is shortened to "shbuf_id" all around.
2020-02-28 12:55:58 +01:00
Liav A
db23703570 Process: Use dbg() instead of dbgprintf()
Also, fix a bad derefernce in sys$create_shared_buffer() method.
2020-02-27 13:05:12 +01:00
Andreas Kling
4997dcde06 Kernel: Always disable interrupts in do_killpg()
Will caught an assertion when running "kill 9999999999999" :^)
2020-02-27 11:05:16 +01:00
Andreas Kling
4a293e8a21 Kernel: Ignore signals sent to threadless (zombie) processes
If a process doesn't have any threads left, it's in a zombie state and
we can't meaningfully send signals to it. So just ignore them.

Fixes #1313.
2020-02-27 11:04:15 +01:00
Andreas Kling
0c1497846e Kernel: Don't allow profiling a dead process
Work towards #1313.
2020-02-27 10:42:31 +01:00
Cristian-Bogdan SIRB
05ce8586ea Kernel: Fix ASSERTION failed in join_thread syscall
set_interrupted_by_death was never called whenever a thread that had
a joiner died, so the joiner remained with the joinee pointer there,
resulting in an assertion fail in JoinBlocker: m_joinee pointed to
a freed task, filled with garbage.

Thread::current->m_joinee may not be valid after the unblock

Properly return the joinee exit value to the joiner thread.
2020-02-27 10:09:44 +01:00
Andreas Kling
d28fa89346 Kernel: Don't assert on sys$kill() with pid=INT32_MIN
On 32-bit platforms, INT32_MIN == -INT32_MIN, so we can't expect this
to always work:

    if (pid < 0)
        positive_pid = -pid; // may still be negative!

This happens because the -INT32_MIN expression becomes a long and is
then truncated back to an int.

Fixes #1312.
2020-02-27 10:02:04 +01:00
Cristian-Bogdan SIRB
717cd5015e Kernel: Allow process with multiple threads to call exec and exit
This allows a process wich has more than 1 thread to call exec, even
from a thread. This kills all the other threads, but it won't wait for
them to finish, just makes sure that they are not in a running/runable
state.

In the case where a thread does exec, the new program PID will be the
thread TID, to keep the PID == TID in the new process.

This introduces a new function inside the Process class,
kill_threads_except_self which is called on exit() too (exit with
multiple threads wasn't properly working either).

Inside the Lock class, there is the need for a new function,
clear_waiters, which removes all the waiters from the
Process::big_lock. This is needed since after a exit/exec, there should
be no other threads waiting for this lock, the threads should be simply
killed. Only queued threads should wait for this lock at this point,
since blocked threads are handled in set_should_die.
2020-02-26 13:06:40 +01:00
Andreas Kling
ceec1a7d38 AK: Make Vector use size_t for its size and capacity 2020-02-25 14:52:35 +01:00
Andreas Kling
d0f5b43c2e Kernel: Use Vector::unstable_remove() when deallocating a region
Process::m_regions is not sorted, so we can use unstable_remove()
to avoid shifting the vector contents. :^)
2020-02-24 18:34:49 +01:00
Andreas Kling
30a8991dbf Kernel: Make Region weakable and use WeakPtr<Region> instead of Region*
This turns use-after-free bugs into null pointer dereferences instead.
2020-02-24 13:32:45 +01:00
Andreas Kling
79576f9280 Kernel: Clear the region lookup cache on exec()
Each process has a 1-level lookup cache for fast repeated lookups of
the same VM region (which tends to be the majority of lookups.)
The cache is used by the following syscalls: munmap, madvise, mprotect
and set_mmap_name.

After a succesful exec(), there could be a stale Region* in the lookup
cache, and the new executable was able to manipulate it using a number
of use-after-free code paths.
2020-02-24 12:37:27 +01:00
Liav A
895e874eb4 Kernel: Include the new PIT class in system components 2020-02-24 11:27:03 +01:00
Andreas Kling
fc5ebe2a50 Kernel: Disown shared buffers on sys$execve()
When committing to a new executable, disown any shared buffers that the
process was previously co-owning.

Otherwise accessing the same shared buffer ID from the new program
would cause the kernel to find a cached (and stale!) reference to the
previous program's VM region corresponding to that shared buffer,
leading to a Region* use-after-free.

Fixes #1270.
2020-02-22 12:29:38 +01:00
Andreas Kling
ece2971112 Kernel: Disable profiling during the critical section of sys$execve()
Since we're gonna throw away these stacks at the end of exec anyway,
we might as well disable profiling before starting to mess with the
process page tables. One less weird situation to worry about in the
sampling code.
2020-02-22 11:09:03 +01:00
Andreas Kling
d7a13dbaa7 Kernel: Reset profiling state on exec() (but keep it going)
We now log the new executable on exec() and throw away all the samples
we've accumulated so far. But profiling keeps going.
2020-02-22 10:54:50 +01:00
Andreas Kling
2a679f228e Kernel: Fix bitrotted DEBUG_IO logging 2020-02-21 15:49:30 +01:00
Andreas Kling
bead20c40f Kernel: Remove SmapDisabler in sys$create_shared_buffer() 2020-02-18 14:12:39 +01:00
Andreas Kling
9aa234cc47 Kernel: Reset FPU state on exec() 2020-02-18 13:44:27 +01:00
Andreas Kling
a7dbb3cf96 Kernel: Use a FixedArray for a process's extra GIDs
There's not really enough of these to justify using a HashTable.
2020-02-18 11:35:47 +01:00
Andreas Kling
48f7c28a5c Kernel: Replace "current" with Thread::current and Process::current
Suggested by Sergey. The currently running Thread and Process are now
Thread::current and Process::current respectively. :^)
2020-02-17 15:04:27 +01:00
Andreas Kling
4f4af24b9d Kernel: Tear down process address space during finalization
Process teardown is divided into two main stages: finalize and reap.

Finalization happens in the "Finalizer" kernel and runs with interrupts
enabled, allowing destructors to take locks, etc.

Reaping happens either in sys$waitid() or in the scheduler for orphans.

The more work we can do in finalization, the better, since it's fully
pre-emptible and reduces the amount of time the system runs without
interrupts enabled.
2020-02-17 14:33:06 +01:00
Andreas Kling
31e1af732f Kernel+LibC: Allow sys$mmap() callers to specify address alignment
This is exposed via the non-standard serenity_mmap() call in userspace.
2020-02-16 12:55:56 +01:00
Andreas Kling
7a8be7f777 Kernel: Remove SmapDisabler in sys$accept() 2020-02-16 08:20:54 +01:00
Andreas Kling
7717084ac7 Kernel: Remove SmapDisabler in sys$clock_gettime() 2020-02-16 08:13:11 +01:00
Andreas Kling
16818322c5 Kernel: Reduce header dependencies of Process and Thread 2020-02-16 02:01:42 +01:00
Andreas Kling
e28809a996 Kernel: Add forward declaration header 2020-02-16 01:50:32 +01:00
Andreas Kling
1d611e4a11 Kernel: Reduce header dependencies of MemoryManager and Region 2020-02-16 01:33:41 +01:00