Also, duplicate data in dbg() and klog() calls were removed.
In addition, leakage of virtual address to kernel log is prevented.
This is done by replacing kprintf() calls to dbg() calls with the
leaked data instead.
Also, other kprintf() calls were replaced with klog().
This was only used by the mechanism for mapping executables into each
process's own address space. Now that we remap executables on demand
when needed for symbolication, this can go away.
Previously we would map the entire executable of a program in its own
address space (but make it unavailable to userspace code.)
This patch removes that and changes the symbolication code to remap
the executable on demand (and into the kernel's own address space
instead of the process address space.)
This opens up a couple of further simplifications that will follow.
I had the wrong idea about this. Thanks to Sergey for pointing it out!
Here's what he says (reproduced for posterity):
> Private mappings protect the underlying file from the changes made by
> you, not the other way around. To quote POSIX, "If MAP_PRIVATE is
> specified, modifications to the mapped data by the calling process
> shall be visible only to the calling process and shall not change the
> underlying object. It is unspecified whether modifications to the
> underlying object done after the MAP_PRIVATE mapping is established
> are visible through the MAP_PRIVATE mapping." In practice that means
> that the pages that were already paged in don't get updated when the
> underlying file changes, and the pages that weren't paged in yet will
> load the latest data at that moment.
> The only thing MAP_FILE | MAP_PRIVATE is really useful for is mapping
> a library and performing relocations; it's definitely useless (and
> actively harmful for the system memory usage) if you only read from
> the file.
This effectively reverts e2697c2ddd.
This patch reduces the number of code paths that lead to the allocation
of a Region object. It's quite hard to follow the various ways in which
this can happen, so this is an effort to simplify.
When stopping a thread with the SIGSTOP signal, we now store the thread
state in Thread::m_stop_state. That state is then restored on SIGCONT.
This fixes an issue where previously-blocked threads would unblock
upon resume. Now they simply resume in the Blocked state, and it's up
to the regular unblocking mechanism to unblock them.
Fixes#1326.
This will be a memory usage pessimization until we actually implement
CoW sharing of the memory pages with SharedInodeVMObject.
However, it's a huge architectural improvement, so let's take it and
improve on this incrementally.
fork() should still be neutral, since all private mappings are CoW'ed.
It's now up to the caller to provide a VMObject when constructing a new
Region object. This will make it easier to handle things going wrong,
like allocation failures, etc.
When forking a process, we now turn all of the private inode-backed
mmap() regions into copy-on-write regions in both the parent and child.
This patch also removes an assertion that becomes irrelevant.
If we wrote anything we should just inform userspace that we did,
and not worry about the error code. Userspace can call us again if
it wants, and we'll give them the error then.
We don't have to log the process name/PID/TID, dbg() automatically adds
that as a prefix to every line.
Also we don't have to do .characters() on Strings passed to dbg() :^)
The IRQController object is RefCounted, and is shared between the
InterruptManagement class & IRQ handlers' classes.
IRQHandler, SharedIRQHandler & SpuriousInterruptHandler classes
use a responsible IRQ controller directly instead of calling
InterruptManagement for disable(), enable() or eoi().
Also, the initialization process of InterruptManagement is
simplified, so it doesn't rely on an ACPI parser to be initialized.
More namespaces have been added to organize the declarations
in a more sensible way.
Also, a namespace StaticParsing has been added to allow early
access to ACPI tables.
You can now mmap a file as private and writable, and the changes you
make will only be visible to you.
This works because internally a MAP_PRIVATE region is backed by a
unique PrivateInodeVMObject instead of using the globally shared
SharedInodeVMObject like we always did before. :^)
Fixes#1045.
We now have PrivateInodeVMObject and SharedInodeVMObject, corresponding
to MAP_PRIVATE and MAP_SHARED respectively.
Note that PrivateInodeVMObject is not used yet.
Add an extra out-parameter to shbuf_get() that receives the size of the
shared buffer. That way we don't need to make a separate syscall to
get the size, which we always did immediately after.
This feels a lot more consistent and Unixy:
create_shared_buffer() => shbuf_create()
share_buffer_with() => shbuf_allow_pid()
share_buffer_globally() => shbuf_allow_all()
get_shared_buffer() => shbuf_get()
release_shared_buffer() => shbuf_release()
seal_shared_buffer() => shbuf_seal()
get_shared_buffer_size() => shbuf_get_size()
Also, "shared_buffer_id" is shortened to "shbuf_id" all around.
Now we check before we set a FBResolution if the BXVGA device is capable
of setting the requested resolution.
If not, we revert the resolution to the previous one and return an error
to userspace.
Fixes#451.
Now the DMIDecoder code is more safer, because we don't use raw pointers
or references to objects or data that are located in the physical
address space, so an accidental dereference cannon happen easily.
Instead, we use the PhysicalAddress class to represent those addresses.
Also, the initializer_parser() method is simplified.
syscall_handler was not actually updating the value in regs->eax, so the
gettid() was always returning 85: the value of regs->eax was not
actually updated, and it remained the one from Userland (the value of
SC_gettid).
The syscall_handler was modified to actually get a pointer to
RegisterState, so any changes to it will actually be saved.
NOTE: This was actually more of a compiler optimization:
On the SC_gettid flow, we saved in regs.eax the return value of
sys$gettid(), but the compiler discarded it, since it followed a return.
On a normal flow, the value of regs.eax was reused in
tracer->did_syscall, so the compiler actually updated the value.
set_interrupted_by_death was never called whenever a thread that had
a joiner died, so the joiner remained with the joinee pointer there,
resulting in an assertion fail in JoinBlocker: m_joinee pointed to
a freed task, filled with garbage.
Thread::current->m_joinee may not be valid after the unblock
Properly return the joinee exit value to the joiner thread.
On 32-bit platforms, INT32_MIN == -INT32_MIN, so we can't expect this
to always work:
if (pid < 0)
positive_pid = -pid; // may still be negative!
This happens because the -INT32_MIN expression becomes a long and is
then truncated back to an int.
Fixes#1312.
It was possible to send signals to processes that you were normally not
allowed to send signals to, by calling ioctl(tty, TIOCSPGRP, targetpid)
and then generating one of the TTY-related signals on the calling
process's TTY (e.g by pressing ^C, ^Z, etc.)
This allows a process wich has more than 1 thread to call exec, even
from a thread. This kills all the other threads, but it won't wait for
them to finish, just makes sure that they are not in a running/runable
state.
In the case where a thread does exec, the new program PID will be the
thread TID, to keep the PID == TID in the new process.
This introduces a new function inside the Process class,
kill_threads_except_self which is called on exit() too (exit with
multiple threads wasn't properly working either).
Inside the Lock class, there is the need for a new function,
clear_waiters, which removes all the waiters from the
Process::big_lock. This is needed since after a exit/exec, there should
be no other threads waiting for this lock, the threads should be simply
killed. Only queued threads should wait for this lock at this point,
since blocked threads are handled in set_should_die.
Each process has a 1-level lookup cache for fast repeated lookups of
the same VM region (which tends to be the majority of lookups.)
The cache is used by the following syscalls: munmap, madvise, mprotect
and set_mmap_name.
After a succesful exec(), there could be a stale Region* in the lookup
cache, and the new executable was able to manipulate it using a number
of use-after-free code paths.
Now the ACPI & PCI code is more safer, because we don't use raw pointers
or references to objects or data that are located in the physical
address space, so an accidental dereference cannot happen easily.
Instead, we use the PhysicalAddress class to represent those addresses.
Now we use the GenericInterruptHandler class instead of IRQHandler in
the CPU functions.
This commit adds an include to the ISR stub macros header file.
Also, this commit adds support for IRQ sharing, so when an IRQHandler
will try to register to already-assigned IRQ number, a SharedIRQHandler
will be created to register both IRQHandlers.
Also, the enable() function is now correct and will use the right
registers and values. In addition to that, write_register() and
read_registers() are not relying on identity mapping anymore.
Also, update the class implementation to use PCI::Device class
accordingly.
The create() helper will now search for an IDE controller in the
PCI bus, allowing to simplify the initialize() method.
This class represents a shared interrupt handler. This class will not be
created automatically but only if two IRQ Handlers are sharing the same
IRQ number.
The GenericInterruptHandler class will be used to represent
an abstract interrupt handler. The InterruptManagement class will
represent a centralized component to manage interrupts.
I probably would've done INI config removal in another commit, but it
fit well here because I didn't want to pledge wpath for SystemMenu if I
didn't need to.
Frankly, that's something that I think should be done: allow ConfigFile
to be used read-only.
Each allocation header was tracking its index into the chunk bitmap,
but that index can be computed from the allocation address anyway.
Removing this means that each allocation gets 4 more bytes of memory
and this avoids allocating an extra chunk in many cases. :^)