We need to use the volatile keyword when mapping the device registers,
or the compiler may optimize access, which lead to this QEMU error:
pci_nvme_ub_mmiord_toosmall in nvme_mmio_read: MMIO read smaller than
32-bits, offset=0x0
This modifies sys$chown to allow specifying whether or not to follow
symlinks and in which directory.
This was then used to implement lchown and fchownat in LibC and LibCore.
Add a basic NVMe driver support to serenity
based on NVMe spec 1.4.
The driver can support multiple NVMe drives (subsystems).
But in a NVMe drive, the driver can support one controller
with multiple namespaces.
Each core will get a separate NVMe Queue.
As the system lacks MSI support, PIN based interrupts are
used for IO.
Tested the NVMe support by replacing IDE driver
with the NVMe driver :^)
Calls to link_up() in the E1000 driver would read the link state
directly from the hardware on every call. This had negative
performance impact in high throughput situations since link_up()
is called every time an IP packet's route is resolved.
This patch takes inspiration from the RTL8139 network adapter where
the link state is stored in a bool and only updated when the hardware
generates an interrupt related to link state change.
After this change I measured a ~9% increase in TCP Tx throughput
using:
cat /dev/zero | nc <host_IP> <host_port> from the Serenity VM to my
host machine
By using the binary from our build of binutils, we can be sure that `nm`
supports demangling symbols, so we can avoid spawning a separate
`c++filt` process.
This change adds a thread member variable to track if we have a pending
promise violation on a kernel thread. This ensures that all code
properly propagates promise violations up to the syscall handler.
Suggested-by: Andreas Kling <kling@serenityos.org>
Previously we would crash the process immediately when a promise
violation was found during a syscall. This is error prone, as we
don't unwind the stack. This means that in certain cases we can
leak resources, like an OwnPtr / RefPtr tracked on the stack. Or
even leak a lock acquired in a ScopeLockLocker.
To remedy this situation we move the promise violation handling to
the syscall handler, right before we return to user space. This
allows the code to follow the normal unwind path, and grantees
there is no longer any cleanup that needs to occur.
The Process::require_promise() and Process::require_no_promises()
functions were modified to return ErrorOr<void> so we enforce that
the errors are always propagated by the caller.
This change lays the foundation for making the require_promise return
an error hand handling the process abort outside of the syscall
implementations, to avoid cases where we would leak resources.
It also has the advantage that it makes removes a gs pointer read
to look up the current thread, then process for every syscall. We
can instead go through the Process this pointer in most cases.
This change lays the foundation for making the require_promise return
an error hand handling the process abort outside of the syscall
implementations, to avoid cases where we would leak resources.
It also has the advantage that it makes removes a gs pointer read
to look up the current thread, then process for every syscall. We
can instead go through the Process this pointer in most cases.
I fell into this trap and tried to switch the syscalls to pass by
the `off_t` by register. I think it makes sense to add a clarifying
comment for future readers of the code, so they don't fall into the
same trap. :^)
This is required for SlavePTY's custom unref handler to function
correctly, as otherwise a SlavePTY held in a File RefPtr would call
the base's (RefCounted<>) unref method instead of SlavePTY's version.
It looks like type types are small enough that there is no padding.
So there didn't happen to be an info leak here, but lets zero initialize
just to be on the safe side, and make auditing easier.
In `sys$accept4()` and `get_sock_or_peer_name()` we were not
initializing the padding of the `sockaddr_un` struct, leading to
an kernel information leak if the
caller looked back at it's contents.
Before Fix:
37.766 Clipboard(11:11): accept4 Bytes:
2f746d702f706f7274616c2f636c6970626f61726440eac130e7fbc1e8abbfc
19c10ffc18440eac15485bcc130e7fbc1549feaca6c9deaca549feaca1bb0bc
03efdf62c0e056eac1b402d7acd010ffc14602000001b0bc030100000050bf0
5c24602000001e7fbc1b402d7ac6bdc
After Fix:
0.603 Clipboard(11:11): accept4 Bytes:
2f746d702f706f7274616c2f636c6970626f617264000000000000000000000
000000000000000000000000000000000000000000000000000000000000000
000000000000000000000000000000000000000000000000000000000000000
0000000000000000000000000000000
In FB_IOCTL_GET_PROPERTIES we were not initializing the padding of the
struct, leading to the potential of an kernel information leak if the
caller looked back at it's contents.
Lets just be extra paranoid and zero initialize all these structs
in we store on the stack while handling ioctls(..).
... instead of returning the maximum number of Processor objects that we
can allocate.
Some ports (e.g. gdb) rely on this information to determine the number
of worker threads to spawn. When gdb spawned 64 threads, the kernel
could not cope with generating backtraces for it, which prevented us
from debugging it properly.
This commit also removes the confusingly named
`Processor::processor_count` function so that this mistake can't happen
again.
Since RefCounted automatically calls a method named `will_be_destoyed`
on classes that have one, so there's no need to have a custom
implementation of unref in File.
This will allow File and it's descendants to use RefCounted instead of
having a custom implementation of unref. (Since RefCounted calls
will_be_destroyed automatically)
This commit also removes an erroneous call to `before_removing` in
AHCIPort, this is a duplicate call, as the only reference to the device
is immediately dropped following the call, which in turns calls
`before_removing` via File::unref.
Custody's unref is one of many implementions of ListedRefCounted's
behaviour in the Kernel, which results in avoidable bugs caused by
the fragmentation of the implementations. This commit starts the work
of replacing all custom implementations with ListedRefCounted by
porting Custody to it.
Add a kernel data segment and make the user code segment come after
the data segment. We need the GDT to be in a certain order to support
the syscall and sysret instruction pair.
We've finally gotten kmalloc to a point where it feels decent enough
to drop this comment.
There's still a lot of room for improvement, and we'll continue working
on it.
This was a premature optimization from the early days of SerenityOS.
The eternal heap was a simple bump pointer allocator over a static
byte array. My original idea was to avoid heap fragmentation and improve
data locality, but both ideas were rooted in cargo culting, not data.
We would reserve 4 MiB at boot and only ended up using ~256 KiB, wasting
the rest.
This patch replaces all kmalloc_eternal() usage by regular kmalloc().
Since a socket can be accessed by multiple threads concurrently, we need
to protect shared data behind the socket mutex.
There's very likely more places where we need to fix this, the purpose
of this patch is to fix a VERIFY() failure in getsockopt() seen on CI.
We were doing this dance in notify_watchers():
set_metadata_dirty(true);
set_metadata_dirty(false);
This was done in order to force out inode watcher events immediately.
Unfortunately, this was racy, as if SyncTask got scheduled at the wrong
moment, it would try to flush metadata for a clean inode. This then got
trapped by the VERIFY() statement in Inode::sync_all():
VERIFY(inode.is_metadata_dirty());
This patch fixes the issue by replacing notify_watchers() with lazy
metadata notifications like all other filesystems.
Much like the existing in6addr_any global and the IN6ADDR_ANY_INIT
macro, our LibC is also expected to export the in6addr_loopback global
and the IN6ADDR_LOOPBACK_INIT constant.
These were found by the stress-ng port.