Instead of signalling allocation failure with a bool return value
(false), we now use ErrorOr<void> and return ENOMEM as appropriate.
This allows us to use TRY() and MUST() with Vector. :^)
In particular, fstatvfs used to assume that a file that was earlier
opened using some path will forever be at that path. This is wrong, and
in the meantime new mounts and new filesystems could take up the
filename or directories, leading to a completely inaccurate result.
This commit improves the situation:
- All filesystem information is now always accurate.
- The mount flags *might* be erroneously zero, if the custody for the
open file is not available. I don't know when that might happen, but
it is definitely not the typical case.
We now use AK::Error and AK::ErrorOr<T> in both kernel and userspace!
This was a slightly tedious refactoring that took a long time, so it's
not unlikely that some bugs crept in.
Nevertheless, it does pass basic functionality testing, and it's just
real nice to finally see the same pattern in all contexts. :^)
The Qemu I8042 controller does not send one IRQ per event, it sends
over four since it will not stop trying to emulate the PS/2 mouse.
If the VMWare backdoor is active, a fake I8042 mouse event will be sent
that we can then use to check if there are VMWare mouse events present.
However, we were only processing one mouse event at a time, even though
multiple events could have been queued up. Luckily this does not often
lead to issues, since after the first IRQ we would still get three
additional interrupts that would then empty the queue.
This change makes sure we always empty the event queue immediately,
instead of waiting on the next interrupt to happen. Functionally this
changes nothing - it could merely improve latency by not waiting for
new interrupts to come in.
Coincidently, this brings our implementation closer to how Linux deals
with the VMMouse.
This reverts commit 4131b35851.
We're swallowing way too many mouse events from QEMU with this code
enabled. Something is not right, so let's revert it for now.
This removes some code dupe from the constructors.
By removing this duplicate constructor we can utilize the main
VirtualConsole::create factory implementation and call that from the
VirtualConsole::create_with_preset_log factory method.
These are constants, they don't need to be dynamically allocated.
Another minor step towards removing `AK::String` from the Kernel
and improving OOM safety.
The goal was to reduce common setup of messages. Changes:
* MailBox turned into singleton to follow existing patterns
* Removed device specific messages from MailBox requiring
clients to know the details instead
* Created base Message class which clients should deriver from
It really simplify the usage for more complicated message queues
like framebuffer setup - see followup commits.
This FIXME does not seem to apply anymore. Yes, symbolic links in all
filesystems appear to be slightly broken, but that has nothing to do
with File::absolute_path. Let's remove the wrong FIXME instead of adding
to the confusion.
Found due to smelly code in InodeFile::absolute_path.
In particular, this replaces the following misleading methods:
File::absolute_path
This method *never* returns an actual path, and if called on an
InodeFile (which is impossible), it would VERIFY_NOT_REACHED().
OpenFileDescription::try_serialize_absolute_path
OpenFileDescription::absolute_path
These methods do not guarantee to return an actual path (just like the
other method), and just like Custody::absolute_path they do not
guarantee accuracy. In particular, just renaming the method made a
TOCTOU bug obvious.
The new method signatures use KResultOr, just like
try_serialize_absolute_path() already did.
We were accidentally casting the pointer to m_ttl from an u8* to an int*
which resulted in copying of 3 extra unrelated bytes (which turned out
to be padding in this case).
We create a base class called GenericFramebufferDevice, which defines
all the virtual functions that must be implemented by a
FramebufferDevice. Then, we make the VirtIO FramebufferDevice and other
FramebufferDevice implementations inherit from it.
The most important consequence of rearranging the classes is that we now
have one IOCTL method, so all drivers should be committed to not
override the IOCTL method or make their own IOCTLs of FramebufferDevice.
All graphical IOCTLs are known to all FramebufferDevices, and it's up to
the specific implementation whether to support them or discard them (so
we require extensive usage of KResult and KResultOr, together with
virtual characteristic functions).
As a result, the interface is much cleaner and understandable to read.
As suggested by @ccapitalK, it makes the interface more neat and clean.
The proper order is to get ScanoutID first, then ResourceID and after it
everything else that is needed to complete the operation successfully.
A VirtIO graphics adapter is really the VirtIO GPU, so the virtualized
hardware has no distinction between both components so there's no
need to put such distinction in software.
We might need to split things in the future, but when we do so, we must
take proper care to ensure that the interface between the components
is correct and use the usual codebase patterns.
We never used that type method except in initialization in
GraphicsManagement, and we used it there to query whether the device is
VGA compatible or not.
'bootmode' now only controls which set of services are started by
SystemServer, so it is more appropriate to rename it to system_mode, and
no longer validate it in the Kernel.
Bootmode used to control framebuffers, panic behavior, and SystemServer.
This patch factors framebuffer control into a separate flag.
Note that the combination 'bootmode=self-test fbdev=on' leads to
unexpected behavior, which can only be fixed in a later commit.
Some ports (like `bc` with history enabled) sensibly set the termios
character size to 8 bits.
Previously, we left the character size value (given by the bitmask
CSIZE) as zero by default (meaning 5 bits per character), and returned
ENOTIMPL whenever someone modified it. This was dumb.
Bit 3 is set here:
c5b2f55981/hw/input/ps2.c (L736)
Spurious mouse packets can be received without this bit set, for
example when double-clicking and keeping the mouse button depressed
instead of releasing it the second time (i.e. mousedown > mouseup >
mousedown). We should not process such packets.
This makes interaction with our buttons much smoother!
Fixes#5881.
Instead of detecting which flag was set in the status register, we can
use the instrument type passed to us. This works because the mouse and
keyboard use different IRQs.
The System V ABI requires that the stack is 16-byte aligned on function
call. Confusingly, however, they mean that the stack must be aligned
this way **before** the `CALL` instruction is executed. That instruction
pushes the return value onto the stack, so the callee will actually see
the stack pointer as a value `sizeof(FlatPtr)` smaller.
The signal trampoline was written with this in mind, but `setup_stack`
aligned the entire stack, *including the return address* to a 16-byte
boundary. Because of this, the trampoline subtracted too much from the
stack pointer, thus misaligning it.
This was not a problem on i686 because we didn't execute any
instructions from signal handlers that would require memory operands to
be aligned to more than 4 bytes. This is not the case, however, on
x86_64, where SSE instructions are enabled by default and they require
16-byte aligned operands. Running such instructions raised a GP fault,
immediately killing the offending program with a SIGSEGV signal.
This issue caused TestKernelAlarm to fail in LibC when ran locally, and
at one point, the zsh port was affected too.
Fixes#9291
Instead, just ensure we pick the m_access_lock and then m_scan_lock when
doing a scan/re-scan of the PCI configuration space so we know nobody
can actually access the PCI configuration space during the scan.
The m_scan_lock is now a Spinlock, to ensure we cannot yield to other
process while we do the PCI configuration space scanning.
This small change simplifies the function a bit but also fixes a problem
with it.
Let's take an example to see this:
Let's say we have a reserved range between 0xe0000 to 0xfffff (EBDA),
then we want to map from the memory device (/dev/mem) the entire
EBDA to a program. If a program tries to map more than 131072 bytes,
the current logic will work - the start address is 0xe0000, and ofcourse
it's below the limit, hence it passes the first two restrictions.
Then, the third if statement will fail if we try to mmap more than
the said allowed bytes.
However, let's take another scenario, where we try to mmap from
0xf0000 - but we try to mmap less than 131072 - but more than 65536.
In such case, we again pass the first two if statements, but the third
one is passed two, because it doesn't take into account the offseted
address from the start of the reserved range (0xe0000). In such case,
a user can easily mmap 65535 bytes above 0x100000. This might
seem negligible. However, it's still a severe bug that can theoretically
be exploited into a info leak or tampering with important kernel
structures.
Storing assigning a string literal to a String object just to pass it to
a function expecting a StringView is wasteful. Let's just not do that.
For consistency's sake, this commit changes all of the other invocations
to use StringView literals, too.
Forcing the formatting to go through `Formatter<FormatString>` is
completely unnecessary, increases code size, performs a String
allocation and prevents us from using the formatting options available
on that type.
This commit also removes explicit formatters from
`BlockBasedFileSystem::BlockIndex` and `Kernel::InodeIndex`, as those
are already covered by the blanket implementation for all
`DistinctNumeric` types.
This change allows the Kernel to actually construct other interfaces
besides the E1000 type.
This solves a breakage that was introduced recently because of move
semantics.
A couple of points on this patch:
1. In current situation, we can waste time to create a KString and throw
it for nothing. This patch ensures we only create it near construction
point so we know we actually need it.
2. It's very likely to assume that non-x86 machines will expose network
device with a device tree (or with ACPI). The raspberry pi machine is a
good example of that. Therefore, each driver should explicitly ask the
correct interface name generation method, and this patch simplifies this
pattern greatly, especially in a case where the same network device can
appear as a PCI device or as device in another bus type on the same
platform target. For example, the (in)famous ne2000 device can be used
either as a PCI device or as an ISA device, depending on the model.
3. In my opinion, it seems much more readable to construct the name near
calling point of the object constructor than to just pass it with move
semantics.
By enabling LTO for the kernel_heap object too, we open the door for
optimization opportunities that come from (partially) inlining `::new`
or kmalloc. Every software spends a non-trivial amount of its run time
on allocating memory, so hopefully this change will make LTO builds even
faster.
This commit updates the Clang toolchain's version to 13.0.0, which comes
with better C++20 support and improved handling of new features by
clang-format. Due to the newly enabled `-Bsymbolic-functions` flag, our
Clang binaries will only be 2-4% slower than if we dynamically linked
them, but we save hundreds of megabytes of disk space.
The `BuildClang.sh` script has been reworked to build the entire
toolchain in just three steps: one for the compiler, one for GNU
binutils, and one for the runtime libraries. This reduces the complexity
of the build script, and will allow us to modify the CI configuration to
only rebuild the libraries when our libc headers change.
Most of the compile flags have been moved out to a separate CMake cache
file, similarly to how the Android and Fuchsia toolchains are
implemented within the LLVM repo. This provides a nicer interface than
the heaps of command-line arguments.
We no longer build separate toolchains for each architecture, as the
same Clang binary can compile code for multiple targets.
The horrible mess that `SERENITY_CLANG_ARCH` was, has been removed in
this commit. Clang happily accepts an `i686-pc-serenity` target triple,
which matches what our GCC toolchain accepts.
LLD fails to define the _GLOBAL_OFFSET_TABLE_ symbol if all inputs to it
are LLVM bitcode files (i.e. those used for LTO). To allow the kernel to
be built with ThinLTO, the workaround suggested in the original LLVM bug
report (<https://bugs.llvm.org/show_bug.cgi?id=39634>) is added in this
commit.
ProcFSGlobalInode now calls `write_bytes()`, `truncate()` and
`set_mtime()` on its associated component. This allows us to write 0 or
1 to a ProcFSSystemBoolean component to toggle a boolean value.
Spinlocks are tied to the platform they are built for, this is why they
have been moved into the Arch folder. They are still available via
"Locking/Spinlock.h"
An Aarch64 stub has been created
A new RegisterState header includes the platform specific RegisterState
header based on the platform being compiled.
The Aarch64 RegisterState header contains stubs for Debug
When booting on RPI3 firmware puts CPU in EL2 mode which is
different from QEMU's default EL3.
I've added logic to discover initial mode at boot
and then act accordingly. This results in Serenity corectly
switching to EL1 on target hardware now.
The platform independent Processor.h file includes the shared processor
code and includes the specific platform header file.
All references to the Arch/x86/Processor.h file have been replaced with
a reference to Arch/Processor.h.
The OpenFileDescription class already offers the necessary functionlity,
so implementing this was only a matter of following the structure for
`read` while handling the additional `offset` argument.
Having these bits of code factored out not only prevents duplication
now, but will also allow us to implement pread without repeating
ourselves (too much).
Values in `ioctl` are given through a pointer, but ioctl's FIONBIO
implementation was interpreting this pointer as an integer directly.
This meant that programs using `ioctl` to set a file descriptor in
blocking mode met with incorrect behavior: they passed a non-null
pointer pointing to a value of 0, but the kernel interpreted the pointer
as a non-zero integer, thus making the file non-blocking.
This commit fixes this behavior by reading the value from the userspace
pointer and using that to set the non-blocking flag on the file
descriptor.
This bug was found while trying to run the openssl tool on serenity,
which used `ioctl` to ensure newly-created sockets are in blocking mode.
Normally, trying to truncate a SysFSInode should result in EPERM error.
However, as suggested by Ali (@alimpfard), we can allow the PowerState
node to be "truncated" so one can open that file with O_TRUNC option.
Likewise, we also need to provide a way to set modified time on SysFS
inodes. For most inodes, we should return ENOTIMPL error, but for the
power state switch, we ignore the modified time setting and just return
KSuccess.
These fixes allow to do "echo -n 1 > /sys/firmware/power_state" in Shell
after gaining root permissions, to switch the power state.
There's basically no real difference in software between a SATA harddisk
and IDE harddisk. The difference in the implementation is for the host
bus adapter protocol and registers layout.
Therefore, there's no point in putting a distinction in software to
these devices.
This change also greatly simplifies and removes stale APIs and removes
unnecessary parameters in constructor calls, which tighten things
further everywhere.
While I was working on LibWeb, I got a page fault at 0xe0e0e0e4.
This indicates a destroyed RefPtr if compiled with SANITIZE_PTRS
defined. However, the page fault handler didn't print out this
indication.
This makes the page fault handler print out a note if the faulting
address looks like a recently destroyed RefPtr, OwnPtr, NonnullRefPtr,
NonnullOwnPtr, ThreadSafeRefPtr or ThreadSafeNonnullRefPtr. It will
only do this if SANITIZE_PTRS is defined, as smart pointers don't get
scrubbed without it being defined.
This exposes a small subset of the information exposed by the Linux
equivalent, and will be used to optimize applications that would like
to know the current CPU usage statistics, but don't want to read all of
the unrelated information in /proc/all
Some time ago, automatic locking was added to the AK smart pointers to
paper over various race conditions in the kernel. Until we've actually
solved the issues in the kernel, we're stuck with the locking.
However, we don't need to punish single-threaded userspace programs with
the high cost of locking. This patch moves the thread-safe variants of
RefPtr, NonnullRefPtr, WeakPtr and RefCounted into Kernel/Library/.
This change is another minor step towards removing `AK::String` from
the Kernel. Instead of dynamically allocating the storage_name we can
instead allocate it via a KString in the factory for each device, and
then push the device name down into the StorageDevice base class.
We don't have a way of doing `AK::String::formatted(..)` with a KString
at the moment, so cleaning that up will be left for a later day.
Previously there was a mix of returning plain strings and returning
explicit string views using `operator ""sv`. This change switches them
all to standardized on `operator ""sv` as it avoids a call to strlen.
Previously there was a mix of returning plain strings and returning
explicit string views using `operator ""sv`. This change switches them
all to standardized on `operator ""sv` as it avoids a call to strlen.
For now, this can only query microseconds since boot.
Use this to print a timestamp every second. This busy-loops
until a second has passed. This might be a good first use of
interrupts soon.
qemu used to not implement this timer at some point, but
it seems to work fine even in qemu now (qemu v 5.2.0).
SonarCloud flagged this "Code Smell", where we are accessing these
static methods as if they are instance methods. While it is technically
possible, it is very confusing to read when you realize they are static
functions.
SonarCloud flagged this "Code Smell", where we are accessing these
static methods as if they are instance methods. While it is technically
possible, it is very confusing to read when you realize they are static
functions.
When a process with a large heap crashes (e.g WebContent), it gets very
cumbersome to dump out a huge amount of memory.
In the vast majority of cases, we're only interested in generating a
nice backtrace from the coredump, so let's have the kernel skip over
userspace heap regions when dumping memory for now.
This is not ideal, and almost a little bit ugly, but it does make
investigating 500 MiB WebContent crashes significantly easier for now.
After building and running
objcopy -O binary Build/aarch64/Kernel/Prekernel/Prekernel \
/media/sdcard/kernel8.img
things start booting on an actual RPi4 :^)
(Assuming the sdcard contains RPi firmware, an empty config.txt,
and no other kernel*.img files).
This allows us to remove the PCI::get_interrupt_line API function. As a
result, this removes a bunch of not so great patterns that we used to
cache PCI interrupt line in many IRQHandler derived classes instead of
just using interrupt_number method of IRQHandler class.
Instead of blindly forcing BGR format on the bochs-display device, let's
ensure we do that only on QEMU bochs-display and not on VirtualBox
graphics adapter too.
- .text now starts at 0x80000, where an actual (non-qemu) RPi expects
- use magic section name ".text.first" to make sure the linker script
puts the kernel entry point at the start of the .text section
- remove a few things from the x86 linker script that aren't needed
for aarch64 (yet?)
This moves Kernel/Prekernel/linker.ld unchanged to
Kernel/Prekernel/Arch/aarch64 and Kernel/Prekernel/Arch/x86.
The aarch64 will change in a future commit.
No behavior change.
Looking at how these two constants are commonly used in other systems,
we should be able to mimic their behavior using our PT_PEEK constant.
For example, see:
https://man.netbsd.org/NetBSD-6.0.1/i386/ptrace.2
This ensures we dont try to hold the PCI Access mutex under IRQ when
printing VirtIO debug logs (which is not allowed and results in an
assertion). This is also relatively free, as it requires no allocations
(we're just storing a pointer to the rodata section).
As a demo, query the firmware version. `Meta/serenity.sh gdb aarch64`
can be used to observe that qemu puts 0x548E1 in x0 in response
to this mailbox message.
This fixes a Kernel Panic where the lazy allocation triggers inside an
ISR and grabs a mutex, which isn't allowed when interrupts are
disabled. This also fixes a bug where the mapping for VirtIO device
BARs is never allocated. #9876
Currently, writing anything to `/dev/mouse0` or `/dev/keyboard0` causes
the Kernel to panic. The reason for this is that
`[Mouse,Keyboard]Device::write` always returns 0, which is explicitly
prohibited by `VERIFY` macro in `Process::sys$write`. The fix seems
trivial; `write` should return EINVAL instead (as is the case with, for
example, `KCOVDevice`).
When testing the RTL8168 driver, it seems we can't allocate super pages
anymore. Either we expand the super pages range, or find a solution to
dynamically expand the range (or let drivers utilize other ranges).
This singleton simplifies many aspects that we struggled with before:
1. There's no need to make derived classes of Device expose the
constructor as public anymore. The singleton is a friend of them, so he
can call the constructor. This solves the issue with try_create_device
helper neatly, hopefully for good.
2. Getting a reference of the NullDevice is now being done from this
singleton, which means that NullDevice no longer needs to use its own
singleton, and we can apply the try_create_device helper on it too :)
3. We can now defer registration completely after the Device constructor
which means the Device constructor is merely assigning the major and
minor numbers of the Device, and the try_create_device helper ensures it
calls the after_inserting method immediately after construction. This
creates a great opportunity to make registration more OOM-safe.
Previously, attempting to call sys$waitid on non-child processes
returned ECHILD.
That prevented debugging non-child processes by attaching to them during
runtime (as opposed to forking and debugging the child, which is what
was previously supported).
We now allow calling sys$waitid on a any process that is being traced
by us, even if it's not our child.
Because we were holding a strong ref to the OpenFileDescription in
LocalSocket and a strong ref to the LocalSocket in Inode, we were
creating a reference cycle in the event of the socket being cleaned up
after the file description did (i.e. unlinking the file before closing
the socket), because the file description never got destructed.
The TimeWait state is intended to prevent another socket from taking the
address tuple in case any packets are still in transit after the final
close. Since this state never delivers packets to userspace, it doesn't
make sense to keep the receive buffer around.