Commit Graph

1376 Commits

Author SHA1 Message Date
Undefine
ab298ca106 Kernel: Dont crash if power states gets set to an invalid value 2023-02-18 23:52:20 +01:00
Ollrogge
361df6eff8 AK: Add conversion functions for packed DOS time format
This also adjusts the FATFS code to use the new functions and removes
the now redundant old conversion functions.
2023-02-12 13:13:15 -07:00
Timothy Flynn
52687814ea Kernel: Explicitly copy Plan9FS read errors to registered delegates 2023-02-10 09:08:52 +00:00
MacDue
63b11030f0 Everywhere: Use ReadonlySpan<T> instead of Span<T const> 2023-02-08 19:15:45 +00:00
Tim Schumacher
81863eaf57 Kernel: Use AK::Stream to write packed binary data 2023-02-08 18:50:31 +00:00
Tim Schumacher
23e10a30ad Kernel: Modernize Error handling when serializing directory entries 2023-02-08 18:50:31 +00:00
Sam Atkins
1014aefe64 Kernel: Protect Thread::m_name with a spinlock
This replaces manually grabbing the thread's main lock.

This lets us remove the `get_thread_name` and `set_thread_name` syscalls
from the big lock. :^)
2023-02-06 20:36:53 +01:00
Sam Atkins
fe7b08dad7 Kernel: Protect Process::m_name with a spinlock
This also lets us remove the `get_process_name` and `set_process_name`
syscalls from the big lock. :^)
2023-02-06 20:36:53 +01:00
MacDue
83a59396c8 Kernel: Fix CPUInfo error propagation fixme
We can now propagate the errors directly from for_each_split_view(),
which I think counts as "Make this nicer" :^)
2023-02-05 19:31:21 +01:00
Liav A
ed67a877a3 Kernel+SystemServer+Base: Introduce the RAMFS filesystem
This filesystem is based on the code of the long-lived TmpFS. It differs
from that filesystem in one keypoint - its root inode doesn't have a
sticky bit on it.

Therefore, we mount it on /dev, to ensure only root can modify files on
that directory. In addition to that, /tmp is mounted directly in the
SystemServer main (start) code, so it's no longer specified in the fstab
file. We ensure that /tmp has a sticky bit and has the value 0777 for
root directory permissions, which is certainly a special case when using
RAM-backed (and in general other) filesystems.

Because of these 2 changes, it's no longer needed to maintain the TmpFS
filesystem, hence it's removed (renamed to RAMFS), because the RAMFS
represents the purpose of this filesystem in a much better way - it
relies on being backed by RAM "storage", and therefore it's easy to
conclude it's temporary and volatile, so its content is gone on either
system shutdown or unmounting of the filesystem.
2023-02-04 15:32:45 -07:00
Tim Schumacher
ae64b68717 AK: Deprecate the old AK::Stream
This also removes a few cases where the respective header wasn't
actually required to be included.
2023-01-29 19:16:44 -07:00
Liav A
722ae35329 Kernel/FileSystem: Simplify the ProcFS inode code
This is done by merging all scattered pieces of derived classes from the
ProcFSInode class into that one class, so we don't use inheritance but
rather simplistic checks to determine the proper code for each ProcFS
inode with its specific characteristics.
2023-01-29 12:59:30 +01:00
Sam Atkins
3cbc0fdbb0 Kernel: Remove declarations for non-existent methods 2023-01-27 20:33:18 +00:00
Liav A
a7677f1d9b Kernel/PCI: Expose PCI option ROM data from the sysfs interface
For each exposed PCI device in sysfs, there's a new node called "rom"
and by reading it, it exposes the raw data of a PCI option ROM blob to
a user for examining the blob.
2023-01-26 23:04:26 +01:00
Liav A
1f9d3a3523 Kernel/PCI: Hold a reference to DeviceIdentifier in the Device class
There are now 2 separate classes for almost the same object type:
- EnumerableDeviceIdentifier, which is used in the enumeration code for
  all PCI host controller classes. This is allowed to be moved and
  copied, as it doesn't support ref-counting.
- DeviceIdentifier, which inherits from EnumerableDeviceIdentifier. This
  class uses ref-counting, and is not allowed to be copied. It has a
  spinlock member in its structure to allow safely executing complicated
  IO sequences on a PCI device and its space configuration.
  There's a static method that allows a quick conversion from
  EnumerableDeviceIdentifier to DeviceIdentifier while creating a
  NonnullRefPtr out of it.

The reason for doing this is for the sake of integrity and reliablity of
the system in 2 places:
- Ensure that "complicated" tasks that rely on manipulating PCI device
  registers are done in a safe manner. For example, determining a PCI
  BAR space size requires multiple read and writes to the same register,
  and if another CPU tries to do something else with our selected
  register, then the result will be a catastrophe.
- Allow the PCI API to have a united form around a shared object which
  actually holds much more data than the PCI::Address structure. This is
  fundamental if we want to do certain types of optimizations, and be
  able to support more features of the PCI bus in the foreseeable
  future.

This patch already has several implications:
- All PCI::Device(s) hold a reference to a DeviceIdentifier structure
  being given originally from the PCI::Access singleton. This means that
  all instances of DeviceIdentifier structures are located in one place,
  and all references are pointing to that location. This ensures that
  locking the operation spinlock will take effect in all the appropriate
  places.
- We no longer support adding PCI host controllers and then immediately
  allow for enumerating it with a lambda function. It was found that
  this method is extremely broken and too much complicated to work
  reliably with the new paradigm being introduced in this patch. This
  means that for Volume Management Devices (Intel VMD devices), we
  simply first enumerate the PCI bus for such devices in the storage
  code, and if we find a device, we attach it in the PCI::Access method
  which will scan for devices behind that bridge and will add new
  DeviceIdentifier(s) objects to its internal Vector. Afterwards, we
  just continue as usual with scanning for actual storage controllers,
  so we will find a corresponding NVMe controllers if there were any
  behind that VMD bridge.
2023-01-26 23:04:26 +01:00
Karol Kosek
8cfd445c23 Kernel: Allow to remove files from sticky directory if user owns it
It's what the Linux chmod(1) manpage says (in the 'Restricted Deletion
Flag or Sticky Bit' section), and it just makes sense to me. :^)
2023-01-24 20:13:30 +00:00
Andrew Kaster
7ab37ee22c Everywhere: Remove string.h include from AK/Traits.h and resolve fallout
A lot of places were relying on AK/Traits.h to give it strnlen, memcmp,
memcpy and other related declarations.

In the quest to remove inclusion of LibC headers from Kernel files, deal
with all the fallout of this included-everywhere header including less
things.
2023-01-21 10:43:59 -07:00
Andrew Kaster
100fb38c3e Kernel+Userland: Move LibC/sys/ioctl_numbers to Kernel/API/Ioctl.h
This header has always been fundamentally a Kernel API file. Move it
where it belongs. Include it directly in Kernel files, and make
Userland applications include it via sys/ioctl.h rather than directly.
2023-01-21 10:43:59 -07:00
Brian Gianforcaro
bfa890251c Kernel: Fix uninitialized member variable in FATFS Filesystem
Reported-by: PVS Studio
2023-01-16 09:45:46 +01:00
Taj Morton
20991a6a3c Kernel/FileSystem: Fix kernel panic during FS init or mount failure
Resolves issue where a panic would occur if the file system failed to
initialize or mount, due to how the FileSystem was already added to
VFS's list. The newly-created FileSystem destructor would fail as a
result of the object still remaining in the IntrusiveList.
2023-01-09 19:26:01 -07:00
Liav A
04221a7533 Kernel: Mark Process::jail() method as const
We really don't want callers of this function to accidentally change
the jail, or even worse - remove the Process from an attached jail.
To ensure this never happens, we can just declare this method as const
so nobody can mutate it this way.
2023-01-07 03:44:59 +03:30
Liav A
d8ebcaede8 Kernel: Add helper function to check if a Process is in jail
Use this helper function in various places to replace the old code of
acquiring the SpinlockProtected<RefPtr<Jail>> of a Process to do that
validation.
2023-01-06 17:29:47 +01:00
Liav A
a9839d7ac5 Kernel/SysFS: Don't refresh/set-values inside the Jail spinlock scope
Only do so after a brief check if we are in a Jail or not. This fixes
SMP, because apparently it is crashing when calling try_generate()
from the SysFSGlobalInformation::refresh_data method, so the fix for
this is to simply not do that inside the Process' Jail spinlock scope,
because otherwise we will simply have a possible flow of taking
multiple conflicting Spinlocks (in the wrong order multiple times), for
the SysFSOverallProcesses generation code:
Process::current().jail(), and then Process::for_each_in_same_jail being
called, we take Process::all_instances(), and Process::current().jail()
again.
Therefore, we should at the very least eliminate the first taking of the
Process::current().jail() spinlock, in the refresh_data method of the
SysFSGlobalInformation class.
2023-01-05 23:58:13 +01:00
Taj Morton
31eeea08ba Kernel/FileSystem: Fix handling of FAT names that don't fill an entry
* Fix bug where last character of a filename or extension would be
   truncated (HELLO.TXT -> HELL.TX).
 * Fix bug where additional NULL characters would be added to long
   filenames that did not completely fill one of the Long Filename Entry
   character fields.
2023-01-04 09:02:13 +00:00
Taj Morton
a91fc697bb Kernel/FileSystem: Remove FIXME about old/new path being the same
Added comment after confirming that Linux and OpenBSD implenment the
same behavior.
2023-01-04 09:02:13 +00:00
Ben Wiederhake
65b420f996 Everywhere: Remove unused includes of AK/Memory.h
These instances were detected by searching for files that include
AK/Memory.h, but don't match the regex:

\\b(fast_u32_copy|fast_u32_fill|secure_zero|timing_safe_compare)\\b

This regex is pessimistic, so there might be more files that don't
actually use any memory function.

In theory, one might use LibCPP to detect things like this
automatically, but let's do this one step after another.
2023-01-02 20:27:20 -05:00
Ben Wiederhake
143a64f9a2 Kernel: Remove unused includes of Kernel/Debug.h
These instances were detected by searching for files that include
Kernel/Debug.h, but don't match the regex:
\\bdbgln_if\(|_DEBUG\\b
This regex is pessimistic, so there might be more files that don't check
for any real *_DEBUG macro. There seem to be no corner cases anyway.

In theory, one might use LibCPP to detect things like this
automatically, but let's do this one step after another.
2023-01-02 20:27:20 -05:00
kleines Filmröllchen
a6a439243f Kernel: Turn lock ranks into template parameters
This step would ideally not have been necessary (increases amount of
refactoring and templates necessary, which in turn increases build
times), but it gives us a couple of nice properties:
- SpinlockProtected inside Singleton (a very common combination) can now
  obtain any lock rank just via the template parameter. It was not
  previously possible to do this with SingletonInstanceCreator magic.
- SpinlockProtected's lock rank is now mandatory; this is the majority
  of cases and allows us to see where we're still missing proper ranks.
- The type already informs us what lock rank a lock has, which aids code
  readability and (possibly, if gdb cooperates) lock mismatch debugging.
- The rank of a lock can no longer be dynamic, which is not something we
  wanted in the first place (or made use of). Locks randomly changing
  their rank sounds like a disaster waiting to happen.
- In some places, we might be able to statically check that locks are
  taken in the right order (with the right lock rank checking
  implementation) as rank information is fully statically known.

This refactoring even more exposes the fact that Mutex has no lock rank
capabilites, which is not fixed here.
2023-01-02 18:15:27 -05:00
Andreas Kling
16f934474f Kernel+Tests: Allow deleting someone else's file in my sticky directory
This should be allowed according to Dr. POSIX. :^)
2023-01-01 10:09:02 +01:00
Andreas Kling
47b9e8e651 Kernel: Annotate VirtualFileSystem::rmdir() errors with spec comments 2023-01-01 10:09:02 +01:00
Andreas Kling
8619f2c6f3 Kernel+Tests: Remove inaccurate FIXME in sys$rmdir()
We were already handling the rmdir("..") case by refusing to remove
directories that were not empty.

This patch removes a FIXME from January 2019 and adds a test. :^)
2023-01-01 10:09:02 +01:00
Andreas Kling
8d781d0216 Kernel+Tests: Make sys$rmdir() fail with EINVAL if basename is "."
Dr. POSIX says that we should reject attempts to rmdir() the file named
"." so this patch does exactly that. We also add a test.

This solves a FIXME from January 2019. :^)
2023-01-01 10:09:02 +01:00
Liav A
91db482ad3 Kernel: Reorganize Arch/x86 directory to Arch/x86_64 after i686 removal
No functional change.
2022-12-28 11:53:41 +01:00
Liav A
5ff318cf3a Kernel: Remove i686 support 2022-12-28 11:53:41 +01:00
Liav A
2e710de2f4 Kernel/FileSystem: Prevent symlink creation in veiled directory paths
Also, try to resolve the target path and check if it is allowed to be
accessed under the unveil rules.
2022-12-21 09:17:09 +00:00
Freakness109
1f1e58ed75 Kernel/Plan9FS: Propagate errors in Plan9FSMessage::append_data 2022-12-17 09:37:04 +00:00
sin-ack
3275015786 Kernel: Implement flock downgrading
This commit makes it possible for a process to downgrade a file lock it
holds from a write (exclusive) lock to a read (shared) lock. For this,
the process must point to the exact range of the flock, and must be the
owner of the lock.
2022-12-11 19:55:37 -07:00
sin-ack
2a502fe232 Kernel+LibC+LibCore+UserspaceEmulator: Implement faccessat(2)
Co-Authored-By: Daniel Bertalan <dani@danielbertalan.dev>
2022-12-11 19:55:37 -07:00
sin-ack
fa692e13f9 Kernel: Use real UID/GID when checking for file access
This aligns the rest of the system with POSIX, who says that access(2)
must check against the real UID and GID, not effective ones.
2022-12-11 19:55:37 -07:00
sin-ack
3472c84d14 Kernel: Remove InodeMetadata::may_{read,write,execute}(Process const&)
These have no definition and are never used.
2022-12-11 19:55:37 -07:00
sin-ack
d5fbdf1866 Kernel+LibC+LibCore: Implement renameat(2)
Now with the ability to specify different bases for the old and new
paths.
2022-12-11 19:55:37 -07:00
Liav A
aa9fab9c3a Kernel/FileSystem: Convert the mount table from Vector to IntrusiveList
The fact that we used a Vector meant that even if creating a Mount
object succeeded, we were still at a risk that appending to the actual
mounts Vector could fail due to OOM condition. To guard against this,
the mount table is now an IntrusiveList, which always means that when
allocation of a Mount object succeeded, then inserting that object to
the list will succeed, which allows us to fail early in case of OOM
condition.
2022-12-09 23:29:33 -07:00
Liav A
6a555af1f1 Kernel: Add callback on ".." directory entry for a TmpFS root directory 2022-12-09 22:59:08 -07:00
Liav A
69f41eb062 Kernel: Reject create links on paths that were not unveiled as writable
This solves one of the security issues being mentioned in issue #15996.
We simply don't allow creating hardlinks on paths that were not unveiled
as writable to prevent possible bypass on a certain path that was
unveiled as non-writable.
2022-12-03 11:00:34 -07:00
Liav A
0bb7c8f4c4 Kernel+SystemServer: Don't hardcode coredump directory path
Instead, allow userspace to decide on the coredump directory path. By
default, SystemServer sets it to the /tmp/coredump directory, but users
can now change this by writing a new path to the sysfs node at
/sys/kernel/variables/coredump_directory, and also to read this node to
check where coredumps are currently generated at.
2022-12-03 05:56:59 -07:00
Liav A
7dcf8f971b Kernel: Rename SysFSSystemBoolean => SysFSSystemBooleanVariable 2022-12-03 05:56:59 -07:00
Liav A
95d8aa2982 Kernel: Allow read access sparingly to some /sys/kernel directory nodes
Those nodes are not exposing any sensitive information so there's no
harm in exposing them.
2022-12-03 05:47:58 -07:00
Liav A
1ca0ac5207 Kernel: Disallow jailed processes to read files in /sys/kernel directory
By default, disallow reading of values in that directory. Later on, we
will enable sparingly read access to specific files.

The idea that led to this mechanism was suggested by Jean-Baptiste
Boric (also known as boricj in GitHub), to prevent access to sensitive
information in the SysFS if someone adds a new file in the /sys/kernel
directory.
2022-12-03 05:47:58 -07:00
Liav A
2e55956784 Kernel: Forbid access to /sys/kernel/power_state for Jailed processes
There's simply no benefit in allowing sandboxed programs to change the
power state of the machine, so disallow writes to the mentioned node to
prevent malicious programs to request that.
2022-12-03 05:47:58 -07:00
Andreas Kling
4dd148f07c Kernel: Add File::is_regular_file()
This makes it easy and expressive to check if a File is a regular file.
2022-11-29 11:09:19 +01:00
Liav A
718ae68621 Kernel+LibCore+LibC: Implement support for forcing unveil on exec
To accomplish this, we add another VeilState which is called
LockedInherited. The idea is to apply exec unveil data, similar to
execpromises of the pledge syscall, on the current exec'ed program
during the execve sequence. When applying the forced unveil data, the
veil state is set to be locked but the special state of LockedInherited
ensures that if the new program tries to unveil paths, the request will
silently be ignored, so the program will continue running without
receiving an error, but is still can only use the paths that were
unveiled before the exec syscall. This in turn, allows us to use the
unveil syscall with a special utility to sandbox other userland programs
in terms of what is visible to them on the filesystem, and is usable on
both programs that use or don't use the unveil syscall in their code.
2022-11-26 12:42:15 -07:00
sin-ack
3b03077abb Kernel: Update the ".." inode for directories after a rename
Because the ".." entry in a directory is a separate inode, if a
directory is renamed to a new location, then we should update this entry
the point to the new parent directory as well.

Co-authored-by: Liav A <liavalb@gmail.com>
2022-11-25 17:33:05 +01:00
Andreas Kling
a9d55ddf57 Kernel/TmpFS: Update mtime instead of ctime when asked to update mtime 2022-11-24 16:56:27 +01:00
Andreas Kling
10fa72d451 Kernel: Use AK::Time for InodeMetadata timestamps instead of time_t
Before this change, we were truncating the nanosecond part of file
timestamps in many different places.
2022-11-24 16:56:27 +01:00
Andreas Kling
fb00d3ed25 Kernel+lsirq: Track per-CPU IRQ handler call counts
Each GenericInterruptHandler now tracks the number of calls that each
CPU has serviced.

This takes care of a FIXME in the /sys/kernel/interrupts generator.

Also, the lsirq command line tool now displays per-CPU call counts.
2022-11-19 15:39:30 +01:00
Andreas Kling
9b3db63e14 Kernel: Rename GenericInterruptHandler "invoking count" to "call count" 2022-11-19 15:39:30 +01:00
Liav A
31d4c07dee Kernel: Add missing includes for Mount.h file 2022-11-11 10:25:54 +01:00
Liav A
3cc0d60141 Kernel: Split the Ext2FileSystem.{cpp,h} files into smaller components 2022-11-08 02:54:48 -07:00
Liav A
1c91881a1d Kernel: Split the ISO9660FileSystem.{cpp,h} files to smaller components 2022-11-08 02:54:48 -07:00
Liav A
fca3b7f1f9 Kernel: Split the DevPtsFS files into smaller components 2022-11-08 02:54:48 -07:00
Liav A
3fc52a6d1c Kernel: Split the Plan9FileSystem.{cpp,h} file into smaller components 2022-11-08 02:54:48 -07:00
Liav A
3906dd3aa3 Kernel: Split the ProcFS core file into smaller components 2022-11-08 02:54:48 -07:00
Liav A
e882b2ed05 Kernel: Split the FATFileSystem.{cpp,h} files into smaller components 2022-11-08 02:54:48 -07:00
Liav A
5e6101dd3e Kernel: Split the TmpFS core files into smaller components 2022-11-08 02:54:48 -07:00
Liav A
f53149d5f6 Kernel: Split the SysFS core files into smaller components 2022-11-08 02:54:48 -07:00
Liav A
5e062414c1 Kernel: Add support for jails
Our implementation for Jails resembles much of how FreeBSD jails are
working - it's essentially only a matter of using a RefPtr in the
Process class to a Jail object. Then, when we iterate over all processes
in various cases, we could ensure if either the current process is in
jail and therefore should be restricted what is visible in terms of
PID isolation, and also to be able to expose metadata about Jails in
/sys/kernel/jails node (which does not reveal anything to a process
which is in jail).

A lifetime model for the Jail object is currently plain simple - there's
simpy no way to manually delete a Jail object once it was created. Such
feature should be carefully designed to allow safe destruction of a Jail
without the possibility of releasing a process which is in Jail from the
actual jail. Each process which is attached into a Jail cannot leave it
until the end of a Process (i.e. when finalizing a Process). All jails
are kept being referenced in the JailManagement. When a last attached
process is finalized, the Jail is automatically destroyed.
2022-11-05 18:00:58 -06:00
Timon Kruiper
0475407f9f Kernel: Remove bunch of unused includes in SysFS/Processes.cpp 2022-10-26 20:01:45 +02:00
Timon Kruiper
97f1fa7d8f Kernel: Include missing headers for various files
With these missing header files, we can now build these files for
aarch64.
2022-10-26 20:01:45 +02:00
Timon Kruiper
fcbb6b79ac Kernel: Don't expose processor information for aarch64 in sysfs
We do not (yet) acquire this information for the aarch64 processors.
2022-10-26 20:01:45 +02:00
Liav A
75f01692b4 Kernel+Userland: Move /sys/firmware/power_state to /sys/kernel directory
Let's put the power_state global node into the /sys/kernel directory,
because that directory represents all global nodes and variables being
related to the Kernel. It's also a mutable node, that is more acceptable
being in the mentioned directory due to the fact that all other files in
the /sys/firmware directory are just firmware blobs and are not mutable
at all.
2022-10-25 15:33:34 -06:00
Liav A
a91589c09b Kernel: Introduce global variables and stats in /sys/kernel directory
The ProcFS is an utter mess currently, so let's start move things that
are not related to processes-info. To ensure it's done in a sane manner,
we start by duplicating all /proc/ global nodes to the /sys/kernel/
directory, then we will move Userland to use the new directory so the
old directory nodes can be removed from the /proc directory.
2022-10-25 15:33:34 -06:00
Liav A
03ae9f94cf Kernel/FileSystem: Remove hardcoded unveil path of /usr/lib/Loader.so
If a program needs to execute a dynamic executable program, then it
should unveil /usr/lib/Loader.so by itself and not rely on the Kernel to
allow using this binary without any sense of respect to unveil promises
being made by the running parent program.
2022-10-24 19:41:32 -06:00
Gunnar Beutner
ce4b66e908 Kernel: Add support for MSG_NOSIGNAL and properly send SIGPIPE
Previously we didn't send the SIGPIPE signal to processes when
sendto()/sendmsg()/etc. returned EPIPE. And now we do.

This also adds support for MSG_NOSIGNAL to suppress the signal.
2022-10-24 15:49:39 +02:00
Liav A
fea3cb5ff9 Kernel/FileSystem: Discard safely filesystems when unmounted last time
This commit reached that goal of "safely discarding" a filesystem by
doing the following:
1. Stop using the s_file_system_map HashMap as it was an unsafe measure
to access pointers of FileSystems. Instead, make sure to register all
FileSystems at the VFS layer, with an IntrusiveList, to avoid problems
related to OOM conditions.
2. Make sure to cleanly remove the DiskCache object from a BlockBased
filesystem, so the destructor of such object will not need to do that in
the destruction point.
3. For ext2 filesystems, don't cache the root inode at m_inode_cache
HashMap. The reason for this is that when unmounting an ext2 filesystem,
we lookup at the cache to see if there's a reference to a cached inode
and if that's the case, we fail with EBUSY. If we keep the m_root_inode
also being referenced at the m_inode_cache map, we have 2 references to
that object, which will lead to fail with EBUSY. Also, it's much simpler
to always ask for a root inode and get it immediately from m_root_inode,
instead of looking up the cache for that inode.
2022-10-22 16:57:52 -04:00
Liav A
24977996a6 Kernel: Append root filesystem to the VFS FileBackedFileSystem list 2022-10-22 16:57:52 -04:00
Liav A
0fd7b688af Kernel: Introduce support for using FileSystem object in multiple mounts
The idea is to enable mounting FileSystem objects across multiple mounts
in contrast to what happened until now - each mount has its own unique
FileSystem object being attached to it.

Considering a situation of mounting a block device at 2 different mount
points at in system, there were a couple of critical flaws due to how
the previous "design" worked:
1. BlockBasedFileSystem(s) that pointed to the same actual device had a
separate DiskCache object being attached to them. Because both instances
were not synchronized by any means, corruption of the filesystem is most
likely achieveable by a simple cache flush of either of the instances.
2. For superblock-oriented filesystems (such as the ext2 filesystem),
lack of synchronization between both instances can lead to severe
corruption in the superblock, which could render the entire filesystem
unusable.
3. Flags of a specific filesystem implementation (for example, with xfs
on Linux, one can instruct to mount it with the discard option) must be
honored across multiple mounts, to ensure expected behavior against a
particular filesystem.

This patch put the foundations to start fix the issues mentioned above.
However, there are still major issues to solve, so this is only a start.
2022-10-22 16:57:52 -04:00
Liav A
965afba320 Kernel/FileSystem: Add a few missing includes
In preparation to future commits, we need to ensure that
OpenFileDescription.h doesn't include the VirtualFileSystem.h file to
avoid include loops.
2022-10-22 16:57:52 -04:00
Liav A
07387ec19a Kernel+Base: Introduce MS_NOREGULAR mount flag
This flag doesn't conform to any POSIX standard nor is found in any OS
out there. The idea behind this mount flag is to ensure that only
non-regular files will be placed in a filesystem, which includes device
nodes, symbolic links, directories, FIFOs and sockets. Currently, the
only valid case for using this mount flag is for TmpFS instances, where
we want to mount a TmpFS but disallow any kind of regular file and only
allow other types of files on the filesystem.
2022-10-22 19:18:15 +02:00
Liav A
97f8927da6 Kernel: Remove the DevTmpFS class
Although this code worked quite well, it is considered to be a code
duplication with the TmpFS code which is more tested and works quite
well for a variety of cases. The only valid reason to keep this
filesystem was that it enforces that no regular files will be created at
all in the filesystem. Later on, we will re-introduce this feature in a
sane manner. Therefore, this can be safely removed after SystemServer no
longer uses this filesystem type anymore.
2022-10-22 19:18:15 +02:00
Liav A
c2b5c5bac5 Kernel: Add support for device nodes in TmpFS
Later on we will remove the DevTmpFS code, so in order to support
mounting TmpFS instead, we need to be able to create device nodes on
the filesystem.
2022-10-22 19:18:15 +02:00
Timon Kruiper
9827c11d8b Kernel: Move InterruptDisabler out of Arch directory
The code in this file is not architecture specific, so it can be moved
to the base Kernel directory.
2022-10-17 20:11:31 +02:00
Liav A
b9dca3300e Kernel: Use more fine-grained content data block granularity in TmpFS
Instead of just having a giant KBuffer that is not resizeable easily, we
use multiple AnonymousVMObjects in one Vector to store them.
The idea is to not have to do giant memcpy or memset each time we need
to allocate or de-allocate memory for TmpFS inodes, but instead, we can
allocate only the desired block range when trying to write to it.
Therefore, it is also possible to have data holes in the inode content
in case of skipping an entire set of one data block or more when writing
to the inode content, thus, making memory usage much more efficient.

To ensure we don't run out of virtual memory range, don't allocate a
Region in advance to each TmpFSInode, but instead try to allocate a
Region on IO operation, and then use that Region to map the VMObjects
in IO loop.
2022-10-16 17:46:40 +02:00
Undefine
135ca3fa1b Kernel: Add support for the FAT32 filesystem
This commit adds read-only support for the FAT32 filesystem. It also
includes support for long file names.
2022-10-14 18:36:40 -06:00
Liav A
3cf6ac1b3f Kernel: Fix typo in comment in Ext2FileSystem::read_bytes_locked method 2022-09-26 20:13:13 +01:00
Liav A
0a793a7fa3 Kernel/FileSystem: Remove the locking of a Inode mutex in InodeVMObjects
We no longer require to lock the m_inode_lock in the SharedInodeVMObject
code as the methods write_bytes and read_bytes of the Inode class do
this for us now.
2022-09-26 22:06:10 +03:00
Liav A
9252a892bb Kernel: Abstracts x86 reboot and shutdown specific methods
We move QEMU and VirtualBox shutdown sequences to a separate file, as
well as moving the i8042 reboot code sequence too to another file.

This allows us to abstract specific methods from the power state node
code of the SysFS filesystem, to allow other architectures to put their
methods there too in the future.
2022-09-20 18:43:05 +01:00
Liav A
3ad0e1a1d5 Kernel: Handle mmap requests on zero-length data file inodes safely 2022-09-16 14:55:45 +03:00
Liav A
c88cc8557f Kernel/FileSystem: Make Inode::{write,read}_bytes methods non-virtual
We make these methods non-virtual because we want to ensure we properly
enforce locking of the m_inode_lock mutex. Also, for write operations,
we want to call prepare_to_write_data before the actual write. The
previous design required us to ensure the callers do that at various
places which lead to hard-to-find bugs. By moving everything to a place
where we call prepare_to_write_data only once, we eliminate a possibilty
of forgeting to call it on some code path in the kernel.
2022-09-16 14:55:45 +03:00
Liav A
4f4717e351 Kernel/FileSystem: Mark ext2 inode block list non-const
The block list required a bit of work, and now the only method being
declared const to bypass its const-iness is the read_bytes method that
calls a new method called compute_block_list_with_exclusive_locking that
takes care of proper locking before trying to update the block list data
of the ext2 inode.
2022-09-16 14:55:45 +03:00
Liav A
843bd43c5b Kernel/FileSystem: Mark ext2 inode lookup cache non-const
For the lookup cache, no method being declared const tried to modify it,
so it was easy to drop the mutable declaration on the HashMap member.
2022-09-16 14:55:45 +03:00
Andreas Kling
2cc947ede4 Kernel: Use correct timestamp in sys$utimens()
We were mixing up the nanosecond and second parts of the timestamps.

Regressed in 280694bb46.
2022-09-13 17:03:31 +02:00
Andreas Kling
30861daa93 Kernel: Simplify the File memory-mapping API
Before this change, we had File::mmap() which did all the work of
setting up a VMObject, and then creating a Region in the current
process's address space.

This patch simplifies the interface by removing the region part.
Files now only have to return a suitable VMObject from
vmobject_for_mmap(), and then sys$mmap() itself will take care of
actually mapping it into the address space.

This fixes an issue where we'd try to block on I/O (for inode metadata
lookup) while holding the address space spinlock. It also reduces time
spent holding the address space lock.
2022-08-24 14:57:51 +02:00
Andreas Kling
cf16b2c8e6 Kernel: Wrap process address spaces in SpinlockProtected
This forces anyone who wants to look into and/or manipulate an address
space to lock it. And this replaces the previous, more flimsy, manual
spinlock use.

Note that pointers *into* the address space are not safe to use after
you unlock the space. We've got many issues like this, and we'll have
to track those down as wlel.
2022-08-24 14:57:51 +02:00
Andreas Kling
434d77cd43 Kernel/ProcFS: Silently ignore attempts to update ProcFS timestamps
We have to override Inode::update_timestamps() for ProcFS inodes,
otherwise we'll get the default behavior of erroring with ENOTIMPL.
2022-08-23 01:00:40 +02:00
Andreas Kling
5307e1bf01 Kernel/SysFS: Silently ignore attempts to update SysFS timestamps
We have to override Inode::update_timestamps() for SysFS inodes,
otherwise we'll get the default behavior of erroring with ENOTIMPL.
2022-08-23 00:55:41 +02:00
Andreas Kling
280694bb46 Kernel: Update atime/ctime/mtime timestamps atomically
Instead of having three separate APIs (one for each timestamp),
there's now only Inode::update_timestamps() and it takes 3x optional
timestamps. The non-empty timestamps are updated while holding the inode
mutex, and the outside world no longer has to look at intermediate
timestamp states.
2022-08-22 17:56:03 +02:00
Anthony Iacono
f86b671de2 Kernel: Use Process::credentials() and remove user ID/group ID helpers
Move away from using the group ID/user ID helpers in the process to
allow for us to take advantage of the immutable credentials instead.
2022-08-22 12:46:32 +02:00
Andreas Kling
dbe182f1c6 Kernel: Make Inode::resolve_as_link() take credentials as input 2022-08-21 16:17:13 +02:00
Andreas Kling
006f753647 Kernel: Make File::{chown,chmod} take credentials as input
...instead of getting them from Process::current(). :^)
2022-08-21 16:15:29 +02:00
Andreas Kling
c3351d4b9f Kernel: Make VirtualFileSystem functions take credentials as input
Instead of getting credentials from Process::current(), we now require
that they be provided as input to the various VFS functions.

This ensures that an atomic set of credentials is used throughout an
entire VFS operation.
2022-08-21 16:02:24 +02:00
James Bellamy
386642ffcf Kernel: Use credentials object in VirtualFileSystem
Use credentials object in mknod, create, mkdir, and symlink
2022-08-21 14:55:01 +02:00
Andreas Kling
728c3fbd14 Kernel: Use RefPtr instead of LockRefPtr for Custody
By protecting all the RefPtr<Custody> objects that may be accessed from
multiple threads at the same time (with spinlocks), we remove the need
for using LockRefPtr<Custody> (which is basically a RefPtr with a
built-in spinlock.)
2022-08-21 12:25:14 +02:00
Andreas Kling
122d7d9533 Kernel: Add Credentials to hold a set of user and group IDs
This patch adds a new object to hold a Process's user credentials:

- UID, EUID, SUID
- GID, EGID, SGID, extra GIDs

Credentials are immutable and child processes initially inherit the
Credentials object from their parent.

Whenever a process changes one or more of its user/group IDs, a new
Credentials object is constructed.

Any code that wants to inspect and act on a set of credentials can now
do so without worrying about data races.
2022-08-20 18:32:50 +02:00
Andreas Kling
bec314611d Kernel: Move InodeMetadata methods out of line 2022-08-20 17:20:44 +02:00
Andreas Kling
11eee67b85 Kernel: Make self-contained locking smart pointers their own classes
Until now, our kernel has reimplemented a number of AK classes to
provide automatic internal locking:

- RefPtr
- NonnullRefPtr
- WeakPtr
- Weakable

This patch renames the Kernel classes so that they can coexist with
the original AK classes:

- RefPtr => LockRefPtr
- NonnullRefPtr => NonnullLockRefPtr
- WeakPtr => LockWeakPtr
- Weakable => LockWeakable

The goal here is to eventually get rid of the Lock* classes in favor of
using external locking.
2022-08-20 17:20:43 +02:00
Andreas Kling
e475263113 AK+Kernel: Add AK::AtomicRefCounted and use everywhere in the kernel
Instead of having two separate implementations of AK::RefCounted, one
for userspace and one for kernelspace, there is now RefCounted and
AtomicRefCounted.
2022-08-20 17:15:52 +02:00
kleines Filmröllchen
4314c25cf2 Kernel: Require lock rank for Spinlock construction
All users which relied on the default constructor use a None lock rank
for now. This will make it easier to in the future remove LockRank and
actually annotate the ranks by searching for None.
2022-08-19 20:26:47 -07:00
Andreas Kling
cb04caa18e Kernel: Protect the Custody cache with a spinlock
Protecting it with a mutex meant that anyone unref()'ing a Custody
might need to block on said mutex.
2022-08-18 00:58:34 +02:00
Andreas Kling
17de393253 Kernel: Remove outdated FIXME in Custody.h 2022-08-18 00:58:34 +02:00
Mike Akers
de980de0e4 Kernel: Lock the inode before writing in SharedInodeVMObject::sync
We ensure that when we call SharedInodeVMObject::sync we lock the inode
lock before calling Inode virtual write_bytes method directly to avoid
assertion on the unlocked inode lock, as it was regressed recently. This
is not a complete fix as the need to lock from each path before calling
the write_bytes method should be avoided because it can lead to
hard-to-find bugs, and this commit only fixes the problem temporarily.
2022-08-16 16:54:03 +02:00
Liav A
c3eaa73113 Kernel/Storage: Remove InterfaceType enum
This enum was created to help put distinction between the commandset and
the interface type, as ATAPI devices are simply ATA devices utilizing
the SCSI commandset. Because we don't support ATAPI, putting such type
of distinction is pointless, so let's remove this for now.
2022-08-14 01:09:03 +01:00
Kristiyan Stoimenov
9e1bea50ee Kernel/VFS: Check that mount-point is not in use
Before committing the mount, verify that this host i-node is not already
a mount-point.
2022-08-12 19:57:18 -07:00
Liav A
cf33d0b5f7 Kernel/FileSystem: Use a new debug flag for SysFS debug messages 2022-08-08 02:33:25 +00:00
Liav A
fcc0e4d538 Kernel/FileSystem: Funnel calls to Inode::prepare_to_write_data method
Instead of requiring each FileSystem implementation to call this method
when trying to write data, do the calls at 2 points to avoid further
calls (or lack of them due to not remembering to use it) at other files
and locations in the codebase.
2022-07-30 23:31:08 +02:00
b14ckcat
b8cfec7b1f Kernel: Move SysFS USB create function 2022-07-27 05:52:35 +00:00
Liav A
60f7d61ad2 Kernel/SysFS: Fix parent directory hierarchy with symbolic links
We should actually start counting from the parent directory and not from
the symbolic link as it will represent a wrong count of hops from the
actual mountpoint.

The symlinks in /sys/dev/block and /sys/dev/char worked only by luck,
because I have set it to the wrong parent directory which is the
/sys/dev directory, so with the symlink it was 3 hops to /sys, together
with the root directory, therefore, everything seemed to work.

Now that the device symlinks in /sys/dev/block and /sys/dev/char are set
to the right parent directory and we start measure hops from root
directory with the parent directory of a symlink, everything seem to
work correctly now.
2022-07-24 13:38:24 +01:00
Linus Groh
8150d71821 Everywhere: Prefix 'TYPEDEF_DISTINCT_ORDERED_ID' with 'AK_' 2022-07-22 23:09:43 +01:00
Idan Horowitz
3a80b25ed6 Kernel: Support F_SETLKW in fcntl 2022-07-21 16:39:22 +02:00
Liav A
3af70cb0fc Kernel/Devices: Abstract SysFS Device add/remove methods more properly
It is starting to get a little messy with how each device can try to add
or remove itself to either /sys/dev/block or /sys/dev/char directories.

To better do this, we introduce 4 virtual methods to take care of that,
so until we ensure all nodes in /sys/dev/block and /sys/dev/char are
actual symlinks, we allow the Device base class to call virtual methods
upon insertion or before being destroying, so it add itself elegantly to
either of these directories or remove itself when needed.

For special cases where we need to create symlinks, we have two virtual
methods to be called otherwise to do almost the same thing mentioned
before, but to use symlinks instead.
2022-07-19 11:02:37 +01:00
Liav A
da8d18b263 Kernel/SysFS: Add exposing interface for DisplayConnectors
Under normal conditions (when mounting SysFS in /sys), there will be a
new directory in the /sys/devices directory called "graphics".
For now, under that directory there will be only a sub-directory called
"connectors" which will contain all DisplayConnectors' details, each in
its own sub-directory too, distinguished in naming with its minor
number.

Therefore, /sys/devices/graphics/connectors/MINOR_NUMBER/ will contain:
- General device attributes such as mutable_mode_setting_capable,
  double_buffering_capable, flush_support, partial_flush_support and
  refresh_rate_support. These values are exposed in the ioctl interface
  of the DisplayConnector class too, but these can be useful later on
  for command line utilities that want/need to expose these basic
  settings.
- The EDID blob, simply named "edid". This will help userspace to fetch
  the edid without the need of using the ioctl interface later on.
2022-07-19 11:02:37 +01:00
Hendiadyoin1
c3e57bfccb Kernel: Try to set [cm]time in Inode::did_modify_contents
This indirectly resolves a fixme in sys$msync
2022-07-15 12:42:43 +02:00
Liav A
1dbd32488f Kernel/SysFS: Add /sys/devices/storage directory
This change in fact does the following:
1. Use support for symlinks between /sys/dev/block/ storage device
identifier nodes and devices in /sys/devices/storage/{LUN}.
2. Add basic nodes in a /sys/devices/storage/{LUN} directory, to let
userspace to know about the device and its details.
2022-07-15 12:29:23 +02:00
Liav A
cdab213750 Kernel/SysFS: Adapt USB plug code to work with SysFS patterns 2022-07-15 12:29:23 +02:00
Liav A
70afa0b171 Kernel/SysFS: Mark SysFSDirectory traverse and lookup methods as final
This enforces us to remove duplicated code across the SysFS code. This
results in great simplification of how the SysFS works now, because we
enforce one way to treat SysFSDirectory objects.
2022-07-15 12:29:23 +02:00
Liav A
6733f19b3c Kernel/SysFS: Reduce the responsibilities of the Registry object
Instead, let the /sys/dev/block and /sys/dev/char directories to handle
the registering part of SysFSDeviceComponents by themselves.
2022-07-15 12:29:23 +02:00
Liav A
ecc29bb52e Kernel/SysFS: Add Symbolic link functionality to the filesystem
This will be used later on to help connecting a node at /sys/dev/block/
that represents a Storage device to a directory in /sys/devices/storage/
with details on that device in that directory.
2022-07-15 12:29:23 +02:00
Liav A
7e88bbe550 Kernel/SysFS: Add two methods related to relative paths for components
These methods will be used later on to introduce symbolic links support
in the SysFS, so the kernel will be able to resolve relative paths of
components in filesystem based on using the m_parent_directory pointer
in each SysFSComponent object.
2022-07-15 12:29:23 +02:00
Liav A
6ff1aeb64d Kernel/SysFS: Rename Devices code folder => DeviceIdentifiers
This folder in the SysFS code represents everything related to /sys/dev,
which is a directory meant to be a convenient interface to track all IDs
of all block and character devices (ID = major:minor numbers).
2022-07-15 12:29:23 +02:00
sin-ack
3f3f45580a Everywhere: Add sv suffix to strings relying on StringView(char const*)
Each of these strings would previously rely on StringView's char const*
constructor overload, which would call __builtin_strlen on the string.
Since we now have operator ""sv, we can replace these with much simpler
versions. This opens the door to being able to remove
StringView(char const*).

No functional changes.
2022-07-12 23:11:35 +02:00
Tim Schumacher
3b3af58cf6 Kernel: Annotate all KBuffer and DoubleBuffer with a custom name 2022-07-12 00:55:31 +01:00
Tim Schumacher
cd189999d1 Kernel: Don't let locks of the same owner conflict with each other
Documentation on POSIX locks seems sparse, but this is how the Linux
kernel implementation handles it.
2022-07-08 22:27:38 +00:00
Tim Schumacher
dc6016cd18 Kernel: Don't fail on unlocking nonexistent file locks
I haven't found any POSIX specification on this, but the Linux kernel
appears to handle it like that.

This is required by QEMU, as it just bulk-unlocks all its file locking
bytes without checking first if they are held.
2022-07-08 22:27:38 +00:00
Andrew Kaster
940be19259 Kernel: Create /proc/pid/cmdline to expose process arguments in procfs
In typical serenity style, they are just a JSON array
2022-06-19 09:05:35 +02:00
Andreas Kling
0132e494d6 Kernel: Add missing #include in SysFS.cpp 2022-06-17 12:22:07 +02:00
Liav A
30b58cd06c Kernel/SysFS: Remove derived BIOSSysFSComponent classes
These are not needed, because both do exactly the same thing, so we can
move the code to the BIOSSysFSComponent class.
2022-06-17 11:01:27 +02:00
Liav A
23c1c40e86 Kernel/SysFS: Migrate components code from SysFS.cpp to the SysFS folder 2022-06-17 11:01:27 +02:00
Liav A
4d05a41b30 Kernel/SysFS: Split the bulky BIOS.h file into multiple files 2022-06-17 11:01:27 +02:00
Liav A
9c6834698f Kerenl/Firmware: Add map_ebda and map_bios methods in the original place
In a previous commit I moved everything into the new subdirectories in
FileSystem/SysFS directory without trying to actually make changes in
the code itself too much. Now it's time to split the code to make it
more readable and understandable, hence this change occurs now.
2022-06-17 11:01:27 +02:00
Liav A
99bac4f34f Kernel/SysFS: Split bulky SysFSPCI file into separate files 2022-06-17 11:01:27 +02:00
Liav A
e488245234 Kernel/SysFS: Split bulky SysFSUSB file into two separate class files 2022-06-17 11:01:27 +02:00
Liav A
290eb53cb5 Kernel/SysFS: Stop cluttering the codebase with pieces of SysFS parts
Instead, start to put everything in one place to resemble the directory
structure of the SysFS when actually using it.
2022-06-17 11:01:27 +02:00
Andreas Kling
4e4a930b13 Kernel: Use the system boot time as default timestamp in /sys and /dev 2022-06-15 17:15:04 +02:00
Timon Kruiper
a4534678f9 Kernel: Implement InterruptDisabler using generic Processor functions
Now that the code does not use architectural specific code, it is moved
to the generic Arch directory and the paths are modified accordingly.
2022-06-02 13:14:12 +01:00
Liav A
58acdce41f Kernel/FileSystem: Simplify even more the mount syscall
As with the previous commit, we put a distinction between filesystems
that require a file description and those which don't, but now in a much
more readable mechanism - all initialization properties as well as the
create static method are grouped to create the FileSystemInitializer
structure. Then when we need to initialize an instance, we iterate over
a table of these structures, checking for matching structure and then
validating the given arguments from userspace against the requirements
to ensure we can create a valid instance of the requested filesystem.
2022-05-29 19:31:02 +01:00
Ariel Don
9a6bd85924 Kernel+LibC+VFS: Implement utimensat(3)
Create POSIX utimensat() library call and corresponding system call to
update file access and modification times.
2022-05-21 18:15:00 +02:00
MacDue
d951e2ca97 Kernel: Add /proc/{pid}/children to ProcFS
This exposes the child processes for a process as a directory
of symlinks to the respective /proc entries for each child.

This makes for an easier and possibly more efficient way
to find and count a process's children. Previously the only
method was to parse the entire /proc/all JSON file.
2022-05-06 02:12:51 +04:30
Andrew Kaster
f08e91f67e Kernel: Don't check pledges or veil against code coverage data files
Coverage tools like LLVM's source-based coverage or GNU's --coverage
need to be able to write out coverage files from any binary, regardless
of its security posture. Not ignoring these pledges and veils means we
can't get our coverage data out without playing some serious tricks.

However this is pretty terrible for normal exeuction, so only skip these
checks when we explicitly configured userspace for coverage.
2022-05-02 01:46:18 +02:00
kleines Filmröllchen
b0a2572577 Kernel: Don't require AnonymousFiles to be mmap'd completely
AnonymousFile always allocates in multiples of a page size when created
with anon_create. This is especially an issue if we use AnonymousFile
shared memory to store a shared data structure that isn't exactly a
multiple of a page in size. Therefore, we can just allow mmaps of
AnonymousFile to map only an initial part of the shared memory.

This makes SharedSingleProducerCircularQueue work when it's introduced
later.
2022-04-21 13:55:00 +02:00
Idan Horowitz
086969277e Everywhere: Run clang-format 2022-04-01 21:24:45 +01:00
Liav A
ae2ec45e78 Kernel: Allow SysFS components to have non-zero size
This is important for dmidecode because it does an fstat on the DMI
blobs, trying to figure out their size. Because we already know the size
of the blobs when creating the SysFS components, there's no performance
penalty whatsoever, and this allows dmidecode to not use the /dev/mem
device as a fallback.
2022-04-01 11:27:19 +02:00