Commit Graph

97 Commits

Author SHA1 Message Date
Idan Horowitz
785c9d5c2b Kernel: Add support for TCP window size scaling
This should allow us to eventually properly saturate high-bandwidth
network links when using TCP, once other nonoptimal parts of our
network stack are improved.
2023-12-26 21:36:49 +01:00
Idan Horowitz
2c51ff763b Kernel: Properly report receive window size in sent TCP packets
Instead of lying and claiming we always have space left in our receive
buffer, actually report the available space.

While this doesn't really affect network-bound workloads, it makes a
world of difference in cpu/disk-bound ones, like git clones. Resulting
in a considerable speed-up, and in some cases making them work at all.
(instead of the sender side hanging up the connection due to timeouts)
2023-12-26 21:36:49 +01:00
Sergey Bugaev
95bcffd713 Kernel/Net: Rework ephemeral port allocation
Currently, ephemeral port allocation is handled by the
allocate_local_port_if_needed() and protocol_allocate_local_port()
methods. Actually binding the socket to an address (which means
inserting the socket/address pair into a global map) is performed either
in protocol_allocate_local_port() (for ephemeral ports) or in
protocol_listen() (for non-ephemeral ports); the latter will fail with
EADDRINUSE if the address is already used by an existing pair present in
the map.

There used to be a bug where for listen() without an explicit bind(),
the port allocation would conflict with itself: first an ephemeral port
would get allocated and inserted into the map, and then
protocol_listen() would check again for the port being free, find the
just-created map entry, and error out. This was fixed in commit
01e5af487f by passing an additional flag
did_allocate_port into protocol_listen() which specifies whether the
port was just allocated, and skipping the check in protocol_listen() if
the flag is set.

However, this only helps if the socket is bound to an ephemeral port
inside of this very listen() call. But calling bind(sin_port = 0) from
userspace should succeed and bind to an allocated ephemeral port, in the
same was as using an unbound socket for connect() does. The port number
can then be retrieved from userspace by calling getsockname (), and it
should be possible to either connect() or listen() on this socket,
keeping the allocated port number. Also, calling bind() when already
bound (either explicitly or implicitly) should always result in EINVAL.

To untangle this, introduce an explicit m_bound state in IPv4Socket,
just like LocalSocket has already. Once a socket is bound, further
attempt to bind it fail. Some operations cause the socket to implicitly
get bound to an (ephemeral) address; this is implemented by the new
ensure_bound() method. The protocol_allocate_local_port() method is
gone; it is now up to a protocol to assign a port to the socket inside
protocol_bind() if it finds that the socket has local_port() == 0.

protocol_bind() is now called in more cases, such as inside listen() if
the socket wasn't bound before that.
2023-07-29 16:51:58 -06:00
Liav A
7c0540a229 Everywhere: Move global Kernel pattern code to Kernel/Library directory
This has KString, KBuffer, DoubleBuffer, KBufferBuilder, IOWindow,
UserOrKernelBuffer and ScopedCritical classes being moved to the
Kernel/Library subdirectory.

Also, move the panic and assertions handling code to that directory.
2023-06-04 21:32:34 +02:00
kleines Filmröllchen
939600d2d4 Kernel: Use UnixDateTime wherever applicable
"Wherever applicable" = most places, actually :^), especially for
networking and filesystem timestamps.

This includes changes to unzip, which uses DOSPackedTime, since that is
changed for the FAT file systems.
2023-05-24 23:18:07 +02:00
kleines Filmröllchen
213025f210 AK: Rename Time to Duration
That's what this class really is; in fact that's what the first line of
the comment says it is.

This commit does not rename the main files, since those will contain
other time-related classes in a little bit.
2023-05-24 23:18:07 +02:00
Andreas Kling
03cc45e5a2 Kernel: Use RefPtr instead of LockRefPtr for File and subclasses
This was mostly straightforward, as all the storage locations are
guarded by some related mutex.

The use of old-school associated mutexes instead of MutexProtected
is unfortunate, but the process to modernize such code is ongoing.
2023-03-10 13:15:44 +01:00
Lenny Maiorani
e0ab7763da AK: Combine SinglyLinkedList and SinglyLinkedListWithCount
Using policy based design `SinglyLinkedList` and
`SinglyLinkedListWithCount` can be combined into one class which takes
a policy to determine how to keep track of the size of the list. The
default policy is to use list iteration to count the items in the list
each time. The `WithCount` form is a different policy which tracks the
size, but comes with the overhead of storing the count and
incrementing/decrementing on each modification.

This model is extensible to have other forms of counting by
implementing only a new policy instead of implementing a totally new
type.
2023-01-02 20:13:24 +00:00
Andreas Kling
42435ce5e4 Kernel: Make sys$recvfrom() with MSG_DONTWAIT not so racy
Instead of temporary changing the open file description's "blocking"
flag while doing a non-waiting recvfrom, we instead plumb the currently
wanted blocking behavior all the way through to the underlying socket.
2022-08-21 16:45:42 +02:00
Andreas Kling
8997c6a4d1 Kernel: Make Socket::connect() take credentials as input 2022-08-21 16:35:03 +02:00
Andreas Kling
51318d51a4 Kernel: Make Socket::bind() take credentials as input 2022-08-21 16:33:09 +02:00
Andreas Kling
11eee67b85 Kernel: Make self-contained locking smart pointers their own classes
Until now, our kernel has reimplemented a number of AK classes to
provide automatic internal locking:

- RefPtr
- NonnullRefPtr
- WeakPtr
- Weakable

This patch renames the Kernel classes so that they can coexist with
the original AK classes:

- RefPtr => LockRefPtr
- NonnullRefPtr => NonnullLockRefPtr
- WeakPtr => LockWeakPtr
- Weakable => LockWeakable

The goal here is to eventually get rid of the Lock* classes in favor of
using external locking.
2022-08-20 17:20:43 +02:00
Idan Horowitz
364f6a9bf0 Kernel: Remove the Socket::{protocol,}connect ShouldBlock argument
This argument is always set to description.is_blocking(), but
description is also given as a separate argument, so there's no point
to piping it through separately.
2022-07-21 16:39:22 +02:00
Idan Horowitz
086969277e Everywhere: Run clang-format 2022-04-01 21:24:45 +01:00
Idan Horowitz
664ca58746 Kernel: Use u64 instead of size_t for File::can_write offset
This ensures offsets will not be truncated on large files on i686.
2022-01-25 22:41:17 +02:00
Idan Horowitz
9ce537d703 Kernel: Use u64 instead of size_t for File::can_read offset
This ensures offsets will not be truncated on large files on i686.
2022-01-25 22:41:17 +02:00
sin-ack
3da0c072f4 Kernel: Return the correct result for FIONREAD on datagram sockets
Before this commit, we only checked the receive buffer on the socket,
which is unused on datagram streams. Now we return the actual size of
the datagram without the protocol headers, which required the protocol
to tell us what the size of the payload is.
2021-12-16 22:21:35 +03:30
Andreas Kling
79fa9765ca Kernel: Replace KResult and KResultOr<T> with Error and ErrorOr<T>
We now use AK::Error and AK::ErrorOr<T> in both kernel and userspace!
This was a slightly tedious refactoring that took a long time, so it's
not unlikely that some bugs crept in.

Nevertheless, it does pass basic functionality testing, and it's just
real nice to finally see the same pattern in all contexts. :^)
2021-11-08 01:10:53 +01:00
Ben Wiederhake
c05c5a7ff4 Kernel: Clarify ambiguous {File,Description}::absolute_path
Found due to smelly code in InodeFile::absolute_path.

In particular, this replaces the following misleading methods:

File::absolute_path
This method *never* returns an actual path, and if called on an
InodeFile (which is impossible), it would VERIFY_NOT_REACHED().

OpenFileDescription::try_serialize_absolute_path
OpenFileDescription::absolute_path
These methods do not guarantee to return an actual path (just like the
other method), and just like Custody::absolute_path they do not
guarantee accuracy. In particular, just renaming the method made a
TOCTOU bug obvious.

The new method signatures use KResultOr, just like
try_serialize_absolute_path() already did.
2021-10-31 12:06:28 +01:00
Idan Horowitz
adc9939a7b Kernel+LibC: Add support for the IPv4 TOS field via the IP_TOS sockopt 2021-10-28 11:24:36 +02:00
Brian Gianforcaro
5f1c98e576 Kernel: Use operator ""sv in all class_name() implementations
Previously there was a mix of returning plain strings and returning
explicit string views using `operator ""sv`. This change switches them
all to standardized on `operator ""sv` as it avoids a call to strlen.
2021-10-03 13:36:10 +02:00
sin-ack
0ccef94a49 Kernel: Drop the receive buffer when socket enters the TimeWait state
The TimeWait state is intended to prevent another socket from taking the
address tuple in case any packets are still in transit after the final
close. Since this state never delivers packets to userspace, it doesn't
make sense to keep the receive buffer around.
2021-09-16 16:50:23 +02:00
Ali Mohammad Pur
5a0cdb15b0 AK+Everywhere: Reduce the number of template parameters of IntrusiveList
This makes the user-facing type only take the node member pointer, and
lets the compiler figure out the other needed types from that.
2021-09-10 18:05:46 +03:00
Andreas Kling
b300f9aa2f Kernel: Convert KBuffer::copy() => KBuffer::try_copy()
This was a weird KBuffer API that assumed failure was impossible.
This patch converts it to a modern KResultOr<NonnullOwnPtr<KBuffer>> API
and updates the two clients to the new style.
2021-09-07 15:36:39 +02:00
Andreas Kling
01993d0af3 Kernel: Make DoubleBuffer::try() return KResultOr
This tidies up error propagation in a number of places.
2021-09-07 13:53:14 +02:00
Andreas Kling
4a9c18afb9 Kernel: Rename FileDescription => OpenFileDescription
Dr. POSIX really calls these "open file description", not just
"file description", so let's call them exactly that. :^)
2021-09-07 13:53:14 +02:00
Andreas Kling
c2fc33becd Kernel: Rename ProtectedValue<T> => MutexProtected<T>
Let's make it obvious what we're protecting it with.
2021-08-22 03:34:09 +02:00
Andreas Kling
7063303022 Kernel: Convert IPv4 socket list from HashTable to IntrusiveList
There was no reason whatsoever to use a HashTable here. IntrusiveList
removes all the heap allocations and does everything more efficiently.
2021-08-15 16:53:03 +02:00
Jean-Baptiste Boric
583abc27d8 Kernel: Migrate IPv4 socket table locking to ProtectedValue 2021-08-07 11:48:00 +02:00
Jean-Baptiste Boric
aea98a85d1 Kernel: Move Lockable into its own header 2021-08-07 11:48:00 +02:00
Jean-Baptiste Boric
f7f794e74a Kernel: Move Mutex into Locking/ 2021-08-07 11:48:00 +02:00
Brian Gianforcaro
c1a0e379e6 Kernel: Handle OOM when allocating IPv4Socket optional scratch buffer 2021-08-03 18:54:23 +02:00
Brian Gianforcaro
ca94a83337 Kernel: Handle OOM from DoubleBuffer usage in IPv4Socket
The IPv4Socket requires a DoubleBuffer for storage of any data it
received on the socket. However it was previously using the default
constructor which can not observe allocation failure. Address this by
plumbing the receive buffer through the various derived classes.
2021-08-03 18:54:23 +02:00
Brian Gianforcaro
de9ff0af50 Kernel: Modify the IOCTL API to return KResult
The kernel has been gradually moving towards KResult from just bare
int's, this change migrates the IOCTL paths.
2021-07-27 01:23:37 +04:30
Brian Gianforcaro
9a04f53a0f Kernel: Utilize AK::Userspace<T> in the ioctl interface
It's easy to forget the responsibility of validating and safely copying
kernel parameters in code that is far away from syscalls. ioctl's are
one such example, and bugs there are just as dangerous as at the root
syscall level.

To avoid this case, utilize the AK::Userspace<T> template in the ioctl
kernel interface so that implementors have no choice but to properly
validate and copy ioctl pointer arguments.
2021-07-27 01:23:37 +04:30
Andreas Kling
cee9528168 Kernel: Rename Lock to Mutex
Let's be explicit about what kind of lock this is meant to be.
2021-07-17 21:10:32 +02:00
Andreas Kling
c9f6786e8b Kernel: Make various T::class_name() and similar return StringView
Instead of returning char const*, we can also give you a StringView.
2021-07-11 01:46:59 +02:00
stelar7
01e5af487f Kernel: Dont try to register ephemeral TCP ports twice 2021-06-01 23:32:27 +04:30
Gunnar Beutner
5feeb62843 Kernel: Avoid allocating KBuffers for TCP packets
This avoids allocating a KBuffer for each incoming TCP packet.
2021-05-12 13:47:07 +02:00
Gunnar Beutner
b83a110174 Kernel: Increase IPv4 buffer size to 256kB
This increases the buffer size for connection-oriented sockets
to 256kB. In combination with the other patches in this series
I was able to receive TCP packets at a rate of about 120Mbps.
2021-05-12 13:47:07 +02:00
Sergey Bugaev
78459b92d5 Kernel: Implement IP multicast support
An IP socket can now join a multicast group by using the
IP_ADD_MEMBERSHIP sockopt, which will cause it to start receiving
packets sent to the multicast address, even though this address does
not belong to this host.
2021-05-05 21:16:17 +02:00
Andreas Kling
a5f385f052 Kernel: Fix bogus error codes from raw socket protocol_{send,receive}
Since these return KResultOr, we should not negate the error code.
2021-04-30 15:27:41 +02:00
Andreas Kling
71a10eb8e7 Kernel/IPv4: Propagate errors from local port allocation
Remove hacks and assumptions and make the EADDRINUSE propagate all
the way from the point of failure to the syscall layer.
2021-04-30 15:27:41 +02:00
Brian Gianforcaro
1682f0b760 Everything: Move to SPDX license identifiers in all files.
SPDX License Identifiers are a more compact / standardized
way of representing file license information.

See: https://spdx.dev/resources/use/#identifiers

This was done with the `ambr` search and replace tool.

 ambr --no-parent-ignore --key-from-file --rep-from-file key.txt rep.txt *
2021-04-22 11:22:27 +02:00
Ben Wiederhake
5c15ca7b84 Kernel: Make sockets use AK::Time 2021-03-02 08:36:08 +01:00
Tom
f98ca35b83 Kernel: Improve ProcFS behavior in low memory conditions
When ProcFS could no longer allocate KBuffer objects to serve calls to
read, it would just return 0, indicating EOF. This then triggered
parsing errors because code assumed it read the file.

Because read isn't supposed to return ENOMEM, change ProcFS to populate
the file data upon file open or seek to the beginning. This also means
that calls to open can now return ENOMEM if needed. This allows the
caller to either be able to successfully open the file and read it, or
fail to open it in the first place.
2021-01-03 22:12:19 +01:00
Andreas Kling
8cc81c2953 Kernel/Net: Make IPv4Socket::protocol_receive() take a ReadonlyBytes
The overrides of this function don't need to know how the original
packet was stored, so let's just give them a ReadonlyBytes view of
the raw packet data.
2020-12-18 19:22:26 +01:00
Tom
046d6855f5 Kernel: Move block condition evaluation out of the Scheduler
This makes the Scheduler a lot leaner by not having to evaluate
block conditions every time it is invoked. Instead evaluate them as
the states change, and unblock threads at that point.

This also implements some more waitid/waitpid/wait features and
behavior. For example, WUNTRACED and WNOWAIT are now supported. And
wait will now not return EINTR when SIGCHLD is delivered at the
same time.
2020-11-30 13:17:02 +01:00
Nico Weber
416d470d07 Kernel: Plumb packet receive timestamp from NetworkAdapter to Socket::recvfrom
Since the receiving socket isn't yet known at packet receive time,
keep timestamps for all packets.

This is useful for keeping statistics about in-kernel queue latencies
in the future, and it can be used to implement SO_TIMESTAMP.
2020-09-17 17:23:01 +02:00
Tom
c8d9f1b9c9 Kernel: Make copy_to/from_user safe and remove unnecessary checks
Since the CPU already does almost all necessary validation steps
for us, we don't really need to attempt to do this. Doing it
ourselves doesn't really work very reliably, because we'd have to
account for other processors modifying virtual memory, and we'd
have to account for e.g. pages not being able to be allocated
due to insufficient resources.

So change the copy_to/from_user (and associated helper functions)
to use the new safe_memcpy, which will return whether it succeeded
or not. The only manual validation step needed (which the CPU
can't perform for us) is making sure the pointers provided by user
mode aren't pointing to kernel mappings.

To make it easier to read/write from/to either kernel or user mode
data add the UserOrKernelBuffer helper class, which will internally
either use copy_from/to_user or directly memcpy, or pass the data
through directly using a temporary buffer on the stack.

Last but not least we need to keep syscall params trivial as we
need to copy them from/to user mode using copy_from/to_user.
2020-09-13 21:19:15 +02:00