Weakable objects ended up with differing memory layouts in some ports
since they don't build with the DEBUG macro defined.
Instead of forcing ports to define DEBUG, just put this behind a custom
WEAKABLE_DEBUG macro and leave it always-on for now.
You can now #include <AK/Forward.h> to get most of the AK types as
forward declarations.
Header dependency explosion is one of the main contributors to compile
times at the moment, so this is a step towards smaller include graphs.
This was only used by HashTable::dump() which I used when doing the
first HashTable implementation. Removing this allows us to also remove
most includes of <AK/kstdio.h>.
Since BufferStream is about creating specific binary stream formats,
let's not have a flaky type like size_t in there. Instead, clients of
BufferStream can cast their size_t to the binary size they want to use.
Account for this in IPCCompiler by making String lengths always 32-bit.
Now that we're trying to be more portable, we can't only rely on using
i32/u32 and i64/u64 since different systems have different combinations
of int/long/long long and unsigned/unsigned long/unsigned long long.
This implementation uses the new helper method of Bitmap called
find_longest_range_of_unset_bits. This method looks for the biggest
range of contiguous bits unset in the bitmap and returns the start of
the range back to the caller.
Trying to make_weak_ptr() on something that has begun destruction is
very unlikely to be what you want. Let's assert if that scenario comes
up so we can catch it immediately.
This changes copyright holder to myself for the source code files that I've
created or have (almost) completely rewritten. Not included are the files
that were significantly changed by others even though it was me who originally
created them (think HtmlView), or the many other files I've contributed code to.
When using dbg() in the kernel, the output is automatically prefixed
with [Process(PID:TID)]. This makes it a lot easier to understand which
thread is generating the output.
This patch also cleans up some common logging messages and removes the
now-unnecessary "dbg() << *current << ..." pattern.
Previously, when deallocating a range of VM, we would sort and merge
the range list. This was quite slow for large processes.
This patch optimizes VM deallocation in the following ways:
- Use binary search instead of linear scan to find the place to insert
the deallocated range.
- Insert at the right place immediately, removing the need to sort.
- Merge the inserted range with any adjacent range(s) in-line instead
of doing a separate merge pass into a list copy.
- Add Traits<Range> to inform Vector that Range objects are trivial
and can be moved using memmove().
I've also added an assertion that deallocated ranges are actually part
of the RangeAllocator's initial address range.
I've benchmarked this using g++ to compile Kernel/Process.cpp.
With these changes, compilation goes from ~41 sec to ~35 sec.
The generic swap() is not able to swap a NonnullRefPtr with itself,
due to its use of a temporary and NonnullRefPtr asserting when trying
to move() from an already move()'d instance.
Given the following situation:
struct Object : public RefCounted<Object> {
RefPtr<Object> parent;
}
NonnullRefPtr<Object> object = get_some_object();
object = *object->parent;
We would previously crash if 'object' was the only strongly referencing
pointer to 'parent'. This happened because NonnullRefPtr would unref
the outgoing pointee before reffing the incoming pointee.
This patch fixes that by implementing NonnullRefPtr assignments using
pointer swaps, just like RefPtr already did.
As suggested by Joshua, this commit adds the 2-clause BSD license as a
comment block to the top of every source file.
For the first pass, I've just added myself for simplicity. I encourage
everyone to add themselves as copyright holders of any file they've
added or modified in some significant way. If I've added myself in
error somewhere, feel free to replace it with the appropriate copyright
holder instead.
Going forward, all new source files should include a license header.
It was possible to craft a custom ELF executable that when symbolicated
would cause the kernel to read from user-controlled addresses anywhere
in memory. You could then fetch this memory via /proc/PID/stack
We fix this by making ELFImage hand out StringView rather than raw
const char* for symbol names. In case a symbol offset is outside the
ELF image, you get a null StringView. :^)
Test: Kernel/elf-symbolication-kernel-read-exploit.cpp
This removes an item at an index without preserving the sort order of
the Vector.
This enables constant-time removal from unsorted Vectors, as it avoids
shifting all of the entries following the removed one.
If the last character was the separator and keep_empty is true, the
previous if statement would have already appended the last empty part,
so no need to do this again.
This was even more problematic, because the result of split_view() is
expected to consist of true substrings that are usable with the
StringView::substring_view_starting_*_substring() methods, not of
equal strings located elsewhere.
Fixes https://github.com/SerenityOS/serenity/issues/970
See https://github.com/SerenityOS/serenity/pull/938
This was tripping up CObject which interprets timer ID 0 as "no timer".
Once we got ID 0 assigned, it was impossible to turn it off and it
would fire on every event loop iteration, causing CPU churn.
This variant of get() returns a const JsonValue* instead of a JsonValue
and can be used when you want to peek into a JsonObject's member fields
without making copies.
Lock each directory before entering it so when using -j, the same
dependency isn't built more than once at a time.
This doesn't get full -j parallelism though, since one make child
will be sitting idle waiting for flock to receive its lock and
continue making (which should then do nothing since it will have
been built already). Unfortunately there's not much that can be
done to fix that since it can't proceed until its dependency is
built by another make process.
Allow everything to be built from the top level directory with just
'make', cleaned with 'make clean', and installed with 'make
install'. Also support these in any particular subdirectory.
Specifying 'make VERBOSE=1' will print each ld/g++/etc. command as
it runs.
Kernel and early host tools (IPCCompiler, etc.) are built as
object.host.o so that they don't conflict with other things built
with the cross-compiler.
Using int was a mistake. This patch changes String, StringImpl,
StringView and StringBuilder to use size_t instead of int for lengths.
Obviously a lot of code needs to change as a result of this.
This patch reduces the O(n) tab completion to something like O(log(n)).
The cache is just a sorted vector of strings and we binary search it to
get a string matching our input, and then check the surrounding strings
to see if we need to remove any characters. Also we no longer stat each
file every time.
Also added an #include in BinarySearch since it was using size_t. Oops.
If `export` is called, we recache. Need to implement the `hash` builtin
for when an executable has been added to a directory in PATH.
This is a special specifier that does not output anything to the stream,
but saves the number of already output chars to the provided pointer.
This is apparently used by GNU Nano.
binary_search takes a haystack, a size, a needle and a compare function.
The compare function should return negative if a < b, positive if a > b
and 0 if a == b. The "sane default" compare function is integral_compare
which implements this with subtraction a - b.
binary_search returns a pointer to a matching element, NOT necessarily
the FIRST matching element. It returns a nullptr if the element was not
found.
This patch includes tests for binary_search.
It's missing query string parsing from new URLs, but you can set the
query string programmatically, and it will be part of the URL when
serialized through to_string().
We were forgetting to adopt the WeakLink, causing a reference leak.
This ended up costing us one allocation per exec(), with this stack:
kmalloc_impl()
Inode::set_vmo()
InodeVMObject::create_with_inode()
Process::do_exec()
Process::exec()
Process::sys$execve()
This was a pain to track down, in the end I caught it by dumping out
every live kmalloc pointer between runs and diffing the sets. Then it
was just a matter of matching the pointer to a call stack and looking
at what went wrong. :^)
I'll be reconstructing parts of the VisualBuilder application here and
then we can retire VisualBuilder entirely once all the functionality
is available in HackStudio.
Since NonnullRefPtr and NonnullOwnPtr cannot be null, it is pointless
to convert them to a bool, since it would always be true.
This patch makes it an error to null-check one of these pointers.
It's too dang frustrating that we actually crash whenever we hit some
unimplemented printf specifier. Let's just log the whole format string
and carry on as best we can.
Add dedicated internal types for Int64 and UnsignedInt64. This makes it
a bit more straightforward to work with 64-bit numbers (instead of just
implicitly storing them as doubles.)
This is just a wrapper around strstr() for now. There are many better
ways to search for a string within a string, but I'm just adding a nice
API at the moment. :^)
Previously we would not run destructors for items in a CircularQueue,
which would lead to memory leaks.
This patch fixes that, and also adds a basic unit test for the class.
ELFLoader::layout() had a "failed" variable that was never set. This
patch checks the return value of each hook (alloc/map section and tls)
and fails the load if they return null.
I also needed to patch Process so that the alloc_section_hook and
map_section_hook actually return nullptr when allocating a region fails.
Fixes#664 :)
This class inherits from CircularQueue and adds the ability dequeue
from the end of the queue using dequeue_end().
Note that I had to make some of CircularQueue's fields protected to
properly implement dequeue_end.
This kind of thing is a bit annoying. On Serenity, size_t is the same
size as u32, but not the same type. Because of "long" or whatever.
This patch makes String not complain about duplicate overloads.
The former allows you to inspect the string while it's being built.
It's an explicit method rather than `operator StringView()` because
you must remember you can only look at it in between modifications;
appending to the StringBuilder invalidates the StringView.
The latter lets you clear the state of a StringBuilder explicitly, to
start from an empty string again.
This simplifies the ownership model and makes Region easier to reason
about. Userspace Regions are now primarily kept by Process::m_regions.
Kernel Regions are kept in various OwnPtr<Regions>'s.
Regions now only ever get unmapped when they are destroyed.
`AK::String` can now be reversed via AK::String::reverse(). This makes
life a lot easier for functions like `itoa()`, where the output
ends up being backwards. Very much not like the normal STL
(which requires an `std::reverse` object) way of doing things.
A call to reverse returns a new `AK::String` so as to not upset any
of the possible references to the same `StringImpl` shared between
Strings.
The old implementation tried to move forward as long as the current
byte looks like a UTF-8 character continuation byte (has its two
most significant bits set to 10). This is correct as long as we assume
the string is actually valid UTF-8, which we do (we also have a separate
method that can check whether it is the case).
We can't, however, assume that the data after the end of our string
is also valid UTF-8 (in fact, we're not even allowed to look at data
outside out string, but it happens to a valid memory region most of
the time). If the byte after the end of our string also has its most
significant bits set to 10, we would move one byte forward, and then
fail the m_length > 0 assertion.
One way to fix this would be to add a length check inside the loop
condition. The other one, implemented in this commit, is to reimplement
the whole function in terms of decode_first_byte(), which gives us
the length as encoded in the first byte. This also brings it more
in line with the other functions around it that do UTF-8 decoding.
This patch adds support for TLS according to the x86 System V ABI.
Each thread gets a thread-specific memory region, and the GS segment
register always points _to a pointer_ to the thread-specific memory.
In other words, to access thread-local variables, userspace programs
start by dereferencing the pointer at [gs:0].
The Process keeps a master copy of the TLS segment that new threads
should use, and when a new thread is created, they get a copy of it.
It's basically whatever the PT_TLS program header in the ELF says.
This was a workaround to be able to build on case-insensitive file
systems where it might get confused about <string.h> vs <String.h>.
Let's just not support building that way, so String.h can have an
objectively nicer name. :^)
Passing these through the generic JsonValue path was causing us to
instantiate temporary JsonValues that incurred a heap allocation.
This avoids that by adding specialized overloads for string types.
Right now if we encounter an unknown character, printf (and its related
functions) fail in a really bad way, where they forget to pull things off
the stack. This usually leads to a crash somewhere else, which is hard to
debug.
This patch makes printf abort as soon as it encounters a formatting
character that it can't handle. This is not the optimal solution, but it
is an improvement for debugging.
Utf8View wraps a StringView and implements begin() and end() that
return a Utf8CodepointIterator, which parses UTF-8-encoded Unicode
codepoints and returns them as 32-bit integers.
This is the first step towards supporting emojis in Serenity ^)
https://github.com/SerenityOS/serenity/issues/490
When printing hex numbers, we were printing the wrong thing sometimes. This
was because we were dividing the digit to print by 15 instead of 16. Also,
dividing by 16 is the same as shifting four bits to the right, which is a
bit closer to our actual intention in this case, so let's use a shift
instead.
This way, primitive JsonValue serialization is still handled by
JsonValue::serialize(), but JsonArray and JsonObject serialization
always goes through serializer classes. This is no less efficient
if you have the whole JSON in memory already.
We use consumable annotations to catch bugs where you get the .value()
of an Optional before verifying that it's okay.
The bug here was that only has_value() would set the consumed state,
even though operator bool() does the same job.
Normally you want to access the T&, but sometimes you need to grab at
the NonnullPtr, for example when moving it out of the Vector before
a call to remove(). This is generally a weird pattern, but I don't have
a better solution at the moment.
Using the new get_process_name() syscall, we can automatically prefix
all userspace debug logging.
Hopefully this is more helpful than annoying. We'll find out! :^)
Add the concept of a PeekType to Traits<T>. This is the type we'll
return (wrapped in an Optional) from HashMap::get().
The PeekType for OwnPtr<T> and NonnullOwnPtr<T> is const T*,
which means that HashMap::get() will return an Optional<const T*> for
maps-of-those.
Okay, so, OwnPtr<T>::release_nonnull() returns a NonnullOwnPtr<T>.
It assumes that the OwnPtr is non-null to begin with.
Note that this removes the value from the OwnPtr, as there can only be
a single owner.
The printf formatting mini-language actually allows you
to pass a '*' character in place of the fill width specification,
in which case it eats one of the passed in arguments and uses it
as width, so implement that.
This replaces Optional<T>(U&&) which clang-tidy complained may hide the
regular copy and move constructors. That's a good point, clang-tidy,
and I appreciate you pointing that out!
This allows us to take advantage of the now-optimized (to do memmove())
Vector::append(const T*, int count) for collecting these strings.
This is a ~15% speedup on the load_4chan_catalog benchmark.
This can definitely be improved with better trivial type detection and
by using the TypedTransfer template in more places.
It's a bit annoying that we can't get <type_traits> in Vector.h since
it's included in the toolchain compilation before we have libstdc++.
This has several significant changes to the networking stack.
* Significant refactoring of the TCP state machine. Right now it's
probably more fragile than it used to be, but handles quite a lot
more of the handshake process.
* `TCPSocket` holds a `NetworkAdapter*`, assigned during `connect()` or
`bind()`, whichever comes first.
* `listen()` is now virtual in `Socket` and intended to be implemented
in its child classes
* `listen()` no longer works without `bind()` - this is a bit of a
regression, but listening sockets didn't work at all before, so it's
not possible to observe the regression.
* A file is exposed at `/proc/net_tcp`, which is a JSON document listing
the current TCP sockets with a bit of metadata.
* There's an `ETHERNET_VERY_DEBUG` flag for dumping packet's content out
to `kprintf`. It is, indeed, _very debug_.
Keep a 256-entry string cache during parse to avoid creating some new
strings when possible. This cache is far from perfect but very cheap.
Since none of the strings are transient, this only costs us a couple of
pointers and a bit of ref-count manipulation.
The cache hit rate on 4chan_catalog.json is ~33% and the speedup on
the load_4chan_catalog benchmark is ~7%.
I was able to get parsing time down to about 1/3 of the original time
by using callgrind+kcachegrind. There's definitely more improvements
that can be made here, but I'm gonna be happy with this for now. :^)
- Return more specific types from parse_array() and parse_object().
- Don't create a throwaway String in extract_while().
- Use a StringView in parse_number() to avoid a throwaway String.
This is a shameless copy-paste of String::to_int(). We should find some
way to share this code between String and StringView instead of having
two duplicate copies like this.
Use AK::exchange() to switch out the internal storage. Also mark these
functions with [[nodiscard]] to provoke an compile-time error if they
are called without using the return value.
This gives us much better error messages when you try to use them.
Without this change, it would complain about the absence of functions
named ref() and deref() on RefPtr itself. With it, we instead get a
"hey, this function is deleted" error.
Change operator=(T&) to operator=T(const T&) also, to keep assigning
a const T& to a NonnullRefPtr working.
Instead of aborting the program when we hit an assertion, just print a
message and keep going.
This allows us to write tests that provoke assertions on purpose.
There was a bug in the "prepend_vector_object" test but it was masked
by us not printing failures. (The bug was that we were adding three
elements to the "objects" vector and then checking that another
vector called "more_objects" indeed had three elements. Oops!)
This is a complement to append() that works by constructing the new
element in-place via placement new and forwarded constructor arguments.
The STL calls this emplace_back() which looks ugly, so I'm inventing
a nice word for it instead. :^)
It doesn't seem sane to try to iterate over a HashTable while it's in
the middle of being cleared. Since this might cause strange problems,
this patch adds an assertion if an iterator is constructed during
clear() or rehash() of a HashTable.
An operation often has two pieces of underlying information:
* the data returned as a result from that operation
* an error that occurred while retrieving that data
Merely returning the data is not good enough. Result<> allows exposing
both the data, and the underlying error, and forces (via clang's
consumable attribute) you to check for the error before you try to
access the data.
Put simply, Error<> is a way of forcing error handling onto an API user.
Given a function like:
bool might_work();
The following code might have been written previously:
might_work(); // but what if it didn't?
The easy way to work around this is of course to [[nodiscard]] might_work.
But this doesn't work for more complex cases like, for instance, a
hypothetical read() function which might return one of _many_ errors
(typically signalled with an int, let's say).
int might_read();
In such a case, the result is often _read_, but not properly handled. Like:
return buffer.substr(0, might_read()); // but what if might_read returned an error?
This is where Error<> comes in:
typedef Error<int, 0> ReadError;
ReadError might_read();
auto res = might_read();
if (might_read.failed()) {
switch (res.value()) {
case EBADF:
...
}
}
Error<> uses clang's consumable attributes to force failed() to be
checked on an Error instance. If it's not checked, then you get smacked.