Instead of ballooning the size of the Function object, simply place the
callable on the heap with a properly aligned address.
This brings the alignment of Function down from ridiculous sizes like 64
bytes down to a manageable 8 bytes.
…kernel"
This reverts commit d4d92184b3.
The alignemnt requirements imposed by this are overkill at best and
ridiculous at worst, a future commit will tackle this problem in a
different, more space-efficient way.
This commit also reverts db5ad0c since code outside of the web spec
expects serialized paths to be percent decoded.
Also, there are issues trying to implement the concept "opaque
path". For now, we still use the old cannot_be_a_base_url(), but its
usage needs to be removed in favor of a has_opaque_path() as the spec
has changed since then.
...requested size"
This reverts commit 13573a6c4b.
Some clients of `BufferedStream` expect a non-blocking read by
`read_some` which the commit above made impossible by potentially
performing a blocking read. For example, the following command hangs:
pro http://ipecho.net/plain
This is caused by the HTTP job expecting to read the body of the
response which is included in the active buffer, but since the buffered
data size is less than the buffer passed into `read_some`, another
blocking read is performed before anything is returned.
By reverting this commit, all tests still pass and `pro` no longer
hangs. Note that because of another bug related to `stdout` / `stderr`
mixing and the absence of a line ending, there is no output to the
command above except a progress update.
This is defined when building on GNU/Hurd, the GNU operating system with
the Hurd as its kernel (as it was designed originally, before Linux and
GNU/Linux came to be).
Also, define the corresponding part of User-Agent.
This is defined when we're compiling against the GNU C Library, whether
on Linux or on other kernels that glibc works on. More AK_LIBC_xxxx
definitions could be potentially added in the future.
Also add AK_LIBC_GLIBC_PREREQ(), which checks for a specific glibc
version.
While macOS backtrace(3) puts a space directly after the mangled symbol
name, some versions of glibc put a + directly after it. This new logic
accounts for both situations when trying to demangle.
Co-Authored-By: Andreas Kling <kling@serenityos.org>
On platforms that support it, enable using ``<execinfo.h>`` to get
backtrace(3) to dump a backtrace on assertion failure. This should make
debugging things like WebContent crashes in Lagom much easier.
This part of the spec is mostly useful for our debugging for now, but
could eventually be hooked up so that the user can see any reported
validation errors.
The main change here is that we now follow the spec by asserting the URL
is not special. Previously I could not find a way to enable this without
getting assertions when browsing the web - but that seems to no longer
be the case (probably from some other fixes which have since been made).
The main change is the simplification of the expression
`(10^precision * fraction) / 2^precision` to `5^precision * fraction`.
Those expressions overflow or not depends on the value of `precision`
and `fraction`. For the maximum value of `fraction`, the following table
shows for which value of `precision` overflow will occur.
Old New
u32 08 10
u64 15 20
u128 30 39
As of now `u64` type is used to calculate the result of the expression.
Meaning that before, only FixedPoints with `precision` less than 15
could be accurately rendered (for every value of fraction) in decimal.
Now, this limit gets increased to 20.
This refactor also fixes, broken decimal render for explicitly specified
precision width in format string, and broken hexadecimal render.
By default, `1` is of the type `int` which is 32-bits wide at max.
Because of that, if `precision` of a `FixedPoint` is greater than 32,
the expression `1 << precision` will get clamped at 32-bits and the
result will always be zero. Casting `1` to the wider underlying type
will make the expression not overflow.
Because of the off-by-one error, the second bit of the fraction was
getting ignored in differentiating between fractions equal to 0.5 or
greater than 0.5. This resulted in numbers like 2.75 being considered
as having fraction equal to 0.5 and getting rounded incorrectly (to 2).
This new Kernel StdLib function will be used to copy contents of a
FixedStringBuffer with a null character to a user process.
The first user of this new function is the prctl option of
PR_GET_PROCESS_NAME which would copy a process name including a null
character to a user provided buffer.
There was a small mishmash of argument order, as seen on the table:
| Traits<T>::equals(U, T) | Traits<T>::equals(T, U)
============= | ======================= | =======================
uses equals() | HashMap | Vector, HashTable
defines equals() | *String[^1] | ByteBuffer
[^1]: String, DeprecatedString, their Fly-type equivalents and KString.
This mostly meant that you couldn't use a StringView for finding a value
in Vector<String>.
I'm changing the order of arguments to make the trait type itself first
(`Traits<T>::equals(T, U)`), as I think it's more expected and makes us
more consistent with the rest of the functions that put the stored type
first (like StringUtils functions and binary_serach). I've also renamed
the variable name "other" in find functions to "entry" to give more
importance to the value.
With this change, each of the following lines will now compile
successfully:
Vector<String>().contains_slow("WHF!"sv);
HashTable<String>().contains("WHF!"sv);
HashMap<ByteBuffer, int>().contains("WHF!"sv.bytes());
The proper syntax for defining user-defined literals does not require a
space between the `operator""` token and the operator name:
> error: identifier 'sv' preceded by whitespace in a literal operator
> declaration is deprecated
This change introduces HeapFunction, which is intended to be used as a
replacement for SafeFunction. The new type behaves like a regular
GC-allocated object, which means it needs to be visited from
visit_edges, and unlike SafeFunction, it does not create new roots for
captured parameters.
Co-Authored-By: Andreas Kling <kling@serenityos.org>
IFF was a generic container fileformat that was popular on the Amiga
since it was the only file format supported by Deluxe Paint.
ILBM is an image format popular in the late eighties/nineties
that uses the IFF container.
This is a very first version of the decoder that only supports
(byterun) compressed files with bpp <= 8.
Only the minimal chunks are decoded: CMAP, BODY, BMHD.
I am planning to add support for the following variants:
- EHB (32 colours + lighter 32 colours)
- HAM6 / HAM8 (special mode that allowed to display the whole Amiga
4096 colours / 262 144 colours palette)
- TrueColor (24bit)
Things that could be fun to do:
- Still images could be animated using color cycle information
The web specs do not expect decoding or decoding to happen when calling
these helpers. This allows us to remove the raw_fragment helper function
from the URL class.
This encoder can handle all integer formats and sample rates, though
only two channels well. It uses fixed LPC and performs a
close-to-optimal parameter search on the LPC order and residual Rice
parameter, leading to decent compression already.
Just like with input buffered streams, we don't currently have a use
case for output buffered streams which aren't seekable, since the main
application are files.
To ensure this happens without duplicating code, we allow forcing a
StringBuilder object to only use the inline buffer, so the code in the
AK/Format.cpp file doesn't need to deal with different underlying
storage types (expandable or inline-fixed) at all.
I couldn't run the parser in a debugger like I normally would, so I
added printouts to understand where the parser is failing.
More could be added, but these are enough to get a good idea of what
the parser is doing. It's very spammy, though, so enable it by flicking
the IMAP_PARSER_DEBUG switch :^)
Instead, use the FixedCharBuffer class to ensure we always use a static
buffer storage for these names. This ensures that if a Process or a
Thread were created, there's a guarantee that setting a new name will
never fail, as only copying of strings should be done to that static
storage.
The limits which are set are 32 characters for processes' names and 64
characters for thread names - this is because threads' names could be
more verbose than processes' names.
This class encapsulates a fixed Array with compile-time size definition
for storing ASCII characters.
There are also new Kernel StdLib functions to copy user data into such
objects so this class will be useful later on.
The spec indicates we should support serializing opaque hosts, but we
were assuming the host contained a String. Opaque hosts are represented
with Empty. Return an empty string here instead to prevent crashing on
an invalid variant access.
Now that ""_string is infallible, the only benefit of explicitly
constructing a short string is the ability to do it at compile-time. But
we never do that, so let's simplify the API and remove this
implementation detail from it.
This could happen if a sequence of '0' parts was followed by a longer
sequence of '0' parts at the end of the host. The first sequence was
being used for the compress, and not the second.
For example, [1:1:0:0:1:0:0:0] was being serialized as: [1:1::1:0:0:0]
instead of [1:1:0:0:1::].
Fix this by checking at the end of the loop if we are in the middle of a
sequence of '0' parts that is longer than the current longest.
Everywhere only ever expects percent encoding to occur, so let's just
remove this flag altogether. At the same time, replace some
DeprecatedString with StringView.
Parsing 'data:' URLs took it's own route. It never set standard URL
fields like path, query or fragment (except for scheme) and instead
gave us separate methods called `data_payload()`, `data_mime_type()`,
and `data_payload_is_base64()`.
Because parsing 'data:' didn't use standard fields, running the
following JS code:
new URL('#a', 'data:text/plain,hello').toString()
not only cleared the path as URLParser doesn't check for data from
data_payload() function (making the result be 'data:#a'), but it also
crashes the program because we forbid having an empty MIME type when we
serialize to string.
With this change, 'data:' URLs will be parsed like every other URLs.
To decode the 'data:' URL contents, one needs to call process_data_url()
on a URL, which will return a struct containing MIME type with already
decoded data! :^)
By not clearing the buffer, we were leaking the path part of a URL into
the query for URLs without an authority component (no '//host').
This could be seen most noticeably in mailto: URLs with header fields
set, as the query part of `mailto:user@example.com?subject=test` was
parsed to `user@example.comsubject=test`.
data: URLs didn't have this problem, because we have a special case for
parsing them.
This is defined in the spec, but was missing in our table. Fix this, and
add a spec comment for what is missing. Also begin a basic text based
test for URL, so we can get some coverage of LibWeb's usage of URL too.
This takes the previous alternation optimisation and applies it to all
the alternation blocks instead of just the few instructions at the
start.
By generating a trie of instructions, all logically equivalent
instructions will be consolidated into a single node, allowing the
engine to avoid checking the same thing multiple times.
For instance, given the pattern /abc|ac|ab/, this optimisation would
generate the following tree:
- a
| - b
| | - c
| | | - <accept>
| | - <accept>
| - c
| | - <accept>
which will attempt to match 'a' or 'b' only once, and would also limit
the number of backtrackings performed in case alternatives fails to
match.
This optimisation is currently gated behind a simple cost model that
estimates the number of instructions generated, which is pessimistic for
small patterns, though the change in performance in such patterns is not
particularly large.
Similar to floor and ceil, we were forcing the values to be doubles here
Also adds a big FIXME about my findings trying to add a general
implementation for these
Both GCC and Clang inline this function to use bit-wise logic and/or
appropriate instructions even on -O0 and allow their use in a constexpr
context, see
https://godbolt.org/z/de1393vha
In order to follow spec text to achieve this, we need to change the
underlying representation of a host in AK::URL to deserialized format.
Before this, we were parsing the host and then immediately serializing
it again.
Making that change resulted in a whole bunch of fallout.
After this change, callers can access the serialized data through
this concept-host-serializer. The functional end result of this
change is that IPv6 hosts are now correctly serialized to be
surrounded with '[' and ']'.
This implementation will allow us to fix serialization of IPv6
addresses not being surrounded by '[' and ']'.
Nothing is calling this function yet - this will come in the next
(larger) commit where the underlying host representation inside of
AK::URL is changed from DeprecatedString to URL::Host.
This doesn't seem trivial enough to be defining in the header like this,
and should not be a performance critical function anyhow.
Also add spec comments while we are at it, and a FIXME since we do not
seem to exactly align.
And use them where applicable. This will allow us to store the host in
the deserialized format as the spec specifies.
Ideally these typdefs would instead be the existing AK interfaces, but
in the meantime, we can just use this.
These methods are slightly more convenient than storing the Bytes
separately. However, it it feels unsanitary to reach in and access this
data directly. Both of the users of these already have the
[Readonly]Bytes available in their constructors, and can easily avoid
using these methods, so let's remove them entirely.
Due to overload resolutions rules, this simple code provokes a crash:
ReadonlyBytes readonly_bytes{};
FixedMemoryStream stream{readonly_bytes};
ReadonlyBytes give_them_back{stream.bytes()};
// -> Panics on VERIFY(m_writing_enabled);
// but this is fine:
auto bytes = static_cast<FixedMemoryStream const&>(*stream).bytes()
If we need to be explicit about it, let's rename the overload instead of
adding that `static_cast`.
I misunderstood the spec step for checking whether the host 'ends with a
number'. We can't simply check for it if ends with a number, this check
is actually an algorithm which is required to avoid detecting hosts that
end with a number from an IPv4 host.
Implement this missing step, and add a test to cover this.
clamp_to_int clamps value to valid range of int values so resulting
value does not overflow.
It is going to be used to clamp float or double values to int that
represents fixed-point value of CSSPixels.
This reverts commit d48c68cf3f.
Unfortunately, this currently copies some warn() invocations that we do
*not* want in the debug console, such as test-js's use of OSC command 9
to report progress.
This is just a straight (and fairly inefficient) implementation of IPv6
parsing and serialization from the URL spec.
Note that we don't use AK::IPv6Address here because the URL spec
requires a specific serialization behavior.
The array which contains the actual parameters is always located
immediately after the base `TypeErasedFormatParams` object of
`VariadicFormatParams`. Hence, storing a pointer to it inside a `Span`
is redundant. Changing it to a zero-length array saves 8 bytes.
Secondly, we limit the number of parameters to 256, so `m_size` and
`m_next_index` can be stored in a smaller data type than `size_t`,
saving us another 8 bytes.
This decreases the size of a single-element `VariadicFormatParams` from
48 to 32 bytes, thus reducing the code size overhead of setting up
parameters for `dbgln()`.
Note that [arrays of length zero][1] are a GNU extension, but it's used
elsewhere in the codebase already and is explicitly supported by Clang
and GCC.
[1]: https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html
Xcode 15 betas 1-3 lack https://reviews.llvm.org/D135772, which fixes a
bug that causes trailing `requires` clauses to be evaluated twice,
failing the second time. Reported as FB12284201.
This caused compile errors when instantiating types derived from RefPtr:
> error: invalid reference to function 'NonnullRefPtr': constraints not
> satisfied
> note: because substituted constraint expression is ill-formed: value
> of type '<dependent type>' is not contextually convertible to 'bool'.
This commit works around the issue by moving the `requires` clauses
after the template parameter list.
In most cases, trailing `requires` clauses and those specified after the
template parameter list work identically, so this change should not
impact the code's behavior. The only difference is that trailing
requires clauses are evaluated *after* constrained placeholder types
(i.e. `Integral auto i` function parameter).
All elements of the vector were moved to the left, for each element to
remove. This patch makes the function move each element exactly once.
On the same test case as the previous commit, it makes the function
disappear from the profile. These two commits combined reduce the
decompression time by 12%.
As confusing as it may sound, reusing them is terrible performance wise.
When profiling the PNG decoder, the result (which is dominated by the
Zlib decompression) shows that the `cleanup_unused_chunks()` function
represented 14.26% of the profile before this patch and only 7.7%
afterward.
On a 6.5 MB PNG image, it reduces the decompression time by more than
5%.
This uses one of Sun OS's algorithms, for a comparison to other
algorithms please refer to
https://gist.github.com/Hendiadyoin1/f58346d66637deb9156ef360aa158bf9
This is used on aarch64 builds and for x86 floats and doubles
for performance gains check
https://quick-bench.com/q/_2jTykshP6cUqtgdepFaoQ53YC8
which shows approximately 2x gains
Co-Authored-By: Ben Wiederhake <BenWiederhake.GitHub@gmx.de>
Co-Authored-By: kleines Filmröllchen <filmroellchen@serenityos.org>
Co-Authored-By: Dan Klishch <danilklishch@gmail.com>
This now searches the memory in blocks, which should be slightly more
efficient. However, it doesn't make much difference (e.g. ~1% in LZMA
compression) in most real-world applications, as the non-hint function
is more expensive by orders of magnitude.
This factors out a lot of complicated math into somewhat understandable
functions.
While at it, rename `next_read_span_with_seekback` to
`next_seekback_span` to keep the naming consistent and to avoid making
function names any longer.
The "operation modes" of this function have very different focuses, and
trying to combine both in a way where we share the most amount of code
probably results in the worst performance.
Instead, split up the function into "existing distances" and "no
existing distances" so that we can optimize either case separately.
We will be adding extra logic to the CircularBuffer to optimize
searching, but this would negatively impact the performance of
CircularBuffer users that don't need that functionality.