This patch parses enough of GPOS tables to be able to support the
kerning information embedded in Inter.
Since that specific font only applies positioning offsets to the first
glyph in each pair, I was able to get away with not changing our API.
Once we start adding support for more sophisticated positioning, we'll
need to be able to communicate more than a simple "kerning offset" to
the clients of this code.
With Clang, the previous/next pointers in buckets of an
`OrderedHashTable` are not cleared when a bucket is being shifted up as
a result of a removed bucket. As a result, an unfortunate pointer mixup
could lead to an infinite loop in the `HashTable` iterator, which was
exposed in `HashMap::keys()`.
Co-authored-by: Luke Wilde <lukew@serenityos.org>
Similar to POSIX read, the basic read and write functions of AK::Stream
do not have a lower limit of how much data they read or write (apart
from "none at all").
Rename the functions to "read some [data]" and "write some [data]" (with
"data" being omitted, since everything here is reading and writing data)
to make them sufficiently distinct from the functions that ensure to
use the entire buffer (which should be the go-to function for most
usages).
No functional changes, just a lot of new FIXMEs.
We don't need to decode the entire code point to know its length. This
reduces the runtime of decoding a string containing 5 million instances
of U+10FFFF from over 4 seconds to 0.9 seconds.
Let's add FlyString::from_deprecated_fly_string() so we can use it
instead of FlyString::from_utf8(). This will make it easier to detect
potential unncessary allocations as we transfer to FlyString.
We currently fully casefold the left- and right-hand sides to compare
two strings with case-insensitivity. Now, we casefold one code point at
a time, storing the result in a view for comparison, until we exhaust
both strings.
Indented #cmakedefine01 is supported since CMake 3.10:
https://cmake.org/cmake/help/latest/release/3.10.html#commands
We're on 3.16, and the minimum required for Serenity itself is 3.25, so
this should be fine. And it makes CLion's auto-formatter much happier!
For example, the code point U+002F could be encoded as UTF-8 with the
bytes 0x80 0xAF. This trick has historically been used to bypass
security checks.
This is needed to have code for creating an in-memory sRGB profile using
the (floating-ppoint) numbers from the sRGB spec and having the
fixed-point values in the profile match what they are in other software
(such as GIMP).
It has the side effect of making the FixedPoint ctor no longer constexpr
(which seems fine; nothing was currently relying on that).
Some of FixedPoint's member functions don't round yet, which requires
tweaking a test.
`consume_until(foo)` stops before foo, and so does
`ignore_until(Predicate)`, so let's make the other `ignore_until()`
overloads consistent with that so they're less confusing.
This commit moves the implementation of getopt into AK, and converts its
API to understand and use StringView instead of char*.
Everything else is caught in the crossfire of making
Option::accept_value() take a StringView instead of a char const*.
With this, we must now pass a Span<StringView> to ArgsParser::parse(),
applications using LibMain are unaffected, but anything not using that
or taking its own argc/argv has to construct a Vector<StringView> for
this method.
The output of the DeprecatedString::bijective_base_from() is now
correct for numbers larger than base^2.
This makes column names display correctly in Spreadsheet.
We briefly discussed this when adding the new String type but couldn't
settle on a name. However, having to use String::from_utf8() on every
literal string is a bit unwieldy, so let's have these options available!
Naming-wise '_string' is not as short as 'sv' but should be relatively
clear; it also matches '_bigint' and '_ubigint' in length.
'_short_string' may be longer than the actual string itself, but it's
still an improvement over the static function :^)
Since our C++ source files are UTF-8 encoded anyway, it should be
impossible to create a string literal with invalid UTF-8, so including
that in the name is not as important as in the function that can receive
arbitrary data.
At the moment, this processes the RIFF chunk structure and extracts
the ICCP chunk, so that `icc` can now print ICC profiles embedded
in webp files. (And are image files really more than containers
of icc profiles?)
It doesn't even decode image dimensions yet.
The lossy format is a VP8 video frame. Once we get to that, we
might want to move all the image decoders into a new LibImageDecoders
that depends on both LibGfx and LibVideo. (Other newer image formats
like heic and av1f also use video frames for image data.)
This naming scheme matches Vector.
This also changes `take_last` to move the value it takes, and delete by
known pointer, avoiding a full lookup and potential copies.
Until now, it was possible to assign a RP<T const> or NNRP<T const>
to RP<T> or NNRP<T>. This meant that the constness of the T was lost.
We had a lot of code that relied on this sloppiness, and by the time
you see this commit, I hopefully found and fixed all of it. :^)
This stops us needing a lot of ugly `FlyString { ... }` wrappers. THis
is the behavior that `DeprecatedFlyString(DeprecatedString)` has so it
should be fine.
The patch also contains modifications on several classes, functions or
files that are related to the `JPGLoader`.
Renaming include:
- JPGLoader{.h, .cpp}
- JPGImageDecoderPlugin
- JPGLoadingContext
- JPG_DEBUG
- decode_jpg
- FuzzJPGLoader.cpp
- Few string literals or texts
Instead of rehashing on collisions, we use Robin Hood hashing: a simple
linear probe where we keep track of the distance between the bucket and
its ideal position. On insertion, we allow a new bucket to "steal" the
position of "rich" buckets (those near their ideal position) and move
them further down.
On removal, we shift buckets back up into the freed slot, decrementing
their distance while doing so.
This behavior automatically optimizes the number of required probes for
any value, and removes the need for periodic rehashing (except when
expanding the capacity).
This approximation tries to generate values within 0.1% of their actual
expected value. Microbenchmarks indicate that this iterative SIMD
version can be up to 60x faster than `AK::SIMD::exp`.
The parser is still very much a work-in-progress, but it can currently
parse most of the basic bits, the only *completely* unimplemented things
in the parser are:
- heredocs (io_here)
- alias expansion
- arithmetic expansion
There are a whole suite of bugs, and syntax highlighting is unreliable
at best.
For now, this is not attached anywhere, a future commit will enable it
for /bin/sh or a `Shell --posix` invocation.
This ensures constructors that take a span or an initializer_list
don't allocate when there's already enough inline storage.
(Previously these constructors always allocated)
This is done by providing Traits<ByteBuffer>::equals functions for
(Readonly)Bytes, as the base GenericTraits<T>::equals is unable to
convert the ByteBuffer to (Readonly)Bytes to then use Span::operator==
This allows us to check if a Vector<ByteBuffer> contains a
(Readonly)Bytes without having to making a copy of it into a ByteBuffer
first. The initial use of this is in LibWeb with CORS-preflight, where
we check the split contents of the Access-Control headers with
Fetch::Infrastructure::Request::method() and static StringViews
such as "*"sv.bytes().
It wouldn't make much sense on its own (as the Kernel only has errno
Errors), but it's an easy fix for not having to ifdef away every single
usage of `is_errno` in code that is shared between Userland and Kernel.
This code should not be used in the kernel - we should always propagate
proper errno codes in case we need to return those to userland so it
could decode it in a reasonable way.
This new method is meant to be used in both userspace and kernel code.
The idea is to allow printing of a verbose message and then returning an
errno code which is the proper mechanism for kernel code because we
should almost always assume that such error will be propagated back to
userspace in some way, so the userspace code could reasonably decode it.
For userspace code however, this new method is meant to be a simple
wrapper for Error::from_string_view, because for most invocations, it's
much more useful to have a verbose & literal error than a errno code, so
we simply ignore that errno code completely in such context.
For example, consider cases where we want to propagate errors only in
specific instances:
auto result = read_data(); // something like ErrorOr<ByteBuffer>
if (result.is_error() && result.error().code() != EINTR)
continue;
auto bytes = TRY(result);
The TRY invocation will currently copy the byte buffer when the
expression (in this case, just a local variable) is stored into
_temporary_result.
This patch binds the expression to a reference to prevent such copies.
In less trival invocations (such as TRY(some_function()), this will
incur only temporary lifetime extensions, i.e. no functional change.
As of now, there is a default copy constructor on Error. A future commit
will make this non-public to prevent implicit copies, so to prepare for
that, this adds a factory for the few cases where a copy is really
needed.
Just because we may compile serenity with or without NDEBUG doesn't
mean that consuming projects or Ports will share the setting.
Always define the custom assertion function so that we don't have to
keep the same debug settings between all projects.
This API is only used by Jakt to implement weak reference unwrapping.
By making it return a NonnullRefPtr, it can be assigned to anything
that accepts a NonnullRefPtr, unlike the previous T* return type (since
that can also be null).
Template argument are checked to ensure that the `Out` type is equal or
convertible to the type returned by the invokee.
Compilation now fails on:
`Function<void()> f = []() -> int { return 0; };`
But this is allowed:
`Function<ErrorOr<int>()> f = []() -> int { return 0; };`
Pretty much no other read function does this, and getting rid of the
typename template parameter for the stream makes the transition to the
new AK::Stream a bit easier.
Similar to the return values earlier, a signed value doesn't really make
sense here. Relying on the much more standard `size_t` makes it easier
to use Stream in all contexts.
Quick select is an algorithm that is able to find the median
of a Vector without fully sorting it.
This replaces the old very naive implementation
for `AK::Statistics::median()` with `AK::quickselect_inline`
This adds the quick select algorithm that finds
the kth smallest element for any collection.
Whilst doing so it also partially sorts the collection.
I have also included the option to use different pivoting functions
including median of medians which makes the quick select have
a truely linear time complexity at the costs of enormous overhead,
so this that only really useful for really large datasets.
The same was chosen to reflect the fact that it modifies
the collection in place during the selection process.
The AnyString concept is currently broken because it checks whether a
StringView is constructible from a type T. The StringView constructors,
however, only accept constant rvalue references - i.e. `T const&`.
This also adds a test to ensure this continues to work.
`Stream` will be qualified as `AK::Stream` until we remove the
`Core::Stream` namespace. `IODevice` now reuses the `SeekMode` that is
defined by `SeekableStream`, since defining its own would require us to
qualify it with `AK::SeekMode` everywhere.