As a nearby comment says, "This is a terrible approximation".
This doesn't make things less terrible, but it does make things
more correct in the given framework of terribleness.
Fixes#17156.
There were two problems:
1. They didn't handle surrogates
2. They used signed chars, leading to eg 0x00e4 being treated as 0xffe4
Also add a basic test that catches both issues.
There's some code duplication with Utf16CodePointIterator::operator*(),
but let's get things working first.
In cases where we know a string literal will fit in the short string
storage, we can do so at compile time without needing to handle error
propagation. If the provided string literal is too long, a compilation
error will be emitted due to the failed VERIFY statement being a non-
constant expression.
We should expect the sniffing method and the initialize method to fail
because this test case is testing that the ICO image decoder should not
decode random data within it.
When trying to figure out the correct implementation, we now have a very
strong distinction on plugins that are well suited for sniffing, and
plugins that need a MIME type to be chosen.
Instead of having multiple calls to non-static virtual sniff methods for
each Image decoding plugin, we have 2 static methods for each
implementation:
1. The sniff method, which in contrast to the old method, gets a
ReadonlyBytes parameter and ensures we can figure out the result
with zero heap allocations for most implementations.
2. The create method, which just creates a new instance so we don't
expose the constructor to everyone anymore.
In addition to that, we have a new virtual method called initialize,
which has a per-implementation initialization pattern to actually ensure
each implementation can construct a decoder object, and then have a
correct context being applied to it for the actual decoding.
In order to prevent this commit from having to refactor almost all of
Intl, the goal here is to update the internal parsing/canonicalization
of locales within LibLocale only. Call sites which are already equiped
to handle String and OOM errors do so, however.
Let's put test files with the tests themselves, instead of a random user
directory. (But still copy them so they appear in the user directory
for convenience.)
Putting the LibTTF tests into the LibGfx directory worked fine before,
but causes issues if we try and call this from Lagom. Also, it's tidier
to put LibTTF tests in a LibTTF directory. :^)
The test currently watches /tmp, which the OS can create/modify files
under at any time outside of our control. So just ignore events that we
aren't interested in.
Also test removing an item from the FileWatcher.
A negative return value doesn't make sense for any of those functions.
The return types were inherited from POSIX, where they also need to have
an indicator for an error (negative values).
The Unicode spec defines much more complicated caseless matching
algorithms in its Collation spec. This implements the "basic" case
folding comparison.
Case folding rules have a similar mapping style as special casing rules,
where one code point may map to zero or more case folding rules. These
will be used for case-insensitive string comparisons. To see how case
folding can differ from other casing rules, consider "ß" (U+00DF):
>>> "ß".lower()
'ß'
>>> "ß".upper()
'SS'
>>> "ß".title()
'Ss'
>>> "ß".casefold()
'ss'
This implements FileWatcher using inotify filesystem events. Serenity's
InodeWatcher is remarkably similar to inotify, so this is almost an
identical implementation.
The existing TestLibCoreFileWatcher test is added to Lagom (currently
just for Linux).
This does not implement BlockingFileWatcher as that is currently not
used anywhere but on Serenity.
The existing `is_i32()` and friends only check if `i32` is their
internal type, but a value such as `0` could be literally any integer
type internally. `is_integer<T>()` instead determines whether the
contained value is an integer and can fit inside T.
This saves us an actual seek and rereading already stored buffer data in
cases where the seek is entirely covered by the currently buffered data.
This is especially important since we implement `discard` using `seek`
for seekable streams.
Unicode declares that to titlecase a string, the first cased code point
after each word boundary should be transformed to its titlecase mapping.
All other codepoints are transformed to their lowercase mapping.
This error was introduced by 9a7accdd and had a significant impact on
`BufferedFile` behavior. Hence, we started seeing crash in test262.
By itself, the issue was a wrong calculation of the internal reading
spans when using the `read` and `until` parameters. Which can lead to
at worse crash in VERIFY and at least weird behaviors as missed needles
or detections out of bounds.
It was also accompanied by an erroneous test.
This patch fixes the bug, the test and also provides more tests.
This parameter allows to start searching after an offset. For example,
to resume a search.
It is unfortunately a breaking change in API so this patch also modifies
one user and one test.