This is preparatory work to read locale extensions. The parser currently
enforces that the entire string is consumed. But to parse extensions,
parse_unicode_locale_id() will need parse_unicode_language_id() to just
stop parsing on the first segment that does not match the language ID
grammar. It will also need to know where the parsing stopped. Both of
these needs are fulfilled by GenericLexer.
The caveat is that we can no longer simply split the parsed string on
separator characters. So parse_unicode_language_id() now operates as a
small state machine.
This is needed so all headers and files exist on disk, so that
the sonar cloud analyzer can find them when executing the compilation
commands contained in compile_commands.json, without actually building.
Co-authored-by: Andrew Kaster <akaster@serenityos.org>
This commit adds a new process method to all Decoder subclasses which
do what to_utf8 used to do, and allows callers to customize the handling
of individiual UTF-8 code points through a callback. Decoder::to_utf8
now uses this API to generate a string via StringBuilder, preserving the
original behavior.
This always subtracted the glyph width of a space, despite isspace
also accepting newlines and a few other characters. It now also uses
AK/CharacterTypes.h. :^)
Non-printable characters should always have a width of 0. This is not
true for some characters like tab, but those can be exempted as the need
arises. Doing this here saves us from a bunch of checks in any place
that needs to figure out glyph widths for text which can contain
non-printable characters.
This more clearly expresses the purpose of this flag. Since only
CSS::WhiteSpace::Nowrap sets this value to false and it does not respect
linebreaks, this made the most sense as a flag name.
This commit refactors the text chunking algorithm used in
TextNode::ChunkIterator. The m_start_of_chunk member parameter has been
replaced with a local variable that's anchored to the current iterator
at the start of every next() call, and the algorithm is made a little
more clear by explicitly separating what can and cannot peek into the
next character during iteration.
We don't need transitions for either of these:
- Adding the 'name' property to a constructor object
- Adding the 'constructor' property to its prototype object
- Replace the misleading abuse of the m_transitions_enabled flag for the
fast path without lookup with a new m_initialized boolean that's set
either by Heap::allocate() after calling the Object's initialize(), or
by the GlobalObject in its special initialize_global_object(). This
makes it work regardless of the shape's uniqueness.
- When we're adding a new property past the initialization phase,
there's no need to do a second metadata lookup to retrieve the storage
value offset - it's known to always be the shape's property count
minus one. Also, instead of doing manual storage resizing and
assignment via indexing, just use Vector::append().
- When we didn't add a new property but are overwriting an existing one,
the property count and therefore storage value offset doesn't change,
so we don't have to retrieve it either.
As a result, Object::set_shape() is now solely responsible for updating
the m_shape pointer and is not resizing storage anymore, so I moved it
into the header.
We don't need to be allocating Strings for these names during static
initialization. The C-string literals will be stored in the .rodata ELF
section, so they're not going anywhere. We can just wrap the .rodata
storage for the class names in StringViews and use those in Object
registration and lookup APIs.
Optimizations:
- Make sure `DT_SYMTAB` is a string view literal, instead of string.
- DynamicObject::HashSection::lookup_sysv_symbol should be using
raw_name() from symbol comparison to avoid needlessly calling
`strlen`, when the StrinView::operator= walks the cstring without
calling `strlen` first.
- DynamicObject::HashSection::lookup_gnu_symbol shouldn't create a
symbol unless we know the hashes match first.
In order to test these changes I enabled Undefined behavior sanitizer
which creates a huge amount of relocations, and then ran the browser
with the help argument 100 times. The browser is a fairly big app with
a few different libraries being loaded, so it seemed liked a good
target.
Command: `time -n 100 br --help`
Before:
```
Timing report:
==============
Command: br --help
Average time: 3897.679931 ms
Excluding first: 3901.242431 ms
```
After:
```
Timing report:
==============
Command: br --help
Average time: 3612.860107 ms
Excluding first: 3613.54541 ms
```
This allows us to remove all the add_subdirectory calls from the top
level CMakeLists.txt that referred to targets linking LagomCore.
Segregating the host tools and Serenity targets helps us get to a place
where the main Serenity build can simply use a CMake toolchain file
rather than swapping all the compiler/sysroot variables after building
host libraries and tools.
Gather the custom commands for each of the 6 bindings generated targets
for libjs_js_wrapper invocations into some lists so that we can foreach
over the lists instead of having 6 copy pasted commands with one or two
things modified for each one.
Additional refactoring, use target_sources command to inform CMake about
additional source files for LibWeb, but only after it's been declared as
a library via add_library. Also avoid use of the write_if_different
script and use cmake -E copy_if_different instead. This lets us express
the actions in rules that CMake understands without going to an external
source file. It exposes a few optimization opportunities for the code
generators to accept an output filename instead of always going to
stdout.
Moving this helper CMake file to the centralized Meta/CMake folder helps
to get a better grasp on what extra files are required for the build,
and what files are generated.
While we're at it, don't use add_compile_definitions for
ENABLE_UNICODE_DATA, which only needs to be seen by LibUnicode sources.
In general, I think `opt == x` looks much nicer than
`opt.has_value() && opt.value() == x`, so I'm updating
the remaining few instances I could find with some regex
magic in my search.
All audio applications (aplay, Piano, Sound Player) respect the ability
of the system to have theoretically any sample rate. Therefore, they
resample their own audio into the system sample rate.
LibAudio previously had its loaders resample their own audio, even
though they expose their sample rate. This is now changed. The loaders
output audio data in their file's sample rate, which the user has to
query and resample appropriately. Resampling code from Buffer, WavLoader
and FlacLoader is removed.
Note that these applications only check the sample rate at startup,
which is reasonable (the user has to restart applications when changing
the sample rate). Fully dynamic adaptation could both lead to errors and
will require another IPC interface. This seems to be enough for now.
Two new ioctl requests are used to get and set the sample rate of the
sound card. The SB16 device keeps track of the sample rate separately,
because I don't want to figure out how to read the sample rate from the
device; it's easier that way.
The soundcard write doesn't set the sample rate to 44100 Hz every time
anymore, as we want to change it externally.
This AO is required for a bunch of PlainTime related methods.
As part of this change the `TemporalTime` record was renamed to
`UnregulatedTemporalTime` and a new `TemporalTime` record that matches
the other Temporal parse result records was added. This also has the
added benefit of getting rid of a would be round-trip cast from integer
to double and back in `ParseTemporalTimeString`.