This reduces the number of tasks to schedule, and the complexity of the
build system integrations for the BindingsGenerator. As a bonus, we move
the "only write if changed" feature into the generator to reduce the
build system load on generated files for this generator.
The LocaleData generator currently stores vectors of unique instances of
CLDR data (e.g. languages, currencies, etc.). For each CLDR file that we
parse, we linearly search through those vectors to decide if the current
field being parsed is unique. Given the size of the CLDR, this adds up
to quite a bit of time.
Augment these vectors with a hash map to store the index of each unique
instance in those vectors. This allows for quickly checking if a field
is unique, and to later look up those indices.
We do not apply this technique to every bit of CLDR data here. For
example, CLDR::character_orders contains only 2 entries. In that case,
it is quicker to search the vector than it is to hash a string key.
This reduces the runtime of GenerateLocaleData from to 2.03s to 1.09s.
Similar to languages and currencies, extract the loop to collect the
unique set of date fields to a preprocessing function. This alone does
not yield any performance improvement, but combined with an upcoming
patch will make the parse_locale_date_fields() a bit faster.
We currently parse each CLDR calendar, then decide based on its primary
key whether we want to skip it. Instead, we can decide to skip it based
on its file name.
This reduces the runtime of GenerateLocaleData from 2.03s to 1.97s.
The LocaleData generator has to read a few of the CLDR files more than
once, to e.g. prepare some data up front (for reasons why, see commits
c86f7a6 and 0b69e9f). This takes non-neglible time, especially for large
JSON files such as currencies.json. So in these cases, cache the parsed
JSON in a map.
This reduces the runtime of GenerateLocaleData from 2.32s to 2.03s.
These were used to generate specialized tables. Now that those tables
have been migrated to general 2-stage lookup tables, these fields are
all unused.
Similar to commit 0652cc4, we now generate 2-stage lookup tables for
case conversion information. Only about 1500 code points are actually
cased. This means that case information is rather highly compressible,
as the blocks we break the code points into will generally all have no
casing information at all.
In total, this change:
* Does not change the size of libunicode.so (which is nice because,
generally, the 2-stage lookup tables are expected to trade a bit
of size for performance).
* Reduces the runtime of the new benchmark test case added here from
1.383s to 1.127s (about an 18.5% improvement).
There is no functional change here. This information will compose the
upcoming multistage casing tables in an upcoming patch. Extract it to
its own struct to prepare for that.
There is no functional change here. This just adjusts the changes made
in commit 0652cc4 to be a bit more generic for code point casing tables.
We currently only generate property tables, which boil down to a vector
of booleans. Casing tables will be a struct of varying types, so this
generalizes some of the generator to prepare for that ahead of time, to
make the upcoming casing patch smaller / easier to grok.
When generating code point property tables, we currently binary search
the code point range lists for each property to decide if a code point
has that property. However, we are both iterating over the code points
and through the sorted properties in order. This means we do not need
to search code point ranges that are below the current code point at
all. We can even remove the code point ranges that fall below the
current code point, as we will not see a code point in those ranges
again.
On my machine, this reduces the run time of GenerateUnicodeData from
3.4 seconds to 1.2 seconds.
The idl file lists are used for two things:
1. As inputs for `generate_window_or_worker_interfaces`
2. In a loop in `generate_idl_bindings` and the loop variable
is passed to `rebase_path`
Both these cases can handle a normal fully-qualified GN path,
so there's no need for the "abspath".
No behavior change.
Currently, the `isobmff` utility will only print the media file type
info from the FileTypeBox (major brand and compatible brands), as well
as the names and sizes of top-level boxes.
This disables running benchmark test cases on Lagom in CI. When we run
unit tests on-board Serenity, run-tests sets this environment variable.
We do not use that utility to run unit tests on Lagom, so we've been
running benchmark tests unnecessarily.
When markdown-check is built, it outputs hundreds of lines of "ignoring
this and that link because reasons". This is extremely not helpful when
trying to figure out exactly which check failed on your commit. Also
remove the timing numbers from lint-ci.sh These are just noise and also
don't help to figure out which pre-commit check failed. Ideally the
output on fail should be "[OK]: Check A" for all the passing checks and
"[FAIL] Check N" with the required context for the failed check.
We currently produce a single table for all categories of code point
properties (GeneralCategory, Script, etc.). Each row contains a field
indicating the range of code points to which that property applies. At
runtime, we then do a binary search through that table to decide if a
code point has a property.
This changes our approach to generate a 2-stage lookup table for each of
those categories. There is an in-depth explanation of these tables above
the new `create_code_point_tables` method. The end effect is that code
point property lookup is reduced from a binary search to constant-time
array lookups.
In total, this change:
* Increases the size of libunicode.so from 2.7 MB to 2.9 MB.
* Reduces the runtime of the new benchmark test case added here from
3.576s to 1.020s (a 3.5x speedup).
* In a profile of resizing a TextEditor window with a 3MB file open,
the runtime of checking if a code point has a word break property
reduces from ~81% to ~56%.
The next commit will need a type from LibUnicode/CharacterTypes.h. To
avoid conflicts between that header's CodePointRange and the one that is
defined in the code generator, just use the public definition.
We started generating this data in commit 0505e03, but it was unused.
It's still not used, so let's remove it, rather than bloating the size
of libunicode.so with unused data. If we need it in the future, it's
trivial to add back.
Note we *have* always used the block name data from that commit, and
that is still present here.
The AST interpreter is still available behind a new `--ast` flag.
We switch to testing with bytecode in the big run-tests battery on
SerenityOS. Lagom CI continues running both AST and BC.
Rather than splitting the Iterator type and its AOs into two files,
let's combine them into one file to match every other JS runtime object
that we have.
This is an editorial change in the ECMA-262 spec. See:
https://github.com/tc39/ecma262/commit/956e5af
This splits the GetIterator AO into two AOs, to remove some recursion
and to (soon) remove optional parameters.
fddbd11baa made it so the command executed
read `sh -c -- '"script" args*'`, the -- in this command is redundant as
the script name never starts with a dash and can never be interpreted as
an option or a flag.
The actually meaningful placement for -- here is after `$SUDO`, to make
sure `$SUDO` does not incorrectly treat `-c` as an option to itself, and
`$SHELL` cannot be interpreted as an option/flag in the extremely
unlikely event that it starts with a dash.
The dollar sign is a special character in POSIX shells and in the Ninja
build file format. If the file name contains a `$`, something goes wrong
in the escaping/unescaping of this symbol, and CMake/GCC/Clang generate
invalid dependency files where not all instances of `$` are escaped
properly. Because of this, Ninja fails to rebuild `$262Object.cpp` if
the headers included by it have changed.
Stale `$262Object.cpp.o` files have been the cause of mysterious crashes
multiple times which only go away after doing a clean build. Let's
prevent these from happening again by removing the `$` from the
filename.
Recreating the previous screenshot in a current build of the system will
show many, usually subtle, changes in comparison.
This patch adds a new screenshot of the SerenityOS desktop with
Terminal, File Manager, System Monitor and Ladybird visible.
This allows opening the ladybird.app app bundle on macOS, using Xcode
tools like Instruments on the applications in the app bundle, and even
installing the app bundle into /Applications :^)