There are cases where Lagom will build with GCC but not Clang.
This often goes unnoticed for a while as we don't often build with
Clang.
However, this is now important to test in CI because of the
OSS-Fuzz integration.
Note that this only tests the build, it does not run any tests.
Note that it also only builds LagomCore, Lagom and the fuzzers.
It does not build the other programs that use Lagom.
We added OSS-Fuzz integration in #4154, but documentation about it
is spread across several pull requests, IRC, and issues. Let's collect
the important bits in the ReadMe.
This commit is a mix of several commits, squashed into one because the
commits before 'Move regex to own Library and fix all the broken stuff'
were not fixable in any elegant way.
The commits are listed below for "historical" purposes:
- AK: Add options/flags and Errors for regular expressions
Flags can be provided for any possible flavour by adding a new scoped enum.
Handling of flags is done by templated Options class and the overloaded
'|' and '&' operators.
- AK: Add Lexer for regular expressions
The lexer parses the input and extracts tokens needed to parse a regular
expression.
- AK: Add regex Parser and PosixExtendedParser
This patchset adds a abstract parser class that can be derived to implement
different parsers. A parser produces bytecode to be executed within the
regex matcher.
- AK: Add regex matcher
This patchset adds an regex matcher based on the principles of the T-REX VM.
The bytecode pruduced by the respective Parser is put into the matcher and
the VM will recursively execute the bytecode according to the available OpCodes.
Possible improvement: the recursion could be replaced by multi threading capabilities.
To match a Regular expression, e.g. for the Posix standard regular expression matcher
use the following API:
```
Pattern<PosixExtendedParser> pattern("^.*$");
auto result = pattern.match("Well, hello friends!\nHello World!"); // Match whole needle
EXPECT(result.count == 1);
EXPECT(result.matches.at(0).view.starts_with("Well"));
EXPECT(result.matches.at(0).view.end() == "!");
result = pattern.match("Well, hello friends!\nHello World!", PosixFlags::Multiline); // Match line by line
EXPECT(result.count == 2);
EXPECT(result.matches.at(0).view == "Well, hello friends!");
EXPECT(result.matches.at(1).view == "Hello World!");
EXPECT(pattern.has_match("Well,....")); // Just check if match without a result, which saves some resources.
```
- AK: Rework regex to work with opcodes objects
This patchsets reworks the matcher to work on a more structured base.
For that an abstract OpCode class and derived classes for the specific
OpCodes have been added. The respective opcode logic is contained in
each respective execute() method.
- AK: Add benchmark for regex
- AK: Some optimization in regex for runtime and memory
- LibRegex: Move regex to own Library and fix all the broken stuff
Now regex works again and grep utility is also in place for testing.
This commit also fixes the use of regex.h in C by making `regex_t`
an opaque (-ish) type, which makes its behaviour consistent between
C and C++ compilers.
Previously, <regex.h> would've blown C compilers up, and even if it
didn't, would've caused a leak in C code, and not in C++ code (due to
the existence of `OwnPtr` inside the struct).
To make this whole ordeal easier to deal with (for now), this pulls the
definitions of `reg*()` into LibRegex.
pros:
- The circular dependency between LibC and LibRegex is broken
- Eaiser to test (without accidentally pulling in the host's libc!)
cons:
- Using any of the regex.h functions will require the user to link -lregex
- The symbols will be missing from libc, which will be a big surprise
down the line (especially with shared libs).
Co-Authored-By: Ali Mohammad Pur <ali.mpfard@gmail.com>
This was broken with the JS::Parser::Error position changes, but I don't
actually see a reason to do anything with the parser errors here, so
let's remove it and consider simply not crashing a success. :^)
It's a thin userland wrapper around adjtime(2). It can be used
to view current pending time adjustments, and root can use it to
smoothly adjust the system time.
As far as I can tell, other systems don't have a userland utility
for this, but it seems useful. Useful enough that I'm adding it to
the lagom build so I can use it on my linux box too :)
This broke in case of unterminated regular expressions, causing goofy location
numbers, and 'source_location_hint' to eat up all memory:
Unexpected token UnterminatedRegexLiteral. Expected statement (line: 2, column: 4294967292)
To keep track of ongoing terminal sessions, we now have a sort-of
traditional /var/run/utmp file, like other Unix systems.
Unlike other Unix systems however, ours is of course JSON. :^)
The /bin/utmpupdate program is used to update the file, which is
not writable by regular user accounts. This helper program is
set-GID "utmp".
While this _does_ add a point of failure, it'll be a pretty bad day when
google goes down.
And this is unlikely to put a (positive) dent in their incoming
requests, so let's just roll with it until we have our own TLS server.
We currently have 16 endpoints. The IDs are typed by a human at creation time.
This check will detect with we ever use an endpoint ID twice.
Since the large irrelevant directories are ignored, this should be quick enough.
The need for SERENITY_ROOT was basically eliminated in
73c953b674. The existing guess
'git rev-parse --show-toplevel' should be correct in all conceivable cases.
Most code just assumes the layout in git, or depends on SERENITY_ROOT as
set in the CMakeLists.txt. *Requiring* the user to set it doesn't make
sense anymore.
While I was in there anyway, I added exit code propagation. Also, 'find' should
be a tad faster now, because it doesn't enumerate files in the large ignored
directories Build/ and Toolchain/ anymore.
Introduces a SERENITY_QEMU_CPU environment variable that allows
overriding of the qemu -cpu command line parameter. If not specified,
the argument defaults to "max".
The primary motivation behind this is to be able to enable or disable
specific features for the vCPU in order to workaround QEMU issues with
certain hardware accelerators. For example, QEMU on Windows with WPHX
sometimes fails to start unless Virtual Machine eXtensions are
disabled. This can now be done with:
export SERENITY_QEMU_CPU="max,vmx=off"
If a buffer smaller than Elf32_Ehdr was passed to Image, header()
would do an out-of-bounds read.
Make parse() check for that. Make most Image methods assert that the image
is_valid(). For that to work, set m_valid early in Image::parse()
instead of only at its end.
Also reorder a few things so that the fuzzer doesn't hit (valid)
assertions, which were harmless from a security PoV but which still
allowed userspace to crash the kernel with an invalid ELF file.
Make dbgprintf()s configurable at run time so that the fuzzer doesn't
produce lots of logspam.
Useful for sanitizer fuzzer builds.
clang doesn't have a -fconcepts switch (I'm guessing it just enables
concepts automatically with -std=c++2a, but I haven't checked),
and at least the version on my system doesn't understand
-Wno-deprecated-move, so pass these two flags only to gcc.
In return, disable -Woverloaded-virtual which fires in many places.
The preceding commits fixed the handful of -Wunused-private-field
warnings that clang emitted.
This is only useful for build commands that update their destination in all cases
and thus sometimes confuse cmake into rebuilding everything needlessly.
This file was formerly named `Libraries/LibCore/puff.c` and it was not
checked by `check-style.sh` because it only checks .h and .cpp files.
Since this file was not written by us, we shouldn't check its style.
This file was also causing our Travis builds to fail.
LibWeb currently has no test suite or program. Let's change that :^)
test-web is mostly a copy of test-js, but modified for LibWeb.
test-web imports both LibJS/Tests/test-common.js and
LibWeb/Test/test-common.js
LibWeb's suite provides the ability to specify the page to load,
what to do before the page is loaded, and what to do after it's
loaded.
This also provides a test of document.doctype and its close sibling
document.compatMode.
Currently, this isn't added to Lagom because of CodeGenerators.
This moves most of the work from run-tests.sh to test-js.cpp. This way,
we have a lot more control over how the test suite runs, as well as how
it outputs. This should result in some cool functionality!
This commit also refactors test-common.js to mimic the jest library.
This should allow tests to be much more expressive :)
If SERENITY_BUILD is not set or empty, SERENITY_BUILD is treated as if
it was set to '.'.
`run.sh` will cd to SERENITY_BUILD before running the emulator.
Also, export SERENITY_BUILD in `Meta/CLion/run.sh` since we are using it
in `Meta/run.sh`.
Also, allow using a different bochs configuration file by setting the
SERENITY_BOCHSRC variable.
The run script is not in Kernel/ anymore, let's move `.bochsrc` in Meta/
so that it can be used with the new build system.
Also make bochs use `grub_disk_image` instead of `_disk_image`
We don't need to copy `run.sh` and modify it: we can just set
environment variables.
Now, we don't have to modify two files everytime we make a change to the
run.sh script.
Also make SERENITY_BUILD overridable with environment variables,why not?
In the GNU coreutils version of chown, ":" is a valid argument
(the command will result in a no-op), but POSIX chown does not
consider that valid.
If the user who ran build-image-*.sh was root, SUDO_UID and SUDO_GID
would not be set and, if the version of chown installed on the system
did not allow passing just a ":" as argument, the script would fail.
Let's default the value of SUDO_UID and SUDO_GID to 0 just in case.
This patch removes the setuid-root flag from the KeyboardSettings GUI
application and adds back the old "keymap" program.
It doesn't feel very safe and sound to have a GUI program runnable
as setuid-root, so in the next patch I'll be making KeyboardSettings
call out to the "keymap" program to do its bidding.
The "WebContent" service provides a very restricted instance of LibWeb
running as an unprivileged user account. This will be used to implement
process separation in Browser, among other things.
This first cut of the service only spawns a single WebContent process
when someone connects to /tmp/portal/webcontent. We will soon switch
this over to spawning a new process for each connection.
Since this feature is very immature, we'll be bringing it up inside of
Demos/WebView as a separate demo program. Eventually this will become
a reusable widget that anyone can embed and easily get out-of-process
web content in their GUI.
This is pretty, pretty cool! :^)
We were getting a little overly memey in some places, so let's scale
things back to business-casual.
Informal language is fine in comments, commits and debug logs,
but let's keep the runtime nice and presentable. :^)
- Parsing invalid JSON no longer asserts
Instead of asserting when coming across malformed JSON,
JsonParser::parse now returns an Optional<JsonValue>.
- Disallow trailing commas in JSON objects and arrays
- No longer parse 'undefined', as that is a purely JS thing
- No longer allow non-whitespace after anything consumed by the initial
parse() call. Examples of things that were valid and no longer are:
- undefineddfz
- {"foo": 1}abcd
- [1,2,3]4
- JsonObject.for_each_member now iterates in original insertion order
run.sh currently makes qemu print this as the very first output:
WARNING: Image format was not specified for '_disk_image' and probing guessed raw.
Automatically detecting the format is dangerous for raw images, write operations on block 0 will be restricted.
Specify the 'raw' format explicitly to remove the restrictions.
This is scary for people who don't know that this is normal, and
write operations to block 0 being restricted could be confusing
for some operations happening from within serenity too at some point.
So specify the 'raw' format explicitly, like the warning suggests.
Requires using -drive instead of -hda.
From `man qemu-system`:
Instead of -hda, -hdb, -hdc, -hdd, you can use:
qemu-system-x86_64 -drive file=file,index=0,media=disk
So use that, and also pass format=raw.
.. and make travis run it.
I renamed check-license-headers.sh to check-style.sh and expanded it so
that it now also checks for the presence of "#pragma once" in .h files.
It also checks the presence of a (single) blank line above and below the
"#pragma once" line.
I also added "#pragma once" to all the files that need it: even the ones
we are not check.
I also added/removed blank lines in order to make the script not fail.
I also ran clang-format on the files I modified.
This commit updates CLionConfiguration.md and NotesOnWSL.md so that
they comply with new build system. In addition to that, the WSL doc
is updated to include instructions to run qemu (and serenity) natively
on Windows, without needing an X-window server.
Rather than printing them to stderr directly the parser now keeps a
Vector<Error>, which allows the "owner" of the parser to consume them
individually after parsing.
The Error struct has a message, line number, column number and a
to_string() helper function to format this information into a meaningful
error message.
The Function() constructor will now include an error message when
throwing a SyntaxError.
It didn't feel right to have a "DHCPClient" in a "Servers" directory.
Rename this to Services to better reflect the type of programs we'll
be putting in there.
This commit adds a CMakeLists.txt file that will be used by CLion to
configure the project and documentation explaining the steps to follow.
Configuring CLion this way enables important features like code
completion and file search. The configuration isn't perfect. There are
source files for which CLion cannot pick up the headers and asks to
manually include them from certain directories. But for the most part,
it works all right.
Note: clang only (see https://llvm.org/docs/LibFuzzer.html)
- add FuzzJs which will run the LibJS parser on random javascript inputs
- added a basic dictionary of javascript tokens
To use fuzzer:
CC=/usr/bin/clang CXX=/usr/bin/clang++ cmake -DENABLE_FUZZER_SANITIZER=1 ..
Fuzzers/FuzzJs -dict=../Fuzzers/FuzzJs.dict
Adding the ability to turn on Clang analyzer support in the Lagom build.
Right now the following are working warning free on the LibJS test suite:
-DENABLE_MEMORY_SANITIZER:BOOL=ON
-DENABLE_ADDRESS_SANITIZER:BOOL=ON
The following analyzer produces errors when running the LibJS test suite:
-DENABLE_UNDEFINED_SANITIZER:BOOL=ON
lint-shell-scripts searches over the repository looking for shell
scripts. On those found, shellcheck is run against them. If any linting
fails print those warnings and exit with a non-zero exit code.
Run this script automatically in Travis.
Warnings fixed:
* SC2086: Double quote to prevent globbing and word splitting.
* SC2006: Use $(...) notation instead of legacy backticked `...`
* SC2039: In POSIX sh, echo flags are undefined
* SC2209: Use var=$(command) to assign output (or quote to assign string)
* SC2164: Use 'cd ... || exit' or 'cd ... || return' in case cd fails
* SC2166: Prefer [ p ] && [ q ] as [ p -a q ] is not well defined.
* SC2034: i appears unused. Verify use (or export if used externally)
* SC2046: Quote this to prevent word splitting.
* SC2236: Use -z instead of ! -n.
There are still a lot of warnings in Kernel/run about:
- SC2086: Double quote to prevent globbing and word splitting.
However, splitting on space is intentional in this case, and not trivial to
change. Therefore ignore the warning for now - but we should fix this in
the future.
I've been wanting to do this for a long time. It's time we start being
consistent about how this stuff works.
The new convention is:
- "LibFoo" is a userspace library that provides the "Foo" namespace.
That's it :^) This was pretty tedious to convert and I didn't even
start on LibGUI yet. But it's coming up next.
The variable is not set anymore by the UseIt.sh script, so if a user doesn't
have it set in the .bashrc or .zshrc file already, it's not working properly.
As suggested by Joshua, this commit adds the 2-clause BSD license as a
comment block to the top of every source file.
For the first pass, I've just added myself for simplicity. I encourage
everyone to add themselves as copyright holders of any file they've
added or modified in some significant way. If I've added myself in
error somewhere, feel free to replace it with the appropriate copyright
holder instead.
Going forward, all new source files should include a license header.
This is more of a meta thing, since it's not seeing active development,
but is just a way for me to build some Serenity parts and include them
in other projects. Move it out of the root to keep things tidy.
Added a script to build QEMU from source as part of the Toolchain.
The script content could be in BuildIt.sh but has been put in
a seperate file to make the build optional.
Added PATH=$PATH to sudo calls to hand over the Toolchain's PATH
setup by UseIt.sh. This enabled the script's to use the QEMU
contained in the SerenityOS toolchain.
Deleted old documentation in Meta and replaced it by a new
documentation in the Toolchain folder.
Ports/.port_include.sh, Toolchain/BuildIt.sh, Toolchain/UseIt.sh
have been left largely untouched due to use of Bash-exclusive
functions and variables such as $BASH_SOURCE, pushd and popd.