Clients now receive HTTP status codes like 200, 404, etc.
Note that a 404 with content is still considered a "successful"
download from ProtocolServer's perspective. It's up to the client
to interpret the status code.
I'm not sure if this is the best API, but it'll work for now.
- Parsing invalid JSON no longer asserts
Instead of asserting when coming across malformed JSON,
JsonParser::parse now returns an Optional<JsonValue>.
- Disallow trailing commas in JSON objects and arrays
- No longer parse 'undefined', as that is a purely JS thing
- No longer allow non-whitespace after anything consumed by the initial
parse() call. Examples of things that were valid and no longer are:
- undefineddfz
- {"foo": 1}abcd
- [1,2,3]4
- JsonObject.for_each_member now iterates in original insertion order
Get rid of the weird old signature:
- int StringType::to_int(bool& ok) const
And replace it with sensible new signature:
- Optional<int> StringType::to_int() const
The interpreter now has an "underscore is last value" flag, which makes
Interpreter::get_variable() return the last value if:
- The m_underscore_is_last_value flag is enabled
- The name of the variable lookup is "_"
- The result of that lookup is an empty value
That means "_" can still be used as a regular variable and will stop
doing its magic once anything is assigned to it.
Example REPL session:
> 1
1
> _ + _
2
> _ + _
4
> _ = "foo"
"foo"
> 1
1
> _
"foo"
> delete _
true
> 1
1
> _
1
>
This adds regex parsing/lexing, as well as a relatively empty
RegExpObject. The purpose of this patch is to allow the engine to not
get hung up on parsing regexes. This will aid in finding new syntax
errors (say, from google or twitter) without having to replace all of
their regexes first!
This patchset adds a simple SignedBigInteger that is entirely defined in
terms of UnsignedBigInteger.
It also adds a NumberTheory::Power function, which is terribly
inefficient, but since the use of exponentiation is very much
discouraged for large inputs, no particular attempts were made
to make it more performant.
This adds a `global` property to the global object so that
`global == this` when typed into the REPL.
This matches what Node.js does and helps with testing on existing
JS code that uses `global` to detect its environment.
Since we're not keeping compatibility with OpenBSD about what promises are
required for which syscalls, tighten things up so that they make more sense.
Previously, all Markdown blocks had a virtual parse method which has
been swapped out for a static parse method returning an OwnPtr of
that block's type.
The Text class also now has a static parse method that will return an
Optional<Text>.
Adds the ability for a scope (either a function or the entire program)
to be in strict mode. Scopes default to non-strict mode.
There are two ways to determine the strict-ness of the JS engine:
1. In the parser, this can be accessed with the parser_state variable
m_is_strict_mode boolean. If true, the Parser is currently parsing in
strict mode. This is done so that the Parser can generate syntax
errors at parse time, which is required in some cases.
2. With Interpreter.is_strict_mode(). This allows strict mode checking
at runtime as opposed to compile time.
Additionally, in order to test this, a global isStrictMode() function
has been added to the JS ReplObject under the test-mode flag.
This patch adds an IndexedProperties object for storing indexed
properties within an Object. This accomplishes two goals: indexed
properties now have an associated descriptor, and objects now gracefully
handle sparse properties.
The IndexedProperties class is a wrapper around two other classes, one
for simple indexed properties storage, and one for general indexed
property storage. Simple indexed property storage is the common-case,
and is simply a vector of properties which all have attributes of
default_attributes (writable, enumerable, and configurable).
General indexed property storage is for a collection of indexed
properties where EITHER one or more properties have attributes other
than default_attributes OR there is a property with a large index (in
particular, large is '200' or higher).
Indexed properties are now treated relatively the same as storage within
the various Object methods. Additionally, there is a custom iterator
class for IndexedProperties which makes iteration easy. The iterator
skips empty values by default, but can be configured otherwise.
Likewise, it evaluates getters by default, but can be set not to.
We still don't handle non-ASCII input correctly, but at least now we'll
convert e.g ISO-8859-1 to UTF-8 before starting to tokenize.
This patch also makes "view source" work with the new parser. :^)
This adds a basic implementation of xargs.
The implemenation is missing quite a few options, and is not entirely
POSIX-compliant, but it gets the job done :^)
Previously, the Object class had many different types of functions for
each action. For example: get_by_index, get(PropertyName),
get(FlyString). This is a bit verbose, so these methods have been
shortened to simply use the PropertyName structure. The methods then
internally call _by_index if necessary. Note that the _by_index
have been made private to enforce this change.
Secondly, a clear distinction has been made between "putting" and
"defining" an object property. "Putting" should mean modifying a
(potentially) already existing property. This is akin to doing "a.b =
'foo'".
This implies two things about put operations:
- They will search the prototype chain for setters and call them, if
necessary.
- If no property exists with a particular key, the put operation
should create a new property with the default attributes
(configurable, writable, and enumerable).
In contrast, "defining" a property should completely overwrite any
existing value without calling setters (if that property is
configurable, of course).
Thus, all of the many JS objects have had any "put" calls changed to
"define_property" calls. Additionally, "put_native_function" and
"put_native_property" have had their "put" replaced with "define".
Finally, "put_own_property" has been made private, as all necessary
functionality should be exposed with the put and define_property
methods.
- initializing m_line_column to 1 in the lexer results in incorrect
column values in tokens on the first line of input.
- not incrementing m_line_column when EOF is reached results in
an incorrect column value on the last token.
And move canonicalized_path() to a static method on LexicalPath.
This is to make it clear that FileSystemPath/canonicalized_path() only
perform *lexical* canonicalization.
* In some cases, we can first call sigaction()/signal(), then *not* pledge
sigaction.
* In other cases, we pledge sigaction at first, call sigaction()/signal()
second, then pledge again, this time without sigaction.
* In yet other cases, we keep the sigaction pledge. I suppose these could all be
migrated to drop it or not pledge it at all, if somebody is interested in
doing that.
This fixes a bunch of FIXME's in LibLine.
Also handles the case where read() would read zero bytes in vt_dsr() and
effectively block forever by erroring out.
Fixes#2370
This patch adds a new HTMLDocumentParser class. It keeps a tokenizer
object internally and feeds itself with one token at a time from it.
The names and idioms in this class are expressed as closely to the
actual HTML parsing spec as possible, to make development as easy
and bug free as possible. :^)
This is going to become pretty large, but it's pretty cool!
In order to actually view the web as it is, we're gonna need a proper
HTML parser. So let's build one!
This patch introduces the Web::HTMLTokenizer class, which currently
operates on a StringView input stream where it fetches (ASCII only atm)
codepoints and tokenizes acccording to the HTML spec tokenization algo.
The tokenizer state machine looks a bit weird but is written in a way
that tries to mimic the spec as closely as possible, in order to make
development easier and bugs less likely.
This initial version is far from finished, but it can parse a trivial
document with a DOCTYPE and open/close tags. :^)
LibLine should ultimately not care about what a "token" means in the
context of its user, so force the user to split the buffer itself.
This also allows the users to pick up contextual clues as well, since
they have to lex the line themselves.
This commit pacthes Shell and the JS repl to better handle completions,
so certain wrong behaviours are now corrected as well:
- JS repl can now complete "Object . getOw<tab>"
- Shell can now complete "echo | ca<tab>" and paths inside strings
Make sure that userspace is always referencing "system" headers in a way
that would build on target :). This means removing the explicit
include_directories of Libraries/LibC in favor of having it export its
headers as SYSTEM. Also remove a redundant include_directories of
Libraries in the 'serenity build' part of the build script. It's already
set at the top.
This causes issues for the Kernel, and for crt0.o. These special cases
are handled individually.
This patch is unfortunately rather large and might make some things feel
bloated, but it is necessary to fix a few flaws in LibJS, primarily
blindly coercing values to numbers without exception checks - i.e.
interpreter.argument(0).to_i32(); // can fail!!!
Some examples where the interpreter would actually crash:
var o = { toString: () => { throw Error() } };
+o;
o - 1;
"foo".charAt(o);
"bar".repeat(o);
To fix this, we now have the following...
to_double(Interpreter&)
to_i32()
to_i32(Interpreter&)
to_size_t()
to_size_t(Interpreter&)
...and a whole lot of exception checking.
There's intentionally no to_double(), use as_double() directly instead.
This way we still can use these convenient utility functions but don't
need to check for exceptions if we are sure the value already is a
number.
Fixes#2267.
Passing a Heap& to it only to then call interpreter() on that is weird.
Let's just give it the Interpreter& directly, like some of the other
to_something() functions.
This was supposed to be the foundation for some kind of pre-kernel
environment, but nobody is working on it right now, so let's move
everything back into the kernel and remove all the confusion.
We will now actually use MIME types for clipboard. The default type is now
"text/plain" (instead of just "text").
This also fixes some issues in copy(1) and paste(1).
There are now two API's on Value:
- Value::to_string(Interpreter&) -- may throw.
- Value::to_string_without_side_effects() -- will never throw.
These are some pretty big sweeping changes, so it's possible that I did
some part the wrong way. We'll work it out as we go. :^)
Fixes#2123.
Giving the lexer the ability to generate errors adds unnecessary
complexity - also it only calls its syntax_error() function in one place
anyway ("unterminated string literal"). But since the lexer *also* emits
tokens like Eof or UnterminatedStringLiteral, it should be up to the
consumer of these tokens to decide what to do.
Also remove the option to not print errors to stderr as that's not
relevant anymore.
The unzip command will unzip a zip file passed as an argument into the
current pwd, with the syntax:
unzip file.zip
This implementation is pretty barebones as it does not support things
like file access times, compression or even compression detection, so
if the user tries to unzip a compressed zip most probably he would find
wrong data inside the files.
However it's an starting point :^)
Moves DirectoryServices out of LibCore (because we need to link with
LibIPC), renames it Desktop::Launcher (because Desktop::DesktopServices
doesn't scan right) and ports it to use the LaunchServer which is now
responsible for starting programs for a file.
Adds fully functioning template literals. Because template literals
contain expressions, most of the work has to be done in the Lexer rather
than the Parser. And because of the complexity of template literals
(expressions, nesting, escapes, etc), the Lexer needs to have some
template-related state.
When entering a new template literal, a TemplateLiteralStart token is
emitted. When inside a literal, all text will be parsed up until a '${'
or '`' (or EOF, but that's a syntax error) is seen, and then a
TemplateLiteralExprStart token is emitted. At this point, the Lexer
proceeds as normal, however it keeps track of the number of opening
and closing curly braces it has seen in order to determine the close
of the expression. Once it finds a matching curly brace for the '${',
a TemplateLiteralExprEnd token is emitted and the state is updated
accordingly.
When the Lexer is inside of a template literal, but not an expression,
and sees a '`', this must be the closing grave: a TemplateLiteralEnd
token is emitted.
The state required to correctly parse template strings consists of a
vector (for nesting) of two pieces of information: whether or not we
are in a template expression (as opposed to a template string); and
the count of the number of unmatched open curly braces we have seen
(only applicable if the Lexer is currently in a template expression).
TODO: Add support for template literal newlines in the JS REPL (this will
cause a syntax error currently):
> `foo
> bar`
'foo
bar'
We're starting with a very basic decoding API and only ISO-8859-1 and
UTF-8 decoding (and UTF-8 decoding is really a no-op since String is
expected to be UTF-8.)
We now store the response headers in a download object on the protocol
server side and pass it to the client when finishing up a download.
Response headers are passed as an IPC::Dictionary. :^)
Only being able to complete enumerable properties is annoying,
especially since we updated everything to use the correct attributes.
Most standard built-in objects are *not* enumerable.
This commit fixes up the following:
- HMAC should not reuse a single hasher when successively updating
- AES Key should not assume its user key is valid signed char*
- Mode should have a virtual destructor
And adds a RFC5246 padding mode, which is required for TLS
There was a bug when dealing with a carry when the addition
result for the current word was UINT32_MAX.
This commit also adds a regression test for the bug.
You can now do:
touch a.txt b.txt c.txt d.txt
Also now you can't do:
touch --test # This created a "./--test" file
Also adds ArgsParser to the mix to better handle arguments :)
This commit introduces a way to get an object's own properties in the
correct order. The "correct order" for JS object properties is first all
array-like index properties (numeric keys) sorted by insertion order,
followed by all string properties sorted by insertion order.
Objects also now print correctly in the repl! Before this commit:
courage ~/js-tests $ js
> ({ foo: 1, bar: 2, baz: 3 })
{ bar: 2, foo: 1, baz: 3 }
After:
courage ~/js-tests $ js
> ({ foo: 1, bar: 2, baz: 3 })
{ foo: 1, bar: 2, baz: 3 }
It does not make much sense to receive an interrupt and process it
*much later*.
Also patches Userland/js to only create exceptions while some code is
actually running.
The kill system call accepts negative pids, as they
have special meaning:
pid == -1 means all processes the calling process has access to.
pid < -1 means every process who's process group ID is -pid.
I don't see any reason why the user space program should mask this.
JS::Value already has the empty state ({} or Value() gives you one.)
Use this instead of wrapping Value in Optional in some places.
I've also added Value::value_or(Value) so you can easily provide a
fallback value when one is not present.
This is required for template literals - we're not quite there yet, but at
least the parser can now tell us when this token is encountered -
currently this yields "Unexpected token Invalid". Not really helpful.
The character is a "backtick", but as we already have
TokenType::{StringLiteral,RegexLiteral} this seemed like a fitting name.
This also enables syntax highlighting for template literals in the js
REPL and LibGUI's JSSyntaxHighlighter.
These strings would be applied when inserted into the buffer, but are
not shown as part of the suggestion.
This commit also patches up Userland/js and Shell to use this
functionality
functrace traces the function calls a program makes.
It's like strace, but for userspace.
It works by using Debugging functionality to insert breakpoints
at call&ret instructions.
POSIX says, "Conforming applications should not assume that the returned
contents of the symbolic link are null-terminated."
If we do include the null terminator into the returning string, Python
believes it to actually be a part of the returned name, and gets unhappy
about that later. This suggests other systems Python runs in don't include
it, so let's do that too.
Also, make our userspace support non-null-terminated realpath().
The addition of assert functions to Userland/js
was done before we had load(..) implemented. Now
that it exists, it seems like the right move the
test helper functions to pure javascript instead
of poluting js with random global functions.
The work I did to add assert as a native function in js
was a step in the wrong direction. Now that js supports
load() it makes sense to just move assert and anything
we want to add to the test harness into pure javascript.
We already have "global" as a way to access the global object in js(1)
(both REPL and script mode). This replaces it with "globalThis", which
is available in all environments, not just js.
This patch adds a new kind of JS::Value, the empty value.
It's what you get when you do JSValue() (or most commonly, {} in C++.)
An empty Value signifies the absence of a value, and should never be
visible to JavaScript itself. As of right now, it's used for array
holes and as a return value when an exception has been thrown and we
just want to unwind.
This patch is a bit of a mess as I had to fix a whole bunch of code
that was relying on JSValue() being undefined, etc.
Objects can have both named and indexed properties. Previously we kept
all property names as strings. This patch separates named and indexed
properties and splits them between Object::m_storage and m_elements.
This allows us to do much faster array-style access using numeric
indices. It also makes the Array class much less special, since all
Objects now have number-indexed storage. :^)
Introduce a central assert implementation so that the LibJS test suite
can take advantage of one implementation instead of copy/pasting the
same function into every test case file.
This patch adds a way for a socket to ask to be routed through a
specific interface.
Currently, this option only applies to sending, however, it should also
apply to receiving...somehow :^)
Only return whatever a "return" statment told us to return.
The last computed value is now available in Interpreter::last_value()
instead, where the REPL can pick it up.
StartDownload requests for unhandled protocols (or invalid URLs) will
now refuse to load instead of asserting. A failure code is sent back
to LibProtocol and Protocol::Client::start_download() returns nullptr.
Fixes#1604.
This allows us to construct menus in a more natural way:
auto& file_menu = menubar->add_menu("File");
file_menu.add_action(...);
Instead of the old way:
auto file_menu = GUI::Menu::construct();
file_menu->add_action(...);
menubar->add_menu(file_menu);
Calling save("file") in a repl saves all the typed lines so far
into the specified file. It currently does not have great support for
multilined functions since those get turned into one line.
This patch adds JS::Shape, which implements a transition tree for our
Object class. Object property keys, prototypes and attributes are now
stored in a Shape, and each Object has a Shape.
When adding a property to an Object, we make a transition from the old
Shape to a new Shape. If we've made the same exact transition in the
past (with another Object), we reuse the same transition and both
objects may now share a Shape.
This will become the foundation of inline caching and other engine
optimizations in the future. :^)
We now print thrown exceptions and clear the interpreter state so it
can continue running instead of refusing to do anything after an
exception has been thrown.
Fixes#1572.
Let's move towards using references over pointers in LibJS as well.
I had originally steered away from it because that's how I've seen
things done in other engines. But this is not the other engines. :^)
When we enter a repl we call initialize_global_object on our custom
ReplObject in order to expose REPL specific features. Specifically:
help(), exit(code), and load("file1.js", "file2.js", "fileEtc.js").
LibWeb now creates a WindowObject which inherits from GlobalObject.
Allocation of the global object is moved out of the Interpreter ctor
to allow for specialized construction.
The existing Window interfaces are moved to WindowObject with their
implementation code in the new Window class.
This adds:
- A global Date object (with `length` property and `now` function)
- The Date constructor (no arguments yet)
- The Date prototype (with `get*` functions)
With this patch, it's now possible to pass a Gfx::ShareableBitmap in an
IPC message. As long as the message itself is synchronous, the bitmap
will be adopted by the receiving end, and disowned by the sender nicely
without any accounting effort like we've had to do in the past.
Use this in NotificationServer to allow sending arbitrary bitmaps as
icons instead of paths-to-icons.
We currently use icon paths for this because I didn't want to deal with
implementing icon bitmap sharing right now. In the future it would be
better to post a bitmap somehow instead of a path.
If you invoke `js` without a script path, it will now enter REPL mode,
where you input commands one by one and immediately get each one
interpreted in one shared interpreter. We support multi-line commands
in case we detect you have unclosed braces or parens or brackets.
When the Heap is going down, it's our last chance to run destructors,
so add a separate collector mode where we simply skip over the marking
phase and go directly to sweeping. This causes everything to get swept
and all live cells get destroyed.
This way, valgrind reports 0 leaks on exit. :^)
Now that we have the beginnings of a parser, let's take the script to
run as a command-line argument and move all the test scripts into
/home/anon/js :^)
To run a script, simply use "js":
$ js my-script.js
To get an AST dump before execution, you can use "js -A"
This adds a basic Javascript lexer and parser. It can parse the
currently existing demo programs. More work needs to be done to
turn it into a complete parser than can parse arbitrary JS Code.
The lexer outputs tokens with preceeding whitespace and comments
in the trivia member. This should allow us to generate the exact
source code by concatenating the generated tokens.
The parser is written in a way that it always returns a complete
syntax tree. Error conditions are represented as nodes in the
tree. This simplifies the code and allows it to be used as an
early stage parser, e.g for parsing JS documents in an IDE while
editing the source code.:
A new IP address or a new network mask can be specified in the command
line arguments of ifconfig to replace the old values of a given network
adapter. Additionally, more information is being printed for each adapter.
Previously, we were assuming all declared variables were bound to a
block scope, now, with the addition of declaration types, we can bind
a variable to a block scope using `let`, or a function scope (the scope
of the inner-most enclosing function of a `var` declaration) using
`var`.
The above snippet is a MemberExpression that necessitates the implicit
construction of a StringObject wrapper around a PrimitiveString.
We then do a property lookup (a "get") on the StringObject, where we
find the "length" property. This is pretty neat! :^)
It's now possible to assign expressions to variables. The variables are
put into the current scope of the interpreter.
Variable lookup follows the scope chain, ending in the global object.
I always tell people to start building things by working on the thing
that seems the most interesting right now. The most interesting thing
here was an AST + simple interpreter, so that's where we start!
There is no lexer or parser yet, we build an AST directly and then
execute it in the interpreter, producing a return value.
This seems like the start of something interesting. :^)