Commit Graph

31 Commits

Author SHA1 Message Date
MacDue
d2f971a919 LibXML: Don't emit a parser error for failing to resolve DTD URI
This prevented loading SVGs such as:
https://mdn.github.io/learning-area/html/multimedia-and-embedding/mdn-splash-page-finished/mdn.svg

Where the DTD is not in the hardcoded unified DTD in LibWeb. It also
prevented loading any SVG with a broken link for a DTD, which other
browsers just seem to ignore.
2024-03-30 07:36:50 +01:00
Ali Mohammad Pur
bc301b6f40 AK+LibXML+JSSpecCompiler: Move LineTrackingLexer to AK
This is a simple extension of GenericLexer, and is used in more than
just LibXML, so let's move it into AK.
The move also resolves a FIXME, which is removed in this commit.
2024-02-16 15:26:43 +01:00
Dan Klishch
dee4978d67 JSSpecCompiler+LibXML: Store location for tokens 2024-02-08 07:05:13 -07:00
kleines Filmröllchen
eada4f2ee8 AK: Remove ByteString from GenericLexer
A bunch of users used consume_specific with a constant ByteString
literal, which can be replaced by an allocation-free StringView literal.

The generic consume_while overload gains a requires clause so that
consume_specific("abc") causes a more understandable and actionable
error.
2024-01-12 17:03:53 -07:00
Shannon Booth
e2e7c4d574 Everywhere: Use to_number<T> instead of to_{int,uint,float,double}
In a bunch of cases, this actually ends up simplifying the code as
to_number will handle something such as:

```
Optional<I> opt;
if constexpr (IsSigned<I>)
    opt = view.to_int<I>();
else
    opt = view.to_uint<I>();
```

For us.

The main goal here however is to have a single generic number conversion
API between all of the String classes.
2023-12-23 20:41:07 +01:00
Ali Mohammad Pur
5e1499d104 Everywhere: Rename {Deprecated => Byte}String
This commit un-deprecates DeprecatedString, and repurposes it as a byte
string.
As the null state has already been removed, there are no other
particularly hairy blockers in repurposing this type as a byte string
(what it _really_ is).

This commit is auto-generated:
  $ xs=$(ack -l \bDeprecatedString\b\|deprecated_string AK Userland \
    Meta Ports Ladybird Tests Kernel)
  $ perl -pie 's/\bDeprecatedString\b/ByteString/g;
    s/deprecated_string/byte_string/g' $xs
  $ clang-format --style=file -i \
    $(git diff --name-only | grep \.cpp\|\.h)
  $ gn format $(git ls-files '*.gn' '*.gni')
2023-12-17 18:25:10 +03:30
Dan Klishch
1079f178cf LibXML: Set parents for text and comment nodes 2023-08-18 08:58:51 +03:30
Dan Klishch
6368e6ca12 LibXML: Record source location for nodes 2023-08-18 08:58:51 +03:30
Dan Klishch
b40ab55830 LibXML: Add helpers for extracting node contents if its type is known 2023-08-18 08:58:51 +03:30
Dan Klishch
2f7527b0a4 LibXML: Actually append resolved references when parsing content 2023-07-23 16:09:12 +02:00
zhiyuang
ce6138fccd LibXML: Parser handle standalone declaration
Previously here missed a equal sign parsing
2023-07-13 09:11:51 +03:30
Simon Wanner
7990f1b85a LibXML: Fix parser not leaving self-closing tags 2023-06-17 06:39:21 +02:00
Timothy Flynn
b6228507ac LibXML: Prevent entering the root node of an SVG document twice
Currently, if an SVG document is parsed, we enter the root <svg> element
twice - first when its node is appended, and then immediately after the
call to append its node. Prevent this by only ever entering nodes from
the appropriate location inside the call to append the node.
2023-06-09 01:12:48 +02:00
Ali Mohammad Pur
87e95ceb69 LibXML: Notify the listener about the root node as well
We previously did not notify the listener about entering the root node,
which caused the following snippet to produce the wrong output:
    a = new DOMParser
    a.parseFromString("<x/>", "text/xml").documentElement // != null
2023-05-05 01:33:49 +02:00
Andreas Kling
a504ac3e2a Everywhere: Rename equals_ignoring_case => equals_ignoring_ascii_case
Let's make it clear that these functions deal with ASCII case only.
2023-03-10 13:15:44 +01:00
Andreas Kling
21db2b7b90 Everywhere: Remove NonnullOwnPtr.h includes 2023-03-06 23:46:35 +01:00
Andreas Kling
359d6e7b0b Everywhere: Stop using NonnullOwnPtrVector
Same as NonnullRefPtrVector: weird semantics, questionable benefits.
2023-03-06 23:46:35 +01:00
Lucas CHOLLET
79006c03b4 AK: Check the return type in IsCallableWithArguments
Template argument are checked to ensure that the `Out` type is equal or
convertible to the type returned by the invokee.

Compilation now fails on:
`Function<void()> f = []() -> int { return 0; };`

But this is allowed:
`Function<ErrorOr<int>()> f = []() -> int { return 0; };`
2023-02-04 18:47:02 -07:00
Sam Atkins
8dff4c577f LibXML: Remove declarations for non-existent methods 2023-01-27 20:33:18 +00:00
Ali Mohammad Pur
803ff81d4a LibXML+LibWeb: Avoid implicit cast from StringView{}->DeprecatedString
This produces a (truly) null DeprecatedString, which is not expected to
occur by CharacterData (where this string ends up).
Simply pass an "empty" DeprecatedString manually instead.
2023-01-08 12:15:46 +01:00
Linus Groh
57dc179b1f Everywhere: Rename to_{string => deprecated_string}() where applicable
This will make it easier to support both string types at the same time
while we convert code, and tracking down remaining uses.

One big exception is Value::to_string() in LibJS, where the name is
dictated by the ToString AO.
2022-12-06 08:54:33 +01:00
Linus Groh
6e19ab2bbc AK+Everywhere: Rename String to DeprecatedString
We have a new, improved string type coming up in AK (OOM aware, no null
state), and while it's going to use UTF-8, the name UTF8String is a
mouthful - so let's free up the String name by renaming the existing
class.
Making the old one have an annoying name will hopefully also help with
quick adoption :^)
2022-12-06 08:54:33 +01:00
Timothy Flynn
b10bbac061 LibXML+LibWeb: Store the XML document's original source
Similar to how we store an HTML document's original source. This allows
the source to be inspected with "View Source" in the Browser.
2022-11-03 14:52:16 +00:00
Timothy Flynn
f8cdfb8907 LibXML: Convert some tab characters to spaces 2022-11-03 14:52:16 +00:00
Tim Schumacher
7834e26ddb Everywhere: Explicitly link all binaries against the LibC target
Even though the toolchain implicitly links against -lc, it does not know
where it should get LibC from except for the sysroot. In the case of
Clang this causes it to pick up the LibC stub instead, which might be
slightly outdated and feature missing symbols.

This is currently not an issue that manifests because we pass through
the dependency on LibC and other libraries by accident, which causes
CMake to link against the LibC target (instead of just the library),
and thus points the linker at the build output directory.

Since we are looking to fix that in the upcoming commits, let's make
sure that everything will still be able to find the proper LibC first.
2022-11-01 14:49:09 +00:00
Ben Wiederhake
a99cd09891 Libraries: Add missing includes, add namespace qualifiers
This remained undetected for a long time as HeaderCheck is disabled by
default. This commit makes the following file compile again:

    // file: compile_me.cpp
    #include <LibDNS/Question.h>
    // That's it, this was enough to cause a compilation error.

Likewise for most other files touched by this commit.
2022-09-18 13:27:24 -04:00
sin-ack
3f3f45580a Everywhere: Add sv suffix to strings relying on StringView(char const*)
Each of these strings would previously rely on StringView's char const*
constructor overload, which would call __builtin_strlen on the string.
Since we now have operator ""sv, we can replace these with much simpler
versions. This opens the door to being able to remove
StringView(char const*).

No functional changes.
2022-07-12 23:11:35 +02:00
Idan Horowitz
18d25124bf LibXML: Fail gracefully on integer overflow in character references
Fixes https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=47738
2022-07-10 22:29:11 +03:00
DexesTTP
7ceeb74535 AK: Use an enum instead of a bool for String::replace(all_occurences)
This commit has no behavior changes.

In particular, this does not fix any of the wrong uses of the previous
default parameter (which used to be 'false', meaning "only replace the
first occurence in the string"). It simply replaces the default uses by
String::replace(..., ReplaceMode::FirstOnly), leaving them incorrect.
2022-07-06 11:12:45 +02:00
Luke Wilde
adb5f7e485 LibXML+Tests: Consume > in the character data ending ]]> and test it
For example, with this input:
```xml
<C>]]>
```
After seeing `<C>`, the parser will start parsing the content of the
element. The content parser will then parse any character data it sees.

The character parser would see the first two `]]` and consume them.
Then, it would see the `>` and set the state machine to say we have
seen this, but it did _not_ consume it and would instead tell
GenericLexer that it should stop consuming characters. Therefore,
we only consumed 2 characters.

Then, it would see that we are in the state where we've seen the
full `]]>` and try to take off three characters from the end of the
consumed input when we only have 2 characters, causing an assertion
failure as we are asking to take off more characters than there really
is.
2022-05-30 00:16:17 +01:00
Ali Mohammad Pur
67357fe984 LibXML: Add a fairly basic XML parser
Currently this can parse XML and resolve external resources/references,
and read a DTD (but not apply or verify its rules).
That's good enough for _most_ XHTML documents as the HTML 5 spec
enforces its own rules about document well-formedness, and does not make
use of XML DTDs (aside from a list of predefined entities).

An accompanying `xml` utility is provided that can read and dump XML
documents, and can also run the XML conformance test suite.
2022-03-28 23:11:48 +02:00