Commit Graph

81 Commits

Author SHA1 Message Date
Timothy Flynn
5f51a11618 LibWeb: Remove OOM propagation from Fetch::Infrastructure::Responses 2024-04-27 07:08:14 +02:00
Timothy Flynn
5a4f13dcd4 LibWeb: Remove OOM propagation from Fetch::Infrastructure::Requests 2024-04-27 07:08:14 +02:00
Timothy Flynn
c79f46fe6f LibWeb: Remove OOM propagation from Fetch::Infrastructure::Headers 2024-04-27 07:08:14 +02:00
Andreas Kling
184368285c LibWeb: Fix GC leaks in Fetch::Infrastructure::Body::fully_read()
By making this function accept the success and error steps as
HeapFunction rather than SafeFunction, we break a bunch of strong
GC cycles.
2024-04-23 12:50:40 +02:00
Kenneth Myhra
291d0e5de8 LibWeb: Let queue_fetch_task() take a JS::HeapFunction
Changes the signature of queue_fetch_task() from AK:Function to
JS::HeapFunction to be more clear to the user of the function that this
is what it uses internally.
2024-04-20 18:11:01 +02:00
Shannon Booth
70e2f51674 LibWeb: Prefer GCPtr<T> over Optional<NonnullGCPtr<T>> 2024-04-07 18:01:05 +02:00
Timothy Flynn
683c08744a Userland: Avoid some conversions from rvalue strings to StringView
These are all actually fine, there is no UAF here. But once e.g.
`ByteString::view() &&` is deleted, these instances won't compile.
2024-04-04 11:23:21 +02:00
Andreas Kling
a9842ebe48 LibWeb: Use JS::HeapFunction in Fetch::Fetching::PendingResponse
This fixes a long-standing realm leak.
2024-04-03 18:14:33 +02:00
Timothy Flynn
24ecf31ff5 LibURL+LibWeb: Move data URL processing to LibWeb's fetch infrastructure
This is a fetching AO and is only used by LibWeb in the context of fetch
tasks. Move it to LibWeb with other fetch methods.

The main reason for this is that it requires the use of other LibWeb AOs
such as the forgiving Base64 decoder and MIME sniffing. These AOs aren't
available within LibURL.
2024-03-25 08:13:27 +01:00
Timothy Flynn
7b3ddd5e15 LibWeb: Track fetching-related tasks in FetchController for cancellation
The HTMLMediaElement, for example, contains spec text which states any
ongoing fetch process must be "stopped". The spec does not indicate how
to do this, so our implementation is rather ad-hoc.

Our current implementation may cause a crash in places that assume one
of the fetch algorithms that we set to null is *not* null. For example:

    if (fetch_params.process_response) {
        queue_fetch_task([]() {
            fetch_params.process_response();
        };
    }

If the fetch process is stopped after queuing the fetch task, but not
before the fetch task is run, we will crash when running this fetch
algorithm.

We now track queued fetch tasks on the fetch controller. When the fetch
process is stopped, we cancel any such pending task.

It is a little bit awkward maintaining a fetch task ID. Ideally, we
could use the underlying task ID throughout. But we do not have access
to the underlying task nor its ID when the task is running, at which
point we need some ID to remove from the pending task list.
2024-03-23 13:45:35 +01:00
Shannon Booth
e800605ad3 AK+LibURL: Move AK::URL into a new URL library
This URL library ends up being a relatively fundamental base library of
the system, as LibCore depends on LibURL.

This change has two main benefits:
 * Moving AK back more towards being an agnostic library that can
   be used between the kernel and userspace. URL has never really fit
   that description - and is not used in the kernel.
 * URL _should_ depend on LibUnicode, as it needs punnycode support.
   However, it's not really possible to do this inside of AK as it can't
   depend on any external library. This change brings us a little closer
   to being able to do that, but unfortunately we aren't there quite
   yet, as the code generators depend on LibCore.
2024-03-18 14:06:28 -04:00
Timothy Flynn
7681772b9f LibWeb: Log failed Fetch responses when WEB_FETCH_DEBUG is enabled
We do the same for successful responses. Very useful for debugging
issues on live websites.
2024-03-14 10:10:33 +01:00
Andrew Kaster
c79bac70f4 LibWeb: Consistently use the EmptyString state of ReferrerPolicy
We previously used an empty optional to denote that a ReferrerPolicy is
in the default empty string state. However, later additions added an
explicit EmptyString state. This patch moves all users to the explicit
state, and stops using `Optional<ReferrerPolicy>` everywhere except for
when an option not being passed from JavaScript has meaning.
2024-03-06 07:19:10 +01:00
Shannon Booth
9ce8189f21 Everywhere: Use unqualified AK::URL
Now possible in LibWeb now that there is no longer a Web::URL.
2024-02-25 08:54:31 +01:00
Shannon Booth
f9e5b43b7a LibWeb: Rename URL platform object to DOMURL
Along with putting functions in the URL namespace into a DOMURL
namespace.

This is done as LibWeb is in an awkward situation where it needs
two URL classes. AK::URL is the general purpose URL class which
is all that is needed in 95% of cases. URL in the Web namespace
is needed predominantly for interfacing with the javascript
interfaces.

Because of two URLs in the same namespace, AK::URL has had to be
used throughout LibWeb. If we move AK::URL into a URL namespace,
this becomes more painful - where ::URL::URL is required to
specify the constructor (and something like
::URL::create_with_url_or_path in other places).

To fix this problem - rename the class in LibWeb implementing the
URL IDL interface to DOMURL, along with moving the other Web URL
related classes into this DOMURL folder.

One could argue that this name also makes the situation a little
more clear in LibWeb for why these two URL classes need be used
in the first place.
2024-02-25 08:54:31 +01:00
Timothy Flynn
85b8971a80 Ladybird+LibWeb+WebContent: Port the did_request_cookie IPC to String 2024-01-26 20:22:39 +01:00
Bastiaan van der Plaat
05c0640474 Ladybird+LibWeb: Add about scheme support for internal pages 2024-01-13 13:41:09 -05:00
Andreas Kling
89da988da1 LibWeb: Honor User-Agent spoofing in Fetch headers
This makes spoofing consistent between legacy ResourceLoader loads,
Fetch loads, and the JavaScript `navigator` APIs.
2023-12-27 11:43:14 +01:00
Timothy Flynn
e511a264fe LibWeb: Implement ad-hoc steps to allow LibWeb to load resource:// URLs
The resource:// scheme is used for Core::Resource files. Currently, any
users of resource:// URLs in Ladybird must manually create the Resource
and extract its data. This will allow for passing the resource:// URL
along for LibWeb to handle.
2023-12-24 14:09:23 +01:00
Ali Mohammad Pur
5e1499d104 Everywhere: Rename {Deprecated => Byte}String
This commit un-deprecates DeprecatedString, and repurposes it as a byte
string.
As the null state has already been removed, there are no other
particularly hairy blockers in repurposing this type as a byte string
(what it _really_ is).

This commit is auto-generated:
  $ xs=$(ack -l \bDeprecatedString\b\|deprecated_string AK Userland \
    Meta Ports Ladybird Tests Kernel)
  $ perl -pie 's/\bDeprecatedString\b/ByteString/g;
    s/deprecated_string/byte_string/g' $xs
  $ clang-format --style=file -i \
    $(git diff --name-only | grep \.cpp\|\.h)
  $ gn format $(git ls-files '*.gn' '*.gni')
2023-12-17 18:25:10 +03:30
Andreas Kling
9793d69d4f LibWeb: Make HTML::Window::page() return a Page& 2023-12-15 22:04:46 +01:00
Andreas Kling
7c95ebc302 LibWeb: Make Document::page() return a Page&
Now that Document always has a Page, and always keeps it alive, we can
make this return a Page&, exposing various unnecessary null checks.
2023-12-15 22:04:46 +01:00
Andreas Kling
bfd354492e LibWeb: Put most LibWeb GC objects in type-specific heap blocks
With this change, we now have ~1200 CellAllocators across both LibJS and
LibWeb in a normal WebContent instance.

This gives us a minimum heap size of 4.7 MiB in the scenario where we
only have one cell allocated per type. Of course, in practice there will
be many more of each type, so the effective overhead is quite a bit
smaller than that in practice.

I left a few types unconverted to this mechanism because I got tired of
doing this. :^)
2023-11-19 22:00:48 +01:00
Andrew Kaster
d7d84ee931 LibWeb: Ensure a Web::Page is associated with local Worker LoadRequests
This is a hack on top of a hack because Workers don't *really* need to
have a Web::Page at all, but the ResourceLoader infra that should be
going away soon ™️ is not quite ready to axe that requirement for
cookies.
2023-11-15 12:56:33 +01:00
Aliaksandr Kalenik
084cb4350e LibWeb/Fetch: Include body and headers in Response for failed requests
Fixes https://github.com/SerenityOS/serenity/issues/21290
2023-10-03 09:41:56 +02:00
Aliaksandr Kalenik
b9e0ad4358 LibWeb: Make ResourceLoader pass body and headers in error callback
Pass body and headers of a failed request to callback so caller can
process them.
2023-10-03 09:41:56 +02:00
Karol Kosek
fdb27c5851 LibWeb: Use StringView in calls to Vector<String>::contains_slow() 2023-08-23 20:21:09 +02:00
Aliaksandr Kalenik
bdd3a16b16 LibWeb: Make Fetch::Infrastructure::Body be GC allocated
Making the body GC-allocated allows us to avoid using `JS::Handle`
for `m_stream` in its members.
2023-08-19 15:12:00 +02:00
Aliaksandr Kalenik
9a07ac0b6a LibWeb/Fetch: Use JS::HeapFunction for callback in FetchAlgorithms
In FetchAlgorithms, it is common for callbacks to capture realms. This
can indirectly keep objects alive that hold FetchController with these
callbacks. This creates a cyclic dependency. However, when
JS::HeapFunction is used, this is not a problem, as captured by
callbacks values do not create new roots.
2023-08-19 05:03:17 +02:00
Shannon Booth
9d60f23abc AK: Port URL::m_fragment from DeprecatedString to String 2023-08-13 15:03:53 -06:00
Shannon Booth
c25485700a AK: Port URL scheme from DeprecatedString to String 2023-08-13 15:03:53 -06:00
Shannon Booth
55a01e72ca AK: Port URL username/password from DeprecatedString to String
And for cases that just need to check whether the password/username is
empty, add a raw_{password,username} helper to avoid any allocation.
2023-08-13 15:03:53 -06:00
Andreas Kling
72c9f56c66 LibJS: Make Heap::allocate<T>() infallible
Stop worrying about tiny OOMs. Work towards #20449.

While going through these, I also changed the function signature in many
places where returning ThrowCompletionOr<T> is no longer necessary.
2023-08-13 15:38:42 +02:00
Lucas CHOLLET
3f35ffb648 Userland: Prefer _string over _short_string
As `_string` can't fail anymore (since 3434412), there are no real
benefits to use the short variant in most cases.
2023-08-08 07:37:21 +02:00
Andreas Kling
34344120f2 AK: Make "foo"_string infallible
Stop worrying about tiny OOMs.

Work towards #20405.
2023-08-07 16:03:27 +02:00
Timothy Flynn
a14f6e42a8 LibWeb: Begin support for requesting blob URLs with Fetch infrastructure
This does not yet implement requests with a Range header.
2023-08-02 00:52:33 +01:00
Simon McMahon
2ce416a676 LibWeb: Parse header value lists for some CORS headers
This adds a simple and incomplete implementation for extracting some
specific CORS headers that are used by fetch. This unifies the existing
ad-hoc parsing that already existed for Access-Control-Allow-Headers
and Access-Control-Allow-Methods, as well as adding
Access-control-Expose-Headers.
2023-08-02 00:52:23 +01:00
Simon McMahon
3a1f510af0 LibWeb: Support for Access-Control-Expose-Headers in Fetch
This adds the headers named in Access-Control-Expose-Headers to the
response's CORS-exposed header-name list which allows those headers to
be accessed from JS.
2023-08-02 00:52:23 +01:00
Karol Kosek
eb41f0144b AK: Decode data URLs to separate class (and parse like every other URL)
Parsing 'data:' URLs took it's own route. It never set standard URL
fields like path, query or fragment (except for scheme) and instead
gave us separate methods called `data_payload()`, `data_mime_type()`,
and `data_payload_is_base64()`.

Because parsing 'data:' didn't use standard fields, running the
following JS code:

    new URL('#a', 'data:text/plain,hello').toString()

not only cleared the path as URLParser doesn't check for data from
data_payload() function (making the result be 'data:#a'), but it also
crashes the program because we forbid having an empty MIME type when we
serialize to string.

With this change, 'data:' URLs will be parsed like every other URLs.
To decode the 'data:' URL contents, one needs to call process_data_url()
on a URL, which will return a struct containing MIME type with already
decoded data! :^)
2023-08-01 14:19:05 +02:00
Shannon Booth
3cb65645cf LibWeb: Make Web::URL::host_is_domain accept an AK::URL::Host
Which allows us to avoid serializing the host only to try and reparse it
again as either an IPv4 or IPv6 address.
2023-07-31 05:18:51 +02:00
Shannon Booth
8751be09f9 AK: Serialize URL hosts with 'concept-host-serializer'
In order to follow spec text to achieve this, we need to change the
underlying representation of a host in AK::URL to deserialized format.
Before this, we were parsing the host and then immediately serializing
it again.

Making that change resulted in a whole bunch of fallout.

After this change, callers can access the serialized data through
this concept-host-serializer. The functional end result of this
change is that IPv6 hosts are now correctly serialized to be
surrounded with '[' and ']'.
2023-07-31 05:18:51 +02:00
Timothy Flynn
6406a561ef LibWeb: Handover the fetch response's internal body data upon completion
These is a normative change to the Fetch spec. See:
https://github.com/whatwg/fetch/commit/9003266
https://github.com/whatwg/fetch/commit/b5a587b
2023-05-29 17:12:46 +02:00
Andreas Kling
055dabc123 LibWeb: Fix unsafe capture of stack variables in main_fetch() 2023-05-21 16:01:19 +02:00
Sam Atkins
9c2d496dbe LibWeb: Make processBodyError take an optional exception
Changed here:
018ac19838
2023-05-15 16:28:16 +02:00
Aliaksandr Kalenik
269c25e1d2 LibWeb/Fetch: Pass recursive=false to manual navigation redirect
Implement ca10f49748

Fixes the issue I found while working on navigation:
https://github.com/whatwg/fetch/issues/1629
2023-04-24 13:38:37 +01:00
Sam Atkins
22e0603bf7 LibWeb: Implement integrity-metadata part of fetch algorithm
Specifically, this makes `<link>` elements with an `integrity` attribute
actually work. Previously, we would load their resource, and then drop
it on the floor without actually using it.

The Subresource Integrity code is in `LibWeb/SRI`, since SRI is the name
of the recommendation spec: https://www.w3.org/TR/SRI/

However, the Fetch spec links to the editor's draft, which varies
significantly from the recommendation, and so that is what the code is
based on and what the spec comments link to:
https://w3c.github.io/webappsec-subresource-integrity/

Fixes #18408
2023-04-21 20:44:47 +01:00
Sam Atkins
6d93e03211 LibWeb+Browser+Ladybird: Use JS::SafeFunction for EventLoop callbacks
This automatically protects captured objects from being GC'd before the
callback runs.
2023-04-21 20:44:47 +01:00
Sam Atkins
955528055c LibWeb: Add FIXME: for new step 6 of Fetch's "main fetch"
This step was added in this commit:
2d78995db8
2023-04-21 20:44:47 +01:00
MacDue
35612c6a7f AK+Everywhere: Change URL::path() to serialize_path()
This now defaults to serializing the path with percent decoded segments
(which is what all callers expect), but has an option not to. This fixes
`file://` URLs with spaces in their paths.

The name has been changed to serialize_path() path to make it more clear
that this method will generate a new string each call (except for the
cannot_be_a_base_url() case). A few callers have then been updated to
avoid repeatedly calling this function.
2023-04-15 06:37:04 +02:00
Aliaksandr Kalenik
47f03c3a9a LibWeb/Fetch: Use a basic filtered response for redirect navigations
Match following change in the spec:
8f109835dc
2023-04-09 19:10:45 +02:00