This will make it easier to support both string types at the same time
while we convert code, and tracking down remaining uses.
One big exception is Value::to_string() in LibJS, where the name is
dictated by the ToString AO.
We have a new, improved string type coming up in AK (OOM aware, no null
state), and while it's going to use UTF-8, the name UTF8String is a
mouthful - so let's free up the String name by renaming the existing
class.
Making the old one have an annoying name will hopefully also help with
quick adoption :^)
Otherwise, we end up propagating those dependencies into targets that
link against that library, which creates unnecessary link-time
dependencies.
Also included are changes to readd now missing dependencies to tools
that actually need them.
`index + 1` was not correct. For example, if the body has two bytes, we
would consume the first byte and increment the index. We then add one
to the index and see it's equal to the size, so we take this one byte
and set the body result to it. The while loop would still continue and
we consume the second byte, adding it to the temporary buffer. We see
that the index is above the size, so we don't update the body, dropping
the last byte on the floor.
Each of these strings would previously rely on StringView's char const*
constructor overload, which would call __builtin_strlen on the string.
Since we now have operator ""sv, we can replace these with much simpler
versions. This opens the door to being able to remove
StringView(char const*).
No functional changes.
JsonArray.h does not #include the definition of JsonValue::serialize, as
it lives in JsonObject.h. The macOS Clang target handles symbol
visibility slightly differently (I couldn't figure out how exactly), so
no visible instantiation ended up being created for the function,
causing a link failure.
A mistake I've repeatedly made is along these lines:
```c++
auto nread = TRY(source_file->read(buffer));
TRY(destination_file->write(buffer));
```
It's a little clunky to have to create a Bytes or StringView from the
buffer's data pointer and the nread, and easy to forget and just use
the buffer. So, this patch changes the read() function to return a
Bytes of the data that were just read.
The other read_foo() methods will be modified in the same way in
subsequent commits.
Fixes#13687
AK::URL stores the URL query string already encoded. We were sending
double-encoded query strings, which is why the Google cookie consent
page was not working correctly.
A change was made prior to percent encode plus signs in order to fix an
issue with the Google cookie consent page.
Unforunately, this was treating a symptom of a problem and not the root
cause and is incorrect behavior.
Adds a new optional parameter 'reserved_chars' to
AK::URL::percent_encode. This new optional parameter allows the caller
to specify custom characters to be percent encoded. This is then used
to percent encode plus signs by HttpRequest::to_raw_request.
When the server doesn't signal the Content-Length or use a chunked mode,
it may just terminate the connection after sending the data.
The TLS sockets would then get stuck in a state with no data to read and
not reach the disconnected state, making some requests hang.
We know double check the EOF status of HTTP jobs after reading the
payload to resolve requests properly and also mark the TLS sockets as
EOF after processing all the data and the underlying TCP socket reaches
EOF.
Fixes#12866.
According to rfc2616 section 6.1 the text of reason phrase is not
defined and can be replaced by server.
Some servers (for example http://linux.org.ru) leave it empty.
This change fixes parsing of HTTP responses with empty reason phrase.
When entering the InBody state LibHTTP performs a
can_read_without_blocking check, which is duplicated immediately
afterwards. This initial call is removed.
When LibHTTP encountered the blank line between the headers and the body
in a HTTP response it made a call the m_socket->can_read_line(). This
ultimately tried to find a newline in the stream. If the response body
was small and did not contain a new line then the request would hang.
The call to m_socket->can_read_line() is removed so that the code is
able to progress into the body reading loop.
Instead of using ByteBuffer::slice() to carve off the remaining part of
the payload every time we flush a part of it, we now keep a sliding
span (ReadonlyBytes) over it.
...even if the headers claim that there's some data in the form of
Content-Length.
This finally fixes loading Discord with RequestServer ConnectionCache
on :^)
This commit converts TLS::TLSv12 to a Core::Stream object, and in the
process allows TLS to now wrap other Core::Stream::Socket objects.
As a large part of LibHTTP and LibGemini depend on LibTLS's interface,
this also converts those to support Core::Stream, which leads to a
simplification of LibHTTP (as there's no need to care about the
underlying socket type anymore).
Note that RequestServer now controls the TLS socket options, which is a
better place anyway, as RS is the first receiver of the user-requested
options (though this is currently not particularly useful).
Apologies for the enormous commit, but I don't see a way to split this
up nicely. In the vast majority of cases it's a simple change. A few
extra places can use TRY instead of manual error checking though. :^)
Derivatives of Core::Object should be constructed through
ClassName::construct(), to avoid handling ref-counted objects with
refcount zero. Fixing the visibility means that misuses like this are
more difficult.
When we receive HTTP payloads, we have to ensure that the number of
bytes read is *at most* the value specified in the Content-Length
header.
However, we did not use the correct value when calculating the truncated
size of the last payload. `m_buffered_size` does not store the total
number of bytes received, but rather the number of bytes that haven't
been read from us.
This means that if some data has already been read from us,
`m_buffered_size` is smaller than `m_received_size`. Because of this, we
ended up resizing the `payload` ByteBuffer to a larger size than its
contents. This garbage data was then read by consumers, producing this
warning when executing scripts:
> Extension byte 0xdc in 1 position after first byte 0xdc doesn't make
> sense.
Used these commands to test it:
printf 'HTTP/1.0 200 OK\r\n%s\r\n\r\n%s' 'Content-Length: 4' \
'well hello friends!' | nc -lN 0.0.0.0 8000
pro http://0.0.0.0:8000
(Actually, this also needs a Content-Encoding header, as response
streaming is disabled then. It didn't fit in the title.)
We were creating too small buffer -- instead of assigning the total
received buffer size, we were using the Content-Length value.
As you can see, the m_buffered_size might now exceed the Content-Length
value, but that will be handled in next commits, regardless if
the response can be streamed or not. :^)
Here's a minimal code that caused crash before:
printf 'HTTP/1.0 200 OK\r\n%s\r\n%s\r\n\r\n%s' \
'Content-Encoding: anything' 'Content-Length: 3' \
':^)AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA' | nc -lN 0.0.0.0 8000
pro http://0.0.0.0:8000
If we don't quit, the underlying socket won't get a chance to do much
other than nothing while we spin in read_while_data_available().
Fixes some possible RS spin (especially seen in Google's cookie consent
page).
Apparently discord likes to feed us headers as big as 6KiB, so clearly
there are large headers out there in the wild.
For reference, Apache's limit is 8KiB, and IIS's limit is 16KiB (this
limit is not defined by the spec, so nothing can stop a server from
sending massive headers - sadly)