Commit Graph

59898 Commits

Author SHA1 Message Date
Nico Weber
c4a45bb521 LibGfx/JBIG2: Make compute_context() a function pointer
...instead of a lambda that checks the template on every call.

Doesn't make a performance difference locally, but seems maybe nicer?

No behavior change.
2024-03-25 14:08:40 +01:00
Nico Weber
828c640087 LibGfx/JBIG2: Make get_pixel static constexpr
...so it doesn't need to be captured.
2024-03-25 14:08:40 +01:00
Nico Weber
b45a4508c7 LibGfx/JBIG2: Implement support for context templates 1, 2, and 3
Template 2 is needed by some symbols in 0000372.pdf page 11 and
0000857.pdf pages 1-4. Implement the others too while here.  (The
mentioned pages in those two PDFs also use the "end of stripe" segment,
so they still don't render yet.

We still don't support EXTTEMPLATE.
2024-03-25 14:08:40 +01:00
Nico Weber
e2f02f4df7 Tests/JBIG2: Add test images for non-0 templates
Produced by this shell script:

```sh
S=Tests/LibGfx/test-inputs/bmp/bitmap.bmp
J=$HOME/Downloads/T-REC-T.88-201808-I\!\!SOFT-ZST-E/Software
J=$J/JBIG2_SampleSoftware-A20180829/source/jbig2

for t in template1 template1-tpgdon \
         template2 template2-tpgdon \
         template3 template3-tpgdon; do
  echo $t.ini
  cat $t.ini
  echo
  $J -i "${S%.bmp}" -f bmp -o bitmap-$t -F jb2 -ini $t.ini
  echo
done
```

Producing this output:

```sh
% ./make-templates.sh
template1.ini
-Gen -Seg 1
-Gen -Param -Template 1
-Gen -Param -ATX1 3
-Gen -Param -ATY1 -1

ENC Start ===>complete

template1-tpgdon.ini
-Gen -Seg 1
-Gen -Param -Template 1
-Gen -Param -ATX1 3
-Gen -Param -ATY1 -1
-Gen -Param -TpGDon 1

ENC Start ===>complete

template2.ini
-Gen -Seg 1
-Gen -Param -Template 2
-Gen -Param -TpGDon 1
-Gen -Param -ATX1 2
-Gen -Param -ATY1 -1

ENC Start ===>complete

template2-tpgdon.ini
-Gen -Seg 1
-Gen -Param -Template 2
-Gen -Param -ATX1 2
-Gen -Param -ATY1 -1
-Gen -Param -TpGDon 1

ENC Start ===>complete

template3.ini
-Gen -Seg 1
-Gen -Param -Template 3
-Gen -Param -ATX1 2
-Gen -Param -ATY1 -1

ENC Start ===>complete

template3-tpgdon.ini
-Gen -Seg 1
-Gen -Param -Template 3
-Gen -Param -ATX1 2
-Gen -Param -ATY1 -1
-Gen -Param -TpGDon 1

ENC Start ===>complete
```

`jbig2` still has the local patch mentioned in #23608 to make it write
correct TPGDON data.
2024-03-25 14:08:40 +01:00
Nico Weber
7035c2a2ff LibGfx/JBIG2: Add some debug logging to decode_page_information() 2024-03-25 14:08:40 +01:00
Andreas Kling
9af966f87d LibWeb: Avoid unnecessary Vector copying when generating line boxes
Carry the same Vector<Gfx::DrawGlyphOrEmoji> all the way from the inline
level iterator to the final line box fragment.
2024-03-25 12:39:23 +01:00
Andreas Kling
f48024c2d1 LibGfx/OpenType: Make glyph_width() only fetch the glyph advance
Instead of fetching a generic set of metrics for each glyph, only fetch
the advance when that's all we need.

This is extremely hot in LibWeb text layout, where it makes a nice dent.
2024-03-25 12:39:23 +01:00
Andreas Kling
2b8a920a7c AK: Don't blindly use SipHash as default hash function
Although it has some interesting properties, SipHash is brutally slow
compared to our previous hash function. Since its introduction, it has
been highly visible in every profile of doing anything interesting with
LibJS or LibWeb.

By switching back, we gain a 10x speedup for 32-bit hashes, and "only"
a 3x speedup for 64-bit hashes.

This comes out to roughly 1.10x faster HashTable insertion, and roughly
2.25x faster HashTable lookup. Hashing is no longer at the top of
profiles and everything runs measurably faster.

For security-sensitive hash tables with user-controlled inputs, we can
opt into SipHash selectively on a case-by-case basis. The vast majority
of our uses don't fit that description though.
2024-03-25 12:39:23 +01:00
Nico Weber
d2998c1f5e LibGfx/JBIG2: Implement generic_refinement_region_decoding_procedure()
With this, we can decode all pages of 0000425.pdf, 0000215.pdf,
0000882.pdf, and 0000057.pdf.
2024-03-25 08:15:36 +01:00
Nico Weber
0d2e91b4ea LibGfx/JBIG2: Reject things in refinement decoding
These aren't hit for my 1000 page PDF test set.
2024-03-25 08:15:36 +01:00
Nico Weber
562d8ed619 LibGfx/JBIG2: Stub out generic_refinement_region_decoding_procedure()
...and make text_region_decoding_procedure() call it.

generic_refinement_region_decoding_procedure() still just returns
"unimplemented", so no behavior change yet.
2024-03-25 08:15:36 +01:00
Nico Weber
c4c48c1d5f LibGfx/JBIG2: Sketch out text segment refinement coding a bit 2024-03-25 08:15:36 +01:00
Nico Weber
9f327833c0 LibGfx/JBIG2: Read refinement adaptive template pixels for text segments
Text segments using refinement are still rejected later, by
text_region_decoding_procedure(). But we deserialize the input data now,
and the error when this feature is used is now slightly different.
2024-03-25 08:15:36 +01:00
Aliaksandr Kalenik
0652d159cf LibWeb: Add a test for mouse{over,out,enter,leave} events 2024-03-25 08:14:13 +01:00
Aliaksandr Kalenik
4ae2eaead1 LibWeb: Dispatch mouseout and mouseover events 2024-03-25 08:14:13 +01:00
Timothy Flynn
ed24d8f2b5 Ladybird/AppKit: Implement pasting content from the clipboard 2024-03-25 08:14:00 +01:00
Timothy Flynn
0069390e1c Browser: Implement pasting content from the clipboard 2024-03-25 08:14:00 +01:00
Timothy Flynn
7e38653492 AK: Reject invalid Base64 encoded string lengths 2024-03-25 08:13:27 +01:00
Timothy Flynn
4ecf4c7617 AK: Compute the exact size of decoded Base64 strings 2024-03-25 08:13:27 +01:00
Timothy Flynn
754ff41b9c AK: Remove whitespace skipping feature from AK's Base64 decoder
This was added in commit f2663f477f as a
partial implementation of what is now LibWeb's forgiving Base64 decoder.
All use cases within LibWeb that require whitespace skipping now use
that implementation instead.

Removing this feature from AK allows us to know the exact output size of
a decoded Base64 string. We can still trim whitespace at the start and
end of the input though; for example, this is useful when reading from a
file that may have a newline at the end of the file.
2024-03-25 08:13:27 +01:00
Timothy Flynn
690db10463 AK: Convert Base64 template parameters to regular function parameters
The generated function name is otherwise very long, which makes stack
traces a bit more difficult to sift through.
2024-03-25 08:13:27 +01:00
Timothy Flynn
f292746134 AK: Convert some west-consts to east-const in Base64.cpp
Caught by clang-format-17. Note that clang-format-16 is fine with this
as well (it leaves the const placement alone), it just doesn't perform
the formatting to east-const itself.
2024-03-25 08:13:27 +01:00
Timothy Flynn
74377618b1 LibWeb: Process Base64 data URLs with the forgiving Base64 algorithm 2024-03-25 08:13:27 +01:00
Timothy Flynn
24ecf31ff5 LibURL+LibWeb: Move data URL processing to LibWeb's fetch infrastructure
This is a fetching AO and is only used by LibWeb in the context of fetch
tasks. Move it to LibWeb with other fetch methods.

The main reason for this is that it requires the use of other LibWeb AOs
such as the forgiving Base64 decoder and MIME sniffing. These AOs aren't
available within LibURL.
2024-03-25 08:13:27 +01:00
Timothy Flynn
2118cdfcaa Meta: Add LibWeb unit tests to the GN build 2024-03-25 08:13:27 +01:00
Timothy Flynn
a88ee029d7 Meta: Add the base64 utility to the GN build 2024-03-25 08:13:27 +01:00
MacDue
c6899b79b6 LibWeb: Normalize the angle delta in CanvasPath::ellipse()
This fixes both the incorrect arc and ellipse from #22817.
2024-03-24 18:37:44 +01:00
Aliaksandr Kalenik
b31fb36ed3 LibWeb: Reschedule repaint for navigables with ongoing painting
Fixes delayed repainting in the following case:
1. Style or layout invalidation triggers html event loop processing.
2. Event loop processing does nothing because there is no rendering
   opportunity.
3. Style or layout change won't be reflected until something else
   triggers event loop processing
2024-03-24 16:30:31 +01:00
Andreas Kling
3bdfca1119 AK: Make FlyString::from_utf8*() avoid allocation if possible
If we already have a FlyString instantiated for the given string,
look that up and return it instead of making a temporary String just to
use as a key into the FlyString table.
2024-03-24 13:28:24 +01:00
Andreas Kling
8d7a1e5654 LibWeb: Skip some redundant UTF-8 validation in CSS tokenizer
If we're just adding code points to a StringBuilder, there's no need to
revalidate the result.
2024-03-24 13:28:24 +01:00
Andreas Kling
a88799c032 AK: Remove excessive hashing caused by FlyString table
Before this change, the global FlyString table looked like this:

    HashMap<StringView, Detail::StringBase>

After this change, we have:

    HashTable<Detail::StringData const*, FlyStringTableHashTraits>

The custom hash traits are used to extract the stored hash from
StringData which avoids having to rehash the StringView repeatedly like
we did before.

This necessitated a handful of smaller changes to make it work.
2024-03-24 13:28:24 +01:00
Andreas Kling
8bfad24708 AK: Move AK::Detail::StringData to its own header file
This will allow us to access it from FlyString.cpp
2024-03-24 13:28:24 +01:00
Andreas Kling
f1f7e89b68 LibJS: Lex 1/2/3-byte tokens without HashMap lookups
The 1-byte ones are now a simple array lookup, while we handle 2 and 3
bytes with a simple list of if statements.
2024-03-24 13:28:24 +01:00
Andreas Kling
3851d3add0 LibJS: Make Token::m_message a StringView
This is only ever a string literal, so there's no need to keep creating
the same strings at runtime.
2024-03-24 13:28:24 +01:00
Kenneth Myhra
51847bbebf LibWeb: Remove ImageData's create_with_size() and use create() instead
Removes ImageData::create_with_size() and redirects previous usage to
ImageData::create().
2024-03-24 11:09:09 +01:00
Kenneth Myhra
8a1e88677f LibWeb: Add FIXME comments to ImageData.idl
Add FIXME comments for ImageData's missing constructor and attribute
colorSpace.
2024-03-24 11:09:09 +01:00
Kenneth Myhra
30a02fef91 LibWeb: Add one of the two documented constructors to ImageData
Also adds the IDL types:
- dictionary ImageDataSettings
- enum PredefinedColorSpace.
2024-03-24 11:09:09 +01:00
Ali Mohammad Pur
6adf1be06b Shell: Add support for octal escapes in strings
This adds all three common prefixes (\0, \o and \c).
2024-03-24 08:26:56 +01:00
Nico Weber
ce4396d6ff MacPDF: Fix capitalization of "Show Images" Debug menu entry 2024-03-24 08:25:31 +01:00
Ali Mohammad Pur
27a38932da LibRegex: Account for extra explicit And/Or in class parser assertion
Fixes #23691.
2024-03-24 08:24:46 +01:00
Nico Weber
259a84ddac Tests/JBIG2: Add a test for symbol and text segment decoding 2024-03-23 17:30:15 -04:00
Nico Weber
ced21d8419 LibGfx/JBIG2: Call decode_immediate_text_region for lossless text region
It seems to do the right thing already, and nothing in the spec says
not to do this as far as I can tell.

With this, we can finally decode the test input from #23659.

See f391c7822d for a similar change for generic regions and
lossless generic regions.
2024-03-23 17:30:15 -04:00
Nico Weber
b15e1d2b2a LibGfx/JBIG2: Implement initial support for text segments
Text segments conceptually store (x,y,id) triples. (x,y) are a
coordinate, and id refers to an id from a symbol segment.
A text segment has the effect of drawing some of the bitmaps stored
in a symbol segment to the output bitmap.

For example, the symbol segment might contain a small bitmap that
happens to look like the letter 'A', and the text segment might
draw that everywhere a scanned page has an 'A'. (The JBIG2 format
only treats it as an abstract bitmap. It doesn't know that this
small bitmap is an 'A'.)

This is missing support for many things:

* Huffman-coded input (not used in practice)
* Symbol refinement
* Transposed symbols
* Colors (not used in practice)

Still, we now have basic symbol/text segment support. This is enough
to decode the downloadable PDF here:
https://www.google.com/books/edition/Paradise_Lost/6qdbAAAAQAAJ

It doesn't lead to any progression on my 1000 file test PDF set.
The 7 files in there that use JBIG2 with symbol and text segments
now fail to load for other reasons (4 need symbol refinement for
text segments, one needs end-of-stripe segment support, one needs
support for symbol segments referring to other segments).

(And possibly, many other PDFs from Google Books, but that's the
only one I've tried so far.)
2024-03-23 17:30:15 -04:00
Nico Weber
3454970903 LibGfx/JBIG2: Extract composite_bitbuffer() and add some features
This extracts the bitbuffer combining code we had into a new function
composite_bitbuffer() and adds the following features:

* Real support for combination operators (which also lets us allow black
  as background color again, even if that's never used in practice)
* Clipping support (not used here yet, but will be needed elsewhere
  soon)

We're going to need this for text segment handling.

No behavior change.
2024-03-23 17:30:15 -04:00
Nico Weber
754e1b46fc LibGfx/JBIG2: Implement basic symbol segment processing
A symbol segment defines a bunch of small bitmaps and associates them
with numeric IDs.

This only implements reading symbols encoded with the arithmetic coder.
It does not support huffman coding. (In practice, everything seems to
use arithmetic coding.)

Support for refinement or aggregate coding isn't implemented yet.
Support for retaining bitmap coding contexts isn't implemented yet.
Support for symbol segments referring to other symbol segments isn't
implemented yet.
But all produce diagnostics if encountered, so we won't forget about
them. (I haven't seen either being used in the wild.)

No visible behavior change yet, but with JBIG2_DEBUG turned on,
it produces all kinds of debug output.
2024-03-23 17:30:15 -04:00
Nico Weber
93fcb529cf LibGfx/JBIG2: Move SegmentData down a bit
Symbol segments will store decoded symbols, and for that SegmentData
needs to come after BitBuffer.

No behavior change.
2024-03-23 17:30:15 -04:00
Nico Weber
2099ca48a1 LibGfx/JBIG2: Pass in decoder and contexts to generic region decoder
The symbol segment decoding procedure will read generic regions
that aren't at a byte boundary, and that share contexts across
several regions.

No behavior change.
2024-03-23 17:30:15 -04:00
Nico Weber
376b1a2309 LibGfx/JBIG2: Have just one CombinationOperator enum class
We already had two, and we would need another one for text segments.

No behavior change.
2024-03-23 17:30:15 -04:00
Nico Weber
c06110da87 LibGfx/JBIG2: Make AdaptiveTemplatePixel toplevel
We're going to need it for symbol segment decoding too.

No behavior change.
2024-03-23 17:30:15 -04:00
Nico Weber
8e82c2b932 LibGfx/JBIG2: Add arithmetic integer decoder
The existing ArithmeticEncoder (from Annex E) reads one bit at a
time.

ArithmeticIntegerDecoder (from Annex A) builds on top of that to
read integer values.

This will be used by both the symbol segment and the text segment
readers.

(This does not yet implement the IAID decoding procedure in A.3.
We only need that one in the text segment decoder at the moment,
and it's pretty small, so I'll put it inline there for now.)

Not used yet, so no behavior change yet.
2024-03-23 17:30:15 -04:00