ladybird

mirror of https://github.com/LadybirdBrowser/ladybird.git synced 2024-09-20 17:58:18 +03:00

Author	SHA1	Message	Date
Lucas CHOLLET	70a3f1f02b	LibCompress: Rename LZWDecoder => LzwDecompressor This is more idiomatic for LibCompress' decoders.	2024-05-14 12:33:53 -06:00
Lucas CHOLLET	11f53ba9cc	LibCompress: Rename LZWDecoder.h => Lzw.h	2024-05-14 12:33:53 -06:00
Ava Rider	3a7bea7402	LibPDF: Added empty read check to parse_hex_string	2024-05-05 06:45:42 +01:00
Sergey Bugaev	d458471e09	LibPDF: Convert byte offsets to u64 This fixes a build failure on 32-bit. Suggested-by: Nico Weber <thakis@chromium.org>	2024-05-02 07:46:53 -06:00
Dan Klishch	5ed7cd6e32	Everywhere: Use east const in more places These changes are compatible with clang-format 16 and will be mandatory when we eventually bump clang-format version. So, since there are no real downsides, let's commit them now.	2024-04-19 06:31:19 -04:00
Nico Weber	76ba374aef	LibPDF: Move CFF to use AK::Error instead of PDF::Error Similar to the previous commit. No real behavior change.	2024-04-14 17:22:00 +02:00
Nico Weber	102ac331c6	LibPDF: Move Type1FontProgram to use AK::Error instead of PDF::Error Makes some of the errors a bit less descriptive. But this is pretty stable by now and the errors fire basically never, so that seems ok. Needing the explicit `AK::` prefix is a bit awkward, but that'll go away once this class moves out of LibPDF. Move error() into the two subclasses. I'll remove it from CFF in a follow-up. No real behavior change.	2024-04-14 17:22:00 +02:00
Nico Weber	2b905cc482	LibPDF: Move rest of CFF from Reader to Stream parse_index_data() wants to take ReadonlyByte views of the stream data, so we need FixedMemoryStream::read_in_place(size_t). All other remaining code indirectly calls parse_index_data(), so that all operates on FixedMemoryStreams too. No behavior change.	2024-04-14 10:45:11 +02:00
Nico Weber	cc3d4b6adb	LibPDF: Convert CFF::parse_dict & co to Stream No behavior change.	2024-04-14 10:45:11 +02:00
Nico Weber	1f7924e14c	LibPDF: Convert CFF::parse_charset to Stream No behavior change.	2024-04-14 10:45:11 +02:00
Nico Weber	4995dfe8f1	LibPDF: Convert CFF::parse_fdselect to Stream No behavior change.	2024-04-14 10:45:11 +02:00
Nico Weber	16c22885eb	LibPDF: Convert CFF::parse_encoding to Stream No behavior change.	2024-04-14 10:45:11 +02:00
Nico Weber	f570678bf0	LibPDF: Invert image masks used as alpha too Fixes #23824, a regression from the first commit in #23781.	2024-04-04 06:55:08 -04:00
Nico Weber	40780304b8	LibPDF: Add a fastpath for 1bpp grayscale to load_image() We used to expand every bit in an 1bpp image to a 0 or 255 byte, then map that to a float that's either 0.0f or 1.0f (or whatever's in /DecodeArray), then multiply that by 255.0f to convert it to a u8 and put that in the rgb channels of a Color. Now we precompute the two possible outcomes (almost always Black and White) and do a per-bit lookup. Reduces time for Build/lagom/bin/pdf --render-bench --render-repeats 20 --page 36 \ ~/Downloads/Flatland.pdf (the "decoded data cached" case) from 3.3s to 1.1s on my system. Reduces time for Build/lagom/bin/pdf --debugging-stats ~/Downloads/0000/0000231.pdf (the "need to decode each page" case) from 52s to 43s on my machine. Also makes paging through PDFs that contain a 1700x2200 pixel CCITT or JBIG2 bitmap on each page noticeably snappier.	2024-04-02 08:07:46 +02:00
Nico Weber	c01acdd733	LibPDF: Move decode_array up a bit No behavior change.	2024-04-02 08:07:46 +02:00
Nico Weber	81ff9f45d9	LibPDF: Move image mask inversion from load_image() to show_image() No behavior change.	2024-04-02 08:07:46 +02:00
Nico Weber	0374c1eb3b	LibPDF: Handle indirect reference resolving during parsing more robustly If `Document::resolve()` was called during parsing, it'd change the reader's current position, so the parsing code that called it would then end up at an unexpected position in the file. Parser.cpp already had special-case recovery when a stream's length was stored in an indirect reference. Commit ead02da98ac70c ("/JBIG2Globals") in #23503 added another case where we could resolve indirect reference during parsing, but wasn't aware of having to save and restore the reader position for that. Put the save/restore code in `DocumentParser::parse_object_with_index` instead, right before the place that ultimately changes the reader's position during `Document::resolve`. This fixes `/JBIG2Globals` and lets us remove the special-case code for `/Length` handling. Since this is kind of subtle, include a test.	2024-03-19 19:20:01 -04:00
Nico Weber	495aaa295c	LibPDF: Add some logging behind PDF_DEBUG I've added these two lines a bunch of times by now. Let's check them in. If they turn out to be annoying, we can remove them again.	2024-03-19 19:20:01 -04:00
Lucas CHOLLET	1e023a589d	LibPDF: Plug in the CCITT3 1D decoder and pass corresponding options	2024-03-19 12:22:28 +01:00
MacDue	8057542dea	LibGfx: Simplify path storage and tidy up APIs Rather than make path segments virtual and refcounted let's store `Gfx::Path`s as a list of `FloatPoints` and a separate list of commands. This reduces the size of paths, for example, a `MoveTo` goes from 24 bytes to 9 bytes (one point + a single byte command), and removes a layer of indirection when accessing segments. A nice little bonus is transforming a path can now be done by applying the transform to all points in the path (without looking at the commands). Alongside this there's been a few minor API changes: - `path.segments()` has been removed * All current uses could be replaced by a new `path.is_empty()` API * There's also now an iterator for looping over `Gfx::Path` segments - `path.add_path(other_path)` has been removed * This was a duplicate of `path.append_path(other_path)` - `path.ensure_subpath(point)` has been removed * Had one use and is equivalent to an `is_empty()` check + `move_to()` - `path.close()` and `path.close_all_subpaths()` assume an implicit `moveto 0,0` if there's no `moveto` at the start of a path (for consistency with `path.segmentize_path()`). Only the last point could change behaviour (though in LibWeb/SVGs all paths start with a `moveto` as per the spec, it's only possible to construct a path without a starting `moveto` via LibGfx APIs).	2024-03-18 07:09:37 +01:00
Nico Weber	21917e7b1e	LibPDF+PDFViewer+MacPDF: Don't draw hidden text by default Text can be rendered in various ways in PDFs: Filled, stroked, both filled and stroked, set as clipping path, hidden, or some combinations thereof. We don't implement any of this at the moment except "filled". Hidden text is used in scanned documents: The image of the scan is drawn in the background, and then OCRd text is "drawn" as hidden on top of the scanned bitmap. That way, the (hidden) text can be selected and copied, and it looks like you're selecting text from the scanned bitmap. Find-in-page also works similarly. (We currently have neither text selection nor find-in-page, but one day we will.) Now that we have pretty good support for CCITT and are growing some support for JBIG2, we now draw both the scanned background image as well as the foreground text. They're not always perfectly aligned. This change makes it so that we don't render text that's marked as hidden. (We still do most of the coordinate math, which will probably come in handy at some point when we implement text selection.) This makes these scanned documents appear as they're supposed to appear (at least in documents where we manage to decode the background bitmap). This also adds a debug option to force rendering of hidden text.	2024-03-16 13:10:48 -04:00
Nico Weber	be9a6caa0a	LibPDF: In Filter::decode_jbig2(), invert bits See included comment.	2024-03-10 10:10:55 -04:00
Nico Weber	1eaaa8c3e9	LibPDF+LibGfx: Support JBIG2s with /JBIG2Globals set Several ramifications: * /JBIG2Globals is an indirect reference, which means we now need a Document for unfiltering. (Technically, other decode parameters can also be indirect objects and we should use the Document to resolve() those too, but in practice it only seems to be needed for /JBIG2Globals.) * Since /JBIG2Globals are so rare, we just parse once for each image that use them, and decode_embedded() now receives a Vector<ReadonlyBytes> with all sections of sequences of segments. * Internally, decode_segment_headers() is now called several times for embedded JBIG2s with multiple such sections (e.g. PDFs with /JBIG2Globals). * That means `data` is now no longer part of JBIG2LoadingContext and things get slightly reshuffled due to this. This completes the LibPDF part of JBIG2 support. Once LibGfx implements actual decoding of JBIG2s, things should start to Just Work in PDFs.	2024-03-09 16:01:22 +01:00
Nico Weber	953f6c5d9b	LibPDF+LibGfx: Pass jbig2-filtered data to JBIG2ImageDecoderPlugin Except for /JBIG2Globals, which we bail out on for now. In my 1000 files, 13 use JBIG2, and of those, 2 use JBIG2Globals (0000372.pdf e.g. page 11 and 0000857.pdf e.g. page 1), and only one (the latter) of the two uses the same JBIG2Globals stream for more than a single image. JBIG2ImageDecoderPlugin cannot decode the data yet, so no behavior change, but with `#define JBIG2_DEBUG 1` at the top of that file, it now prints segment header info for PDFs containing JBIG2 data :^)	2024-03-09 16:01:22 +01:00
Nico Weber	24951a039e	LibPDF: Clip stroke for B / B* operators Fixes pages 17-19 on https://www.iro.umontreal.ca/~feeley/papers/ChevalierBoisvertFeeleyECOOP15.pdf Calling the fill handler after painting the stroke as previously doesn't work, since we need to set up the clip before both stroke and fill, and unset it after both. The duplication is a bit unfortunate, but also minor.	2024-03-08 10:45:28 -05:00
Nico Weber	3a39939995	LibPDF: Make truetype fonts marked as symbol fonts actually work Turns out the spec didn't mean that the whole range is populated, but that one of these ranges is populated. So take the argmax. As fallout, explicitly mark the Liberation fonts as nonsymbolic when we use them for the 14 standard fonts. Else, we'd regress "PostScrõpt", since the Liberation fonts would otherwise go down the "is symbolic or doesn't have explicit encoding" codepath, since the standard fonts usually don't have an explicit encoding. As a fallout from _that_, since the 14 standard fonts now go down the regular truetype rendering path, and since we don't implement lookup by postscript name yet, glyphs not present in Liberation now cause text to stop rendering with a diag, instead of rendering a "glyph not found" symbol. That isn't super common, only an additional 4 files appear for the "'post' table not yet implemented" diag. Since we'll implement that soon, this seems fine until then.	2024-03-07 11:29:47 -05:00
Lucas CHOLLET	be5e7a360f	LibGfx/CCITT: Add support for images with an unknown number of lines	2024-03-07 11:07:20 -05:00
Kyle Lanmon	a099d0e140	PDFViewer: Hide the rendering diagnostics window by default You can enable it in the Debug menu and we will remember your choice.	2024-03-04 10:43:41 +01:00
Nico Weber	c6b484a728	LibPDF: Make image creation in Renderer::load_image() fallible	2024-03-03 11:18:37 -05:00
Nico Weber	9bff8abcc7	LibPDF: Add support for array image masks An array image mask contains a min/max range for each channel, and if each channel of a given pixel is in that channel's range, that pixel is masked out (i.e. transparent). (It's similar to having a single color or palette index be transparent, but it supports a range of transparent colors if desired.) What makes this a bit awkward is that the range is relative to the origin bits per pixel and the inputs to the image's color space. So an indexed (palettized) image with 4bpp has a 2-element mask array where both entries are between 0 and 15. We currently apply masks after converting images to a Gfx::Bitmap, that is after converting to 8bpp sRGB. And we do this by mapping everything to 8bpp very early on in load_image(). This leaves us with a bunch of options that are all a bit awkward: 1. Make load_image() store the up- (or for 16bpp inputs, down-) sampled-to-8bpp pixel data. And also return if we expanded the pixel range while resampling (for color values) or not (for palettized images). Then, when applying the image filter, resample the array bounds in exactly the same way. This requires passing around more stuff. 2. Like 1, but pass in the mask array to load_image() and apply the mask right there and then. This means we'd apply mask arrays at a different time than other masks. 3. Make the function that computes the mask from the mask array work from the original, unprocessed image data. This is the most local change, but probably also requires the largest amount of code (in return, the color mask for 16bpp images is precise, in addition that it separates concerns the most nicely). This goes with 3 for now.	2024-03-03 11:18:37 -05:00
Nico Weber	98e272ce15	LibPDF: Silently ignore BX / EX operators See the added comment for reasoning.	2024-03-02 17:43:53 -05:00
Nico Weber	9e502dcfe4	LibPDF: Honor writing mode in TJ operator as well	2024-03-02 12:25:09 +01:00
Nico Weber	c69797fda9	LibPDF: Implement support for vertical text for Type0 For Identity-V only for now.	2024-03-02 12:25:09 +01:00
Nico Weber	6348a857ea	LibPDF: Prepare for more encodings than just Identity-H in Type0 code Introduces CIDIterator, an iterator type for iterating over CIDs. Also introduces Type0CMap which can return a CIDIterator given some bytes. The existing code of treating the bytes as an identity map of big-endian u16s is now implemented in IdentityType0CMap. No behavior change.	2024-03-02 12:25:09 +01:00
Nico Weber	b9a4689af3	LibPDF: In Type0Font, read metrics /DW2 and /W2 for vertical text Not used for anything yet.	2024-03-02 12:25:09 +01:00
Nico Weber	ef5d7b685d	LibPDF: In Type0::initialize(), move variable increment next to cause No behavior change.	2024-03-02 12:25:09 +01:00
Nico Weber	fc9b2440bd	LibPDF: Add some spec comments in Type0Font::initialize()	2024-03-02 12:25:09 +01:00
Nico Weber	004e47df88	LibPDF: Remove minor duplication in Renderer::text_show_string_array() This "regressed" in #10080 (back then, the branches were smaller). No behavior change.	2024-03-02 12:25:09 +01:00
Hendiadyoin1	773a280bdf	LibPDF: Use a struct for the subsection in parse_xref_stream	2024-03-01 14:05:53 -07:00
Hendiadyoin1	fe0fde2154	Userland+Tests: Remove unused <AK/Tuple.h> includes	2024-03-01 14:05:53 -07:00
Nico Weber	c3980eda9e	LibPDF: Give Type0 CIDFontType2 a ScaledFont instead of a Font ...with the same reasoning as the previous commit. No behavior change.	2024-03-01 17:56:59 +01:00
Nico Weber	f374ad50a1	LibPDF: Give TrueTypePainter a ScaledFont instead of a Font This will allow us to get at the font's glyphs as paths, which will eventually enable us to implement glyph rotation. We'll have to do our own caching then, but we can then hopefully share the caching across the Type0 / Type1 / TrueType codepaths. It also gives us access to a font's glyphs by glyph id, which will help us implementing looking up glyph ids by postscript name. (Else we'd have to plumb through a whole Painter::draw_glyph_by_postscript_name() API just for LibPDF.) No behavior change.	2024-03-01 17:56:59 +01:00
Nico Weber	5dad8b693e	LibPDF: Make PDFFont::replacement_for() return a ScaledFont We only want to load non-bitmap fallback fonts as PDF fallback fonts, so let's make the return type represent that. No behavior change.	2024-03-01 17:56:59 +01:00
Nico Weber	2bbdfe0fba	LibPDF: Treat "Oblique" as italic indicator The standard 14 fonts include e.g. "CourierBoldOblique" and "HelveticaOblique". Let's map them to italic fonts :^)	2024-03-01 14:17:42 +01:00
Nico Weber	8e3c54f203	LibPDF: Implement ZapfDingbats clause of the adobe glphy list algorithm Liberation Sans still doesn't have the vast majority of the Zapf Dingbats glyphs, but now we map the Zapf Dingbats names to good unicode values. So we only need to use a different font and all should work. (And Liberation Sans has _some_ of the glyphs, like 13 of the 223.) And we now render empty squares instead of wrong glyphs for the ones we don't have. I haven't seen any PDFs using ZapfDingbats in the wild, but they probably exist somewhere. (Tests/LibPDF/standard-14-fonts.pdf is a synthetic PDF using it.)	2024-03-01 14:17:42 +01:00
Nico Weber	2eb099aabe	LibPDF: Implement some of the AdobeGlyphList algorithm Turns out there's a spec that goes with the table. The big change here is that we can now map `uni1234` to 0x1234 and `u123456` to 0x123456. The parts where we split a name on `_` and map each component and the part where we're supposed to allow multiple groups of 4 after `uni` aren't implemented yet. The ZapfDingbats lookup is also still missing. I haven't seen this have an effect in practice, but it's easy to construct a PDF with a custom encoding where it would make a difference.	2024-03-01 14:17:42 +01:00
Nico Weber	9aa31157d5	LibPDF: Use right encoding for standard fonts Symbol and ZapfDingbats We use Liberation Sans for the actual glyph for these, and that's missing some (Symbol) / all (ZapfDingbats) of the glyphs we need for these two standard fonts (...or at least the mapping from name to glyph, not sure). But still, better rendering squares than completely incorrect glpyhs. Our code deciding what to do when a value isn't found in an encoding, or when the name doesn't map to a glpyh, also needs work, but that's mostly independent of this change. I think this is a nice small standalone progression.	2024-02-27 17:42:08 -05:00
Nico Weber	76105d5d7f	LibPDF: Resize images to the larger of image and mask dimensions Makes text show up on 0000646.pdf pages 87-92, which for some reason renders all text using 2x2 images with huge masks that contain rendered text outlines.	2024-02-27 17:39:13 -05:00
Nico Weber	472bc367d3	LibPDF: Do not have redundant variables for image size This way, the size of the bitmap cannot become out of sync with these variables. No behavior change.	2024-02-27 17:39:13 -05:00
Nico Weber	83d29b3e45	LibPDF: Hack around a FIXME in TrueTypePainter::get_glyph_width() This will need further thought once we implement support for the truetype 'post' table, but for now it's correct most of the time, and better than not doing it.	2024-02-27 07:02:27 +01:00

1 2 3 4 5 ...

683 Commits