LibPDF: Handle CFF fonts with charset format 0 and > 255 glyphs better

We used to use an u8 as loop counter, which would overflow
if there were more than 255 glyphs, producing hundreds of megabytes
of

    Couldn't find string for SID x, going with space

output in the process, while all data until the end of the CFF
section got interpreted as SIDs, until a try_read() would finally
fail.

We now no longer fail miserably trying to render page 2 of
0000352.pdf of 0000.zip from the pdfa dataset.

Fixes just one crash of the larger 500-document test set, but
when I tweak test_pdf.py to print all stacks instead of just the
top 5, it no longer produces 260 MB of output.
This commit is contained in:
Nico Weber 2023-10-22 22:10:49 -04:00 committed by Tim Flynn
parent 0869ca5615
commit 3197f0cab6
Notes: sideshowbarker 2024-07-17 21:16:31 +09:00

View File

@ -742,7 +742,7 @@ PDFErrorOr<Vector<DeprecatedFlyString>> CFF::parse_charset(Reader&& reader, size
if (format == 0) {
// CFF spec, "Table 17 Format 0"
dbgln_if(CFF_DEBUG, "CFF charset format 0");
for (u8 i = 0; i < glyph_count - 1; i++) {
for (size_t i = 0; i < glyph_count - 1; i++) {
SID sid = TRY(reader.try_read<BigEndian<SID>>());
TRY(names.try_append(resolve_sid(sid, strings)));
}