That way, we can write 0 instead of 8 bits for every alpha byte.
Reduces the size of sunset-retro.png when saved as a webp file
from 3 MiB to 2.25 MiB, without affecting encode speed.
Once we use CanonicalCodes we'll get this for free for all channels,
but opaque images are common enough that it feels worth it to do this
before then.
This doesn't use any transforms yet (in particular not the predictor
transform), and doesn't do anything else that actually compresses the
data.
It also give all 256 values code length 8 unconditionally. This means
the huffman trees are not data-dependent at all and provide no
compression. It also means we can just write out the image data
unmodified.
So the output is fairly large. But it _is_ a valid lossless webp file.
Possible follow-ups, to improve compression later:
1. Use actual byte distributions to create huffman trees, to get
huffman compression.
2. If the distribution has just 1 element, write a simple code length
code (that way, images that have alpha 0xff everywhere need to store
no data for alpha).
3. Add backref support, to get full deflate(ish) compression of pixels.
4. Add predictor transform.
5. Consider writing different sets of prefix codes for every 16x16 tile
(by writing a meta prefix code image).
(It might be nice to make the follow-ups optional, so that this can also
be used as a webp example file generator.)
...and add a test case that shows why it's incorrect.
If one dimension is 2^n + 1 and the other side is just 1, then the
topmost node will have 2^n x 1 and 1 x 1 children. The first child will
have n levels of children. The 1 x 1 child could end immediately, or it
could require that it also has n levels of (all 1 x 1) children. The
spec isn't clear on which of the two alternatives should happen. We
currently have n levels of 1 x 1 blocks.
This test case shows that a VERIFY we had was incorrect, so remove it.
The alternative implementation is to keep the VERIFY and to add a
if (x_count == 1 && y_count == 1)
level = 0;
to the top of TagTreeNode::create(). Then we don't have multiple levels
of 1 x 1 nodes, and we need to read fewer bits.
The images in the spec suggest that all nodes should have the same
number of levels, so go with that interpretation for now. Once we can
actually decode images, we'll hopefully see which of the two
interpretations is correct.
(The removed VERIFY() is hit when decoding
Tests/LibGfx/test-inputs/jpeg2000/buggie-gray.jpf in a local branch that
has some image decoding implemented. That file contains a packet with
1x3 code-blocks, which hits this case.)
Most JPEG2000 files put the codestream in an ISOBMFF box structure
(which is useful for including metadata that's bigger than the
~65k marker segment data limit, such as large ICC profiles), but
some files just store the codestream directly, for example
https://sembiance.com/fileFormatSamples/image/jpeg2000/balloon.j2c
See https://www.iana.org/assignments/media-types/image/j2c for the
mime type.
The main motivation is to be able to use the test data in J.10 in
the spec as a test case.
These changes are compatible with clang-format 16 and will be mandatory
when we eventually bump clang-format version. So, since there are no
real downsides, let's commit them now.
A tag tree is a data structure used for deserializing JPEG2000
packet headers.
We don't use them for anything yet, except from tests.
The implementation feels a bit awkward to me, but we can always polish
it later.
The spec thankfully includes two concrete examples. The code is
correct enough to pass those -- I added them as test.
For text, we always ended up with a leading \0 byte (on little-endian),
which prints as nothing. Since that's the only thing we do with this
data, no actual behavior change.
We don't do anything with this (except log the contents if
JPEG2000_DEBUG is 1).
The motivation is that we now decode all marker segments that are
used in all JPEG2000 files I've seen so far, allowing us to make
remaining unknown marker types fatal.