The removed code was not tested, and was probably not useful anymore.
This fix doesn’t fix the problem with soft hyphens put at the end of words.
These soft hyphens should not be replaced by real hyphens.
Fix#1878.
This commit uses an "option" dictionary to store various API options that were
used as arguments in many public and private functions. This change allows to
easily document default values, to reduce the number of arguments and to avoid
many repetitions in documentation and signatures.
The changes to the public API are minimal, and should only have an impact for
users who passed unnamed arguments.
Even if it can be an important feature for some users, the fact that nobody
ever complained means that it’s not useful for the majority of users. The
option is available but disabled by default.
This feature compresses PDF streams (as it was already the case) and ask pydyf
to use a compact PDF structure with compressed object stream and
cross-reference object (for PDF version >=1.5).
In version 57.x, we used to use two different functions to generate abstract
bookmarks (whose coordinates origin is the top-left corner) and PDF
bookmarks (whose coordinates origin is the bottom-left corner). Now that we
only use one function, Document.make_bookmark_tree, we have to give the scale
and the matrix parameters so that we can retrieve both depending on the
context.
Fix#1815.
This code is not handled correctly. This commit avoids the crash, but it
doesn’t give a correct result. We have two different problems to solve:
- floats in floats are broken;
- split out-of-flow elements in split out-of-flow elements are broken.
Fix#1807 (but doesn’t give the right result).
In setting the font size also sets the font family, and that’s useful when we
have different fonts in the same text block. It happens for example when
characters are missing from a font and are replaced by a fallback font’s
character.
Fix#1748.
We were already trying to get the correct top margin of the child, but we
didn’t include subtleties introduced by unforced page breaks. The logic is
still duplicated, it already was, but at least now the copy-paste is correctly
done.
Fix#1058.
The bug in the code was caused by the page_is_empty variable. It’s useless to
test this variable here because we know that we rendered at least the current
line. It’s not updated earlier, but if it were it would always be True.
This commit also fixes minor bugs in the tests.
Fix#1674.
The metrics and floats roundings seems to be slightly different on different
platforms. Using a lower line-height value should fix the problem everywhere.
Fix#1677.
The previous detection was failing when some cells were empty (or fully
rendered) and when the current rendering position was not overflowing the page.
In this case, the line with empty cells was rendered and was visible for
example when cells had padding.
Removing placeholders is something we have to do when we must discard a part of
the layout that has already been done. It currently removes absolutely
positioned placeholders and footnotes.
The same has to be done with split floats, that’s what this commit is about.
We have to change the structure of the broken_out_of_flow attribute, as we have
to get a reliable way to reach the saved data using from the real element in
the tree. The new structure is thus a dictionary whose keys are the boxes in
the tree (placeholder for absolutes, partial elements for floats) and whose
values are the original_box+parent+resume_at.
Fix#1669.
Fix#1666.
Note that the added test doesn’t test this problem, because the "hash" function
is stable when we use the same process. If someone finds a nice way to solve
this…
Before this commit, when we had to break previously rendered waiting children
in a line, we re-rendered the already-rendered children. That was bad, because
the tree of the children that were already rendered were different from the
original tree, that may include leading spaces for example. It also required a
complex mix of the original resume-at with the original skip-stack.
This new solution includes both the original child (to render it from a fresh
start) and the already-rendered child (to get the previous positions). The
skip-stack mix is now useless, removing a lot of complex code.
This change is scary and will probably break some corner cases. But tests pass,
and a new one has been added to avoid future regressions.
Fix#1638.
Previously, if a footnote triggered an overflow (that is, it was too
large for the containing space), but footnote-policy didn't make us push
down the line containing it, any subsequent footnotes in the same
linebox would not be set into a footnote area, because they would never
be passed to layout_footnote and report_footnote. (Their call sites
would be set, but their content would not be.)
This also fixes a potential infinite loop where using footnote-policy
could have forced the first line of a page to be pushed to the next
page: that will just result in an infinite loop, so instead we set the
line and move on if we are on the first line of a page. (This behavior
is not specified in GCPM, but no other behavior seems practical: the
only alternative would be to expand the page, which is almost certainly
less desirable.)
Previously, if a footnote triggered an overflow (that is, it was too
large for the containing space), but footnote-policy didn't make us push
down the line containing it, any subsequent footnotes in the same
linebox would not be set into a footnote area, because they would never
be passed to layout_footnote and report_footnote. (Their call sites
would be set, but their content would not be.)
This also fixes a potential infinite loop where using footnote-policy
could have forced the first line of a page to be pushed to the next
page: that will just result in an infinite loop, so instead we set the
line and move on if we are on the first line of a page. (This behavior
is not specified in GCPM, but no other behavior seems practical: the
only alternative would be to expand the page, which is almost certainly
less desirable.)
The text-decoration-* properties are not inherited anymore, they are propagated
according to specific rules.
This commit removes the inheritance, but it doesn’t follow all the rules
defined by the specification. The new behavior is not perfect, but it should be
at least better than the previous code and opens the possibility to fix this
issue correctly in the future.
Failing tests have been added.
Fix#1621.
The default "2" value for orphans and widows is really useful for real
documents, but it can be very disturbing for tests and introduce false
negatives and false positives.
The primary change here is that the column calculation algorithm now
attempts to render columns as if they were the full remaining height on
the page, rather than to render the entire content regardless of how
long the page is.
The worst-case behavior is effectively the same as that of the previous
algorithm (as at least one additional pass would be required to
determine how high balanced columns should be, but this applies in both
cases), but if content would not fit on the page, this can bail and set
the page immediately, rather than continuing significantly more
calculations.
As an associated benefit, this closes#1020, as we now handle multiple
column breaks (which would force a page break) correctly. A test for
this behavior is included.
Previously, this would throw an error because absolute_layout hadn't had
a chance to lay out the boxes. Now we do the block level layout for
footnotes *before* doing layout for absolute boxes (or fixed boxes) and
add any positioned boxes to positioned_boxes so they can be put onto the
page correctly.
When a box would break over the edge of a page, its height is extended
to the bottom of that page (per
https://www.w3.org/TR/css-break-3/#box-splitting , primarily to allow
backgrounds and borders to continue to the end of the page).
When this happened, sometimes the values that would be calculated for
the height of the extended element would be rounded *over* the
calculated height that remained on the page, forcing the entire
containing box to wrap to the next page.
Rather than trying to carefully manage the order of operations to try to
be safe in IEEE floats for directions, we apply a small "fudge factor":
if an element fits very nearly (within a thousandth of a pixel) into the
remaining space, it is still accepted.
This fixes a bug accidentally introduced in #1566, where we would try to
unlayout a footnote that had not yet been laid out, if the algorithm
decided that there would be insufficient space on a page before laying
out a further footnote.
If a footnote has been output but we then decide to cancel the line it
was output on, we should cancel the output of the footnote itself.
This requires a bit of extra bookkeeping, so we know which footnotes are
relevant, but otherwise largely works using the existing
remove_placeholders code.
Includes a minor refactor of footnote area management in LayoutContext,
and a new testing utility: `tree_position`.
Closes#1564.
This commit:
- fixes the trailing space detection, by handling all trailing spacing
characters that could be ignored by Pango’s line break algorithm;
- tries harder to break waiting children when a line break occurs in an
inline block that can’t be separated from the previous one.
Fix#1562.
When a box would break over the edge of a page, its height is extended
to the bottom of that page (per
https://www.w3.org/TR/css-break-3/#box-splitting , primarily to allow
backgrounds and borders to continue to the end of the page).
When this happened, sometimes the values that would be calculated for
the height of the extended element would be rounded *over* the
calculated height that remained on the page, forcing the entire
containing box to wrap to the next page.
Rather than trying to carefully manage the order of operations to try to
be safe in IEEE floats for directions, we apply a small "fudge factor":
if an element fits very nearly (within a thousandth of a pixel) into the
remaining space, it is still accepted.
Previously, if `overflow-wrap: anywhere` (or `break-word`) was set on an
inline element (like an `<a>` or `<span>`) whose first word occurred
towards the end of a line, it would have broken that word, even if doing
so wasn't necessary (and wouldn't have happened were it not for the
wrapping element).
This pipes through a `is_line_start` argument to `split_text_box` that
only allows the break on a word if it's the first word in a line.
When we calculate the minimum width of an inline block, the size of the
trailing space is already removed by split_first_line. There’s no need to
remove it twice.
We should probably fix split_first_line to remove the trailing space only when
it’s been asked to. But there’s no obvious situation when we want the minimum
width to include trailing spaces, as the minimum size requires line breaks
everywhere, including after each space.
At least, this commit doesn’t remove trailing spaces twice.
Related to #1520.
This supports the `overflow-wrap` value `anywhere`: `anywhere` is like
`break-word`, but the soft breaks it allows *are* considered when
calculating min-content intrinsic sizes.