At some point, unicode regional indicators became combining chars in the
unicode standard, which broke the handling of them in draw_codepoint().
The fix has the added advantage of improving performance in the common
case by only checking for combining chars. The flag check happens only
if the first check matches.
Fixes#4407
Also roundtrip all characters in the Cf category.
Characters with the DI (Default Ignorable) property are now
preserved but not rendered and treated as zero-width
as per the unicode standard.
See https://www.unicode.org/faq/unsup_char.html
A better solution from an ecosystem perspective is to just work with the
original protocol. I have modified kitty's escape parser to special case
OSC 52 handling without changing its max escape code size.
Basically, it works by splitting up OSC 52 escape codes longer than the
max size into a series of partial OSC 52 escape codes. These get
dispatched to the UI layer where it accumulates them upto the 8MB limit
and then sends to clipboard when the partial sequence ends.
See https://github.com/ranger/ranger/issues/1861
There are two user-visible changes in here:
1. If a scroll region is set such that there is a bottom margin and no
top margin, scrolling the region forward used to discard the top
lines. Now those lines are appended to the scrollback.
```shell
# Assuming a terminal window with 24 lines.
printf '\033[H\033[J\033[3J\033[0;5r' && seq 100
```
This command used to result in an empty scrollback. Now it contains
the numbers 1-96.
2. If a scroll region is set such that there is a top margin and no bottom
margin, scrolling the region forward used to append the top lines to
the scrollback. Now these lines are discarded.
```shell
# Assuming a terminal window with 24 lines.
printf '\033[H\033[J\033[3J\033[2;24r' && seq 100
```
This command used to populate scrollback with the numbers 2-78. Now
the scrollback is empty. The numbers on the screen are the same as
before: 1 and 79-100.
Related issue: #3113.
Needed for output of hyperlinks, also more efficient, since avoids
malloc per line. Also fix pagerhist not having SGR reset at the start of
every line.