I noticed while scrolling `emoji-test.txt` that some of the combined
emoji sequences rendered very poorly. This was due to the unicode
width being reported as up to 4 in some cases.
Digging into it, I discovered that the unicode width crate uses a
standard calculation that doesn't take emoji combination sequences
into account (see https://github.com/unicode-rs/unicode-width/issues/4).
This commit takes a dep on the xi-unicode crate as a lightweight way
to gain access to emoji tables and test whether a given grapheme is
part of a combining sequence of emoji.
I've noticed this off and on for a while, and thought it was something
fishy with my shell dotfiles.
Tracing through I found that the final byte in the "Face with head
bandage" emoji 🤕 U+1F915 was being interpreted as the MW control
code and causing the vt parser to jump out of the OSC state.
The solution for this is to hook up proper UTF-8 processing in the
same way that it is applied in the ground state.
Since we don't have enough bits to introduce new state values (we're
pretty tightly packed in the 16 bits available), I've introduced a
memory of the state to which the utf8 parser needs to return once
a complete sequence is detected.
This enables using large OSC buffers in a form that we can publish
to crates.io without blocking on an external crate. Large OSC
buffers are important both for some tunnelling use cases and for
eg: iTerm2 image protocol handling.
It's taking a while for https://github.com/jwilm/vte/pull/20 to get
merged, so point to my branch directly while I build out some
tunneled mux protocol escape sequences.
I'll need to fork vte on crates.io if vte doesn't merge the PR
before the next termwiz crate bump.
I wanted to see how much work remains to enable iterm2 image
display; one of the blockers was a limit in the size of the
buffer in the vte crate, which has been removed in my fork
of vte.
As part of testing our ability to absorb that data, I found
a couple of issues with applying the image cells to the display,
so this commit also takes care of that.
We still don't have code to connect the cell image data to the
opengl render layer.