Add `VTParser::is_ground`, which is true if the VTParser is in the
"ground" state, i.e. it has no stored state that will affect the
interpretation of future input characters.
These were parsed but swallowed. This commit expands the transitions
to be able to track the APC start, data and end and then adds
an `apc_dispatch` method to allow capturing APC sequences.
APC sequences are used in the kitty image protocol.
refs: #986
This commit removes the intermediates parameter and collapses it
together with the parameters themselves.
This allows us to model DECSET (eg: `CSI ? 1 l`) correctly.
Previously this would get reported as:
```
params: [1],
intermediates: ['?'],
code: 'l'
```
but since the intermediates are logically things that precede the code,
the canonical interpretation of that would be as if we'd received
`CSI 1 ? l`.
AFAICT, DECSET isn't conforming to ECMA 48 when it comes to this
sequence.
That made things a bit of a headache in the CSI parser, so what we do
now is to treat intermediates as parameters so that it is much simpler
to reason about and match in the CSI parser; we now get:
```
params: ['?', 1],
code: 'l',
```
refs: https://github.com/wez/wezterm/issues/955
The original design of the vtparse crate was inspired by the vte
crate. There were some assumptions about the shape of CSI sequences
that were lossy and that is posing a problem when it comes to
implementing DECRQM.
This commit improves the situation by adjusting CsiParam to be capable
of capturing all of the possible parameters as well as intermediates.
This commit isn't done; I just need to push it to transfer it to another
machine.
refs: https://github.com/wez/wezterm/issues/882
refs: https://github.com/wez/wezterm/issues/955
There were two bugs here:
* \u8D (the utf8 encoded representation of 0x8d, aka: RI) was not
recognized as a C1 code and was instead passed through as printable
text.
* The \u8D is a zero-width sequence which means that a subsequent
set_cell call on the new empty-by-default line wouldn't allocate
any cells in the line array, and the assigment to the line would
panic.
This commit avoids the panic for the second case, and then fixes up
the vtparser to correctly recognize the sequence as a C1 control.
refs: https://github.com/wez/wezterm/issues/768
The latest Rust handles more types of const expressions. So we can
use const fns or match directly without using workarounds like OptionPack
and macros.
Going to change how TRANSITIONS is generated. Add a test to ensure the
value is unlikely changed by the change. Note the string is 14K long
so we only hash its values to keep the code short.
This allows us to support the kitty style underline sequence,
or the : separated form of the true color escape sequences.
refs: https://github.com/wez/wezterm/issues/415
These aren't currently rendered, but the parser and model now support
recognizing expanded underline sequences:
```
CSI 24 m -> No underline
CSI 4 m -> Single underline
CSI 21 m -> Double underline
CSI 60 m -> Curly underline
CSI 61 m -> Dotted underline
CSI 62 m -> Dashed underline
CSI 58 ; 2 ; R ; G ; B m -> set underline color to specified true color RGB
CSI 58 ; 5 ; I m -> set underline color to palette index I (0-255)
CSI 59 -> restore underline color to default
```
The Curly, Dotted and Dashed CSI codes are a wezterm assignment in the
SGR space. This is by no means official; I just picked some numbers
that were not used based on the xterm ctrl sequences.
The color assignment codes 58 and 59 are prior art from Kitty.
refs: https://github.com/wez/wezterm/issues/415
This corrects an issue where the mode byte of the DCS sequence was
discarded from the DcsHook, making it impossible to know what sequence
is being activated.
So far this hasn't come up as these sequences are relatively rare,
but in looking at sixel parsing I noticed the error.
Change build.rs codegen to const_fns. This makes vtparse more friendly for buck
build.
Note const_fn functions still have limitation on the current stable (1.41)
rustc (ex. native "match" or "if" cannot be used in const_fn). So I used some
tricks to get it compile.
These are used in the default Fedora 31 bash profile, so it seems
worth handling even if they are a bit amgiguously defined.
Closes: https://github.com/wez/wezterm/issues/86
I've noticed this off and on for a while, and thought it was something
fishy with my shell dotfiles.
Tracing through I found that the final byte in the "Face with head
bandage" emoji 🤕 U+1F915 was being interpreted as the MW control
code and causing the vt parser to jump out of the OSC state.
The solution for this is to hook up proper UTF-8 processing in the
same way that it is applied in the ground state.
Since we don't have enough bits to introduce new state values (we're
pretty tightly packed in the 16 bits available), I've introduced a
memory of the state to which the utf8 parser needs to return once
a complete sequence is detected.
This enables using large OSC buffers in a form that we can publish
to crates.io without blocking on an external crate. Large OSC
buffers are important both for some tunnelling use cases and for
eg: iTerm2 image protocol handling.