This commit is larger than it appears to due fanout from threading
through bidi parameters. The main changes are:
* When clustering cells, add an additional phase to resolve embedding
levels and further sub-divide a cluster based on the resolved bidi
runs; this is where we get the direction for a run and this needs
to be passed through to the shaper.
* When doing bidi, the forced cluster boundary hack that we use to
de-ligature when cursoring through text needs to be disabled,
otherwise the cursor appears to push/rotate the text in that
cluster when moving through it! We'll need to find a different
way to handle shading the cursor that eliminates the original
cursor/ligature/black issue.
* In the shaper, the logic for coalescing unresolved runs for font
fallback assumed LTR and needed to be adjusted to cluster RTL.
That meant also computing a little index of codepoint lengths.
* Added `experimental_bidi` boolean option that defaults to false.
When enabled, it activates the bidi processing phase in clustering
with a strong hint that the paragraph is LTR.
This implementation is incomplete and/or wrong for a number of cases:
* The config option should probably allow specifying the paragraph
direction hint to use by default.
* https://terminal-wg.pages.freedesktop.org/bidi/recommendation/paragraphs.html
recommends that bidi be applied to logical lines, not physical
lines (or really: ranges within physical lines) that we're doing
at the moment
* The paragraph direction hint should be overridden by cell attributes
and other escapes; see 85a6b178cf
and probably others.
However, as of this commit, if you `experimental_bidi=true` then
```
echo This is RTL -> عربي فارسی bidi
```
(that text was sourced from:
https://github.com/microsoft/terminal/issues/538#issuecomment-677017322)
then wezterm will display the text in the same order as the text
renders in Chrome for that github comment.
```
; ./target/debug/wezterm --config experimental_bidi=false ls-fonts --text "عربي فارسی ->"
LeftToRight
0 ع \u{639} x_adv=8 glyph=300 wezterm.font(".Geeza Pro Interface", {weight="Regular", stretch="Normal", italic=false})
/System/Library/Fonts/GeezaPro.ttc index=2 variation=0, CoreText
2 ر \u{631} x_adv=3.78125 glyph=273 wezterm.font(".Geeza Pro Interface", {weight="Regular", stretch="Normal", italic=false})
/System/Library/Fonts/GeezaPro.ttc index=2 variation=0, CoreText
4 ب \u{628} x_adv=4 glyph=244 wezterm.font(".Geeza Pro Interface", {weight="Regular", stretch="Normal", italic=false})
/System/Library/Fonts/GeezaPro.ttc index=2 variation=0, CoreText
6 ي \u{64a} x_adv=4 glyph=363 wezterm.font(".Geeza Pro Interface", {weight="Regular", stretch="Normal", italic=false})
/System/Library/Fonts/GeezaPro.ttc index=2 variation=0, CoreText
8 \u{20} x_adv=8 glyph=2 wezterm.font("Operator Mono SSm Lig", {weight="DemiLight", stretch="Normal", italic=false})
/Users/wez/.fonts/OperatorMonoSSmLig-Medium.otf, FontDirs
9 ف \u{641} x_adv=11 glyph=328 wezterm.font(".Geeza Pro Interface", {weight="Regular", stretch="Normal", italic=false})
/System/Library/Fonts/GeezaPro.ttc index=2 variation=0, CoreText
11 ا \u{627} x_adv=4 glyph=240 wezterm.font(".Geeza Pro Interface", {weight="Regular", stretch="Normal", italic=false})
/System/Library/Fonts/GeezaPro.ttc index=2 variation=0, CoreText
13 ر \u{631} x_adv=3.78125 glyph=273 wezterm.font(".Geeza Pro Interface", {weight="Regular", stretch="Normal", italic=false})
/System/Library/Fonts/GeezaPro.ttc index=2 variation=0, CoreText
15 س \u{633} x_adv=10 glyph=278 wezterm.font(".Geeza Pro Interface", {weight="Regular", stretch="Normal", italic=false})
/System/Library/Fonts/GeezaPro.ttc index=2 variation=0, CoreText
17 ی \u{6cc} x_adv=4 glyph=664 wezterm.font(".Geeza Pro Interface", {weight="Regular", stretch="Normal", italic=false})
/System/Library/Fonts/GeezaPro.ttc index=2 variation=0, CoreText
19 \u{20} x_adv=8 glyph=2 wezterm.font("Operator Mono SSm Lig", {weight="DemiLight", stretch="Normal", italic=false})
/Users/wez/.fonts/OperatorMonoSSmLig-Medium.otf, FontDirs
20 - \u{2d} x_adv=8 glyph=276 wezterm.font("Operator Mono SSm Lig", {weight="DemiLight", stretch="Normal", italic=false})
/Users/wez/.fonts/OperatorMonoSSmLig-Medium.otf, FontDirs
21 > \u{3e} x_adv=8 glyph=338 wezterm.font("Operator Mono SSm Lig", {weight="DemiLight", stretch="Normal", italic=false})
/Users/wez/.fonts/OperatorMonoSSmLig-Medium.otf, FontDirs
```
```
; ./target/debug/wezterm --config experimental_bidi=true ls-fonts --text "عربي فارسی ->"
RightToLeft
17 ی \u{6cc} x_adv=9 glyph=906 wezterm.font(".Geeza Pro Interface", {weight="Regular", stretch="Normal", italic=false})
/System/Library/Fonts/GeezaPro.ttc index=2 variation=0, CoreText
15 س \u{633} x_adv=10 glyph=277 wezterm.font(".Geeza Pro Interface", {weight="Regular", stretch="Normal", italic=false})
/System/Library/Fonts/GeezaPro.ttc index=2 variation=0, CoreText
13 ر \u{631} x_adv=4.78125 glyph=272 wezterm.font(".Geeza Pro Interface", {weight="Regular", stretch="Normal", italic=false})
/System/Library/Fonts/GeezaPro.ttc index=2 variation=0, CoreText
11 ا \u{627} x_adv=4 glyph=241 wezterm.font(".Geeza Pro Interface", {weight="Regular", stretch="Normal", italic=false})
/System/Library/Fonts/GeezaPro.ttc index=2 variation=0, CoreText
9 ف \u{641} x_adv=5 glyph=329 wezterm.font(".Geeza Pro Interface", {weight="Regular", stretch="Normal", italic=false})
/System/Library/Fonts/GeezaPro.ttc index=2 variation=0, CoreText
8 \u{20} x_adv=8 glyph=2 wezterm.font("Operator Mono SSm Lig", {weight="DemiLight", stretch="Normal", italic=false})
/Users/wez/.fonts/OperatorMonoSSmLig-Medium.otf, FontDirs
6 ي \u{64a} x_adv=9 glyph=904 wezterm.font(".Geeza Pro Interface", {weight="Regular", stretch="Normal", italic=false})
/System/Library/Fonts/GeezaPro.ttc index=2 variation=0, CoreText
4 ب \u{628} x_adv=4 glyph=243 wezterm.font(".Geeza Pro Interface", {weight="Regular", stretch="Normal", italic=false})
/System/Library/Fonts/GeezaPro.ttc index=2 variation=0, CoreText
2 ر \u{631} x_adv=5 glyph=273 wezterm.font(".Geeza Pro Interface", {weight="Regular", stretch="Normal", italic=false})
/System/Library/Fonts/GeezaPro.ttc index=2 variation=0, CoreText
0 ع \u{639} x_adv=6 glyph=301 wezterm.font(".Geeza Pro Interface", {weight="Regular", stretch="Normal", italic=false})
/System/Library/Fonts/GeezaPro.ttc index=2 variation=0, CoreText
LeftToRight
0 \u{20} x_adv=8 glyph=2 wezterm.font("Operator Mono SSm Lig", {weight="DemiLight", stretch="Normal", italic=false})
/Users/wez/.fonts/OperatorMonoSSmLig-Medium.otf, FontDirs
1 - \u{2d} x_adv=8 glyph=480 wezterm.font("Operator Mono SSm Lig", {weight="DemiLight", stretch="Normal", italic=false})
/Users/wez/.fonts/OperatorMonoSSmLig-Medium.otf, FontDirs
2 > \u{3e} x_adv=8 glyph=470 wezterm.font("Operator Mono SSm Lig", {weight="DemiLight", stretch="Normal", italic=false})
/Users/wez/.fonts/OperatorMonoSSmLig-Medium.otf, FontDirs
;
```
refs: https://github.com/wez/wezterm/issues/784
This commit decomposes the main get_semantic_zones method into two
parts:
* A per-line portion, where the line ranges are cached (invalidated on
change)
* The overall screen portion, where the line ranges are merged
This changes the overall complexity of computing zones from
O(width * scrollback-height)
To an incremental:
O((width * number of changed lines since last query) + scrollback-height)
You can see some samples of elapsed time below; those show the times for
running both the old and the new implementation on the same data. The
number of lines/zones in the scrollback increases with each call and you
can see that the new implementation is a bit faster anyway at low
volumes but is significantly faster as the number of lines/zones
increases, because the amount of work is reduced.
```
get_semantic_zones: 71.708µs
get_semantic_zones_new: 59.041µs
get_semantic_zones: 71.166µs
get_semantic_zones_new: 9.166µs
get_semantic_zones: 44.291µs
get_semantic_zones_new: 4.208µs
get_semantic_zones: 69.791µs
get_semantic_zones_new: 10.291µs
get_semantic_zones: 59.375µs
get_semantic_zones_new: 7.958µs
get_semantic_zones: 52.5µs
get_semantic_zones_new: 4.5µs
get_semantic_zones: 91.791µs
get_semantic_zones_new: 20.916µs
get_semantic_zones: 229.916µs
get_semantic_zones_new: 109.208µs
get_semantic_zones: 224.125µs
get_semantic_zones_new: 15.208µs
get_semantic_zones: 291.791µs
get_semantic_zones_new: 11.833µs
get_semantic_zones: 238.875µs
get_semantic_zones_new: 12.625µs
get_semantic_zones: 468.458µs
get_semantic_zones_new: 126.583µs
get_semantic_zones: 460.5µs
get_semantic_zones_new: 25.666µs
get_semantic_zones: 358.291µs
get_semantic_zones_new: 19.541µs
get_semantic_zones: 436.833µs
get_semantic_zones_new: 17.875µs
get_semantic_zones: 313.166µs
get_semantic_zones_new: 15.25µs
get_semantic_zones: 333.958µs
get_semantic_zones_new: 16.541µs
get_semantic_zones: 364.666µs
get_semantic_zones_new: 14.041µs
```
I generated nerdfonts_data.rs with this shell script; it uses `i_all.sh`
from the nerdfonts repo to get the base mapping:
```
source ./lib/i_all.sh
echo "//! Data mapping nerd font symbol names to their char codepoints"
echo "pub const NERD_FONT_GLYPHS: &[(&str, char)] = &["
for var in "${!i@}"; do
# trim 'i_' prefix
glyph_name=${var#*_}
glyph_char=${!var}
glyph_code=$(printf "%x" "'$glyph_char'")
echo "(\"$glyph_name\", '\u{$glyph_code}'), // $glyph_char"
done
echo "];"
```
Then intent is to use it in wezterm:
```
local wezterm = require 'wezterm'
wezterm.log_info(wezterm.nerdfonts.dev_mozilla)
```
We need 100% of the info for it to work correctly, so this commit:
* Exposes the keyboard encoding mode via the Pane trait
* Adds the scan code to the RawKeyEvent
* Has the GUI perform the encoding if the keyboard is set that way
* Removes the basic encoder from termwiz in favor of the gui level one
The net result is that we bypass the Pane::key_up/Pane::key_down methods
in almost all cases when the encoding mode is set to win32-input-mode.
There is now a config option: allow_win32_input_mode that can be
used to prevent using this mode.
refs: #1509
This commit causes the terminal to emit win32-input-mode encoded key up
and down events for a limited subset of keys When win32-input-mode is
enabled.
We limit them to keys where we know the VK key code equivalent,
and where those keys are either not representable (eg: modifier
only key events), or may generate ambiguous output (eg: CTRL-SPACE
in different keyboard layouts).
However, in my experiments, modifier only key presses confuse powershell
and cause it to emit `@`, so I've disabled that in the code for now.
refs: https://github.com/wez/wezterm/issues/318
refs: https://github.com/wez/wezterm/issues/1509
refs: https://github.com/wez/wezterm/issues/1510
This is a baby step that formalizes the different encoding schemes into
an enum, and hooks up the decset sequence for win32-input-mode.
It doesn't change any of the actual encoding at this time.
refs: #1509
When rendering the IME composing text, I noticed that for the Korean
input sequence: shift+'ㅅ' followed by 'ㅏ' we'd render the 'ㅆ' (the
shifted first character) in black and the composing 'ㅏ' in white
against the cursor color, and that was very difficult to read,
especially at the default font size.
To resolve this, this commit:
* Forces clustering to break around the cursor boundary, so that
we treat the cursor position as its own separately styled cluster
* Adjusts cursor/bg rendering so that we always consider the start of
the cluster for the colors of that run. We are guaranteed that a
ligatured sequence will fit in the background area anyway.
This has the effect of "breaking" programming ligatures such as '->'
when cursoring through them, and decomposing them into their individual
'-' and '>' glyphs, which is a reasonable price to pay for being able
to see things better on screen.
refs: https://github.com/wez/wezterm/issues/1504
refs: https://github.com/wez/wezterm/issues/478
Previously, we would implicitly set it to the special SEQ_ZERO
value, but since that value always flags the row as changed,
it causes some over-invalidation issues downstream in wezterm.
This commit makes that parameter required, so that the code that
is creating a new Line always passes down the seqno from that event.
refs: #1472
Not 100% sure why this only really manifested on Windows, but
the symptoms were:
* Run powershell in a tab
* Run `dir`
* Hit enter a couple of times to show a couple of prompts
* Try using the mouse to select across the prompt boundaries
The selection would get invalidated crossing the boundaries.
I traced this down to the lines around those regions having
SEQ_ZERO as their sequence, so this commit ensures that lines
that are created as part of scrolling the screen are correctly
tagged with the current seqno from the terminal display.
Why only windows? Not totally sure; perhaps it is related to
something funky happening in the conpty layer and sending us
unusual escapes (eg: scroll margins?)
permits iTerm2 images to be drawn anywhere on screen without
scrolling the cursor, including the bottom row.
Also included is a check in fcwrap.rs to_range_set(), without which
was causing a panic at runtime due to subtraction from unsigned
leading to overflow.
This helps us correctly set the size of the image cell
for the case where we have a partial cell at the right/bottom
edge of an image being mapped across cells.
refs: #1270
We need to force the codepage to UTF8 this to avoid having our UTF-8
bytestreams be misinterpreted, and to restore the original before we're
done.
refs: https://github.com/wez/wezterm/issues/1435
implement missing alt-screen support and fixup building the examples
while we're in here.
refs: #1244
This commit teaches `RgbColor::from_rgb_str` to support
colors in the form `hsl:235 100 50`, an HSL colorspace
color specification.
While banging my head on why my test wasn't passing, I realized
that this was producing 10 bpc color and the code to convert
those to RGB was incorrectly multiplying conversion terms!
refs: https://github.com/wez/wezterm/issues/1436
From esctest:
CUPTests.test_CUP_ColumnOnly
HVPTests.test_HVP_ColumnOnly
and the newly added at https://invent.kde.org/ninjalj/esctest.git:
DECSETTests.test_DECSET_DECLRMM_OnlyRight
DECSTBMTests.test_DECSTBM_OnlyBottom
Not all codepoints are valid when combined with a presentation
selector.
This commit ensures that we respect the valid sequences defined
by the current version of unicode (version 14).
refs: #1231
refs: #997
As promised in the previous commit, this one implements an escape
sequence to control the unicode version.
Unknown to me in the previous commit, iTerm2 already defines such
an escape sequence, so we simply implement it here with the same
semantics.
refs: #1231
refs: #997
This is a fairly far-reaching commit. The idea is:
* Introduce a unicode_version config that specifies the default level
of unicode conformance for each newly created Terminal (each Pane)
* The unicode_version is passed down to the `grapheme_column_width`
function which interprets the width based on the version
* `Cell` records the width so that later calculations don't need to
know the unicode version
In a subsequent diff, I will introduce an escape sequence that allows
setting/pushing/popping the unicode version so that it can be overridden
via eg: a shell alias prior to launching an application that uses a
different version of unicode from the default.
This approach allows output from multiple applications with differing
understanding of unicode to coexist on the same screen a little more
sanely.
Note that the default `unicode_version` is set to 9, which means that
emoji presentation selectors are now by-default ignored. This was
selected to better match the level of support in widely deployed
applications.
I expect to raise that default version in the future.
Also worth noting: there are a number of callers of
`unicode_column_width` in things like overlays and lua helper functions
that pass `None` for the unicode version: these will assume the latest
known-to-wezterm/termwiz version of unicode to be desired. If those
overlays do things with emoji presentation selectors, then there may be
some alignment artifacts. That can be tackled in a follow up commit.
refs: #1231
refs: #997
`use_fancy_tab_bar` switches to an alternate rendering of the tab
bar that uses the window_frame config to get a proportional
title font to use to render tabs, as well as rendering a few
additional elements to space out and make the tabs feel more
like tabs.
Computing the number of tabs doesn't respect the alternate font
at this time.
Formatted tab item foreground and background colors are also
not respected at this time.
refs: #1180
This was a bit of a PITA to run down; the essence of the problem
was that the shaper was returning an x_advance of 0 for U+3000,
which caused wezterm's shaping layer to elide that glyph.
I eventually tracked down the x_advance to be the result of
scaling by an x_scale of 0, which in turn is the result of
harfbuzz not knowing the font size.
The critical portion of this diff is the line that advises
harfbuzz that the font has changed after we've applied the
font size.
The rest is just stuff to make it easier to debug and verify.
This:
```
printf "x\u3000x."
```
Now correctly renders on screen as "x x".
fixes: #1161