This commit is larger than it appears to due fanout from threading
through bidi parameters. The main changes are:
* When clustering cells, add an additional phase to resolve embedding
levels and further sub-divide a cluster based on the resolved bidi
runs; this is where we get the direction for a run and this needs
to be passed through to the shaper.
* When doing bidi, the forced cluster boundary hack that we use to
de-ligature when cursoring through text needs to be disabled,
otherwise the cursor appears to push/rotate the text in that
cluster when moving through it! We'll need to find a different
way to handle shading the cursor that eliminates the original
cursor/ligature/black issue.
* In the shaper, the logic for coalescing unresolved runs for font
fallback assumed LTR and needed to be adjusted to cluster RTL.
That meant also computing a little index of codepoint lengths.
* Added `experimental_bidi` boolean option that defaults to false.
When enabled, it activates the bidi processing phase in clustering
with a strong hint that the paragraph is LTR.
This implementation is incomplete and/or wrong for a number of cases:
* The config option should probably allow specifying the paragraph
direction hint to use by default.
* https://terminal-wg.pages.freedesktop.org/bidi/recommendation/paragraphs.html
recommends that bidi be applied to logical lines, not physical
lines (or really: ranges within physical lines) that we're doing
at the moment
* The paragraph direction hint should be overridden by cell attributes
and other escapes; see 85a6b178cf
and probably others.
However, as of this commit, if you `experimental_bidi=true` then
```
echo This is RTL -> عربي فارسی bidi
```
(that text was sourced from:
https://github.com/microsoft/terminal/issues/538#issuecomment-677017322)
then wezterm will display the text in the same order as the text
renders in Chrome for that github comment.
```
; ./target/debug/wezterm --config experimental_bidi=false ls-fonts --text "عربي فارسی ->"
LeftToRight
0 ع \u{639} x_adv=8 glyph=300 wezterm.font(".Geeza Pro Interface", {weight="Regular", stretch="Normal", italic=false})
/System/Library/Fonts/GeezaPro.ttc index=2 variation=0, CoreText
2 ر \u{631} x_adv=3.78125 glyph=273 wezterm.font(".Geeza Pro Interface", {weight="Regular", stretch="Normal", italic=false})
/System/Library/Fonts/GeezaPro.ttc index=2 variation=0, CoreText
4 ب \u{628} x_adv=4 glyph=244 wezterm.font(".Geeza Pro Interface", {weight="Regular", stretch="Normal", italic=false})
/System/Library/Fonts/GeezaPro.ttc index=2 variation=0, CoreText
6 ي \u{64a} x_adv=4 glyph=363 wezterm.font(".Geeza Pro Interface", {weight="Regular", stretch="Normal", italic=false})
/System/Library/Fonts/GeezaPro.ttc index=2 variation=0, CoreText
8 \u{20} x_adv=8 glyph=2 wezterm.font("Operator Mono SSm Lig", {weight="DemiLight", stretch="Normal", italic=false})
/Users/wez/.fonts/OperatorMonoSSmLig-Medium.otf, FontDirs
9 ف \u{641} x_adv=11 glyph=328 wezterm.font(".Geeza Pro Interface", {weight="Regular", stretch="Normal", italic=false})
/System/Library/Fonts/GeezaPro.ttc index=2 variation=0, CoreText
11 ا \u{627} x_adv=4 glyph=240 wezterm.font(".Geeza Pro Interface", {weight="Regular", stretch="Normal", italic=false})
/System/Library/Fonts/GeezaPro.ttc index=2 variation=0, CoreText
13 ر \u{631} x_adv=3.78125 glyph=273 wezterm.font(".Geeza Pro Interface", {weight="Regular", stretch="Normal", italic=false})
/System/Library/Fonts/GeezaPro.ttc index=2 variation=0, CoreText
15 س \u{633} x_adv=10 glyph=278 wezterm.font(".Geeza Pro Interface", {weight="Regular", stretch="Normal", italic=false})
/System/Library/Fonts/GeezaPro.ttc index=2 variation=0, CoreText
17 ی \u{6cc} x_adv=4 glyph=664 wezterm.font(".Geeza Pro Interface", {weight="Regular", stretch="Normal", italic=false})
/System/Library/Fonts/GeezaPro.ttc index=2 variation=0, CoreText
19 \u{20} x_adv=8 glyph=2 wezterm.font("Operator Mono SSm Lig", {weight="DemiLight", stretch="Normal", italic=false})
/Users/wez/.fonts/OperatorMonoSSmLig-Medium.otf, FontDirs
20 - \u{2d} x_adv=8 glyph=276 wezterm.font("Operator Mono SSm Lig", {weight="DemiLight", stretch="Normal", italic=false})
/Users/wez/.fonts/OperatorMonoSSmLig-Medium.otf, FontDirs
21 > \u{3e} x_adv=8 glyph=338 wezterm.font("Operator Mono SSm Lig", {weight="DemiLight", stretch="Normal", italic=false})
/Users/wez/.fonts/OperatorMonoSSmLig-Medium.otf, FontDirs
```
```
; ./target/debug/wezterm --config experimental_bidi=true ls-fonts --text "عربي فارسی ->"
RightToLeft
17 ی \u{6cc} x_adv=9 glyph=906 wezterm.font(".Geeza Pro Interface", {weight="Regular", stretch="Normal", italic=false})
/System/Library/Fonts/GeezaPro.ttc index=2 variation=0, CoreText
15 س \u{633} x_adv=10 glyph=277 wezterm.font(".Geeza Pro Interface", {weight="Regular", stretch="Normal", italic=false})
/System/Library/Fonts/GeezaPro.ttc index=2 variation=0, CoreText
13 ر \u{631} x_adv=4.78125 glyph=272 wezterm.font(".Geeza Pro Interface", {weight="Regular", stretch="Normal", italic=false})
/System/Library/Fonts/GeezaPro.ttc index=2 variation=0, CoreText
11 ا \u{627} x_adv=4 glyph=241 wezterm.font(".Geeza Pro Interface", {weight="Regular", stretch="Normal", italic=false})
/System/Library/Fonts/GeezaPro.ttc index=2 variation=0, CoreText
9 ف \u{641} x_adv=5 glyph=329 wezterm.font(".Geeza Pro Interface", {weight="Regular", stretch="Normal", italic=false})
/System/Library/Fonts/GeezaPro.ttc index=2 variation=0, CoreText
8 \u{20} x_adv=8 glyph=2 wezterm.font("Operator Mono SSm Lig", {weight="DemiLight", stretch="Normal", italic=false})
/Users/wez/.fonts/OperatorMonoSSmLig-Medium.otf, FontDirs
6 ي \u{64a} x_adv=9 glyph=904 wezterm.font(".Geeza Pro Interface", {weight="Regular", stretch="Normal", italic=false})
/System/Library/Fonts/GeezaPro.ttc index=2 variation=0, CoreText
4 ب \u{628} x_adv=4 glyph=243 wezterm.font(".Geeza Pro Interface", {weight="Regular", stretch="Normal", italic=false})
/System/Library/Fonts/GeezaPro.ttc index=2 variation=0, CoreText
2 ر \u{631} x_adv=5 glyph=273 wezterm.font(".Geeza Pro Interface", {weight="Regular", stretch="Normal", italic=false})
/System/Library/Fonts/GeezaPro.ttc index=2 variation=0, CoreText
0 ع \u{639} x_adv=6 glyph=301 wezterm.font(".Geeza Pro Interface", {weight="Regular", stretch="Normal", italic=false})
/System/Library/Fonts/GeezaPro.ttc index=2 variation=0, CoreText
LeftToRight
0 \u{20} x_adv=8 glyph=2 wezterm.font("Operator Mono SSm Lig", {weight="DemiLight", stretch="Normal", italic=false})
/Users/wez/.fonts/OperatorMonoSSmLig-Medium.otf, FontDirs
1 - \u{2d} x_adv=8 glyph=480 wezterm.font("Operator Mono SSm Lig", {weight="DemiLight", stretch="Normal", italic=false})
/Users/wez/.fonts/OperatorMonoSSmLig-Medium.otf, FontDirs
2 > \u{3e} x_adv=8 glyph=470 wezterm.font("Operator Mono SSm Lig", {weight="DemiLight", stretch="Normal", italic=false})
/Users/wez/.fonts/OperatorMonoSSmLig-Medium.otf, FontDirs
;
```
refs: https://github.com/wez/wezterm/issues/784
In order to support RTL/BIDI, wezterm needs a bidi implementation. I
don't think a well-conforming rust implementation exists today; what I
found were implementations that didn't pass 100% of the conformance
tests.
So I decided to port "bidiref", the reference implementation of the UBA
described in http://unicode.org/reports/tr9/ to Rust.
This implementation focuses on conformance: no special measures have
been taken to optimize it so far, with my focus having been to ensure
that all of the approx 780,000 test cases in the unicode data for
unicode 14 pass. Having the tests passing 100% allows for making
performance improvements with confidence in the future.
The API isn't completely designed/fully baked. Until I get to hooking
it up to wezterm's shaper, I'm not 100% sure exactly what I'll need.
There's a good discussion on API in
https://github.com/open-i18n/rust-unic/issues/273 that suggests omitting
"legacy" operations such as reordering. I suspect that wezterm may
actually need that function to support monospace text layout in some
terminal scenarios, but regardless: reordering is part of the
conformance test suite so it remains a part of the API.
That said: the API does model the major operations as separate
phases, so you should be able to pay for just what you use:
* Resolving the embedding levels from a paragraph
* Returning paragraph runs of those levels (and their directions)
* Returning the whitespace-level-reset runs for a line-slice within the
paragraph
* Returning the reordered indices + levels for a line-slice within the
paragraph.
refs: https://github.com/wez/wezterm/issues/784
refs: https://github.com/kas-gui/kas-text/issues/20
This setting now allows specifying what the canonical format is;
previously it was a boolean that meant "don't change anything"
if false, and rewrite to CRLF if true.
It's now an enum with None, CR, LF, and CRLF variants that express
more control.
The config accepts true (maps to CRLF) and false (maps to None)
for backwards compatibility.
The default behavior is unchanged by this commit, however, I did
uncover a bug in the canonicalizer for inputs like `\r\r\n`.
refs: #1575
This commit centralizes the focus-loss logic to the Window so
that activating a new tab will deactivate the pane in the current
window.
Note that this cannot see overlays in the gui, but overlays shouldn't
care about focus, so it should be ok.
refs: https://github.com/wez/wezterm/discussions/796
This commit enables the following config to work for local (not mux yet!)
panes:
```lua
local wezterm = require 'wezterm'
wezterm.on("format-tab-title", function(tab, tabs, panes, config, hover, max_width)
if tab.is_active then
return {
{Background={Color="blue"}},
{Text=" " .. tab.active_pane.title .. " "},
}
end
local has_unseen_output = false
for _, pane in ipairs(tab.panes) do
if pane.has_unseen_output then
has_unseen_output = true
break;
end
end
if has_unseen_output then
return {
{Background={Color="Orange"}},
{Text=" " .. tab.active_pane.title .. " "},
}
end
return tab.active_pane.title
end)
return {
}
```
refs: https://github.com/wez/wezterm/discussions/796
This commit decomposes the main get_semantic_zones method into two
parts:
* A per-line portion, where the line ranges are cached (invalidated on
change)
* The overall screen portion, where the line ranges are merged
This changes the overall complexity of computing zones from
O(width * scrollback-height)
To an incremental:
O((width * number of changed lines since last query) + scrollback-height)
You can see some samples of elapsed time below; those show the times for
running both the old and the new implementation on the same data. The
number of lines/zones in the scrollback increases with each call and you
can see that the new implementation is a bit faster anyway at low
volumes but is significantly faster as the number of lines/zones
increases, because the amount of work is reduced.
```
get_semantic_zones: 71.708µs
get_semantic_zones_new: 59.041µs
get_semantic_zones: 71.166µs
get_semantic_zones_new: 9.166µs
get_semantic_zones: 44.291µs
get_semantic_zones_new: 4.208µs
get_semantic_zones: 69.791µs
get_semantic_zones_new: 10.291µs
get_semantic_zones: 59.375µs
get_semantic_zones_new: 7.958µs
get_semantic_zones: 52.5µs
get_semantic_zones_new: 4.5µs
get_semantic_zones: 91.791µs
get_semantic_zones_new: 20.916µs
get_semantic_zones: 229.916µs
get_semantic_zones_new: 109.208µs
get_semantic_zones: 224.125µs
get_semantic_zones_new: 15.208µs
get_semantic_zones: 291.791µs
get_semantic_zones_new: 11.833µs
get_semantic_zones: 238.875µs
get_semantic_zones_new: 12.625µs
get_semantic_zones: 468.458µs
get_semantic_zones_new: 126.583µs
get_semantic_zones: 460.5µs
get_semantic_zones_new: 25.666µs
get_semantic_zones: 358.291µs
get_semantic_zones_new: 19.541µs
get_semantic_zones: 436.833µs
get_semantic_zones_new: 17.875µs
get_semantic_zones: 313.166µs
get_semantic_zones_new: 15.25µs
get_semantic_zones: 333.958µs
get_semantic_zones_new: 16.541µs
get_semantic_zones: 364.666µs
get_semantic_zones_new: 14.041µs
```