1
1
mirror of https://github.com/wez/wezterm.git synced 2024-11-23 23:21:08 +03:00
Commit Graph

524 Commits

Author SHA1 Message Date
Wez Furlong
e1a0b63923 termwiz: track size of sgr enum 2022-09-10 07:23:11 -07:00
Wez Furlong
96c4e7e9b9 Switch to finl_unicode for grapheme clustering
According to its benchmarks, it's almost 2x faster than
unicode_segmentation.  It doesn't appear to make a visible
difference to `time cat bigfile`, but I'll take anything
that gives more headroom for such little effort of switching.
2022-09-10 07:15:49 -07:00
Wez Furlong
a0c2df2d86 Line: save 8 bytes per line 2022-09-09 21:39:31 -07:00
Wez Furlong
55fa2e7893 fmt 2022-09-09 21:22:17 -07:00
Wez Furlong
ef14f78e08 Reduce size of Action
Action is used to encode parsed terminal output and shuttle it between
the thread that does the parsing and the main gui thread that applies
it to the terminal model.

Take it down from 184 bytes to 40 bytes (on 64-bit systems).  This seems
to boost `time cat bigfile` by reducing the runtime to ~40% of its prior
duration: down from 8s -> 4.5s on an M1 macbook air.

Size reductions achieved by Box'ing relatively less frequently
used enum variants. The kitty image data variant is particularly
large, and the Window variant is also pretty heavy.
2022-09-09 21:11:11 -07:00
Wez Furlong
7a0461989e termwiz: add test to track size_of Action
no functional change
2022-09-09 21:03:19 -07:00
Wez Furlong
026b9e3577 fix hyperlink underlines
There were two problems:

* We weren't correctly invalidating when the hover state changed
  (a recent regression caused by recent caching changes)
* We'd underline every link with the same destination on hover,
  not just the one under the mouse (longstanding wart)

Recent changes allow the application layer to reference the underlying
Lines directly, so we can restore the original and expected
only-highlight-under-the-mouse by switching to those newer APIs.

Adjust the cache values so that we know to also verify the current
highlight and invalidate.

I was a little surprised to see that this also works with mux client
panes: I was expecting to need to do some follow up on those because
they return copies of Line rather than references to them. That happens
to work because the mux client updates the hyperlinks at the time where
it inserts into its cache. The effect of that is that lines in mux
client panes won't update to new hyperlink rules if they were received
prior to a change in the config.

refs: https://github.com/wez/wezterm/issues/2496
2022-09-07 07:51:28 -07:00
Wez Furlong
dbacf98b89 fix incorrect underline attribute when scrolling
Given this sequence:

ENABLE-UNDERLINE CRLF SGR-RESET

if the CRLF caused the terminal to scroll, the newly created line
at the bottom would be filled in with a "blank" cell that had
the underline attribute set.

That's because we're supposed to preserve the coloring in that
scenario, but we were also preserving other SGR attributes.

This commit explicitly clears out under, over and strikethrough
lines from these blank attributes.

refs: https://github.com/wez/wezterm/issues/2489
2022-09-06 20:06:01 -07:00
Wez Furlong
d8e43b92b8 termwiz: slim down size of clustered line storage
Move is_double_wide to a box; it is relatively rare to need
this and we're okay with it being a separate heap allocation
when it is needed if it reduces the size of Line in the common case.
2022-09-06 18:56:29 -07:00
Wez Furlong
62ca174d19 termwiz: remove assertions
I don't think these are really necessary any more; the implementation
cannot go out of bounds, so the worst that can happen is that we
don't return any changes.

refs: https://github.com/wez/wezterm/issues/2222
2022-09-04 10:19:17 -07:00
Wez Furlong
9732f9ee2b Adjust render caching; switch to LFU caches from LRU 2022-08-28 10:28:26 -07:00
Wez Furlong
d492eef700 implement new pane trait methods for copy and quickselect overlays 2022-08-27 06:19:12 -07:00
Wez Furlong
b7b1e4020a revise Pane line related funcs
Adds Pane::for_each_logical_line_in_stable_range_mut and
Pane::with_lines_mut which allow iterating mutably over lines.

The idea is that this will allow the renderer to directly cache
data in the Line via its appdata without having to build cumbersome
external caching logic and managing cache keys.

This commit just swaps the implementation around for localpane
and sanity checks that the renderer functions.

Various overlays and the mux client don't properly implement these
yet and current warn at compile time and panic at runtime.

To follow is the logic to cache the data and make sure that it
works the way that I think before converting the other Pane
implementations.
2022-08-26 08:05:06 -07:00
Wez Furlong
66761ed014 termwiz: use interior mutability for Line::set_appdata 2022-08-25 06:19:56 -07:00
Wez Furlong
5e993c581a termwiz: remove reverse video attribute from Line
It didn't really belong there; it was added as a bit of a hack
to propagate screen reverse video mode.

Move that to the RenderableDims struct and remove the related
bits from Line
2022-08-24 22:43:47 -07:00
Wez Furlong
4963187eaf termwiz: associate appdata with a Line
I plan to use this to hang cached shaper data with a line
and avoid expensive hashing to resolve it.
2022-08-24 20:11:27 -07:00
Wez Furlong
00ddfbf9b8 perf: cache quads by line
Introduces a heap-based quad allocator that we cache on a per-line
basis, so if a line is unchanged we simply need to copy the previously
computed set of quads for it into the gpu quad buffer.

The results are encouraging wrt. constructing those quads; the
`quad_buffer_apply` is the cost of the copy operation, compare with
`render_screen_line_opengl` which is the cost of computing the quads;
it's 300x better at the p50 and >100x better at p95 for a full-screen
updating program:

full 2880x1800 screen top:

```
STAT                                             p50      p75      p95
Key(quad_buffer_apply)                           2.26µs   5.22µs   9.60µs
Key(render_screen_line_opengl)                   610.30µs 905.22µs 1.33ms
Key(gui.paint.opengl)                            35.39ms  37.75ms  45.88ms
```

However, the extra buffering does increase the latency of
`gui.paint.opengl` (the overall cost of painting a frame); contrast the
above with the latency in the same scenario with the current `main`
(rather than this branch):

```
Key(gui.paint.opengl)                            19.14ms  21.10ms  28.18ms
```

Note that for an idle screen this latency is ~1.5ms but that is also true
of `main`.

While the overall latency in the histogram isn't a slam dunk,
running `time cat bigfile` is ~10% faster on my mac.

I'm sure there's something that can be shaved off to get a more
convincing win.
2022-08-23 06:37:12 -07:00
Wez Furlong
9cce9ff81b fix potential panic when computing hyperlink rules
refs: https://github.com/wez/wezterm/issues/2355
2022-08-03 21:34:12 -07:00
Wez Furlong
1df1f166ca copy mode: move by semantic zone, select by zone
Allows the following assignment actions; I was just over-using z for
no real reason, I'm not suggesting that these are good assignments.

```
      -- move the cursor backwards to the start of the current zone, or
      -- to the prior zone if already at the start
      { key = 'z', mods = 'NONE', action = act.CopyMode 'MoveBackwardSemanticZone' },
      -- move the cursor forwards to the start of the next zone
      { key = 'Z', mods = 'NONE', action = act.CopyMode 'MoveForwardSemanticZone' },
      -- start selecting by zone: both the start point and the cursor
      -- position will be expanded to the containing zone and the union
      -- of those two will be used for the selection
      {
        key = 'z',
        mods = 'CTRL',
        action = act.CopyMode { SetSelectionMode = 'SemanticZone' },
      },
      -- like MoveBackwardSemanticZone by only considers zones of the
      -- specified type
      { key = 'z', mods = 'ALT', action = act.CopyMode { MoveBackwardZoneOfType ='Output' }},
      -- like MoveForwardSemanticZone by only considers zones of the
      -- specified type
      { key = 'Z', mods = 'ALT', action = act.CopyMode { MoveForwardZoneOfType ='Output' }},
```

refs: https://github.com/wez/wezterm/issues/2346
2022-08-02 21:56:53 -07:00
Wez Furlong
33866dad70 termwiz: fix min version dep on winapi. publish 0.17.1
Need at least 0.3.5 (I think) for the type used in
https://github.com/wez/wezterm/pull/2349 as well as the impl-default
feature.

refs: https://github.com/markbt/streampager/issues/57#event-7110671310
2022-08-02 18:21:49 -07:00
Mateusz Kwapich
edbc98304b [termwiz] add impl-default feature to winapi dep to fix the build
Without the feature the build fails with:

    --> src\escape\apc.rs:377:57
     |
377  |         let mut memory_info = MEMORY_BASIC_INFORMATION::default();
     |                                                         ^^^^^^^ function or associated item not found in `MEMORY_BASIC_INFORMATION`
     |
2022-08-02 13:29:05 -07:00
Wez Furlong
673ca3ad07 color types version bump for crates.io 2022-08-01 18:41:52 -07:00
Wez Furlong
56de27c773 bidi version bump for publish to crates.io 2022-08-01 18:38:44 -07:00
Wez Furlong
edeae72b5f termwiz: bump version to 0.17 2022-08-01 18:31:43 -07:00
Mark Juggurnauth-Thomas
40df2f14d2 termwiz: ensure ST sequence is grouped with OSC
When `parse_first_as_vec` is parsing an OSC sequence (e.g.
`SetHyperlink`) that is terminated by the escaped form of ST (`ESC \`),
ensure that the ST sequence is included in the returned vector.

This is achieved by ensuring the VTParser has returned to the "ground"
state: i.e. the stored state after the `ESC` is processed is not enough
for `parse_first_as_vec` to terminate.  We must also parse the `\` to
ensure that we return a complete span to the caller.

Fixes https://github.com/markbt/streampager/issues/57
2022-08-01 18:30:04 -07:00
Wez Furlong
b1c60ba258 termwiz: add test cases for parse_first_as_vec w/ OSC+ST
refs: https://github.com/markbt/streampager/issues/57
2022-07-29 06:40:26 -07:00
Wez Furlong
718a021817 termwiz: use 6 for the rgba colorspace
`4` is actually defined as CMYK according to ITU-T Rec. T.416:

> A parameter substring for values 38 or 48 may be divided by one or more separators (03/10) into parameter elements,
> denoted as Pe. The format of such a parameter sub-string is indicated as:
>
>     Pe : P ...
>
> Each parameter element consists of zero, one or more bit combinations from 03/00 to 03/09, representing the digits
> 0 to 9. An empty parameter element represents a default value for this parameter element. Empty parameter elements at
> the end of the parameter substring need not be included.
>
> The first parameter element indicates a choice between:
>
> 0    implementation defined (only applicable for the character foreground colour)
> 1    transparent;
> 2    direct colour in RGB space;
> 3    direct colour in CMY space;
> 4    direct colour in CMYK space;
> 5    indexed colour.

refs: 6e9a22e199 (commitcomment-79669016)
2022-07-28 17:09:15 -07:00
Wez Furlong
c7c81e3161 really fix termwiz --all-features build 2022-07-26 22:35:49 -07:00
Wez Furlong
9559404c3c fix termwiz --all-features build 2022-07-26 21:26:20 -07:00
Wez Furlong
6e9a22e199 termwiz: allow setting alpha for SGR fg, bg attributes
This commit extends the sgr color parser to support a wezterm
extension that I just made up:

```
printf "\e[48:4:255:0:0:60mhello\e0m"
```

The `4` is wezterm specific and denotes 4 channel color, in this case
RGBA. red is 255, green is 0, blue is 0 and alpha is 60; the values are
interpreted as u8 values.

CSI 38 (fg), 48 (bg) and 58 (underline) support this.

refs: https://github.com/wez/wezterm/issues/2313
2022-07-26 21:05:18 -07:00
Wez Furlong
bc083ee470 termwiz: ColorSpec now allows for alpha to be tracked
This doesn't really change any behavior, but adjusts the types
such that CSIs that set colors have the potential to track the
alpha channel and that can make it through to the GUI/render layer.
2022-07-26 19:39:53 -07:00
Wez Furlong
cb89f2c36e allow setting alpha for OSC 10, 11, 12
refs: https://github.com/wez/wezterm/issues/2313
2022-07-26 19:02:47 -07:00
Wez Furlong
c32db29474 termwiz: refactor: split line into sub-modules 2022-07-25 07:20:59 -07:00
Wez Furlong
9deed0b8b4 Line::wrap now prefers cluster storage
Want to avoid inflating scrollback when the window is resized
2022-07-24 16:05:42 -07:00
Wez Furlong
dec5ca0349 term: refactor getting logical lines
This will make it easier to refactor search in a subsequent commit
2022-07-24 10:57:05 -07:00
Wez Furlong
c722db22d6 term/termwiz: microoptimize set cell
If we can avoid constructing a Cell then do so
2022-07-24 09:14:44 -07:00
Wez Furlong
e8dfb553b4 termwiz: microoptimize ClusteredLine::set_last_cell_was_wrapped
Cache the last cell width so that we can avoid iterating the whole
line cell-wise to compute it for time cat bigfile.
2022-07-24 08:36:04 -07:00
Wez Furlong
656f725623 termwiz: micro-optimize ClusteredLine::new(), set_last_wrapped 2022-07-24 08:18:23 -07:00
Wez Furlong
c2dfba27f3 termwiz: don't claim that visible_cells is double-ended
It's not; there's important state that only works in forward iteration.
2022-07-24 08:11:11 -07:00
Wez Furlong
7be01110ca termwiz: avoid cluster -> vec conversions in a few more cases
This reduces the resident memory by another ~10% because it avoids
keeping as many runs of whitespace.

Runtime for `time cat enwiki8.wiki` is still ~11-12s, resident: 530K

refs: https://github.com/wez/wezterm/issues/1626
2022-07-24 07:57:33 -07:00
Wez Furlong
8002a17242 term: default to cluster storage
The previous commit added the option to convert the storage to
the cluster format.  That saves memory as rows are moved to scrollback,
but makes scrolling back more expensive due to that conversion.

This commit adds a fast(ish) path for the common case of simply
appending text to a new line before it gets scrolled: the default
format for lines in the screen is now the cluster format and,
provided that the cursor moves from left to right as text is
emitted, can simply append to the cluster storage in-place
and avoids a conversion when the line is moved to scrollback.

This improves the throughput of `time cat enwiki8.wiki` so
that the runtime is typically around 11-12s (compared to 9-10s
before introducing cluster storage).  However, this is often
a deal of variance in the measured time and I believe that
that is due to the renderer triggering conversions back to
the vec storage and introducing slowdowns from the render side.

That's what I'll investigate next.
2022-07-23 22:54:43 -07:00
Wez Furlong
c8b1b92e08 Line::as_str() -> Cow<str>
Previously this would create a new String because it had to, but
with the clustered storage we may be able to simply reference the
existing string as a str reference, so allow for that.
2022-07-23 12:10:13 -07:00
Wez Furlong
d5d161b510 termwiz: add clustered line storage for line
Adds the option to use an alternative clusted line storage for
the cells component of the line.

This structure is not optimal for mutation, but is better structured
for:

* matching/extracting textual content
* using less memory than the prior simple vector

For some contrast: the line "hello" occupies 5 Cells in the cell based
storage; that 5 discrete Cells each with their own tiny string
and a copy of their attributes.

The clustered version of the line stores one copy of the cell
attributes, the string "hello" and some small (almost constant size)
overhead for some metadata.  For simple lines of ascii text, the
clustered version is smaller as there are fewer copies of the cell
attributes.  Over the span of a large scrollback and typical terminal
display composition, this saving is anticipated to be significant.

The clustered version is also cheaper to search as it doesn't require
building a copy of the search text for each line (provided the line is
already in clustered form).

This commit introduces the capability: none of the internals request the
new form yet, and there are likely a few call sites that need to be
tweaked to avoid coersion from clustered to vector form.
2022-07-23 12:03:00 -07:00
Wez Furlong
614900f85c line: introduce possibility of alternate cell backing
Uses an enum as a way to use an alternative to Vec<Cell>, but
doesn't provide that alternative in this commit.
2022-07-23 08:18:34 -07:00
Wez Furlong
e26d634da1 termwiz: refactor Line::visible_cells()
Don't promise that we iterate Cell directly, but things that smell
like Cell.  That makes it easier to adjust internal storage in
a later commit.
2022-07-23 07:03:34 -07:00
Wez Furlong
bd79ee0bff termwiz: add bench for Cell creation/drop 2022-07-22 19:23:53 -07:00
Wez Furlong
9eeb9b7c43 termwiz: add static WcLookupTable to codegen
This shaves off some tiny overheads from lazy_static
2022-07-22 19:23:14 -07:00
Wez Furlong
308bbc9038 termwiz: micro-optimize grapheme_column_width
While profiling `time cat bigfile` I noted that a big chunk of the
time is spent computing widths, so I wanted to dig into a bit.

After playing around with a few options, I settled on the approach
in this commit.

The key observations:

* WcWidth::from_char performs a series of binary searches.
  The fast path was for ASCII, but anything outside that range
  suffered in terms of latency.
* Binary search does a lot more work than a simple table lookup,
  so it is desirable to use a lookup, and moreso to combine the
  different tables into a single table so that classification
  is an O(1) super fast lookup in the most common cases.

Here's some benchmarking results comparing the prior implementation
(grapheme_column_width) against this new pre-computed table
implementation (grapheme_column_width_tbl).

The ASCII case is more than 5x faster than before at a reasonably snappy
~3.5ns, with the more complex cases being closer to a constant ~20ns
down from 120ns in some cases.

There are changes here to widechar_width.rs that should get
upstreamed.

```
column_width ASCII/grapheme_column_width
                        time:   [23.413 ns 23.432 ns 23.451 ns]
column_width ASCII/grapheme_column_width_tbl
                        time:   [3.4066 ns 3.4092 ns 3.4121 ns]

column_width variation selector/grapheme_column_width
                        time:   [119.99 ns 120.13 ns 120.28 ns]
column_width variation selector/grapheme_column_width_tbl
                        time:   [21.185 ns 21.253 ns 21.346 ns]

column_width variation selector unicode 14/grapheme_column_width
                        time:   [119.44 ns 119.56 ns 119.69 ns]
column_width variation selector unicode 14/grapheme_column_width_tbl
                        time:   [21.214 ns 21.236 ns 21.264 ns]

column_width WidenedIn9/grapheme_column_width
                        time:   [99.652 ns 99.905 ns 100.18 ns]
column_width WidenedIn9/grapheme_column_width_tbl
                        time:   [21.394 ns 21.419 ns 21.446 ns]

column_width Unassigned/grapheme_column_width
                        time:   [82.767 ns 82.843 ns 82.926 ns]
column_width Unassigned/grapheme_column_width_tbl
                        time:   [24.230 ns 24.319 ns 24.428 ns]
```

Here's the benchmark summary after cleaning this diff up ready
to commit; it shows ~70-80% improvement in these cases:

```
; cargo criterion -- column_width
column_width ASCII/grapheme_column_width
                        time:   [3.4237 ns 3.4347 ns 3.4463 ns]
                        change: [-85.401% -85.353% -85.302%] (p = 0.00 < 0.05)
                        Performance has improved.

column_width variation selector/grapheme_column_width
                        time:   [20.918 ns 20.935 ns 20.957 ns]
                        change: [-82.562% -82.384% -82.152%] (p = 0.00 < 0.05)
                        Performance has improved.

column_width variation selector unicode 14/grapheme_column_width
                        time:   [21.190 ns 21.210 ns 21.233 ns]
                        change: [-82.294% -82.261% -82.224%] (p = 0.00 < 0.05)
                        Performance has improved.

column_width WidenedIn9/grapheme_column_width
                        time:   [21.603 ns 21.630 ns 21.662 ns]
                        change: [-78.429% -78.375% -78.322%] (p = 0.00 < 0.05)
                        Performance has improved.

column_width Unassigned/grapheme_column_width
                        time:   [23.283 ns 23.355 ns 23.435 ns]
                        change: [-71.826% -71.734% -71.641%] (p = 0.00 < 0.05)
                        Performance has improved.
```
2022-07-22 19:21:43 -07:00
Wez Furlong
354f3577b3 termwiz: add criterion benches for wcwidth 2022-07-22 19:21:25 -07:00
Wez Furlong
27df31e0d7 termwiz: make emoji presentation very slightly more efficient
When resolving the variation, we can pre-compute the base presentation
so let's combine that with the data in the perfect hash map.
2022-07-22 19:20:17 -07:00