Introduces a heap-based quad allocator that we cache on a per-line
basis, so if a line is unchanged we simply need to copy the previously
computed set of quads for it into the gpu quad buffer.
The results are encouraging wrt. constructing those quads; the
`quad_buffer_apply` is the cost of the copy operation, compare with
`render_screen_line_opengl` which is the cost of computing the quads;
it's 300x better at the p50 and >100x better at p95 for a full-screen
updating program:
full 2880x1800 screen top:
```
STAT p50 p75 p95
Key(quad_buffer_apply) 2.26µs 5.22µs 9.60µs
Key(render_screen_line_opengl) 610.30µs 905.22µs 1.33ms
Key(gui.paint.opengl) 35.39ms 37.75ms 45.88ms
```
However, the extra buffering does increase the latency of
`gui.paint.opengl` (the overall cost of painting a frame); contrast the
above with the latency in the same scenario with the current `main`
(rather than this branch):
```
Key(gui.paint.opengl) 19.14ms 21.10ms 28.18ms
```
Note that for an idle screen this latency is ~1.5ms but that is also true
of `main`.
While the overall latency in the histogram isn't a slam dunk,
running `time cat bigfile` is ~10% faster on my mac.
I'm sure there's something that can be shaved off to get a more
convincing win.
This is really a proof of concept commit; I want to be able to pass
more structured data into the shader as uniforms and the basic
macros provided by glium make that a bit awkward.
What I came up with is a slightly more dynamic uniform builder
thingy.
I'm using this to pass in a copy of the various blinking easing
functions.
Those are incomplete and unused, but it shows that the technique works.
It's not the first time that I've solved a problem by slowing things
down... in this situation, a couple of very inefficient TUI programs had
flickering outputs in wezterm because they were filling a buffer with a
bunch of spaces to erase a screen before sending the main body of their
updates in a subsequent buffer chunk. wezterm would render the
intervening partially blank frame and appear to flicker.
The resolution is to add a small delay (3ms by default) before sending
data to the terminal model. If the output is readable in that time
we'll accumulate it with the pending set of actions so that the
whole batch can be applied "more atomically".
Take care: `time cat bigfile` is sensitive to this, so we want to
keep the latency as small as possible, and we also want to avoid
accumulating actions and only flushing them at the end of the file.
We use the existing buffer size (~1MB) as a threshold: we bump
a count of the number of input bytes that resulted in the current
set of actions, and if that exceeds that buffer size we flush it.
refs: https://github.com/wez/wezterm/issues/2443
weirdly, BOOL is considered bool when I compile locally,
but in the CI:
```
error[E0308]: mismatched types
--> window/src/os/macos/connection.rs:170:22
|
170 | let max_fps = if has_max_fps {
| ^^^^^^^^^^^ expected `bool`, found `i8`
```
I can't explain the difference in behavior (feels like a compiler
bug?) but let's try comparing explicitly against YES
Use the font height as the basis for the size, rather than the width,
to avoid the buttons being too condensed.
Explicitly use the pixel height for the dimensions so that the
buttons are square.
refs: https://github.com/wez/wezterm/issues/2399
The prior mutually exclusive behavior kept surprising people so let's
just flip this around.
This is potentially a "breaking" change for folks, but I think it is
worth it.
d2892c6 switched to using recency only, but neglected to verify that
the edges of the candidate panes were actually touching, leading to
some weird results.
This commit uses recency only when the edges intersect, otherwise,
scores 0 for the candidate.
refs: #2374
This slightly improves the startup time of wezterm.
Right now we query the portal appearance value again over dbus every
time that we access it, for example every time that the user calls
wezterm.gui.get_appearance() from the Lua interface.
Queries over dbus are slow, they usually take a few milliseconds to
complete, for example on my system a portal query over dbus takes around
2 milliseconds to complete.
Wezterm also automatically calls the portal during its own internal
x11/wayland connection initialization, thus right now wezterm queries
the appearance portal setting n+1 times on startup, where n is the
number of times that the user calls get_appearance() from the config.
To fix this problem, we simply cache the portal appearance.
Thus this patch decreases the startup time by 2ms for users that
configure wezterm to follow the global system theme and potentially by
more for users that call get_appearance() in inflational amounts.
With the naive implementation wezterm would be subject to the following
race condition:
1. wezterm calls get_appearance() and caches the value
2. System-wide dark mode changes
3. wezterm subscribes to portal notifications
In that scenario wezterm would miss the dark mode switch entirely and
would cache the wrong value until the dark mode switches again after
wezterm subscribed.
To fix this race condition we call read_setting() again **after** we
have subscribed just to be on the safe side.
Note that while this still introduces a second "redundant" dbus query
for the same value, this time it does not actually block start up since
it happens in another thread.
refs: #2258
I was seeing a black "hole" in the center of this gradient:
```
background = {
{
source = {
Gradient={
colors = {"rgb(45,26,109)", "black"},
orientation = {
Radial={
cx = 0.75,
cy = 0.75,
radius = 1.25,
}
},
}
},
width="100%",
height="100%",
},
```
setting noise=0 "fixed" it, so this commit localizes that fix
to the center of the gradient by preventing noise from wrapping
around the gradient.
e6421d1b72 removed this bit from the
example:
```
-
- // Note that we're waiting until after we've read the output
- // to call `wait` on the process.
- // On macOS Catalina, waiting on the process seems to prevent
- // its output from making it into the pty.
- println!("child status: {:?}", child.wait().unwrap());
```
This commit revisits that and puts in place a differently horrible
solution for macos.
The issue is that if we don't put in a short sleep on macos, then
for a short lived process like `whoami` in this example, we end up
reading output like this:
```
; cargo run --example whoami
Compiling portable-pty v0.8.0 (/Users/wez/wez-personal/wezterm/pty)
Finished dev [unoptimized + debuginfo] target(s) in 0.60s
Running `/Users/wez/wez-personal/wezterm/target/debug/examples/whoami`
child status: ExitStatus { code: 0, signal: None }
output: \r\n^D\u{8}\u{8}wez\r\n%
```
where the EOT that we send on Drop somehow gets *prepended* to the output
that we read from the pty with a couple of BELs thrown in.
I'm not sure WTF is happening on macOS for that to occur; feels like
some kind of race wrt. process startup and initializing the pty in the
system.
The reader has to be started before we close it as well, otherwise
the same issue can occur.
This breaking API change allows us to explicitly generate EOF when the
taken writer is dropped.
The examples have been updated to show how to manage read, write
and waiting without deadlock for both linux and windows.
Need to confirm that this is still good on macOS, but my
confidence is high.
I've also removed ssh2 support from this crate as part of this
change. We haven't used it directly in wezterm in a long while
and removing it from here means that there is slightly less code
to keep compiling over and over.
refs: https://github.com/wez/wezterm/discussions/2392
refs: https://github.com/wez/wezterm/issues/1396
I'd forgotten that this was in here.
The recentish fixes to flush for various terminal queries
likely need this to be 100% good. Surprised this hasn't
come up as a bug already :-/