Switches from using a dynamic vertex buffer to an immutable
vertex buffer. This feels counter-intuitive to me; the purpose
of dynamic is to sustain frequent updates, but mapping the buffer
needs to synchronize with the GPU, and if we are rapidly invalidating
the window that can stall painting by tens of milliseconds.
Switching to an immutable buffer avoids the stall and makes
quad mapping more consistently < 10ms, but its still not
ideal.
refs: https://github.com/wez/wezterm/issues/546
* Make window invalidation more efficient by avoiding spawning a call
that spawns a call to invalidate the window. Just directly mark as
invalidated.
* Suppress default background erase
* hoist the bg_color calc for quads that don't have Cells outside of
its loop.
refs: https://github.com/wez/wezterm/issues/546
The dwrote crate offers functions that can extract the underlying
font file name(s) from the system, so let's use those to get
OnDisk font handles and save some memory.
refs: https://github.com/wez/wezterm/issues/559
I'm not sure what exactly changed (perhaps it was a Windows updated?)
but window_background_opacity was only taking effect for windows
with no title bar.
I found that explicitly configuring a region makes transparency
work again.
refs: #553
In an earlier incarnation of both wezterm and freetype, FT_New_Face
would lead file descriptors into child processes because it didn't
set O_CLOEXEC. That led to the slightly pessimistic approach of
loading the font into memory for the lifetime of the wezterm process.
With the improved fallback handling on macos, this can result in
hundreds of MB of font data being loaded, in some cases multiple
times.
Since those days, freetype now sets O_CLOEXEC and wezterm has some
logic to close other random fds, so the descriptor leaking problem
is gone and we can now let freetype manage a file handle instead
of a memory-baked font.
This reduces the memory utilization by at least 1GB in the case
that a glyphs need to be resolved from the system fallback fonts.
refs: https://github.com/wez/wezterm/issues/559
terminus-bold.otb reports 0 height!
Detect and force that case to go through the bitmap strike loading
code path.
Improve the size selection heuristic for bitmap strikes: previously,
we would just pick the largest bitmap and allow it to be scaled down,
which was OK for emoji fonts that just had 128px square glyphs, but
is not ok for pre-rendered pixel strikes like terminus.
Note that IncreaseFontSize works in terms of percentages only,
so using a font like this may have "gaps" when ctrl-+ or - to change
the font size.
refs: https://github.com/wez/wezterm/issues/560
I don't understand how fish ends up blocking forever in the related
issue, but it shouldn't block us too! The price of this situation
is likely a lingering zombie child process but that seems fine.
refs: https://github.com/wez/wezterm/issues/558
I've had a few people comment that the screen repaints stutter
more since the most recent release.
One of the main changes in that area was to increase throughput
for timg case, where a lot of data was being pumped through.
I think that, ironically, the decreased latency results in more
frequent repaints where not all of the updated screen is visible
in a full screen redraw, so it appears more janky.
This commit introduces a small 1ms delay to see if additional
output is forthcoming when parsing the data. It will keep
delaying and accumulating until there's at least one parsed
output action to process, so there is a small constant latency
overhead added to a single character output (thread context
switch + 1ms delay).
This small delay is counter-balanced with raising the priority
of dispatching the render actions; previously we'd spawn them
at lower-than-input priority. With the batching potential,
I think spawning them at the same priority is OK; the main
reason for the lower priority was to ensure timely ctrl-c
processing when a lot of output is being dumped to the terminal.
It's hard for me to gauge whether this fixes the reported issue,
as I've been unable to reproduce it for myself.
refs: https://github.com/wez/wezterm/issues/559
refs: https://github.com/wez/wezterm/issues/546
I didn't realize that xterm inherited some additional mappings from
the X server, so this commit should make us more comformant with
xterms behavior.
Verified this by comparing `showkey -a` under both xterm and wezterm:
```
wezterm -n --config disable_default_key_bindings=true --config debug_key_events=true start -- showkey -a
```
refs: https://github.com/wez/wezterm/issues/236
refs: https://github.com/wez/wezterm/discussions/556
This is a bit unfortunate, but necessary, because the system fallback
list contains a handful of special fonts that apple doesn't ship on
disk in ttf/otf files.
One of those is `.AppleSymbolsFB` which would normally satisfy
the symbol lookup.
This commit hard codes the "Apple Symbols" font to use instead,
which is a disk based font. I don't know what the difference
is between it and `.AppleSymbolsFB`, but this is sufficient
to satisfy the glyph in question from the referenced issue.
refs: https://github.com/wez/wezterm/issues/506
CI got broken by the termwiz release. This commit teaches the
various `git describe --tags` calls to filter to the wezterm
tags which all start with the year. We're match `20*` which should
be good for the next 79 years.
I've removed the vergen dependency as there was no way to teach it
to do the equivalent matching, and it wasn't a terrible burden
to just inline the git describe call anyway.
I'm not convinced that this is 100% good, but @fanzeyi reported
some latency when using tmux to mirror two sessions. The session
that was accepting interactive input responded quickly, but the
mirroring session was laggy.
This change connects the mux pane output event to window invalidation,
which should cause repaints to happen more often.
I couldn't reproduce the scenario above on my M1 mac, but that may
just be because M1 has dark magicks.
A casualty of b8dcfba9a4 was that
the decoded gif would get reset each time the texture filled up.
Take care to move that cached into the newly minted glyphcache.
Continuing along the same lines as the prior commit, the goal
of this commit is to remove the buffer transformation that was
part of uploading the texture to the GPU provided surface.
In order to do so:
* The sense of our local textures needs to change from bgra32 to rgba32.
bgra32 was a hangover from earlier versions of our window crate that
allowed direct-to-fb writes in software mode. We had to pick bgra32
for that for the broadest OS compatibility. I believe that that
constraint has been totally removed, although there is a chance that
this will flip the colors on macos.
* There was an additional linear-to-srgb conversion inlined in that
buffer transformation. I have no idea where that is needed because
the source data is carefully constructed as SRGB. I don't yet know
how to signal that, but for now I've moved that gamma correction
into the shader when we sample the texture.
With this change, timg playback now has vtparse as the hottest
region of code.
refs: #537
Two issues highlighted by profiling:
* Clearing the texture takes a non-trivial percentage of the profile.
The docs suggest that it is better to create a new texture than
to update large portions of a texture, so add some plumbing so
that we can do that in the first texture-full case.
* Next on the list is the code that translates from linear BGRA to
SRGBA. This is present for reasons that I believe are now legacy,
but for the moment: those two primitives now have faster and
easier implementations, so simplify to those.
This improves the timg video playback performance by ~10% for me.
refs: #537
I've been meaning to do this for a while; this commit moves
the escape sequence parsing into the thread that reads the
pty output which achieves two goals:
* Large escape sequences (eg: image protocols) that span multiple
4k buffers can be processed without ping-ponging between the
reader thread and the main gui thread
* That parsing can happen in the reader thread, keeping the gui
thread more responsive.
These changes free up the CPU during intensive operations such
as timg video playback.
This is a slight layering violation, in that this processing
really belongs to local pane (or any pane that embeds Terminal),
rather than generically at the Mux layer, but it's not any
worse a violation than `advance_bytes` already was.
refs: https://github.com/wez/wezterm/issues/537
The default downscaling provided by the GPU can result in noisy
artifacts on highly detailed images.
This commit employs a cubic Catmull-Rom sampling filter for the
case where we have a single frame image that is being reduced
in size. This isn't the highest quality filter but strikes
a good balance with speed vs appearance and is strictly better
than the GPU texture sampling options that I could try.
In a couple of quick tests, this doesn't seem to break cat'ing
the emoji test data so it seems like it isn't needed anymore.
In addition, it removes a bottleneck from processing image
streams produced by timg, which tend to be very large with
new newlines.
With this change, I'm able to view a 25fps source video at
about 60fps.
refs: https://github.com/wez/wezterm/issues/537
when the size is set to auto, we'd essentially take the image as-is
and overflow the terminal.
This commit makes auto scale down the image to fit the terminal dimensions
if it is too big.
The leftmost pixel was being set to at least 1 by the scale
function.
Fix that up by computing the x coordinate without calling
the scale function.
refs: https://github.com/wez/wezterm/issues/536
Using a boxed slice means that we hold exactly the memory required
for the file data, rather than the next-power-of-two, which can
be wasteful when a large number of images are being sent to
the terminal.
This is a API breaking change for termwiz, so bump its version.
refs: #534
While adding gif support I let this become unbounded.
This commit resolves that by categorizing images as either
single frame or animations.
Single frame images are decoded and held entirely in the texture
atlas, so occupy no additional space beyond the image file contents
and their sprite region in the texture atlas.
Animations are decoded into a set of frame bitmaps. There can be
up to 16 animations (each with their full set of frames) cached.
The individual frames may also exist within the texture atlas
if space permits.
refs: #534