Do it "more properly": use intrusive list linkage to manage three
indices:
* hash lookup
* recency
* frequency
eviction is frequency based, but in order to avoid things that were
super hot in the past and that are no longer hot clogging up the cache,
eviction will incrementally age out least recently used entries that
haven't been active in the span of some number of get/put operations.
Aging scales down the frequency value to reduce its strength, so an aged
item isn't necessarily immediately a candidate for removal, it just
makes it more likely to be picked up by the frequency based removal as
it goes unused for an extended period.
This sequence forms a grapheme with cell_width=2, but harfbuzz returns
it as two distinct clusters, causing our prior logic to shape them
independently from different fonts, but our logic for assessing width
would resolve them both to the same cell and double-count their width,
leading to issues with the rendered result.
This commit revises our clustering logic to add a pass that compares
the harfbuzz cluster positions with the cell-based positions from
the presentation_width that may have been provided. We use the starting
cell positions from that to order and de-dup so that clusters aren't
split in the wrong place.
refs: https://github.com/wez/wezterm/issues/2572
I was going to upgrade to the unicode 15 font, but in testing this I
decided that the logic is slightly complex and the glyphs are often
difficult to see at most terminal font sizes, which generates questions
from users, so just fall back to notdef.
The clustered line storage impl of get_cell() is O(column-index)
as cells have to be iterated in order.
The render pass was calling it 3 times to resolve information
about the cursor; reduce that to just once.
It was also calling it once per cell to determine whether the
cell needed to be replaced with a custom glyph if custom block
glyphs were enabled, making it accidentally quadratic!
While it wasn't especially expensive, it did show up in profiling,
so this commit removes that call: we can cache the block glyph
key in the shaper info which is a better place for it anyway,
so that's what we do.
Similarly, there was some extraneous work to call get_cell
when computing some shaper info; remove that too!
That one might be slightly contentious: the is-followed-by-space
logic used to check the successor cell by index to see if it
was a space, but now looks at the successor shaped glyph to see
if it was a space. That might actually be a better choice, but
it may have some ripple effects.
According to its benchmarks, it's almost 2x faster than
unicode_segmentation. It doesn't appear to make a visible
difference to `time cat bigfile`, but I'll take anything
that gives more headroom for such little effort of switching.
When falling back to presentation-ambivalent font lookup, restart
our shaper run from font_idx=0 rather than the current font index.
For:
```
wezterm ls-fonts --text "$(printf \\U1faf0)"
```
the current font index was the last resort font so we'd never
search over anything useful in the second pass.
Looking back at 4fc8dfb374, I can't
see a good reason for starting at the later offset.
We were getting data like this back from harfbuzz:
shaped font_idx=0 "파인더에서 만든 폴더" as: [.notdef=0+448|.notdef=0+448|.notdef=6+448|.notde=6+448|.notdef=6+448|.notdef=15+448|.notdef=15+448|.notdef=21+448|.notdef=21+448|.notdef=27+448|.notdef=27+448|space=33+448|.notdef=34+448|.notdef=34+448|.notdef=34+448|.notdef=43+448|.notdef=43+448|.notdef=43+448|space=52+448|.notdef=53+448|.notdef=53+448|.notdef=53+448|.notdef=62+448|.notdef=62+448]
which had a series of discontiguous and repeated runs for the same
cluster/region, but due to a logic error, we weren't coalescing them
together and were passing somewhat nonsensical ranges to the
next font for shaping.
This commit "simply" ensures that we code the checks for codepoint==0
(the `.nodef` seen above) and the results are at least correct
for this particular case.
refs: https://github.com/wez/wezterm/issues/2482
This commit is a result of debugging a problem with a particular font
where a VS2 variation of a glyph produced incorrect shaping.
Further investigation showed that this was specific to using freetype
font functions and that switching to opentype functions produces the
correct shaping information in that case.
Further-further investigation showed that this difference in behavior
was really due to the font file being out of spec and freetype returning
unexpected data as a result.
This commit allows switching to using harfbuzz's own opentype font+face
(rather than freetype) and/or switching the freetype-based font to using
opentype font functions. These two options don't yield equivalent
results in the wezterm integration: a couple of our shaping tests
fail due to the x-advances not being the same. Those can be drastically
different in some cases (eg: I seem to get 0 for certain bitmap strikes,
and for Roboto the value is off by a factor of about 1.5).
This is incomplete and not something I want to turn on by default at
the moment, but I don't want to lose this work, hence this commit.
refs: https://github.com/wez/wezterm/issues/2475
refs: https://github.com/harfbuzz/harfbuzz/issues/3806
We've been using hb with freetype faces. This gives us the option
of using harfbuzz's own opentype based face and function set.
It doesn't change any behavior: this is just newly available code.
refs: https://github.com/wez/wezterm/issues/2475
CTRL-SHIFT-U is a new default key assignment for this new modal.
It opens up a fuzzy searchable browser that defaults to showing
emoji/emoticons. The category can by cycled through the suggested
emoji categories using CTRL-r. Unlike the system emoji palette,
wezterm includes a category for nerdfont symbols, and another
that is a list of all unicode codepoint names, so you should be
able to browse for pretty much any codepoint you can think of.
The modal also allows fuzzy searching based on:
* The official unicode name
* The github shortcode
* codepoint value in hex
so if you know the codepoint value but not the name, you can
still find a way to input what you're looking for.
Pressing Enter will copy the selected item to the clipboard
send it to the active pane, and cancel the modal. You can therefore
repeat the insert by simply pasting.
I plan to add frecency to this in a later commit: that way the
frequently/recently used selections will show in a category of
their own and make it easier to re-input them.
refs: https://github.com/wez/wezterm/issues/2163
This avoids a confusing situation where we'd not bother to resolve
a font for a given codepoint for say bold text if we had previously
resolved it for normal text.
This hashset would be more useful if we maintained a mapping of
codepoint to font at that layer, but we don't.
The simplest thing to do for this is to allow the search to continue
and succeed.
refs: https://github.com/wez/wezterm/issues/2158
refs: https://github.com/wez/wezterm/issues/1913
wezterm makes two passes over fonts when resolving emoji, based on
the presentation of the text. If the text has a defined presentation
via a variation selector, then the first pass will only consider
fonts that match that presentation.
If no fonts match, a second pass is made to locate any font with
the appropriate glyphs.
Previously, the font was assumed to have emoji presentation if it
had color glyphs. That meant that if you chose a monochrome emoji
font then it would be skipped for regular emoji presentation and
we'd end up matching against the default fallback for the Noto Color
Emoji font.
This commit changes the behavior to consider any explicitly listed
font with "Emoji" in the name as having emoji presentation.
This is also an imperfect heuristic, but perhaps it is good enough?
refs: #1959
Freetype has a compile-time feature that, when enabled, rewrites the
font names of PCF fonts to include the foundry and wide status of the
font in order to disambiguate the various versions of fonts all named
"Fixed".
That option is enabled by default in some linux distributions but not
all; it's not enabled in Fedora, for example.
When that feature is enabled it causes problems for the Terminus font as
the PCF version of the fonts are no longer resolvable via the simple
"Terminus" name but via "xos4 Terminus" instead.
wezterm builds its own version of freetype (for consistency and cross
platform support reasons), and is unaware of the choice used by the
distro.
The result of that is that fontconfig may see PCF fonts as having
different font names than how wezterm sees them.
A concrete problem is on such a system, when requesting "xos4 Terminus",
fontconfig will present a match with that name, but when wezterm opens
the font and sees that it has name "Terminus" (because of the difference
in this feature in the freetype libraries in use), wezterm will reject
that match.
This commit enables that option in the freetype library and adds a
wezterm config level option (freetype_pcf_long_family_names) that can be
used to control the underlying pcf font driver configuration.
The upshot of this is that this commit doesn't change any default
behavior, but allows users of those systems to set
`freetype_pcf_long_family_names = true` to turn that behavior on.
My personal opinion on this is that users probably shouldn't use this if
they can avoid it (and PCF fonts in general), and instead install the
OTB version of the Terminus font, which doesn't have this legacy baggage
associated with it!
refs: https://github.com/wez/wezterm/issues/2100
For fonts like Lucida Console on Windows which do not have a bold
variant, we were not synthesizing bold.
The reason was that the config-level "make bold" logic works by adding
200 to the weight which takes normal -> demibold, but the bold synthesis
logic is enabled only for bold and higher.
This commit changes the threshold for synthesis to demibold or higher.
refs: https://github.com/wez/wezterm/issues/2074
harfbuzz can return incomplete overlapping runs when it processes
text in unicode NFD. Add another check for the case where we've
accumulated the bytes in the range 0-12 and then harfbuzz returns
another range of 6-12. We coalesce the two together so that we can
pass the full unresolved sequence to the next fallback pass.
refs: https://github.com/wez/wezterm/issues/2032
This is still a bit of a WIP, but this commit:
* Introduces a new "Modal" concept to the GUI layer. The intent is
that modal intercepts key and mouse events while active, and renders
over the top of the rest of the normal display.
I think there might be a couple of cases where key events skirt
through this, but this is good enough as a first step.
Also, the render is forced into layer 1 which has some funny side
effects: if the modal choses to render transparent, it will poke
a hole in the window because all the rendering happens together:
there aren't distinct layer compositing passes.
* Add a new PaneSelect action that is implemented as a modal.
It uses quickselect style alphabet -> pane label generation and
renders the labels over ~the middle of each pane using an
enlarged version of the window frame font. Typing the label
will activate that pane. Escape will cancel the modal.
More styling and docs will follow in a later commit.
refs: #1975
Go directly to the underlying env_logger crate, as pretty_env_logger
hasn't been updated in some time, and I'd like to be able to redirect
the log output to a file more directly, and that feature is in a newer
version of the env logger than pretty_env_logger was pulling in.
Some versions of fontconfig classify some fonts as having charcell
spacing. We need to consider those as monospace as well.
refs: https://github.com/wez/wezterm/issues/1820
We were using a simple hashmap of name -> parsed file, but for
something like Terminus where there are ~10 files per weight
but for different pixel sizes, we'd end up forgetting files
beyond the first.
This commit tracks the full list in the font db.
related to #1820 in the sense that I needed this to confirm that
we do handle BDFs, but it is not the issue reported there because
that was sourcing fonts via fontconfig on linux.
The gist of the issue is that when setting eg: scale=1.2 to draw
a larger CJK glyph, it is drawn at the same descender level, which
makes it more likely to leave the top of the cell.
This commit adjusts the y position by the difference between the
original and the scaled descender so that is less likely to cause
problems.
refs: https://github.com/wez/wezterm/issues/1803
The broot icon font has glyphs with horizontal advance set to 0. That
would cause us to consider the glyph to be zero width, so handle that as
a special case. Note that it is legit for certain cells to end up with
a zero advance/width during shaping if they represent combining
characters: this is more common in Arabic scripts.
refs: https://github.com/wez/wezterm/issues/1787
This commit allows specifying a scaling factor as part of the font
attribute definition. This scaling factor is fed through to the
rasterizer and the shaper to adjust the actual font size that is
loaded.
The intent is to provide manual control for situations where the
fallback font has a different scale to the primary font and renders
either too small or too large.
The concrete example is
https://github.com/wez/wezterm/issues/1761#issuecomment-1079708207 where
the CJK fallback looks too small.
The scaling factor doesn't influence font metrics so it may also be
desirable to configure line height.
```lua
local wezterm = require 'wezterm'
return {
line_height = 1.2,
font = wezterm.font_with_fallback({
"JetBrains Mono",
{family="Microsoft YaHei", scale=1.5},
}),
}
```
freetype can't handle a wide range of encodings for
font names and can return strings like `?????` when
the family name is only present in the font as a non-unicode encoding,
such as Chinese.
This commit improves our handling of the font name table
and prefers to use results from processing that over the
results returned for eg: font family directly from the
freetype API.
refs: https://github.com/wez/wezterm/issues/1761
after reading around italic vs. oblique, I think "slant" is a misleading
way to categorize this, as slant implies something about the angle of
the font but really the difference between italic and oblique is purely
stylistic and without a suggested angle.
style more closely matches the CSS name which is well understood by
many, so we go for that.
refs: #1646
There are three slants that are broadly recognized; normal, italic and
oblique. Prior to this commit, we only considered normal or italic.
This is mostly a mechanical change to replace the boolean with the enum,
and rename the field from `italic` to `slant`.
For the sake of backwards compatibility with existing configs, the lua
helpers for working with fonts continue to accept boolean italic values
and rewrite them to the equivalent slant value.
refs: #1646
Resolves a little bit of the awkward duplication of color types
between some of the crates by factoring them a little bit better.
This is prep for allowing specifying alpha for some colors
in the config.