mirror of
https://github.com/wez/wezterm.git
synced 2024-12-29 00:21:57 +03:00
087e11ebbb
Previously the same emoji was able to appear multiple times in the CharSelect modal for emoji input, because one emoji might have multiple aliases. In fact, often the aliases have similar names, making it especially likely that a fuzzy match matches multiple aliases at the same time. The same Unicode char may even match multiple times both as Character::Unicode as well as a Character::Emoji. To make the deduplication easy, store the results in a hash map instead of a vector. We use the glyph as the key of the map to get free deduplication. Only update the mapped value, if a duplicate entry would improve the score. Performance-wise this is pretty much identical to the previous state. We do see minor performance regression for very large n - granted, this is expected as we do more work - but the use of the HashMap covers up for a large part of it. If the user types more than 3 characters, the performance is absolutely identical. For less than 3 characters, the performance was unacceptable anyway (700 ms before this patch, 800 ms after this patch on my system). Here is a side-by-side comparison for a user iteratively typing the query "no-evil": # Before After 1 718.361276ms 837.612275ms 2 719.532450ms 816.348394ms 3 349.625101ms 369.726458ms 4 356.349671ms 354.367768ms 5 363.862194ms 361.985546ms 6 372.339582ms 370.022932ms 7 381.123785ms 378.349672ms In fact, for small n, the hash map seems to perform even slightly better than the vector. For large n we need to optimize the performance anyway, as both 700ms and 800ms are unacceptable. Thus, this is worth it for the benefit of Unicode symbol deduplication. |
||
---|---|---|
.. | ||
src | ||
build.rs | ||
Cargo.toml |