mirror of
https://github.com/wez/wezterm.git
synced 2024-12-24 22:01:47 +03:00
00ddfbf9b8
Introduces a heap-based quad allocator that we cache on a per-line basis, so if a line is unchanged we simply need to copy the previously computed set of quads for it into the gpu quad buffer. The results are encouraging wrt. constructing those quads; the `quad_buffer_apply` is the cost of the copy operation, compare with `render_screen_line_opengl` which is the cost of computing the quads; it's 300x better at the p50 and >100x better at p95 for a full-screen updating program: full 2880x1800 screen top: ``` STAT p50 p75 p95 Key(quad_buffer_apply) 2.26µs 5.22µs 9.60µs Key(render_screen_line_opengl) 610.30µs 905.22µs 1.33ms Key(gui.paint.opengl) 35.39ms 37.75ms 45.88ms ``` However, the extra buffering does increase the latency of `gui.paint.opengl` (the overall cost of painting a frame); contrast the above with the latency in the same scenario with the current `main` (rather than this branch): ``` Key(gui.paint.opengl) 19.14ms 21.10ms 28.18ms ``` Note that for an idle screen this latency is ~1.5ms but that is also true of `main`. While the overall latency in the histogram isn't a slam dunk, running `time cat bigfile` is ~10% faster on my mac. I'm sure there's something that can be shaved off to get a more convincing win. |
||
---|---|---|
.. | ||
src | ||
Cargo.toml |