zed/crates/rope
Thorsten Ball 0533923f91
Reduce memory usage to represent buffers by up to 50% (#10321)
This should help with some of the memory problems reported in
https://github.com/zed-industries/zed/issues/8436, especially the ones
related to large files (see:
https://github.com/zed-industries/zed/issues/8436#issuecomment2037442695),
by **reducing the memory required to represent a buffer in Zed by
~50%.**

### How?

Zed's memory consumption is dominated by the in-memory representation of
buffer contents.

On the lowest level, the buffer is represented as a
[Rope](https://en.wikipedia.org/wiki/Rope_(data_structure)) and that's
where the most memory is used. The layers above — buffer, syntax map,
fold map, display map, ... — basically use "no memory" compared to the
Rope.

Zed's `Rope` data structure is itself implemented as [a `SumTree` of
`Chunks`](8205c52d2b/crates/rope/src/rope.rs (L35-L38)).

An important constant at play here is `CHUNK_BASE`:

`CHUNK_BASE` is the maximum length of a single text `Chunk` in the
`SumTree` underlying the `Rope`. In other words: It determines into how
many pieces a given buffer is split up.

By changing `CHUNK_BASE` we can adjust the level of granularity
withwhich we index a given piece of text. Theoretical maximum is the
length of the text, theoretical minimum is 1. Sweet spot is somewhere
inbetween, where memory use and performance of write & read access are
optimal.

We started with `16` as the `CHUNK_BASE`, but that wasn't the result of
extensive benchmarks, more the first reasonable number that came to
mind.

### What

This changes `CHUNK_BASE` from `16` to `64`. That reduces the memory
usage, trading it in for slight reduction in performance in certain
benchmarks.

### Benchmarks

I added a benchmark suite for `Rope` to determine whether we'd regress
in performance as `CHUNK_BASE` goes up. I went from `16` to `32` and
then to `64`. While `32` increased performance and reduced memory usage,
`64` had one slight drop in performance, increases in other benchmarks
and substantial memory savings.

| `CHUNK_BASE` from `16` to `32` | `CHUNK_BASE` from `16` to `64` |
|-------------------|--------------------|
|
![chunk_base_16_to_32](https://github.com/zed-industries/zed/assets/1185253/fcf1f9c6-4f43-4e44-8ef5-29c1e5d8e2b9)
|
![chunk_base_16_to_64](https://github.com/zed-industries/zed/assets/1185253/d82a0478-eeef-43d0-9240-e0aa9df8d946)
|

### Real World Results

We tested this by loading a 138 MB `*.tex` file (parsed as plain text)
into Zed and measuring in `Instruments.app` the allocation.

#### standard allocator
Before, with `CHUNK_BASE: 16`, the memory usage was ~827MB after loading
the buffer.

| `CHUNK_BASE: 16` |
|---------------------|
|
![memory_consumption_chunk_base_16_std_alloc](https://github.com/zed-industries/zed/assets/1185253/c1e04c34-7d1a-49fa-bb3c-6ad10aec6e26)
|


After, with `CHUNK_BASE: 64`, the memory usage was ~396MB after loading
the buffer.

| `CHUNK_BASE: 64` |
|---------------------|
|
![memory_consumption_chunk_base_64_std_alloc](https://github.com/zed-industries/zed/assets/1185253/c728e134-1846-467f-b20f-114a582c7b5a)
|


#### `mimalloc`

`MiMalloc` by default and that seems to be pretty aggressive when it
comes to growing memory. Whereas the std allocator would go up to
~800mb, MiMalloc would jump straight to 1024MB.

I also can't get `MiMalloc` to work properly with `Instruments.app` (it
always shows 15MB of memory usage) so I had to use these `Activity
Monitor` screenshots:

| `CHUNK_BASE: 16` |
|---------------------|
|
![memory_consumption_chunk_base_16_mimalloc](https://github.com/zed-industries/zed/assets/1185253/1e6e05e9-80c2-4ec7-9b0e-8a6fa78836eb)
|

| `CHUNK_BASE: 64` |
|---------------------|
|
![memory_consumption_chunk_base_64_mimalloc](https://github.com/zed-industries/zed/assets/1185253/8a47e982-a675-4db0-b690-d60f1ff9acc8)
|

### Release Notes

Release Notes:

- Reduced memory usage for files by up to 50%.

---------

Co-authored-by: Antonio <antonio@zed.dev>
2024-04-09 18:07:53 +02:00
..
benches Reduce memory usage to represent buffers by up to 50% (#10321) 2024-04-09 18:07:53 +02:00
src Reduce memory usage to represent buffers by up to 50% (#10321) 2024-04-09 18:07:53 +02:00
Cargo.toml Reduce memory usage to represent buffers by up to 50% (#10321) 2024-04-09 18:07:53 +02:00
LICENSE-GPL chore: Change AGPL-licensed crates to GPL (except for collab) (#4231) 2024-01-24 00:26:58 +01:00