Merge pull request #6425 from m3nu/issue/6305/performance-faq

Add discussion and note on file attrs.
This commit is contained in:
TW 2022-03-09 16:05:00 +01:00 committed by GitHub
commit 68d5e33b06
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

View File

@ -880,13 +880,30 @@ If you run into that, try this:
What's the expected backup performance? What's the expected backup performance?
--------------------------------------- ---------------------------------------
A first backup will usually be somehow "slow" because there is a lot of data Compared to simply copying files (e.g. with ``rsync``), Borg has more work to do.
to process. Performance here depends on a lot of factors, so it is hard to This can make creation of the first archive slower, but saves time
give specific numbers. and disk space on subsequent runs. Here what Borg does when you run ``borg create``:
- Borg chunks the file (using the relatively expensive buzhash algorithm)
- It then computes the "id" of the chunk (hmac-sha256 (often slow, except
if your CPU has sha256 acceleration) or blake2b (fast, in software))
- Then it checks whether this chunk is already in the repo (local hashtable lookup,
fast). If so, the processing of the chunk is completed here. Otherwise it needs to
process the chunk:
- Compresses (the default lz4 is super fast)
- Encrypts (AES, usually fast if your CPU has AES acceleration as usual
since about 10y)
- Authenticates ("signs") using hmac-sha256 or blake2b (see above),
- Transmits to repo. If the repo is remote, this usually involves an SSH connection
(does its own encryption / authentication).
- Stores the chunk into a key/value store (the key is the chunk id, the value
is the data). While doing that, it computes a CRC32 of the data (repo low-level
checksum, used by borg check --repository) and also updates the repo index
(another hashtable).
Subsequent backups are usually very fast if most files are unchanged and only Subsequent backups are usually very fast if most files are unchanged and only
a few are new or modified. The high performance on unchanged files primarily depends a few are new or modified. The high performance on unchanged files primarily depends
only on a few factors (like fs recursion + metadata reading performance and the only on a few factors (like FS recursion + metadata reading performance and the
files cache working as expected) and much less on other factors. files cache working as expected) and much less on other factors.
E.g., for this setup: E.g., for this setup:
@ -904,14 +921,37 @@ few FAQ entries below.
.. _slow_backup: .. _slow_backup:
Why is backup slow for me? Why is my backup so slow?
-------------------------- --------------------------
So, if you feel your Borg backup is too slow somehow, you should find out why. If you feel your Borg backup is too slow somehow, here is what you can do:
The usual way to approach this is to add ``--list --filter=AME --stats`` to your - Make sure Borg has enough RAM (depends on how big your repo is / how many
``borg create`` call to produce more log output, including a file list (with file status files you have)
characters) and also some statistics at the end of the backup. - Use one of the blake2 modes for --encryption except if you positively know
your CPU (and openssl) accelerates sha256 (then stay with hmac-sha256).
- Don't use any expensive compression. The default is lz4 and super fast.
Uncompressed is often slower than lz4.
- Just wait. You can also interrupt it and start it again as often as you like,
it will converge against a valid "completed" state (see ``--checkpoint-interval``,
maybe use the default, but in any case don't make it too short). It is starting
from the beginning each time, but it is still faster then as it does not store
data into the repo which it already has there from last checkpoint.
- If you dont need additional file attributes, you can disable them with ``--noflags``,
``--noacls``, ``--noxattrs``. This can lead to noticable performance improvements
when your backup consists of many small files.
If you feel that Borg "freezes" on a file, it could be in the middle of processing a
large file (like ISOs or VM images). Borg < 1.2 announces file names *after* finishing
with the file. This can lead to displaying the name of a small file, while processing the
next (larger) file. For very big files this can lead to the progress display show some
previous short file for a long time while it processes the big one. With Borg 1.2 this
was changed to announcing the filename before starting to process it.
To see what files have changed and take more time processing, you can also add
``--list --filter=AME --stats`` to your ``borg create`` call to produce more log output,
including a file list (with file status characters) and also some statistics at
the end of the backup.
Then you do the backup and look at the log output: Then you do the backup and look at the log output: