Merge pull request #6425 from m3nu/issue/6305/performance-faq

Add discussion and note on file attrs.
2024-11-05 03:25:19 +03:00 · 2022-03-09 16:05:00 +01:00 · 2022-03-09 16:05:00 +01:00 · 68d5e33b06
commit 68d5e33b06
parent 9a8fb9c902 af5103d833
1 changed files with 49 additions and 9 deletions
--- a/docs/faq.rst
+++ b/docs/faq.rst
@ -880,13 +880,30 @@ If you run into that, try this:
 What's the expected backup performance?
 ---------------------------------------
-A first backup will usually be somehow "slow" because there is a lot of data
+Compared to simply copying files (e.g. with ``rsync``), Borg has more work to do.
-to process. Performance here depends on a lot of factors, so it is hard to
+This can make creation of the first archive slower, but saves time 
-give specific numbers.
+and disk space on subsequent runs. Here what Borg does when you run ``borg create``:
 - Borg chunks the file (using the relatively expensive buzhash algorithm)
 - It then computes the "id" of the chunk (hmac-sha256 (often slow, except 
  if your CPU has sha256 acceleration) or blake2b (fast, in software))
 - Then it checks whether this chunk is already in the repo (local hashtable lookup, 
  fast). If so, the processing of the chunk is completed here. Otherwise it needs to 
  process the chunk:
 - Compresses (the default lz4 is super fast)
 - Encrypts (AES, usually fast if your CPU has AES acceleration as usual
  since about 10y)
 - Authenticates ("signs") using hmac-sha256 or blake2b (see above),
 - Transmits to repo. If the repo is remote, this usually involves an SSH connection
  (does its own encryption / authentication).
 - Stores the chunk into a key/value store (the key is the chunk id, the value 
  is the data). While doing that, it computes a CRC32 of the data (repo low-level
  checksum, used by borg check --repository) and also updates the repo index 
  (another hashtable).
 Subsequent backups are usually very fast if most files are unchanged and only
 a few are new or modified. The high performance on unchanged files primarily depends
-only on a few factors (like fs recursion + metadata reading performance and the
+only on a few factors (like FS recursion + metadata reading performance and the
 files cache working as expected) and much less on other factors.
 E.g., for this setup:
@ -904,14 +921,37 @@ few FAQ entries below.
 .. _slow_backup:
-Why is backup slow for me?
+Why is my backup so slow?
 --------------------------
-So, if you feel your Borg backup is too slow somehow, you should find out why.
+If you feel your Borg backup is too slow somehow, here is what you can do:
-The usual way to approach this is to add ``--list --filter=AME --stats`` to your
+- Make sure Borg has enough RAM (depends on how big your repo is / how many
-``borg create`` call to produce more log output, including a file list (with file status
+  files you have)
-characters) and also some statistics at the end of the backup.
+- Use one of the blake2 modes for --encryption except if you positively know 
  your CPU (and openssl) accelerates sha256 (then stay with hmac-sha256).
 - Don't use any expensive compression. The default is lz4 and super fast.
  Uncompressed is often slower than lz4.
 - Just wait. You can also interrupt it and start it again as often as you like,
  it will converge against a valid "completed" state (see ``--checkpoint-interval``,
  maybe use the default, but in any case don't make it too short). It is starting
  from the beginning each time, but it is still faster then as it does not store 
  data into the repo which it already has there from last checkpoint.
 - If you don’t need additional file attributes, you can disable them with ``--noflags``,
  ``--noacls``, ``--noxattrs``. This can lead to noticable performance improvements
  when your backup consists of many small files.
 If you feel that Borg "freezes" on a file, it could be in the middle of processing a
 large file (like ISOs or VM images). Borg < 1.2 announces file names *after* finishing
 with the file. This can lead to displaying the name of a small file, while processing the
 next (larger) file. For very big files this can lead to the progress display show some
 previous short file for a long time while it processes the big one. With Borg 1.2 this 
 was changed to announcing the filename before starting to process it.
 To see what files have changed and take more time processing, you can also add
 ``--list --filter=AME --stats`` to your ``borg create`` call to produce more log output, 
 including a file list (with file status characters) and also some statistics at 
 the end of the backup.
 Then you do the backup and look at the log output: