files cache improvements, fixes #8385, fixes #5658

- changes to locally stored files cache:

  - store as files.<H(archive_name)>
  - user can manually control suffix via env var
  - if local files cache is not found, build from previous archive.
- enable rebuilding the files cache via loading the previous
  archive's metadata from the repo (better than starting with
  empty files cache and needing to read/chunk/hash all files).
  previous archive == same archive name, latest timestamp in repo.
- remove AdHocCache (not needed any more, slow)
- remove BORG_CACHE_IMPL, we only have one
- remove cache lock (this was blocking parallel backups to same
  repo from same machine/user).

Cache entries now have ctime AND mtime.

Note: TTL and age still needed for discarding removed files.
      But due to the separate files caches per series, the TTL
      was lowered to 2 (from 20).
This commit is contained in:
Thomas Waldmann 2024-09-17 23:25:23 +02:00
parent 385eeeb4d5
commit a891559578
No known key found for this signature in database
GPG Key ID: 243ACFA951F78E01
14 changed files with 180 additions and 270 deletions

View File

@ -837,50 +837,29 @@ already used.
By default, ctime (change time) is used for the timestamps to have a rather By default, ctime (change time) is used for the timestamps to have a rather
safe change detection (see also the --files-cache option). safe change detection (see also the --files-cache option).
Furthermore, pathnames recorded in files cache are always absolute, even if you Furthermore, pathnames used as key into the files cache are **as archived**,
specify source directories with relative pathname. If relative pathnames are so make sure these are always the same (see ``borg list``).
stable, but absolute are not (for example if you mount a filesystem without
stable mount points for each backup or if you are running the backup from a
filesystem snapshot whose name is not stable), borg will assume that files are
different and will report them as 'added', even though no new chunks will be
actually recorded for them. To avoid this, you could bind mount your source
directory in a directory with the stable path.
.. _always_chunking: .. _always_chunking:
It always chunks all my files, even unchanged ones! It always chunks all my files, even unchanged ones!
--------------------------------------------------- ---------------------------------------------------
Borg maintains a files cache where it remembers the timestamp, size and Borg maintains a files cache where it remembers the timestamps, size and
inode of files. When Borg does a new backup and starts processing a inode of files. When Borg does a new backup and starts processing a
file, it first looks whether the file has changed (compared to the values file, it first looks whether the file has changed (compared to the values
stored in the files cache). If the values are the same, the file is assumed stored in the files cache). If the values are the same, the file is assumed
unchanged and thus its contents won't get chunked (again). unchanged and thus its contents won't get chunked (again).
Borg can't keep an infinite history of files of course, thus entries The files cache is stored separately (using a different filename suffix) per
in the files cache have a "maximum time to live" which is set via the archive series, thus using always the same name for the archive is strongly
environment variable BORG_FILES_CACHE_TTL (and defaults to 20). recommended. The "rebuild files cache from previous archive in repo" feature
Every time you do a backup (on the same machine, using the same user), the also depends on that.
cache entries' ttl values of files that were not "seen" are incremented by 1 Alternatively, there is also BORG_FILES_CACHE_SUFFIX which can be used to
and if they reach BORG_FILES_CACHE_TTL, the entry is removed from the cache. manually set a custom suffix (if you can't just use the same archive name).
So, for example, if you do daily backups of 26 different data sets A, B, Another possible reason is that files don't always have the same path -
C, ..., Z on one machine (using the default TTL), the files from A will be borg uses the paths as seen in the archive when using ``borg list``.
already forgotten when you repeat the same backups on the next day and it
will be slow because it would chunk all the files each time. If you set
BORG_FILES_CACHE_TTL to at least 26 (or maybe even a small multiple of that),
it would be much faster.
Besides using a higher BORG_FILES_CACHE_TTL (which also increases memory usage),
there is also BORG_FILES_CACHE_SUFFIX which can be used to have separate (smaller)
files caches for each backup set instead of the default one (big) unified files cache.
Another possible reason is that files don't always have the same path, for
example if you mount a filesystem without stable mount points for each backup
or if you are running the backup from a filesystem snapshot whose name is not
stable. If the directory where you mount a filesystem is different every time,
Borg assumes they are different files. This is true even if you back up these
files with relative pathnames - borg uses full pathnames in files cache regardless.
It is possible for some filesystems, such as ``mergerfs`` or network filesystems, It is possible for some filesystems, such as ``mergerfs`` or network filesystems,
to return inconsistent inode numbers across runs, causing borg to consider them changed. to return inconsistent inode numbers across runs, causing borg to consider them changed.

View File

@ -474,18 +474,20 @@ guess what files you have based on a specific set of chunk sizes).
The cache The cache
--------- ---------
The **files cache** is stored in ``cache/files`` and is used at backup time to The **files cache** is stored in ``cache/files.<SUFFIX>`` and is used at backup
quickly determine whether a given file is unchanged and we have all its chunks. time to quickly determine whether a given file is unchanged and we have all its
chunks.
In memory, the files cache is a key -> value mapping (a Python *dict*) and contains: In memory, the files cache is a key -> value mapping (a Python *dict*) and contains:
* key: id_hash of the encoded, absolute file path * key: id_hash of the encoded path (same path as seen in archive)
* value: * value:
- age (0 [newest], ..., BORG_FILES_CACHE_TTL - 1)
- file inode number - file inode number
- file size - file size
- file ctime_ns (or mtime_ns) - file ctime_ns
- age (0 [newest], 1, 2, 3, ..., BORG_FILES_CACHE_TTL - 1) - file mtime_ns
- list of chunk (id, size) tuples representing the file's contents - list of chunk (id, size) tuples representing the file's contents
To determine whether a file has not changed, cached values are looked up via To determine whether a file has not changed, cached values are looked up via
@ -514,7 +516,7 @@ be told to ignore the inode number in the check via --files-cache.
The age value is used for cache management. If a file is "seen" in a backup The age value is used for cache management. If a file is "seen" in a backup
run, its age is reset to 0, otherwise its age is incremented by one. run, its age is reset to 0, otherwise its age is incremented by one.
If a file was not seen in BORG_FILES_CACHE_TTL backups, its cache entry is If a file was not seen in BORG_FILES_CACHE_TTL backups, its cache entry is
removed. See also: :ref:`always_chunking` and :ref:`a_status_oddity` removed.
The files cache is a python dictionary, storing python objects, which The files cache is a python dictionary, storing python objects, which
generates a lot of overhead. generates a lot of overhead.

View File

@ -66,8 +66,7 @@ General:
cache entries for backup sources other than the current sources. cache entries for backup sources other than the current sources.
BORG_FILES_CACHE_TTL BORG_FILES_CACHE_TTL
When set to a numeric value, this determines the maximum "time to live" for the files cache When set to a numeric value, this determines the maximum "time to live" for the files cache
entries (default: 20). The files cache is used to determine quickly whether a file is unchanged. entries (default: 2). The files cache is used to determine quickly whether a file is unchanged.
The FAQ explains this more detailed in: :ref:`always_chunking`
BORG_USE_CHUNKS_ARCHIVE BORG_USE_CHUNKS_ARCHIVE
When set to no (default: yes), the ``chunks.archive.d`` folder will not be used. This reduces When set to no (default: yes), the ``chunks.archive.d`` folder will not be used. This reduces
disk space usage but slows down cache resyncs. disk space usage but slows down cache resyncs.
@ -85,15 +84,6 @@ General:
- ``pyfuse3``: only try to load pyfuse3 - ``pyfuse3``: only try to load pyfuse3
- ``llfuse``: only try to load llfuse - ``llfuse``: only try to load llfuse
- ``none``: do not try to load an implementation - ``none``: do not try to load an implementation
BORG_CACHE_IMPL
Choose the implementation for the clientside cache, choose one of:
- ``adhoc``: builds a non-persistent chunks cache by querying the repo. Chunks cache contents
are somewhat sloppy for already existing chunks, concerning their refcount ("infinite") and
size (0). No files cache (slow, will chunk all input files). DEPRECATED.
- ``adhocwithfiles``: Like ``adhoc``, but with a persistent files cache. Default implementation.
- ``cli``: Determine the cache implementation from cli options. Without special options, will
usually end up with the ``local`` implementation.
BORG_SELFTEST BORG_SELFTEST
This can be used to influence borg's builtin self-tests. The default is to execute the tests This can be used to influence borg's builtin self-tests. The default is to execute the tests
at the beginning of each borg command invocation. at the beginning of each borg command invocation.

View File

@ -1345,7 +1345,7 @@ def process_file(self, *, path, parent_fd, name, st, cache, flags=flags_normal,
item.chunks.append(chunk_entry) item.chunks.append(chunk_entry)
else: # normal case, no "2nd+" hardlink else: # normal case, no "2nd+" hardlink
if not is_special_file: if not is_special_file:
hashed_path = safe_encode(os.path.join(self.cwd, path)) hashed_path = safe_encode(item.path) # path as in archive item!
started_hashing = time.monotonic() started_hashing = time.monotonic()
path_hash = self.key.id_hash(hashed_path) path_hash = self.key.id_hash(hashed_path)
self.stats.hashing_time += time.monotonic() - started_hashing self.stats.hashing_time += time.monotonic() - started_hashing

View File

@ -161,13 +161,12 @@ def wrapper(self, args, **kwargs):
if "compression" in args: if "compression" in args:
manifest_.repo_objs.compressor = args.compression.compressor manifest_.repo_objs.compressor = args.compression.compressor
if secure: if secure:
assert_secure(repository, manifest_, self.lock_wait) assert_secure(repository, manifest_)
if cache: if cache:
with Cache( with Cache(
repository, repository,
manifest_, manifest_,
progress=getattr(args, "progress", False), progress=getattr(args, "progress", False),
lock_wait=self.lock_wait,
cache_mode=getattr(args, "files_cache_mode", FILES_CACHE_MODE_DISABLED), cache_mode=getattr(args, "files_cache_mode", FILES_CACHE_MODE_DISABLED),
iec=getattr(args, "iec", False), iec=getattr(args, "iec", False),
) as cache_: ) as cache_:
@ -230,7 +229,7 @@ def wrapper(self, args, **kwargs):
manifest_ = Manifest.load( manifest_ = Manifest.load(
repository, compatibility, ro_cls=RepoObj if repository.version > 1 else RepoObj1 repository, compatibility, ro_cls=RepoObj if repository.version > 1 else RepoObj1
) )
assert_secure(repository, manifest_, self.lock_wait) assert_secure(repository, manifest_)
if manifest: if manifest:
kwargs["other_manifest"] = manifest_ kwargs["other_manifest"] = manifest_
if cache: if cache:
@ -238,7 +237,6 @@ def wrapper(self, args, **kwargs):
repository, repository,
manifest_, manifest_,
progress=False, progress=False,
lock_wait=self.lock_wait,
cache_mode=getattr(args, "files_cache_mode", FILES_CACHE_MODE_DISABLED), cache_mode=getattr(args, "files_cache_mode", FILES_CACHE_MODE_DISABLED),
iec=getattr(args, "iec", False), iec=getattr(args, "iec", False),
) as cache_: ) as cache_:

View File

@ -222,10 +222,9 @@ def create_inner(archive, cache, fso):
repository, repository,
manifest, manifest,
progress=args.progress, progress=args.progress,
lock_wait=self.lock_wait,
prefer_adhoc_cache=args.prefer_adhoc_cache,
cache_mode=args.files_cache_mode, cache_mode=args.files_cache_mode,
iec=args.iec, iec=args.iec,
archive_name=args.name,
) as cache: ) as cache:
archive = Archive( archive = Archive(
manifest, manifest,
@ -787,12 +786,6 @@ def build_parser_create(self, subparsers, common_parser, mid_common_parser):
help="only display items with the given status characters (see description)", help="only display items with the given status characters (see description)",
) )
subparser.add_argument("--json", action="store_true", help="output stats as JSON. Implies ``--stats``.") subparser.add_argument("--json", action="store_true", help="output stats as JSON. Implies ``--stats``.")
subparser.add_argument(
"--prefer-adhoc-cache",
dest="prefer_adhoc_cache",
action="store_true",
help="experimental: prefer AdHocCache (w/o files cache) over AdHocWithFilesCache (with files cache).",
)
subparser.add_argument( subparser.add_argument(
"--stdin-name", "--stdin-name",
metavar="NAME", metavar="NAME",

View File

@ -37,7 +37,7 @@ def _list_inner(cache):
# Only load the cache if it will be used # Only load the cache if it will be used
if ItemFormatter.format_needs_cache(format): if ItemFormatter.format_needs_cache(format):
with Cache(repository, manifest, lock_wait=self.lock_wait) as cache: with Cache(repository, manifest) as cache:
_list_inner(cache) _list_inner(cache)
else: else:
_list_inner(cache=None) _list_inner(cache=None)

View File

@ -111,7 +111,7 @@ def do_prune(self, args, repository, manifest):
keep += prune_split(archives, rule, num, kept_because) keep += prune_split(archives, rule, num, kept_because)
to_delete = set(archives) - set(keep) to_delete = set(archives) - set(keep)
with Cache(repository, manifest, lock_wait=self.lock_wait, iec=args.iec) as cache: with Cache(repository, manifest, iec=args.iec) as cache:
list_logger = logging.getLogger("borg.output.list") list_logger = logging.getLogger("borg.output.list")
# set up counters for the progress display # set up counters for the progress display
to_delete_len = len(to_delete) to_delete_len = len(to_delete)

View File

@ -12,11 +12,12 @@
files_cache_logger = create_logger("borg.debug.files_cache") files_cache_logger = create_logger("borg.debug.files_cache")
from .constants import CACHE_README, FILES_CACHE_MODE_DISABLED, ROBJ_FILE_STREAM from .constants import CACHE_README, FILES_CACHE_MODE_DISABLED, ROBJ_FILE_STREAM
from .checksums import xxh64
from .hashindex import ChunkIndex, ChunkIndexEntry from .hashindex import ChunkIndex, ChunkIndexEntry
from .helpers import Error from .helpers import Error
from .helpers import get_cache_dir, get_security_dir from .helpers import get_cache_dir, get_security_dir
from .helpers import hex_to_bin, parse_stringified_list from .helpers import hex_to_bin, bin_to_hex, parse_stringified_list
from .helpers import format_file_size from .helpers import format_file_size, safe_encode
from .helpers import safe_ns from .helpers import safe_ns
from .helpers import yes from .helpers import yes
from .helpers import ProgressIndicatorMessage from .helpers import ProgressIndicatorMessage
@ -25,14 +26,13 @@
from .item import ChunkListEntry from .item import ChunkListEntry
from .crypto.key import PlaintextKey from .crypto.key import PlaintextKey
from .crypto.file_integrity import IntegrityCheckedFile, FileIntegrityError from .crypto.file_integrity import IntegrityCheckedFile, FileIntegrityError
from .fslocking import Lock
from .manifest import Manifest from .manifest import Manifest
from .platform import SaveFile from .platform import SaveFile
from .remote import RemoteRepository from .remote import RemoteRepository
from .repository import LIST_SCAN_LIMIT, Repository from .repository import LIST_SCAN_LIMIT, Repository
# note: cmtime might be either a ctime or a mtime timestamp, chunks is a list of ChunkListEntry # chunks is a list of ChunkListEntry
FileCacheEntry = namedtuple("FileCacheEntry", "age inode size cmtime chunks") FileCacheEntry = namedtuple("FileCacheEntry", "age inode size ctime mtime chunks")
class SecurityManager: class SecurityManager:
@ -154,7 +154,7 @@ def assert_key_type(self, key):
if self.known() and not self.key_matches(key): if self.known() and not self.key_matches(key):
raise Cache.EncryptionMethodMismatch() raise Cache.EncryptionMethodMismatch()
def assert_secure(self, manifest, key, *, warn_if_unencrypted=True, lock_wait=None): def assert_secure(self, manifest, key, *, warn_if_unencrypted=True):
# warn_if_unencrypted=False is only used for initializing a new repository. # warn_if_unencrypted=False is only used for initializing a new repository.
# Thus, avoiding asking about a repository that's currently initializing. # Thus, avoiding asking about a repository that's currently initializing.
self.assert_access_unknown(warn_if_unencrypted, manifest, key) self.assert_access_unknown(warn_if_unencrypted, manifest, key)
@ -194,9 +194,9 @@ def assert_access_unknown(self, warn_if_unencrypted, manifest, key):
raise Cache.CacheInitAbortedError() raise Cache.CacheInitAbortedError()
def assert_secure(repository, manifest, lock_wait): def assert_secure(repository, manifest):
sm = SecurityManager(repository) sm = SecurityManager(repository)
sm.assert_secure(manifest, manifest.key, lock_wait=lock_wait) sm.assert_secure(manifest, manifest.key)
def cache_dir(repository, path=None): def cache_dir(repository, path=None):
@ -204,13 +204,11 @@ def cache_dir(repository, path=None):
class CacheConfig: class CacheConfig:
def __init__(self, repository, path=None, lock_wait=None): def __init__(self, repository, path=None):
self.repository = repository self.repository = repository
self.path = cache_dir(repository, path) self.path = cache_dir(repository, path)
logger.debug("Using %s as cache", self.path) logger.debug("Using %s as cache", self.path)
self.config_path = os.path.join(self.path, "config") self.config_path = os.path.join(self.path, "config")
self.lock = None
self.lock_wait = lock_wait
def __enter__(self): def __enter__(self):
self.open() self.open()
@ -235,7 +233,6 @@ def create(self):
config.write(fd) config.write(fd)
def open(self): def open(self):
self.lock = Lock(os.path.join(self.path, "lock"), exclusive=True, timeout=self.lock_wait).acquire()
self.load() self.load()
def load(self): def load(self):
@ -278,9 +275,7 @@ def save(self, manifest=None):
self._config.write(fd) self._config.write(fd)
def close(self): def close(self):
if self.lock is not None: pass
self.lock.release()
self.lock = None
def _check_upgrade(self, config_path): def _check_upgrade(self, config_path):
try: try:
@ -296,10 +291,6 @@ def _check_upgrade(self, config_path):
raise Exception("%s does not look like a Borg cache." % config_path) from None raise Exception("%s does not look like a Borg cache." % config_path) from None
def get_cache_impl():
return os.environ.get("BORG_CACHE_IMPL", "adhocwithfiles")
class Cache: class Cache:
"""Client Side cache""" """Client Side cache"""
@ -330,8 +321,7 @@ class RepositoryReplay(Error):
@staticmethod @staticmethod
def break_lock(repository, path=None): def break_lock(repository, path=None):
path = cache_dir(repository, path) pass
Lock(os.path.join(path, "lock"), exclusive=True).break_lock()
@staticmethod @staticmethod
def destroy(repository, path=None): def destroy(repository, path=None):
@ -350,71 +340,117 @@ def __new__(
sync=True, sync=True,
warn_if_unencrypted=True, warn_if_unencrypted=True,
progress=False, progress=False,
lock_wait=None,
prefer_adhoc_cache=False,
cache_mode=FILES_CACHE_MODE_DISABLED, cache_mode=FILES_CACHE_MODE_DISABLED,
iec=False, iec=False,
archive_name=None,
): ):
def adhocwithfiles(): return AdHocWithFilesCache(
return AdHocWithFilesCache( manifest=manifest,
manifest=manifest, path=path,
path=path, warn_if_unencrypted=warn_if_unencrypted,
warn_if_unencrypted=warn_if_unencrypted, progress=progress,
progress=progress, iec=iec,
iec=iec, cache_mode=cache_mode,
lock_wait=lock_wait, archive_name=archive_name,
cache_mode=cache_mode, )
)
def adhoc():
return AdHocCache(manifest=manifest, lock_wait=lock_wait, iec=iec)
impl = get_cache_impl()
if impl != "cli":
methods = dict(adhocwithfiles=adhocwithfiles, adhoc=adhoc)
try:
method = methods[impl]
except KeyError:
raise RuntimeError("Unknown BORG_CACHE_IMPL value: %s" % impl)
return method()
return adhoc() if prefer_adhoc_cache else adhocwithfiles()
class FilesCacheMixin: class FilesCacheMixin:
""" """
Massively accelerate processing of unchanged files by caching their chunks list. Massively accelerate processing of unchanged files.
With that, we can avoid having to read and chunk them to get their chunks list. We read the "files cache" (either from cache directory or from previous archive
in repo) that has metadata for all "already stored" files, like size, ctime/mtime,
inode number and chunks id/size list.
When finding a file on disk, we use the metadata to determine if the file is unchanged.
If so, we use the cached chunks list and skip reading/chunking the file contents.
""" """
FILES_CACHE_NAME = "files" FILES_CACHE_NAME = "files"
def __init__(self, cache_mode): def __init__(self, cache_mode, archive_name=None):
self.archive_name = archive_name # ideally a SERIES name
assert not ("c" in cache_mode and "m" in cache_mode)
assert "d" in cache_mode or "c" in cache_mode or "m" in cache_mode
self.cache_mode = cache_mode self.cache_mode = cache_mode
self._files = None self._files = None
self._newest_cmtime = None self._newest_cmtime = 0
self._newest_path_hashes = set()
@property @property
def files(self): def files(self):
if self._files is None: if self._files is None:
self._files = self._read_files_cache() self._files = self._read_files_cache() # try loading from cache dir
if self._files is None:
self._files = self._build_files_cache() # try loading from repository
if self._files is None:
self._files = {} # start from scratch
return self._files return self._files
def _build_files_cache(self):
"""rebuild the files cache by reading previous archive from repository"""
if "d" in self.cache_mode: # d(isabled)
return
if not self.archive_name:
return
from .archive import Archive
# get the latest archive with the IDENTICAL name, supporting archive series:
archives = self.manifest.archives.list(match=self.archive_name, sort_by=["ts"], last=1)
if not archives:
# nothing found
return
prev_archive = archives[0]
files = {}
logger.debug(
f"Building files cache from {prev_archive.name} {prev_archive.ts} {bin_to_hex(prev_archive.id)} ..."
)
files_cache_logger.debug("FILES-CACHE-BUILD: starting...")
archive = Archive(self.manifest, prev_archive.id)
for item in archive.iter_items(preload=False):
# only put regular files' infos into the files cache:
if stat.S_ISREG(item.mode):
path_hash = self.key.id_hash(safe_encode(item.path))
# keep track of the key(s) for the most recent timestamp(s):
ctime_ns = item.ctime
if ctime_ns > self._newest_cmtime:
self._newest_cmtime = ctime_ns
self._newest_path_hashes = {path_hash}
elif ctime_ns == self._newest_cmtime:
self._newest_path_hashes.add(path_hash)
mtime_ns = item.mtime
if mtime_ns > self._newest_cmtime:
self._newest_cmtime = mtime_ns
self._newest_path_hashes = {path_hash}
elif mtime_ns == self._newest_cmtime:
self._newest_path_hashes.add(path_hash)
# add the file to the in-memory files cache
entry = FileCacheEntry(
item.get("inode", 0), item.size, int_to_timestamp(ctime_ns), int_to_timestamp(mtime_ns), item.chunks
)
files[path_hash] = msgpack.packb(entry) # takes about 240 Bytes per file
# deal with special snapshot / timestamp granularity case, see FAQ:
for path_hash in self._newest_path_hashes:
del files[path_hash]
files_cache_logger.debug("FILES-CACHE-BUILD: finished, %d entries loaded.", len(files))
return files
def files_cache_name(self): def files_cache_name(self):
suffix = os.environ.get("BORG_FILES_CACHE_SUFFIX", "") suffix = os.environ.get("BORG_FILES_CACHE_SUFFIX", "")
return self.FILES_CACHE_NAME + "." + suffix if suffix else self.FILES_CACHE_NAME # when using archive series, we automatically make up a separate cache file per series.
# when not, the user may manually do that by using the env var.
if not suffix:
# avoid issues with too complex or long archive_name by hashing it:
suffix = bin_to_hex(xxh64(self.archive_name.encode()))
return self.FILES_CACHE_NAME + "." + suffix
def discover_files_cache_name(self, path): def discover_files_cache_names(self, path):
return [ return [fn for fn in os.listdir(path) if fn.startswith(self.FILES_CACHE_NAME + ".")]
fn for fn in os.listdir(path) if fn == self.FILES_CACHE_NAME or fn.startswith(self.FILES_CACHE_NAME + ".")
][0]
def _create_empty_files_cache(self, path):
with IntegrityCheckedFile(path=os.path.join(path, self.files_cache_name()), write=True) as fd:
pass # empty file
return fd.integrity_data
def _read_files_cache(self): def _read_files_cache(self):
"""read files cache from cache directory"""
if "d" in self.cache_mode: # d(isabled) if "d" in self.cache_mode: # d(isabled)
return return
@ -447,17 +483,17 @@ def _read_files_cache(self):
except FileIntegrityError as fie: except FileIntegrityError as fie:
msg = "The files cache is corrupted. [%s]" % str(fie) msg = "The files cache is corrupted. [%s]" % str(fie)
if msg is not None: if msg is not None:
logger.warning(msg) logger.debug(msg)
logger.warning("Continuing without files cache - expect lower performance.") files = None
files = {} files_cache_logger.debug("FILES-CACHE-LOAD: finished, %d entries loaded.", len(files or {}))
files_cache_logger.debug("FILES-CACHE-LOAD: finished, %d entries loaded.", len(files))
return files return files
def _write_files_cache(self, files): def _write_files_cache(self, files):
"""write files cache to cache directory"""
if self._newest_cmtime is None: if self._newest_cmtime is None:
# was never set because no files were modified/added # was never set because no files were modified/added
self._newest_cmtime = 2**63 - 1 # nanoseconds, good until y2262 self._newest_cmtime = 2**63 - 1 # nanoseconds, good until y2262
ttl = int(os.environ.get("BORG_FILES_CACHE_TTL", 20)) ttl = int(os.environ.get("BORG_FILES_CACHE_TTL", 2))
files_cache_logger.debug("FILES-CACHE-SAVE: starting...") files_cache_logger.debug("FILES-CACHE-SAVE: starting...")
# TODO: use something like SaveFile here, but that didn't work due to SyncFile missing .seek(). # TODO: use something like SaveFile here, but that didn't work due to SyncFile missing .seek().
with IntegrityCheckedFile(path=os.path.join(self.path, self.files_cache_name()), write=True) as fd: with IntegrityCheckedFile(path=os.path.join(self.path, self.files_cache_name()), write=True) as fd:
@ -469,7 +505,7 @@ def _write_files_cache(self, files):
entry = FileCacheEntry(*msgpack.unpackb(item)) entry = FileCacheEntry(*msgpack.unpackb(item))
if ( if (
entry.age == 0 entry.age == 0
and timestamp_to_int(entry.cmtime) < self._newest_cmtime and max(timestamp_to_int(entry.ctime), timestamp_to_int(entry.mtime)) < self._newest_cmtime
or entry.age > 0 or entry.age > 0
and entry.age < ttl and entry.age < ttl
): ):
@ -517,10 +553,10 @@ def file_known_and_unchanged(self, hashed_path, path_hash, st):
if "i" in cache_mode and entry.inode != st.st_ino: if "i" in cache_mode and entry.inode != st.st_ino:
files_cache_logger.debug("KNOWN-CHANGED: file inode number has changed: %r", hashed_path) files_cache_logger.debug("KNOWN-CHANGED: file inode number has changed: %r", hashed_path)
return True, None return True, None
if "c" in cache_mode and timestamp_to_int(entry.cmtime) != st.st_ctime_ns: if "c" in cache_mode and timestamp_to_int(entry.ctime) != st.st_ctime_ns:
files_cache_logger.debug("KNOWN-CHANGED: file ctime has changed: %r", hashed_path) files_cache_logger.debug("KNOWN-CHANGED: file ctime has changed: %r", hashed_path)
return True, None return True, None
elif "m" in cache_mode and timestamp_to_int(entry.cmtime) != st.st_mtime_ns: if "m" in cache_mode and timestamp_to_int(entry.mtime) != st.st_mtime_ns:
files_cache_logger.debug("KNOWN-CHANGED: file mtime has changed: %r", hashed_path) files_cache_logger.debug("KNOWN-CHANGED: file mtime has changed: %r", hashed_path)
return True, None return True, None
# we ignored the inode number in the comparison above or it is still same. # we ignored the inode number in the comparison above or it is still same.
@ -538,30 +574,25 @@ def file_known_and_unchanged(self, hashed_path, path_hash, st):
def memorize_file(self, hashed_path, path_hash, st, chunks): def memorize_file(self, hashed_path, path_hash, st, chunks):
if not stat.S_ISREG(st.st_mode): if not stat.S_ISREG(st.st_mode):
return return
cache_mode = self.cache_mode
# note: r(echunk) modes will update the files cache, d(isabled) mode won't # note: r(echunk) modes will update the files cache, d(isabled) mode won't
if "d" in cache_mode: if "d" in self.cache_mode:
files_cache_logger.debug("FILES-CACHE-NOUPDATE: files cache disabled") files_cache_logger.debug("FILES-CACHE-NOUPDATE: files cache disabled")
return return
if "c" in cache_mode: ctime_ns = safe_ns(st.st_ctime_ns)
cmtime_type = "ctime" mtime_ns = safe_ns(st.st_mtime_ns)
cmtime_ns = safe_ns(st.st_ctime_ns)
elif "m" in cache_mode:
cmtime_type = "mtime"
cmtime_ns = safe_ns(st.st_mtime_ns)
else: # neither 'c' nor 'm' in cache_mode, avoid UnboundLocalError
cmtime_type = "ctime"
cmtime_ns = safe_ns(st.st_ctime_ns)
entry = FileCacheEntry( entry = FileCacheEntry(
age=0, inode=st.st_ino, size=st.st_size, cmtime=int_to_timestamp(cmtime_ns), chunks=chunks age=0,
inode=st.st_ino,
size=st.st_size,
ctime=int_to_timestamp(ctime_ns),
mtime=int_to_timestamp(mtime_ns),
chunks=chunks,
) )
self.files[path_hash] = msgpack.packb(entry) self.files[path_hash] = msgpack.packb(entry)
self._newest_cmtime = max(self._newest_cmtime or 0, cmtime_ns) self._newest_cmtime = max(self._newest_cmtime or 0, ctime_ns)
self._newest_cmtime = max(self._newest_cmtime or 0, mtime_ns)
files_cache_logger.debug( files_cache_logger.debug(
"FILES-CACHE-UPDATE: put %r [has %s] <- %r", "FILES-CACHE-UPDATE: put %r <- %r", entry._replace(chunks="[%d entries]" % len(entry.chunks)), hashed_path
entry._replace(chunks="[%d entries]" % len(entry.chunks)),
cmtime_type,
hashed_path,
) )
@ -615,7 +646,7 @@ def seen_chunk(self, id, size=None):
if entry.refcount and size is not None: if entry.refcount and size is not None:
assert isinstance(entry.size, int) assert isinstance(entry.size, int)
if not entry.size: if not entry.size:
# AdHocWithFilesCache / AdHocCache: # AdHocWithFilesCache:
# Here *size* is used to update the chunk's size information, which will be zero for existing chunks. # Here *size* is used to update the chunk's size information, which will be zero for existing chunks.
self.chunks[id] = entry._replace(size=size) self.chunks[id] = entry._replace(size=size)
return entry.refcount != 0 return entry.refcount != 0
@ -659,7 +690,13 @@ def add_chunk(
class AdHocWithFilesCache(FilesCacheMixin, ChunksMixin): class AdHocWithFilesCache(FilesCacheMixin, ChunksMixin):
""" """
Like AdHocCache, but with a files cache. An ad-hoc chunks and files cache.
Chunks: it does not maintain accurate reference count.
Chunks that were not added during the current lifetime won't have correct size set (0 bytes)
and will have an infinite reference count (MAX_VALUE).
Files: if a previous_archive_id is given, ad-hoc build a in-memory files cache from that archive.
""" """
def __init__( def __init__(
@ -668,16 +705,15 @@ def __init__(
path=None, path=None,
warn_if_unencrypted=True, warn_if_unencrypted=True,
progress=False, progress=False,
lock_wait=None,
cache_mode=FILES_CACHE_MODE_DISABLED, cache_mode=FILES_CACHE_MODE_DISABLED,
iec=False, iec=False,
archive_name=None,
): ):
""" """
:param warn_if_unencrypted: print warning if accessing unknown unencrypted repository :param warn_if_unencrypted: print warning if accessing unknown unencrypted repository
:param lock_wait: timeout for lock acquisition (int [s] or None [wait forever])
:param cache_mode: what shall be compared in the file stat infos vs. cached stat infos comparison :param cache_mode: what shall be compared in the file stat infos vs. cached stat infos comparison
""" """
FilesCacheMixin.__init__(self, cache_mode) FilesCacheMixin.__init__(self, cache_mode, archive_name)
ChunksMixin.__init__(self) ChunksMixin.__init__(self)
assert isinstance(manifest, Manifest) assert isinstance(manifest, Manifest)
self.manifest = manifest self.manifest = manifest
@ -688,7 +724,7 @@ def __init__(
self.path = cache_dir(self.repository, path) self.path = cache_dir(self.repository, path)
self.security_manager = SecurityManager(self.repository) self.security_manager = SecurityManager(self.repository)
self.cache_config = CacheConfig(self.repository, self.path, lock_wait) self.cache_config = CacheConfig(self.repository, self.path)
# Warn user before sending data to a never seen before unencrypted repository # Warn user before sending data to a never seen before unencrypted repository
if not os.path.exists(self.path): if not os.path.exists(self.path):
@ -721,7 +757,6 @@ def create(self):
with open(os.path.join(self.path, "README"), "w") as fd: with open(os.path.join(self.path, "README"), "w") as fd:
fd.write(CACHE_README) fd.write(CACHE_README)
self.cache_config.create() self.cache_config.create()
self._create_empty_files_cache(self.path)
def open(self): def open(self):
if not os.path.isdir(self.path): if not os.path.isdir(self.path):
@ -757,7 +792,6 @@ def check_cache_compatibility(self):
def wipe_cache(self): def wipe_cache(self):
logger.warning("Discarding incompatible cache and forcing a cache rebuild") logger.warning("Discarding incompatible cache and forcing a cache rebuild")
self._chunks = ChunkIndex() self._chunks = ChunkIndex()
self._create_empty_files_cache(self.path)
self.cache_config.manifest_id = "" self.cache_config.manifest_id = ""
self.cache_config._config.set("cache", "manifest", "") self.cache_config._config.set("cache", "manifest", "")
@ -773,46 +807,3 @@ def update_compatibility(self):
self.cache_config.ignored_features.update(repo_features - my_features) self.cache_config.ignored_features.update(repo_features - my_features)
self.cache_config.mandatory_features.update(repo_features & my_features) self.cache_config.mandatory_features.update(repo_features & my_features)
class AdHocCache(ChunksMixin):
"""
Ad-hoc, non-persistent cache.
The AdHocCache does not maintain accurate reference count, nor does it provide a files cache
(which would require persistence).
Chunks that were not added during the current AdHocCache lifetime won't have correct size set
(0 bytes) and will have an infinite reference count (MAX_VALUE).
"""
def __init__(self, manifest, warn_if_unencrypted=True, lock_wait=None, iec=False):
ChunksMixin.__init__(self)
assert isinstance(manifest, Manifest)
self.manifest = manifest
self.repository = manifest.repository
self.key = manifest.key
self.repo_objs = manifest.repo_objs
self.security_manager = SecurityManager(self.repository)
self.security_manager.assert_secure(manifest, self.key, lock_wait=lock_wait)
# Public API
def __enter__(self):
self._chunks = None
return self
def __exit__(self, exc_type, exc_val, exc_tb):
if exc_type is None:
self.security_manager.save(self.manifest, self.key)
self._chunks = None
files = None # type: ignore
cache_mode = "d"
def file_known_and_unchanged(self, hashed_path, path_hash, st):
files_cache_logger.debug("UNKNOWN: files cache not implemented")
return False, None
def memorize_file(self, hashed_path, path_hash, st, chunks):
pass

View File

@ -1189,7 +1189,7 @@ def default(self, o):
from ..legacyremote import LegacyRemoteRepository from ..legacyremote import LegacyRemoteRepository
from ..remote import RemoteRepository from ..remote import RemoteRepository
from ..archive import Archive from ..archive import Archive
from ..cache import AdHocCache, AdHocWithFilesCache from ..cache import AdHocWithFilesCache
if isinstance(o, (LegacyRepository, LegacyRemoteRepository)) or isinstance(o, (Repository, RemoteRepository)): if isinstance(o, (LegacyRepository, LegacyRemoteRepository)) or isinstance(o, (Repository, RemoteRepository)):
return {"id": bin_to_hex(o.id), "location": o._location.canonical_path()} return {"id": bin_to_hex(o.id), "location": o._location.canonical_path()}
@ -1197,8 +1197,6 @@ def default(self, o):
return o.info() return o.info()
if isinstance(o, (AdHocWithFilesCache,)): if isinstance(o, (AdHocWithFilesCache,)):
return {"path": o.path} return {"path": o.path}
if isinstance(o, AdHocCache):
return {}
if callable(getattr(o, "to_json", None)): if callable(getattr(o, "to_json", None)):
return o.to_json() return o.to_json()
return super().default(o) return super().default(o)

View File

@ -1,4 +1,3 @@
import io
import json import json
import os import os
from configparser import ConfigParser from configparser import ConfigParser
@ -16,26 +15,6 @@ def corrupt_archiver(archiver):
archiver.cache_path = json.loads(cmd(archiver, "repo-info", "--json"))["cache"].get("path") archiver.cache_path = json.loads(cmd(archiver, "repo-info", "--json"))["cache"].get("path")
def corrupt(file, amount=1):
with open(file, "r+b") as fd:
fd.seek(-amount, io.SEEK_END)
corrupted = bytes(255 - c for c in fd.read(amount))
fd.seek(-amount, io.SEEK_END)
fd.write(corrupted)
def test_cache_files(archiver):
corrupt_archiver(archiver)
if archiver.cache_path is None:
pytest.skip("no cache path for this kind of Cache implementation")
cmd(archiver, "create", "test", "input")
corrupt(os.path.join(archiver.cache_path, "files"))
out = cmd(archiver, "create", "test1", "input")
# borg warns about the corrupt files cache, but then continues without files cache.
assert "files cache is corrupted" in out
def test_old_version_interfered(archiver): def test_old_version_interfered(archiver):
corrupt_archiver(archiver) corrupt_archiver(archiver)
if archiver.cache_path is None: if archiver.cache_path is None:

View File

@ -11,7 +11,6 @@
import pytest import pytest
from ... import platform from ... import platform
from ...cache import get_cache_impl
from ...constants import * # NOQA from ...constants import * # NOQA
from ...manifest import Manifest from ...manifest import Manifest
from ...platform import is_cygwin, is_win32, is_darwin from ...platform import is_cygwin, is_win32, is_darwin
@ -525,25 +524,6 @@ def test_create_pattern_intermediate_folders_first(archivers, request):
assert out_list.index("d x/b") < out_list.index("- x/b/foo_b") assert out_list.index("d x/b") < out_list.index("- x/b/foo_b")
@pytest.mark.skipif(get_cache_impl() != "adhoc", reason="only works with AdHocCache")
def test_create_no_cache_sync_adhoc(archivers, request): # TODO: add test for AdHocWithFilesCache
archiver = request.getfixturevalue(archivers)
create_test_files(archiver.input_path)
cmd(archiver, "repo-create", RK_ENCRYPTION)
cmd(archiver, "repo-delete", "--cache-only")
create_json = json.loads(
cmd(archiver, "create", "--no-cache-sync", "--prefer-adhoc-cache", "--json", "test", "input")
)
info_json = json.loads(cmd(archiver, "info", "-a", "test", "--json"))
create_stats = create_json["cache"]["stats"]
info_stats = info_json["cache"]["stats"]
assert create_stats == info_stats
cmd(archiver, "repo-delete", "--cache-only")
cmd(archiver, "create", "--no-cache-sync", "--prefer-adhoc-cache", "test2", "input")
cmd(archiver, "repo-info")
cmd(archiver, "check")
def test_create_archivename_with_placeholder(archivers, request): def test_create_archivename_with_placeholder(archivers, request):
archiver = request.getfixturevalue(archivers) archiver = request.getfixturevalue(archivers)
create_test_files(archiver.input_path) create_test_files(archiver.input_path)
@ -676,7 +656,7 @@ def test_file_status(archivers, request):
assert "A input/file1" in output assert "A input/file1" in output
assert "A input/file2" in output assert "A input/file2" in output
# should find first file as unmodified # should find first file as unmodified
output = cmd(archiver, "create", "--list", "test2", "input") output = cmd(archiver, "create", "--list", "test", "input")
assert "U input/file1" in output assert "U input/file1" in output
# although surprising, this is expected. For why, see: # although surprising, this is expected. For why, see:
# https://borgbackup.readthedocs.org/en/latest/faq.html#i-am-seeing-a-added-status-for-a-unchanged-file # https://borgbackup.readthedocs.org/en/latest/faq.html#i-am-seeing-a-added-status-for-a-unchanged-file
@ -693,13 +673,13 @@ def test_file_status_cs_cache_mode(archivers, request):
time.sleep(1) # file2 must have newer timestamps than file1 time.sleep(1) # file2 must have newer timestamps than file1
create_regular_file(archiver.input_path, "file2", size=10) create_regular_file(archiver.input_path, "file2", size=10)
cmd(archiver, "repo-create", RK_ENCRYPTION) cmd(archiver, "repo-create", RK_ENCRYPTION)
cmd(archiver, "create", "test1", "input", "--list", "--files-cache=ctime,size") cmd(archiver, "create", "test", "input", "--list", "--files-cache=ctime,size")
# modify file1, but cheat with the mtime (and atime) and also keep same size: # modify file1, but cheat with the mtime (and atime) and also keep same size:
st = os.stat("input/file1") st = os.stat("input/file1")
create_regular_file(archiver.input_path, "file1", contents=b"321") create_regular_file(archiver.input_path, "file1", contents=b"321")
os.utime("input/file1", ns=(st.st_atime_ns, st.st_mtime_ns)) os.utime("input/file1", ns=(st.st_atime_ns, st.st_mtime_ns))
# this mode uses ctime for change detection, so it should find file1 as modified # this mode uses ctime for change detection, so it should find file1 as modified
output = cmd(archiver, "create", "test2", "input", "--list", "--files-cache=ctime,size") output = cmd(archiver, "create", "test", "input", "--list", "--files-cache=ctime,size")
assert "M input/file1" in output assert "M input/file1" in output
@ -710,12 +690,12 @@ def test_file_status_ms_cache_mode(archivers, request):
time.sleep(1) # file2 must have newer timestamps than file1 time.sleep(1) # file2 must have newer timestamps than file1
create_regular_file(archiver.input_path, "file2", size=10) create_regular_file(archiver.input_path, "file2", size=10)
cmd(archiver, "repo-create", RK_ENCRYPTION) cmd(archiver, "repo-create", RK_ENCRYPTION)
cmd(archiver, "create", "--list", "--files-cache=mtime,size", "test1", "input") cmd(archiver, "create", "--list", "--files-cache=mtime,size", "test", "input")
# change mode of file1, no content change: # change mode of file1, no content change:
st = os.stat("input/file1") st = os.stat("input/file1")
os.chmod("input/file1", st.st_mode ^ stat.S_IRWXO) # this triggers a ctime change, but mtime is unchanged os.chmod("input/file1", st.st_mode ^ stat.S_IRWXO) # this triggers a ctime change, but mtime is unchanged
# this mode uses mtime for change detection, so it should find file1 as unmodified # this mode uses mtime for change detection, so it should find file1 as unmodified
output = cmd(archiver, "create", "--list", "--files-cache=mtime,size", "test2", "input") output = cmd(archiver, "create", "--list", "--files-cache=mtime,size", "test", "input")
assert "U input/file1" in output assert "U input/file1" in output
@ -726,9 +706,9 @@ def test_file_status_rc_cache_mode(archivers, request):
time.sleep(1) # file2 must have newer timestamps than file1 time.sleep(1) # file2 must have newer timestamps than file1
create_regular_file(archiver.input_path, "file2", size=10) create_regular_file(archiver.input_path, "file2", size=10)
cmd(archiver, "repo-create", RK_ENCRYPTION) cmd(archiver, "repo-create", RK_ENCRYPTION)
cmd(archiver, "create", "--list", "--files-cache=rechunk,ctime", "test1", "input") cmd(archiver, "create", "--list", "--files-cache=rechunk,ctime", "test", "input")
# no changes here, but this mode rechunks unconditionally # no changes here, but this mode rechunks unconditionally
output = cmd(archiver, "create", "--list", "--files-cache=rechunk,ctime", "test2", "input") output = cmd(archiver, "create", "--list", "--files-cache=rechunk,ctime", "test", "input")
assert "A input/file1" in output assert "A input/file1" in output
@ -748,7 +728,7 @@ def test_file_status_excluded(archivers, request):
if has_lchflags: if has_lchflags:
assert "- input/file3" in output assert "- input/file3" in output
# should find second file as excluded # should find second file as excluded
output = cmd(archiver, "create", "test1", "input", "--list", "--exclude-nodump", "--exclude", "*/file2") output = cmd(archiver, "create", "test", "input", "--list", "--exclude-nodump", "--exclude", "*/file2")
assert "U input/file1" in output assert "U input/file1" in output
assert "- input/file2" in output assert "- input/file2" in output
if has_lchflags: if has_lchflags:
@ -781,14 +761,14 @@ def to_dict(borg_create_output):
create_regular_file(archiver.input_path, "testfile1", contents=b"test1") create_regular_file(archiver.input_path, "testfile1", contents=b"test1")
time.sleep(1.0 if is_darwin else 0.01) # testfile2 must have newer timestamps than testfile1 time.sleep(1.0 if is_darwin else 0.01) # testfile2 must have newer timestamps than testfile1
create_regular_file(archiver.input_path, "testfile2", contents=b"test2") create_regular_file(archiver.input_path, "testfile2", contents=b"test2")
result = cmd(archiver, "create", "--stats", "test_archive2", archiver.input_path) result = cmd(archiver, "create", "--stats", "test_archive", archiver.input_path)
result = to_dict(result) result = to_dict(result)
assert result["Added files"] == 2 assert result["Added files"] == 2
assert result["Unchanged files"] == 0 assert result["Unchanged files"] == 0
assert result["Modified files"] == 0 assert result["Modified files"] == 0
# Archive a dir with 1 unmodified file and 1 modified # Archive a dir with 1 unmodified file and 1 modified
create_regular_file(archiver.input_path, "testfile1", contents=b"new data") create_regular_file(archiver.input_path, "testfile1", contents=b"new data")
result = cmd(archiver, "create", "--stats", "test_archive3", archiver.input_path) result = cmd(archiver, "create", "--stats", "test_archive", archiver.input_path)
result = to_dict(result) result = to_dict(result)
# Should process testfile2 as added because of # Should process testfile2 as added because of
# https://borgbackup.readthedocs.io/en/stable/faq.html#i-am-seeing-a-added-status-for-an-unchanged-file # https://borgbackup.readthedocs.io/en/stable/faq.html#i-am-seeing-a-added-status-for-an-unchanged-file
@ -826,18 +806,18 @@ def test_create_topical(archivers, request):
output = cmd(archiver, "create", "test", "input") output = cmd(archiver, "create", "test", "input")
assert "file1" not in output assert "file1" not in output
# shouldn't be listed even if unchanged # shouldn't be listed even if unchanged
output = cmd(archiver, "create", "test0", "input") output = cmd(archiver, "create", "test", "input")
assert "file1" not in output assert "file1" not in output
# should list the file as unchanged # should list the file as unchanged
output = cmd(archiver, "create", "test1", "input", "--list", "--filter=U") output = cmd(archiver, "create", "test", "input", "--list", "--filter=U")
assert "file1" in output assert "file1" in output
# should *not* list the file as changed # should *not* list the file as changed
output = cmd(archiver, "create", "test2", "input", "--list", "--filter=AM") output = cmd(archiver, "create", "test", "input", "--list", "--filter=AM")
assert "file1" not in output assert "file1" not in output
# change the file # change the file
create_regular_file(archiver.input_path, "file1", size=1024 * 100) create_regular_file(archiver.input_path, "file1", size=1024 * 100)
# should list the file as changed # should list the file as changed
output = cmd(archiver, "create", "test3", "input", "--list", "--filter=AM") output = cmd(archiver, "create", "test", "input", "--list", "--filter=AM")
assert "file1" in output assert "file1" in output

View File

@ -1,17 +1,17 @@
import os.path import os
import pytest import pytest
from .hashindex import H from .hashindex import H
from .key import TestKey from .key import TestKey
from ..archive import Statistics from ..archive import Statistics
from ..cache import AdHocCache from ..cache import AdHocWithFilesCache
from ..crypto.key import AESOCBRepoKey from ..crypto.key import AESOCBRepoKey
from ..manifest import Manifest from ..manifest import Manifest
from ..repository import Repository from ..repository import Repository
class TestAdHocCache: class TestAdHocWithFilesCache:
@pytest.fixture @pytest.fixture
def repository(self, tmpdir): def repository(self, tmpdir):
self.repository_location = os.path.join(str(tmpdir), "repository") self.repository_location = os.path.join(str(tmpdir), "repository")
@ -32,7 +32,7 @@ def manifest(self, repository, key):
@pytest.fixture @pytest.fixture
def cache(self, repository, key, manifest): def cache(self, repository, key, manifest):
return AdHocCache(manifest) return AdHocWithFilesCache(manifest)
def test_does_not_contain_manifest(self, cache): def test_does_not_contain_manifest(self, cache):
assert not cache.seen_chunk(Manifest.MANIFEST_ID) assert not cache.seen_chunk(Manifest.MANIFEST_ID)
@ -40,11 +40,6 @@ def test_does_not_contain_manifest(self, cache):
def test_seen_chunk_add_chunk_size(self, cache): def test_seen_chunk_add_chunk_size(self, cache):
assert cache.add_chunk(H(1), {}, b"5678", stats=Statistics()) == (H(1), 4) assert cache.add_chunk(H(1), {}, b"5678", stats=Statistics()) == (H(1), 4)
def test_files_cache(self, cache):
assert cache.file_known_and_unchanged(b"foo", bytes(32), None) == (False, None)
assert cache.cache_mode == "d"
assert cache.files is None
def test_reuse_after_add_chunk(self, cache): def test_reuse_after_add_chunk(self, cache):
assert cache.add_chunk(H(3), {}, b"5678", stats=Statistics()) == (H(3), 4) assert cache.add_chunk(H(3), {}, b"5678", stats=Statistics()) == (H(3), 4)
assert cache.reuse_chunk(H(3), 4, Statistics()) == (H(3), 4) assert cache.reuse_chunk(H(3), 4, Statistics()) == (H(3), 4)
@ -52,3 +47,9 @@ def test_reuse_after_add_chunk(self, cache):
def test_existing_reuse_after_add_chunk(self, cache): def test_existing_reuse_after_add_chunk(self, cache):
assert cache.add_chunk(H(1), {}, b"5678", stats=Statistics()) == (H(1), 4) assert cache.add_chunk(H(1), {}, b"5678", stats=Statistics()) == (H(1), 4)
assert cache.reuse_chunk(H(1), 4, Statistics()) == (H(1), 4) assert cache.reuse_chunk(H(1), 4, Statistics()) == (H(1), 4)
def test_files_cache(self, cache):
st = os.stat(".")
assert cache.file_known_and_unchanged(b"foo", bytes(32), st) == (False, None)
assert cache.cache_mode == "d"
assert cache.files == {}

View File

@ -104,7 +104,6 @@ def archiver(tmp_path, set_env_variables):
archiver.patterns_file_path = os.fspath(tmp_path / "patterns") archiver.patterns_file_path = os.fspath(tmp_path / "patterns")
os.environ["BORG_KEYS_DIR"] = archiver.keys_path os.environ["BORG_KEYS_DIR"] = archiver.keys_path
os.environ["BORG_CACHE_DIR"] = archiver.cache_path os.environ["BORG_CACHE_DIR"] = archiver.cache_path
# os.environ["BORG_CACHE_IMPL"] = "adhocwithfiles"
os.mkdir(archiver.input_path) os.mkdir(archiver.input_path)
os.chmod(archiver.input_path, 0o777) # avoid troubles with fakeroot / FUSE os.chmod(archiver.input_path, 0o777) # avoid troubles with fakeroot / FUSE
os.mkdir(archiver.output_path) os.mkdir(archiver.output_path)