sapling/eden/scm/tests/test-fb-hgext-remotefilelog-ruststores-lfs-duplicated.t

34 lines
1.2 KiB
Perl
Raw Normal View History

#chg-compatible
$ newserver master
$ setconfig extensions.lfs= lfs.url=file:$TESTTMP/lfs-server
$ clone master shallow --noupdate
$ switchrepo shallow
$ setconfig extensions.lfs= lfs.url=file:$TESTTMP/lfs-server lfs.threshold=10B
$ echo "THIS IS AN LFS BLOB" > x
$ hg commit -qAm x
# Copy the packfiles that contain LFS pointers before they get removed by the following repack.
$ cp .hg/store/packs/*.data{pack,idx} $TESTTMP
$ setconfig remotefilelog.lfs=True remotefilelog.localdatarepack=True
revisionstore: disallow reading LFS pointers from packfiles Summary: For repositories that have the old-style LFS extension enabled, the pointers are stored in packfiles/indexedlog alongside with a flag that signify to the upper layers that the blob is externally stored. With the new way of doing LFS, pointers are stored separately. When both are enabled, we are observing some interesting behavior where different get and get_meta calls may return different blobs/metadata for the same filenode. This may happen if a filenode is stored in both a packfile as an LFS pointers, and in the LFS store. Guaranteeing that the revisionstore code is deterministic in this situation is unfortunately way too costly (a get_meta call would for instance have to fully validate the sha256 of the blob, and this wouldn't guarantee that it wouldn't become corrupted on disk before calling get). The solution take here is to simply ignore all the lfs pointers from packfiles/indexedlog when remotefilelog.lfs is enabled. This way, there is no risk of reading the metadata from the packfiles, and the blob from the LFSStore. This brings however another complication for the user created blobs: these are stored in packfiles and would thus become unreadable, the solution is to simply perform a one-time full repack of the local store to make sure that all the pointers are moved from the packfiles to to LFSStore. In the code, the Python bindings are using ExtStoredPolicy::Ignore directly as these are only used in the treemanifest code where no LFS pointers should be present, the repack code uses ExtStoredPolicy::Use to be able to read the pointers, it wouldn't be able to otherwise. Reviewed By: DurhamG Differential Revision: D22951598 fbshipit-source-id: 0e929708ba5a3bb2a02c0891fd62dae1ccf18204
2020-09-10 04:26:00 +03:00
$ setconfig remotefilelog.maintenance.timestamp.localrepack=1 remotefilelog.maintenance=localrepack
$ hg repack
revisionstore: disallow reading LFS pointers from packfiles Summary: For repositories that have the old-style LFS extension enabled, the pointers are stored in packfiles/indexedlog alongside with a flag that signify to the upper layers that the blob is externally stored. With the new way of doing LFS, pointers are stored separately. When both are enabled, we are observing some interesting behavior where different get and get_meta calls may return different blobs/metadata for the same filenode. This may happen if a filenode is stored in both a packfile as an LFS pointers, and in the LFS store. Guaranteeing that the revisionstore code is deterministic in this situation is unfortunately way too costly (a get_meta call would for instance have to fully validate the sha256 of the blob, and this wouldn't guarantee that it wouldn't become corrupted on disk before calling get). The solution take here is to simply ignore all the lfs pointers from packfiles/indexedlog when remotefilelog.lfs is enabled. This way, there is no risk of reading the metadata from the packfiles, and the blob from the LFSStore. This brings however another complication for the user created blobs: these are stored in packfiles and would thus become unreadable, the solution is to simply perform a one-time full repack of the local store to make sure that all the pointers are moved from the packfiles to to LFSStore. In the code, the Python bindings are using ExtStoredPolicy::Ignore directly as these are only used in the treemanifest code where no LFS pointers should be present, the repack code uses ExtStoredPolicy::Use to be able to read the pointers, it wouldn't be able to otherwise. Reviewed By: DurhamG Differential Revision: D22951598 fbshipit-source-id: 0e929708ba5a3bb2a02c0891fd62dae1ccf18204
2020-09-10 04:26:00 +03:00
Running a one-time local repack, this may take some time
Done with one-time local repack
# Copy back the packfiles. We now have a filenode with pointer in 2 different location, the packfile, and the lfs store.
$ cp "$TESTTMP/"*.data{pack,idx} .hg/store/packs
# Make sure that bundle isn't confused by this.
$ hg bundle -q -r . $TESTTMP/test-bundle
$ clone master shallow2 --noupdate
$ switchrepo shallow2
$ setconfig remotefilelog.lfs=True lfs.url=file:$TESTTMP/lfs-server lfs.threshold=10GB
$ hg unbundle -q -u $TESTTMP/test-bundle
$ cat x
THIS IS AN LFS BLOB