Commit Graph

8 Commits

Author SHA1 Message Date
Jun Wu
034f666e1c lfs: use sortdict to stablize tests
Summary: Previously the verbose output is not stable.

Differential Revision: D6901941

fbshipit-source-id: 2423612f3a620d92a0bd515f311ca70614bbe001
2018-04-13 21:51:04 -07:00
Jun Wu
fb3c9c5675 lfs: add adhoc commands to upload and download blobs outside a repo
Summary:
We'd like to have a lightweight, portable tool to upload and download certain
binaries without checking them in.

This diff makes LFS do that. It's intended to be used for Rust vendored assets,
Cython and other dependencies' tarballs, and a prebuilt Python using MSVC on
Windows.

Comparing with dewey, this would unblock our Windows build.

Test Plan: Added a test.

Reviewers: durham, #mercurial

Reviewed By: durham

Subscribers: durham

Differential Revision: https://phabricator.intern.facebook.com/D6699099

Signature: 6699099:1515633647:2fb90c8ecb4395b0b12e8e8baf1c5ee7fa4d84b0
2018-01-10 16:07:18 -08:00
Jun Wu
f076d26725 lfs: remove internal url in test
Summary:
D5617283 requires test-lfs-test-server.t to be able to resolve an internal
domain and also requires some certain implementation detail at the endpoint
(ex. set error.code).

That will not work outside internal network. Let's change the test to just
use `lfs-test-server`. Note the logic also needs to be changed sicne
lfs-test-server does not set error.code to 404 but just removes "download"
from "actions".

Test Plan: Ran all tests.

Reviewers: durham, davidsp, #mercurial

Reviewed By: durham

Differential Revision: https://phabricator.intern.facebook.com/D6698170

Signature: 6698170:1515633241:2496a4c02de6916a8f74ac67c4628e6e3a049b1b
2018-01-10 15:08:16 -08:00
Wojciech Lis
f2ebe55279 lfs: using workers in lfs prefetch
This significantly speeds up lfs prefetch. With fast network we are
seeing ~50% improvement of overall prefetch times
Because of worker's API in posix we do lose finegrained progress update and only
see progress when a file finished downloading.

Test Plan:
Run tests:
./run-tests.py -l test-lfs*
....
# Ran 4 tests, 0 skipped, 0 failed.
Run commands resulting in lfs prefetch e.g. hg sparse --enable-profile

Differential Revision: https://phab.mercurial-scm.org/D1568
2017-12-11 17:02:02 -08:00
Matt Harbison
1eac95a707 lfs: introduce a user level cache for lfs files
This is the same mechanism in place for largefiles, and solves several problems
working with multiple local repositories.  The existing largefiles method is
reused in place, because I suspect that there are other functions that can be
shared.  If we wait a bit to identify more before `hg cp lfutil.py ...`, the
history will be easier to trace.

The push between repo14 and repo15 in test-lfs.t arguably shouldn't be uploading
any files with a local push.  Maybe we can revisit that when `hg push` without
'lfs.url' can upload files to the push destination.  Then it would be consistent
for blobs in a local push to be linked to the local destination's cache.

The cache property is added to run-tests.py, the same as the largefiles
property, so that test generated files don't pollute the real location.  Having
files available locally broke a couple existing lfs-test-server tests, so the
cache is cleared in a few places to force file download.
2017-12-06 22:56:15 -05:00
Matt Harbison
555700adde test-lfs: allow the test server to be killed on Windows
Apparently '$!' doesn't return a Win32 PID, so the process was never killed, and
the next run was screwed up.  Oddly, without the explicit killdaemons.py at the
end, the test seems to hang.  This spawning is just sad, so I limited it to
Windows.
2017-11-21 00:24:09 -05:00
Matt Harbison
f4937719cf hghave: add a check for lfs-test-server
This is consistent with how the other tests require a feature.
2017-11-14 22:35:42 -05:00
Matt Harbison
85811b33f9 lfs: import the Facebook git-lfs client extension
The purpose of this is the same as the built-in largefiles extension- to handle
huge files outside of the normal storage system, generally to keep the amount of
data cloned to a lower amount.  There are several benefits of implementing the
git-lfs protocol, instead of using the largefiles extension:

  - Bitbucket and Github support (and probably wider support in 3rd party
    hosting sites in general). [1][2]

  - The number of hg internals monkey patched are several orders of magnitude
    lower, so it will be easier to reason about and maintain.  Future commands
    will likely just work, without requiring various wrappers.

  - The "standin" files are only written to the filelog, not the disk.  That
    should avoid weird edge cases where the largefile and standin files get out
    of sync. [3]  It also avoids the occasional printing of the "hidden" standin
    file in various messages.

  - Filesets like size() will work, even if the file isn't present.  (It always
    says 41 bytes for largefiles, whether present or not.)

The only place that I see where largefiles comes out on top is that it works
with `hg serve` for simple sharing, without external infrastructure.  Getting
lfs-test-server working was a hassle, and took awhile to figure out.  Maybe we
can do something to make it work in the future.

Long term, I expect that this will be highly preferred over largefiles.  But if
we are to recommend this to largefile users, there are some UI issues to
bikeshed.  Until they are resolved, I've marked this experimental, and am not
putting a pointer to this in the largefiles help.  The (non exhaustive) list of
issues I've seen so far are:

  - It isn't sufficient to just enable the largefiles extension- you have to
    explicitly add a file with --large before it will pay attention to the
    configured sizes and patterns on future adds.  The justification being that
    once you use it, you're stuck with it.  I've seen people confused by this,
    and haven't liked it myself.  But it's also saved me a few times.  Should we
    do something like have a specific enabling config setting that must be set
    in the local repo config, so that enabling this extension in the user or
    system hgrc doesn't silently start storing lfs files?

  - The largefiles extension adds a repo requirement when the first largefile is
    committed, so that the extension must always be enabled in the future.  This
    extension is not doing that, and since I only enabled it locally to avoid
    infecting other repos, I got a cryptic error about missing flag processors
    when I cloned.  Is there no repo requirement due to shallow/narrow clone
    considerations (or other future advanced things)?

  - In the (small amount of) reading I've done about the git implementation, it
    seems that the files and sizes are stored in a tracked .gitattributes file.
    I think a tracked file for this would be extremely useful for consistency
    across developers, but this kind of touches on the tracked hgrc file
    proposal a few months back.

  - The git client can specify file patterns, not just sizes.

  - The largefiles extension has a cache directory in the local repo, but also a
    system wide one.  We should probably implement a system wide cache too, so
    that multiple clones don't have to refetch the files from the server.

  - Jun mentioned other missing features, like SSH authentication, gc, etc.

The code corresponds to c0492b73c7ef in hg-experimental. [4]  The only tweaks
are to load the extension in the tests with 'lfs=' instead of
'lfs=$TESTDIR/../hgext3rd/lfs', change the import in the *.py test to hgext
(from hgext3rd), add the 'testedwith' declaration, and mark it experimental for
now.  The infinite-push, p4fastimport, and remotefilelog tests were left behind.

The devel-warnings for unregistered config options are not corrected yet, nor
are the import check warnings.

[1] https://www.mercurial-scm.org/pipermail/mercurial/2017-November/050699.html
[2] https://bitbucket.org/site/master/issues/3843/largefiles-support-bb-3903
[3] https://bz.mercurial-scm.org/show_bug.cgi?id=5738
[4] https://bitbucket.org/facebook/hg-experimental
2017-11-14 00:06:23 -05:00