Commit Graph

28 Commits

Author SHA1 Message Date
Jun Wu
4532efb04d lfs: split prepush hook into individual functions
Summary:
This makes it possible to reuse part of them - like uploading blobs for
given revisions without going through prepush hook.

`pointer.tostoreids()` was changed to `pointer.tostoreid()` to simplify
things a bit.

Unnecessary remoterepo assignment was removed.

Test Plan: `arc unit`

Reviewers: #mercurial, davidsp, rmcelroy

Reviewed By: rmcelroy

Subscribers: rmcelroy, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D5009560

Signature: t1:5009560:1494230285:6469a2701baa8cfa4511a08149a37fc429733343
2017-05-08 11:20:50 -07:00
Jun Wu
3793eee754 lfs: remove 40 char length limit
Summary:
The `lfs-test-server` reference implementation [1] validates the hash, and
40-byte sha256 hexdigest will not pass that check.

[1]: https://github.com/git-lfs/lfs-test-server

Test Plan: A test will be added in the next diff.

Reviewers: #mercurial, davidsp

Reviewed By: davidsp

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D5009319

Signature: t1:5009319:1494006078:32950c5490935d9786553fc07d01f1fc92aacf25
2017-05-08 11:12:11 -07:00
Jun Wu
3f004c2689 lfs: ensure storeid.size is an integer
Summary:
The request JSON data used in the batch API should have `size` field as an
integer, not a string. This was discovered when testing against GitHub's
`lfs-test-server` implementation [1]. The latter was written in Golang and has
strong type requirement, it will treat `{"size": "42"}` as `{"size": 0}`.

[1]: https://github.com/git-lfs/lfs-test-server

Test Plan:
An integration test with `lfs-test-server` will be added once remaining issues
are resolved.

Reviewers: #mercurial, ikostia, davidsp

Reviewed By: ikostia, davidsp

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D5009248

Signature: t1:5009248:1493977299:5daf8d32cd8c933be71a41afcc7ff832eb7edb5a
2017-05-08 11:11:47 -07:00
Jun Wu
fd5d8b9996 lfs: simplify blobstore config options
Summary:
This diff simplifies lfs remote server configs to a single item: `url`,
similar to what git-lfs has.

Compare:

```
  Before                        | After
 -------------------------------+-----------------------------------
  remoteurl = http://a.com/lfs  | url = http://foo:pass@a.com/lfs
  remoteuser = foo              |
  remotepassword = pass         |
  remotestore = git-lfs         |
  ------------------------------+-----------------------------------
  remotepath = /tmp/lfs-test    | url = file:///tmp/lfs-test
  remotestore = dummy           |
```

Test Plan: Modified existing cases.

Reviewers: #mercurial, davidsp, rmcelroy

Reviewed By: rmcelroy

Subscribers: rmcelroy, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D5008882

Signature: t1:5008882:1494230125:822b0e92f45dff2a37e26e6b3e44b559b4a47e6d
2017-05-08 11:11:14 -07:00
David Soria Parra
cd7eed4dd0 lfs: ensure svfs is set
Summary:
Some remotestores (in particular null) do not have a svfs.
Check for it before we set it.

Test Plan:
Used a nullstore with a threshold set during push and saw it
now worked.

Reviewers: quark

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4984664
2017-05-04 16:18:32 -07:00
Jun Wu
64daf52ec8 lfs: use lfsvfs in dummy store
Summary: This simplifies code a lot.

Test Plan: Updated existing test

Reviewers: #mercurial, rmcelroy

Reviewed By: rmcelroy

Subscribers: rmcelroy, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D5001837

Signature: t1:5001837:1493885337:eea8b5cba7234453b32eba46fd58b4d5904367bf
2017-05-04 09:25:20 -07:00
Jun Wu
fb4cd9891d lfs: remove chunking feature
Summary:
Per discussion with @davidsp, we want to stick to Git-LFS specification and
avoid non-standard behavior. The chunking behavior will happen at LFS server
transparently.

The direct motivation for this is to make it possible to implement an
efficient `filectx.cmp` that just compares hashes.

Test Plan: Updated existing test

Reviewers: #mercurial, durham

Reviewed By: durham

Subscribers: mjpieters, davidsp

Differential Revision: https://phabricator.intern.facebook.com/D5001827

Signature: t1:5001827:1493914639:c58694873e79a8ca910bb8ee01bf593885896664
2017-05-04 09:20:34 -07:00
Jun Wu
3d56581781 lfs: store isbinary information in LFS metadata
Summary:
Usually LFS files are binary files. But there could be exceptions. This diff
adds a new customized field `x-is-binary` to record those exceptions.

The `filectx.isbinary` API is changed to use that metadata as a fast path.

This allows us to provide a transparent user experience (whether a file is
stored in LFS or not does not affect its original `isbinary` property),
while still being able to skip loading the LFS blob if the file is binary.

Test Plan: Added a new test case

Reviewers: #mercurial, rmcelroy

Reviewed By: rmcelroy

Subscribers: rmcelroy, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D5001117

Signature: t1:5001117:1493892406:2a8ebd42d0ae0dbae39b87b9ea16db22b99f3d12
2017-05-04 09:18:04 -07:00
Jun Wu
d3f8c4e9dc lfs: override filelog.renamed code path
Summary:
An `hg pull` test triggers the following code path server-side when using
remotefilelog:

  ...
  remote:   File "/usr/lib64/python2.7/site-packages/remotefilelog/remotefilelogserver.py", line 308, in streamer
  remote:     text = _loadfileblob(repo, cachepath, path, node)
  remote:   File "/usr/lib64/python2.7/site-packages/remotefilelog/remotefilelogserver.py", line 223, in _loadfileblob
  remote:     text = createfileblob(filectx)
  remote:   File "/usr/lib64/python2.7/site-packages/remotefilelog/remotefilelogserver.py", line 348, in createfileblob
  remote:     ancestors.extend([f for f in filectx.ancestors()])
  remote:   File "/usr/lib64/python2.7/site-packages/mercurial/context.py", line 1072, in ancestors
  remote:     for parent in c.parents()[:cut]:
  remote:   File "/usr/lib64/python2.7/site-packages/mercurial/context.py", line 923, in parents
  remote:     r = fl.renamed(self._filenode)
  remote:   File "/usr/lib64/python2.7/site-packages/mercurial/filelog.py", line 62, in renamed
  remote:     t = self.revision(node)
  ...

That triggers downloading a blob. We don't want to do that server-side. So
override the `renamed` method to use LFS fast path to test rename.

Practically, this reverts part of D4906074.

Test Plan: Run existing tests. This change was made on the server.

Reviewers: davidsp, #mercurial, rmcelroy

Reviewed By: rmcelroy

Subscribers: rmcelroy, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4992421

Signature: t1:4992421:1493802628:2bf2cf819bfed2aa61ea1c2323c03ab428732815
2017-05-03 11:10:48 -07:00
David Soria Parra
f8bb4e2122 lfs: add a 'null' store that doesn't require any setup
Summary:
We have to use the dummy store to ensure we don't try to open an HTTP
connection during push, which can fail and must not happen during p4fastimport.
However `dummy` requires some configuration. So just add an internal `null`
store that allows us to operate LFS in a mode that will ignore all large files
completely.

Test Plan: used it in p4fastimport

Reviewers: #sourcecontrol, quark

Reviewed By: quark

Subscribers: quark, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4963638

Signature: t1:4963638:1493316573:c012bd97794f9a57c3cf8c15d868a67ae3c03c31
2017-04-27 11:13:54 -07:00
Jun Wu
0d6151f530 remotefilelog: add lfs integration test
Summary:
The test covers common workflows like clone, commit, push, update, pull. It
exercises the remotefilelog plain store and Python datapack store to make sure
they won't lose the revlog flag. The test also tries to verify rename works
correctly.

Since the lfs extension may be eventually upstreamed, it seems a good idea to
make remotefilelog call `lfs.wrapfilelog` so lfs is free from remotefilelog
code.

Test Plan: Added a test

Reviewers: #mercurial, durham

Reviewed By: durham

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4904281

Signature: t1:4904281:1492560308:5fd9f214ada6de795735ea7d737d30c1bf39812a
2017-04-26 19:55:02 -07:00
Jun Wu
00a9465989 lfs: be compatible with filelog metadata
Summary:
This diff changes lfs `revision(raw=False)` output to include hg filelog
metadata. The LFS blob does not contain filelog metadata as before.

This hurts performance if there is a rename, or the binary starts with the
magic `\1\n`. But compatibility is greatly improved - it's now possible to swap
a non-lfs revision with mercurial rename to a lfs revision, and easier to be
compatible with remotefilelog (namely, remotefilelog defers filelog.add until
commit hash is known).

Test Plan: Modified existing test.

Reviewers: #mercurial, durham

Reviewed By: durham

Subscribers: rmcelroy, durham, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4906074

Signature: t1:4906074:1492559513:09b25fc1026d4ce8fd784a044d6724f12e8bda45
2017-04-21 19:56:27 -07:00
Jun Wu
11cd7af681 lfs: add a fast path for filelog.size
Summary:
Previously, `filelog.size` requires loading the lfs blob in memory. This
makes it unnecessary.

Test Plan: Run existing tests.

Reviewers: #mercurial, mitrandir

Reviewed By: mitrandir

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4861358

Signature: t1:4861358:1491923349:f1d2fe5656158854e1dd50987ea3db64a9793db6
2017-04-11 13:20:58 -07:00
Jun Wu
2f1756608e lfs: remove manifest walk in prepush hook
Summary:
`ctx.files` should contain changed files from both p1 and p2. There is no
need to walk the manifest, which could also be painfully slow for large
repos. So the manifest walk got removed.

Test Plan: `arc unit`

Reviewers: #mercurial, rmcelroy

Reviewed By: rmcelroy

Subscribers: rmcelroy, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4856552

Signature: t1:4856552:1491639067:8080f9bc7e246a8d36fcc3504716e9c95ee24fcf
2017-04-10 11:22:54 -07:00
Jun Wu
46d44c8b07 lfs: use non-chunking spec format if possible
Summary:
The "chunking" feature is not specified by [the current Git LFS standard](21e1695220/docs/spec.md).

Therefore avoid using it if possible - if there is only one chunk, use the
standard specification (`https://git-lfs.github.com/spec/v1`).

An upload message is slightly changed to be more accurate.

Test Plan: Changed existing tests

Reviewers: davidsp, #mercurial, rmcelroy

Reviewed By: rmcelroy

Subscribers: rmcelroy, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4856527

Signature: t1:4856527:1491638822:1c6f555b706e7bb22dd9090afa156f2161bf9f7f
2017-04-10 11:17:48 -07:00
Jun Wu
38632025a9 lfs: add bundle support
Summary:
This diff adds bundle support for lfs:

  - Let `hg bundle` use changegroup3 instead of changegroup2 to record revlog
    flags.
  - Hook related functions so `hg -R bundle.hg` works with LFS.

Test Plan: Added a test

Reviewers: #mercurial, rmcelroy

Reviewed By: rmcelroy

Subscribers: rmcelroy, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4856335

Signature: t1:4856335:1491638660:d56ba54ea5f59262f009418b9c7e48c8f2a25ed6
2017-04-10 11:11:37 -07:00
Jun Wu
cdc9accca5 lfs: do not store mercurial filelog metadata in lfs blobs
Summary:
Per discussion with @davidsp, it's better for LFS to not store Mercurial
filelog metadata, which is currently used to store rename information. That has
many advantages:

  - Large blobs could be reused across renames
  - No need to special handle files starting with `\1\n`
  - P4 LFS server implementation is much easier
  - remotefilelog LFS support is easier and cleaner

That said, the rename information is stored as lfs metadata using the
non-standard `x-hg-copy`, `x-hg-copyrev` keys. So they still exist and are
functional.

The disadvantage is that rename gets no longer hashed, which is probably fine.


Test Plan: Added a test

Reviewers: davidsp, #sourcecontrol, rmcelroy

Reviewed By: rmcelroy

Subscribers: jsgf, rmcelroy, stash, mjpieters, davidsp

Differential Revision: https://phabricator.intern.facebook.com/D4849764

Signature: t1:4849764:1491580506:1d80ad476b9cbd6773843cb52aee6745f478a0b0
2017-04-07 18:29:35 -07:00
Jun Wu
de84869150 lfs: cleanup user-facing messages
Summary:
The diff cleans up messages shown to the user. It makes verbose messages
gated by `if ui.verbose`, and simplifies some words.

The resulting user experience is, when there is no large file involved, lfs
shows nothing. When there are largefiles being downloaded or uploaded, show
progress bar if it takes long. The progress bar is the only user visible
output from lfs by default.

Test Plan: `rt test-lfs.t`

Reviewers: #mercurial, simonfar

Reviewed By: simonfar

Subscribers: simonfar, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4813297

Signature: t1:4813297:1491213089:c9ec363eb65df4c85282c90b230da3321b93b5e0
2017-04-05 15:58:56 -07:00
Jun Wu
b1cdf3f0e6 lfs: remove setup.py
Summary:
The logic in `setup.py` is now simple enough to be moved to `reposetup`.
`setup.py` becomes unused and gets removed.

Test Plan: `rt test-lfs.t`

Reviewers: #mercurial, simonfar

Reviewed By: simonfar

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4813193

Signature: t1:4813193:1491212154:0238ea448269b0e5516c927c1fa989d6d9130d4a
2017-04-05 15:49:08 -07:00
Jun Wu
f0b6b81970 lfs: simplify remote blobstore.logic
Summary:
Previously, the remote blobstore could be either `git-lfs` or `dummy`. The
application code does not really care, it only wants a "remote" blobstore.

This diff adds a factory method and makes `git-lfs` and `dummy` stores private.

The `@staticmethod get(vfs)` interface is also removed as it's duplicated and
unnecessary - as long as mercurial calls `reposetup`, the blob store objects
are set, and they cannot be missing.

The error message about an unsupported store is also changed to be consistent
with mercurial style. A test was added to test the error.

Test Plan: `rt test-lfs.t`

Reviewers: #mercurial, simonfar

Reviewed By: simonfar

Subscribers: simonfar, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4813023

Signature: t1:4813023:1491382755:b9d8ec6518141d0ba8263e16c53f430ce80c39f0
2017-04-05 15:48:10 -07:00
Jun Wu
039e232a0f lfs: fix an error caused by a wrong merge resolution
The change should belong to D4812131. But the change gets lost during a wrong
merge resolution. This patch fixes it.
2017-04-04 16:22:31 -07:00
Jun Wu
00e6156a07 lfs: let blobstore.local and blobstore.remote take an repo argument
Summary:
Previously local takes a path, remote takes a ui. This diff makes it more
consistent.

Test Plan: `rt test-lfs.t`

Reviewers: #mercurial, simonfar

Reviewed By: simonfar

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4812884

Signature: t1:4812884:1491211049:6b3709e3b7e054b11d4a3f9a92e7921fe55de9fa
2017-04-04 16:13:16 -07:00
Jun Wu
ead899b26b lfs: cleanup option passing
Summary:
For options passed to revlog, use `repo.svfs.options`. For objects, use
`repo.svfs.objname`.

Unused assignments and unused functions are removed.

Test Plan: `arc unit`

Reviewers: #mercurial, simonfar, rmcelroy

Reviewed By: simonfar, rmcelroy

Subscribers: rmcelroy, simonfar, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4812131

Signature: t1:4812131:1491210581:004297885b8b09d23b58e0a3bee147910615ce62
2017-04-04 16:10:29 -07:00
Jun Wu
6d19f83af9 lfs: add a "bypass" config option
Summary:
The bypass option limits lfs's functionality to only skip hash checks. It is
intended to be used server-side, to make it more predictable - the server
never interacts with the lfs blob service.

Test Plan: Added a test case

Reviewers: #mercurial, rmcelroy

Reviewed By: rmcelroy

Subscribers: rmcelroy, simonfar, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4811828

Signature: t1:4811828:1491301114:c2e3c4200ce4cc84b9c5872a8b9a040176bb002a
2017-04-04 16:08:36 -07:00
Jun Wu
abb7a83354 lfs: make tests stronger
Summary:
This diff makes the `test-lfs.t` much more stronger. It reveals a lot of core
hg issues in this area. I'll send patches to fix them all.

Regarding on lfs, there are some changes:

  - An existence check in its push hook was added. Otherwise pushing a revision
    with rename will cause crash.
  - The "read" processor is responsible for downloading blobs, and translate
    raw revision to lfs text. It should always return lfs text. But it may
    return raw revision text on error currently. That error handler was
    removed to avoid further damage.


Test Plan: Added new test cases. I also added `hg verify` to sanity check things are good.

Reviewers: #mercurial, rmcelroy

Reviewed By: rmcelroy

Subscribers: rmcelroy, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4792641

Signature: t1:4792641:1490868946:8f09c84dc9ebda3889d6a1ea04c49a06acbf38a3
2017-03-30 16:41:46 -07:00
Jun Wu
07d5d0f836 lfs: fix a pyflakes issue 2017-03-28 15:43:52 -07:00
Jun Wu
eeeb0210ae lfs: allow blobstore to be outside the repo
Summary:
Previously `lfs.blobstore` must be a relative path. If an absolute
blobstore path is set, it will traceback because vfs audit fails:

```
  File "hg/mercurial/revlog.py", line 1356, in _processflags
    text, vhash = writetransform(self, text)
  File "fb-hgext/hgext3rd/lfs/wrapper.py", line 69, in writetostore
    blobstore.local.get(self.opener).write(storeid, chunk)
  File "fb-hgext/hgext3rd/lfs/blobstore.py", line 41, in write
    fp = self._opener(self.filename(storeid), 'w+', atomictemp=True)
  File "hg/mercurial/vfs.py", line 344, in __call__
    self.audit(path)
  File "hg/mercurial/pathutil.py", line 64, in __call__
    raise error.Abort(_("path contains illegal component: %s") % path)
  Abort: path contains illegal component: /home/quark/lfslocalblobstore/d7/dbc611df1fe7dfacfe267a2bfd32ba8fc27ad16aa72af7e6c553a120b92f18
```

That was because the code was using `repo.vfs`. This diff adds a new `lfsvfs`
to avoid the issue. The `lfsvfs` also did the correct filename check (the
old `re.match` check will not match the whole string), so `blobstore.local`
could be simplified a lot.


Test Plan:
A new test case was added to make sure absolute blobstore path works. I also
did some cleanups for the test file to de-dup hgrc, and avoid writing files
outside `$TESTTMP`.

Reviewers: #mercurial, rmcelroy

Reviewed By: rmcelroy

Subscribers: rmcelroy, mjpieters, remi

Differential Revision: https://phabricator.intern.facebook.com/D4785084

Signature: t1:4785084:1490693306:b42eef8e53af107897c2e1bc2984c090bdd2e465
2017-03-28 15:41:59 -07:00
Jun Wu
c3fd525bf8 lfs: fix tests
Summary:
A bunch of modifications to get the test pass with the new lfs code.

- Move `lfs` to `hgext3rd`. The code was supposed for hg-core. For now, we do
  them in fb-hgext to speed up the process
- Remove the windows test, which is not supported by `run-tests.py` and is
  duplicated with `test-lfs.t`.
- Do import `mercurial.i18n._` correctly.
- Change some i18n logic a bit so it's more translator-friendly.
- Change `revlog.RevlogError` to `error.RevlogError`.
- Avoid direct symbol import of `mercurial.util.bytecount`, which will fail the
  upstream importchecker test.
- Fix various lint issues like lines being too long etc.
- Document lfs config options.

Test Plan: `arc unit`

Reviewers: #sourcecontrol, rmcelroy

Reviewed By: rmcelroy

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4772216

Signature: t1:4772216:1490401458:1ad3c18ab80e1d31085d0b6b4c630e62a7dc7930
2017-03-24 19:01:42 -07:00