Commit Graph

61 Commits

Author SHA1 Message Date
Durham Goode
3102e456ab infinitepush: backout lfs fix
Summary: This broke hg bundle on non-generaldelta repositories.

Test Plan: Ran the tests

Reviewers: #fbhgext, stash, quark

Reviewed By: #fbhgext, quark

Differential Revision: https://phab.mercurial-scm.org/D74
2017-07-13 13:53:40 -07:00
stash@fb.com
35d988093a infinitepush: handle lfs correctly
Summary:
Make sure that infinitepush work fine with lfs. To achieve it we try to use
changegroup3 if it's available and also run prepush hook before infinitepush.

Test Plan: Run `arc unit`

Reviewers: #fbhgext, quark, dsp

Reviewed By: #fbhgext, dsp

Differential Revision: https://phab.mercurial-scm.org/D17
2017-07-12 09:52:03 -07:00
Jun Wu
7e8b62aef6 lfs: add retry logic for transferring a single object
Summary:
We got TCP RESETs frequently. It's hard to pinpoint the root cause so let's
just add retry for now.

Test Plan: eyes

Reviewers: #mercurial, #ovrsource_warroom, steaphan

Reviewed By: steaphan

Subscribers: steaphan, mjpieters, medson

Differential Revision: https://phabricator.intern.facebook.com/D5265928

Tasks: 19419154

Signature: t1:5265928:1497635499:17cf5d5cb69b406330326d693a9eceb1d22861f8
2017-06-16 10:54:07 -07:00
Durham Goode
e34660b057 commands: update to use registrar instead of cmdutil
Summary: Upstream has deprecated cmdutil.commands() in favor of registrar.commands()

Test Plan: Ran the tests

Reviewers: #mercurial, quark

Reviewed By: quark

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D5106486

Signature: t1:5106486:1495485074:0e20f00622cc651e8c9dda837f84dd84cc51099e
2017-05-22 13:38:37 -07:00
Jun Wu
f532dddaa4 lfs: implement byte-level progress bar
Summary:
Previously, the progress bar is file-level - it moves when a file is transferred.

When uploading or downloading a single giant file, progress bar matters. And
this diff adds it to make people more patient.

Test Plan:
Manually upload and download several big files.
Make sure the progress bar appears in both cases.

Also make sure the output is sane with `-v`.

Reviewers: #mercurial, davidsp, rmcelroy

Reviewed By: rmcelroy

Subscribers: rmcelroy, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D5100014

Signature: t1:5100014:1495316719:c500ccd63d3a495e41575cadd72419a41f5fb259
2017-05-22 11:03:14 -07:00
Jun Wu
3e43e09a97 lfs: add a lfs_files template
Summary:
This allows automation to know which modified or added files are LFS for
specific changesets.

Test Plan: Added a test

Reviewers: #mercurial, davidsp

Reviewed By: davidsp

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D5091591

Signature: t1:5091591:1495163606:138638d8ccc57b8ed6c1e324750ec5dc15c198a0
2017-05-18 22:18:20 -07:00
Jun Wu
546cff40f8 lfs: skip uploading when remotestore is a null store
Summary:
When remotestore is a null store, uploading is a no-op and can be skipped.
This diff makes it so.

Test Plan: arc unit

Reviewers: #mercurial, davidsp

Reviewed By: davidsp

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D5090686

Signature: t1:5090686:1495144964:956549307912f4a44201aa305959bf8de26ac028
2017-05-18 15:10:39 -07:00
Jun Wu
a82574317b lfs: upload blobs during 'hg bundle'
Summary:
When a bundle is created, the bundle could be exchanged via copy and possibly
eventually reach to a publishing repo. If we don't upload LFS blobs, hg server
could have revisions that can never be checked out or verified.

So let's just assume bundles generated by `hg bundle` will be public, and
upload LFS blobs automatically, without depending on other code review tooling.

Note: there is a `preoutgoing` hook which will be triggered in this case,
however it's not useful since it does not have the `outgoing` information.

Test Plan: Will add a new test

Reviewers: davidsp, #mercurial, rmcelroy

Reviewed By: rmcelroy

Subscribers: rmcelroy, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D5050945

Signature: t1:5050945:1494958648:0c46ab1d85755838ba189bbb1e0673882922bb58
2017-05-16 15:49:32 -07:00
Jun Wu
7098094e7e lfs: cleanup file headers
Summary:
This diff removes `# coding=UTF-8` and adds standard GPL2 headers to lfs
files.

Test Plan: arc unit

Reviewers: davidsp, #mercurial, simonfar, rmcelroy

Reviewed By: simonfar, rmcelroy

Subscribers: simonfar, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D5067881

Signature: t1:5067881:1494947941:e669d4d9485c390138f76031608a7f727c57bf55
2017-05-16 15:44:06 -07:00
Jun Wu
f0c7a3cef9 lfs: remove util.py
Summary:
`util.py` only contains 2 small things: `lfsvfs` and `sha256`. Both of them
only have one user (in terms of files). Therefore just move the code to
related files to make it simpler.

Test Plan: arc unit

Reviewers: davidsp, #mercurial, rmcelroy

Reviewed By: rmcelroy

Subscribers: rmcelroy, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D5067862

Signature: t1:5067862:1494955824:b99021f1bb44568dc0c738b0e472eb732f2bc91a
2017-05-16 15:43:36 -07:00
Jun Wu
ae8a1ccb8f lfs: update old PointerDeserializationError error handling
Summary:
The error was renamed to `InvalidPointer`. This diff updates error handling
to catch that instead.

Test Plan: arc unit

Reviewers: davidsp, #mercurial, rmcelroy

Reviewed By: rmcelroy

Subscribers: rmcelroy, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D5067496

Signature: t1:5067496:1494955510:5f0162a8cc6f1d0d83e3ab6319ec3202028684d8
2017-05-16 15:42:30 -07:00
Jun Wu
acfcc44094 lfs: add a fctx.islfs method
Summary:
This simplifies code a bit, and could be useful for 3rd party code. We already
have `isbinary`, `islink`, `isexec`, `isabsent`. So another `is` method looks
fine.

Test Plan: Run existing tests

Reviewers: davidsp, #mercurial, rmcelroy

Reviewed By: rmcelroy

Subscribers: rmcelroy, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D5067369

Signature: t1:5067369:1494955466:ae17e310e743c704cc0d9bb73e0d2e82adcffff8
2017-05-16 15:41:39 -07:00
Jun Wu
2eef188bc0 lfs: add a filectx.cmp fast path
Summary:
The `filectx.cmp` fast path allows us to show "binary file changed" diff output
when two binaries have the same size but different content.

Test Plan: Added a test case

Reviewers: davidsp, #mercurial, rmcelroy

Reviewed By: rmcelroy

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D5067266

Signature: t1:5067266:1494954783:a41e8213782afe24618ef2930c9576f21610fd3e
2017-05-16 15:38:51 -07:00
Jun Wu
9eedb98b61 lfs: downgrade "computing set of blobs to upload" to debug message
Summary:
Calculating what blobs to upload should be fast since we use changelog `files`
information. Users will see `lfs: uploading ...` very soon. So let's downgrade
`lfs: computing set of blobs to upload` to a debug message to make `-v`
cleaner.

Test Plan: Updated existing test.

Reviewers: davidsp, #mercurial, rmcelroy

Reviewed By: rmcelroy

Subscribers: rmcelroy, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D5066600

Signature: t1:5066600:1494954115:c00925a1930f9b53e914078d32b6c4e4161099ee
2017-05-16 15:27:25 -07:00
Jun Wu
ae541c9f80 lfs: show status per object when ui.verbose is set
Summary:
Downloading or uploading a single LFS object could take long. Instead of
showing a summary of what objects are downloaded or uploaded, print a message
per object if --verbose is provided.

Test Plan: Added a test

Reviewers: davidsp, #mercurial, rmcelroy

Reviewed By: rmcelroy

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D5066327

Signature: t1:5066327:1494953894:3593436ff53eaff117a2932476dfc4b867fb9f17
2017-05-16 15:24:04 -07:00
Jun Wu
28de141267 lfs: handle "wrong action" error earlier
Summary:
Previously, when an object is not found server-side, we error out during the
basic transfer API. This diffs move the check to the batch metadata API so the
check is preformed earlier.

That also means, when batch fetching many LFS objects, we could report multiple
objects being missing instead of just one.

Test Plan: Modified an existing test

Reviewers: davidsp, #mercurial, rmcelroy

Reviewed By: rmcelroy

Subscribers: rmcelroy, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D5065623

Signature: t1:5065623:1494952131:cb1df73e1cea21e07bfd0ec630bacec60e5b385c
2017-05-16 15:19:40 -07:00
Jun Wu
b78dbc3dad lfs: check object errors from batch API early
Summary:
The batch API allows server to return errors per object, handling them early
seems to be a good idea.

This is an attempt to distinguish "cannot upload - no permission" from
"no need to upload - already exist".

Unfortunately, `lfs-test-server` reference implementation does not provide
the error message so it's not reflected in tests.

Reviewers: #mercurial, davidsp, rmcelroy

Reviewed By: rmcelroy

Subscribers: rmcelroy, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D5057816

Signature: t1:5057816:1494951879:07f2b67408e9d77255104b932baae2e50af9890b
2017-05-16 12:30:23 -07:00
Jun Wu
adb61e1976 lfs: split basic transfer API out from _batch
Summary:
The basic transfer API [1] worths a separate method.

[1]: https://github.com/git-lfs/git-lfs/blob/master/docs/api/basic-transfers.md

Reviewers: #mercurial, davidsp, rmcelroy

Reviewed By: rmcelroy

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D5057683

Signature: t1:5057683:1494951784:de3d3a28a69366fbbc4546f9e68b4d664ba00dfa
2017-05-16 12:26:30 -07:00
Jun Wu
d5a9c91ff9 lfs: move batch API to a separate method
Summary:
The LFS uploading or downloading process consists of 2 parts: a batch API to
get metadata about objects, followed by `len(objects)` requests.

This diff moves the first metadata API out from a giant `_batch` method to
make the code structure easier to maintain. Later diffs will further split
`_batch` method.

A side effect is `lfs: mapping blobs to #{action} URLs` message is removed.
More user-friendly message will be added back in a later patch.

Reviewers: #mercurial, davidsp, rmcelroy

Reviewed By: rmcelroy

Subscribers: rmcelroy, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D5057541

Signature: t1:5057541:1494949989:d510c3d9369744e7ee776c0d0a26d263dfaf8352
2017-05-16 12:24:20 -07:00
Jun Wu
31882a9791 lfs: remove total parameters from remote store APIs
Summary:
Displaying total bytes to upload is currently inaccurate - the server could
already have some objects so only a subset of selected objects will be
uploaded.

Besides, we pass `pointers` to related upload and download APIs so `total`
could be calculated from them.

This diff removes the inaccurate "need to upload" message and unnecessary
parameters. An accurate message will be added in a later patch.

Reviewers: #mercurial, davidsp, rmcelroy

Reviewed By: rmcelroy

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D5057470

Signature: t1:5057470:1494949554:07e637a8b58c894598b3f0c2dd53f80a5ade25a8
2017-05-16 11:19:39 -07:00
Jun Wu
f484dba4a0 lfs: narrow down try block
Summary:
A try block is only effective for the HTTP error. Let's narrow it down to
just the HTTP request.

Test Plan: arc unit

Reviewers: #mercurial, rmcelroy

Reviewed By: rmcelroy

Subscribers: rmcelroy, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D5057107

Signature: t1:5057107:1494634308:584877babc6a4705fa0740788dcf58affdf51bec
2017-05-12 17:28:45 -07:00
Jun Wu
853b228a8a lfs: handle batch API's HTTP and JSON error
Summary:
There are `1+n` HTTP requests. `n` of them is handled. But the first one
about JSON metadata is not. Let's handle it.

Test Plan: arc unit

Reviewers: #mercurial, rmcelroy

Reviewed By: rmcelroy

Subscribers: rmcelroy, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D5057053

Signature: t1:5057053:1494634184:fa0e562dc7879379106bf37915a2e5ea3cc01a4d
2017-05-12 17:26:06 -07:00
Jun Wu
d5e501b573 lfs: provide more detailed message on network error
Summary:
The server or the HTTP library may provide more detailed error message.
Let's use them. `RequestFailedError` is also renamed to `LfsRemoteError` to
make it clear it's related to LFS and network.

Test Plan: arc unit

Reviewers: #mercurial, rmcelroy

Reviewed By: rmcelroy

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D5057014

Signature: t1:5057014:1494634052:00c1cb1b337a0e4cb92828cd49e95b7b0ff7e540
2017-05-12 17:22:02 -07:00
Jun Wu
b808e0f70f lfs: remove UnavailableBatchOperationError
Summary:
It's caught, and re-raise with RequestFailedError. We can just raise
RequestFailedError directly.

Test Plan: arc unit

Reviewers: #mercurial, rmcelroy

Reviewed By: rmcelroy

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D5056886

Signature: t1:5056886:1494633307:3fdf2fab13e783224d343266a1278c41ed4394e8
2017-05-12 17:18:05 -07:00
Jun Wu
34ffa6d4ce lfs: use ProgrammingError for invalid action in _batch
Summary:
_batch is called by programmers. If action is not expected, it's a
ProgrammingError.

Test Plan: arc unit

Reviewers: #mercurial, rmcelroy

Reviewed By: rmcelroy

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D5056822

Signature: t1:5056822:1494634121:59737eb0d3f42d706ee8b0dc18ec7a4b49ada34f
2017-05-12 17:14:52 -07:00
Jun Wu
d3b28f4b42 lfs: remove StoreID
Summary:
For long I have been wondering - we have `oid` already and an `oid` could
identify an object already, why do we need a `StoreID` which contains both
`oid` and `size`? It does not seem necessary?

I figured out the answer - the batch download or upload API *requires* size
information per object [1].

Since we will have LFS pointers during upload or download, we can just use
pointers. This diff removes `StoreID`, changes remote blob store to take
`pointers` directly, and local blob store to take `oid` directly.

As a side effect, `debuglfsupload` no longer takes `-o`, which seems fine
because `-r` should be more useful.

[1] https://github.com/git-lfs/git-lfs/blob/master/docs/api/batch.md

Test Plan: `arc unit`

Reviewers: davidsp, #mercurial, rmcelroy

Reviewed By: rmcelroy

Subscribers: rmcelroy

Differential Revision: https://phabricator.intern.facebook.com/D5043694

Signature: t1:5043694:1494496522:ea10f3dd12fc2ae3ab77b8f623eeead84dcc25fa
2017-05-11 17:57:42 -07:00
Jun Wu
8641f3cc37 lfs: rename GithubPointer to gitlfspointer
Summary:
It's Git-LFS, not GitHub.
Besides, mercurial uses lowercase for class names.

Test Plan: Run existing tests

Reviewers: davidsp, #mercurial, rmcelroy

Reviewed By: rmcelroy

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D5043570

Signature: t1:5043570:1494493515:1df24d12d8b9be51480909ef75cad56cd9e22a2e
2017-05-11 17:54:33 -07:00
Jun Wu
a024f96812 lfs: add validation to pointer
Summary:
Read the Git-LFS specification [1] and implement most checks to prevent
programming error and detect data corruption.

The new code should be stronger than what was before the refactoring, since the
old code only checks keys but not values.

[1]: https://github.com/git-lfs/git-lfs/blob/master/docs/spec.md

Test Plan: Added a new test

Reviewers: #mercurial, davidsp, rmcelroy

Reviewed By: rmcelroy

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D5046740

Signature: t1:5046740:1494543072:2f0dbfc07cd52a10572550a277d850bdf3f78d27
2017-05-11 17:51:44 -07:00
Jun Wu
143b8c67d9 lfs: rewrite pointer logic
Summary:
Since we decided to only support GitHub's Git LFS specification, there is no
need to support multiple pointer types, so the code could be simplified.

The old code special case keys like `version`, `oid`, `hashalgo`, which makes
it longer than necessary. The new code is a rewrite treating everything as a
normal dict entry so the pointer class is much shorter: 76 -> 21 lines.

Data validation is temporary lost, which will be added back (and stronger) by
the next diff. It is separated to make review easier.


Test Plan: Run existing tests

Reviewers: #mercurial, davidsp, rmcelroy

Reviewed By: rmcelroy

Subscribers: rmcelroy

Differential Revision: https://phabricator.intern.facebook.com/D5043547

Signature: t1:5043547:1494543031:ac1100939a10a79dfd749ab6ac9c3bb7fcd84dbf
2017-05-11 17:48:21 -07:00
Jun Wu
ec852f8f5f lfs: disable lfs code path if filenode is None
Summary:
`filelog.filenode` could be `None` when it's unknown - like working copy. That
breaks `isbinary` test. Let's modify `_islfs` check to return False to disable
lfs code path in that case.

Test Plan: Added a new test.

Reviewers: #mercurial, rmcelroy

Reviewed By: rmcelroy

Subscribers: rmcelroy, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D5034058

Signature: t1:5034058:1494431790:a02b69addf5d2f9b6b9a27ad71ed4f136b2bfd2b
2017-05-10 13:28:39 -07:00
Jun Wu
ee34d8ca1e lfs: remove lfs.bypass config option
Summary:
The config option was designed to be used server-side to avoid accessing remote
LFS blob stores. However, we now have `lfs.url=null:` as a clean alternative,
which will explicitly raise if the server ever tries to download remote
content. So `bypass` is no longer needed.

`bypass` could also be used for displaying raw content. That is doable using
`debugdata`.

Test Plan: Modified existing tests

Reviewers: #mercurial, rmcelroy, davidsp

Reviewed By: davidsp

Subscribers: davidsp, rmcelroy, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D5009725

Signature: t1:5009725:1494260384:332d7bd658359c004342efd01e4f13b6fc5499b1
2017-05-08 11:22:01 -07:00
Jun Wu
8909f8ded8 lfs: remove lfs.blobstore option
Summary:
`lfs.blobstore` is a path used to store local blobs. Some of the blobs are not
uploaded yet so they couldn't be easily discarded. Although blobs downloaded
from the server could be removed if necessary.

It does not make much sense to make this a config option - the data should be
stored reliably. It's also dangerous to allow write to arbitrary paths under
`repo.vfs`.

This diff makes the local blob store fixed path at `.hg/store/lfs/objects`.
The choice is similar to `.git/lfs/objects`, but with `store`, shared repo
could share their lfs stores.

Test Plan: Modified existing test.

Reviewers: #mercurial, rmcelroy

Reviewed By: rmcelroy

Subscribers: rmcelroy, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D5009664

Signature: t1:5009664:1493978440:c3f6351d0ea0cca2ce3caa9f7260c5d65bcc0e5f
2017-05-08 11:21:34 -07:00
Jun Wu
186df9d185 lfs: add a command to upload lfs blobs
Summary: This makes it possible to upload lfs blobs without going through push.

Test Plan: The feature will be used in the next change.

Reviewers: #mercurial, davidsp, rmcelroy

Reviewed By: rmcelroy

Subscribers: rmcelroy, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D5009622

Signature: t1:5009622:1494230385:0df33ed12c96bc91f6252398ce5091057fe32a21
2017-05-08 11:21:11 -07:00
Jun Wu
4532efb04d lfs: split prepush hook into individual functions
Summary:
This makes it possible to reuse part of them - like uploading blobs for
given revisions without going through prepush hook.

`pointer.tostoreids()` was changed to `pointer.tostoreid()` to simplify
things a bit.

Unnecessary remoterepo assignment was removed.

Test Plan: `arc unit`

Reviewers: #mercurial, davidsp, rmcelroy

Reviewed By: rmcelroy

Subscribers: rmcelroy, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D5009560

Signature: t1:5009560:1494230285:6469a2701baa8cfa4511a08149a37fc429733343
2017-05-08 11:20:50 -07:00
Jun Wu
3793eee754 lfs: remove 40 char length limit
Summary:
The `lfs-test-server` reference implementation [1] validates the hash, and
40-byte sha256 hexdigest will not pass that check.

[1]: https://github.com/git-lfs/lfs-test-server

Test Plan: A test will be added in the next diff.

Reviewers: #mercurial, davidsp

Reviewed By: davidsp

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D5009319

Signature: t1:5009319:1494006078:32950c5490935d9786553fc07d01f1fc92aacf25
2017-05-08 11:12:11 -07:00
Jun Wu
3f004c2689 lfs: ensure storeid.size is an integer
Summary:
The request JSON data used in the batch API should have `size` field as an
integer, not a string. This was discovered when testing against GitHub's
`lfs-test-server` implementation [1]. The latter was written in Golang and has
strong type requirement, it will treat `{"size": "42"}` as `{"size": 0}`.

[1]: https://github.com/git-lfs/lfs-test-server

Test Plan:
An integration test with `lfs-test-server` will be added once remaining issues
are resolved.

Reviewers: #mercurial, ikostia, davidsp

Reviewed By: ikostia, davidsp

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D5009248

Signature: t1:5009248:1493977299:5daf8d32cd8c933be71a41afcc7ff832eb7edb5a
2017-05-08 11:11:47 -07:00
Jun Wu
fd5d8b9996 lfs: simplify blobstore config options
Summary:
This diff simplifies lfs remote server configs to a single item: `url`,
similar to what git-lfs has.

Compare:

```
  Before                        | After
 -------------------------------+-----------------------------------
  remoteurl = http://a.com/lfs  | url = http://foo:pass@a.com/lfs
  remoteuser = foo              |
  remotepassword = pass         |
  remotestore = git-lfs         |
  ------------------------------+-----------------------------------
  remotepath = /tmp/lfs-test    | url = file:///tmp/lfs-test
  remotestore = dummy           |
```

Test Plan: Modified existing cases.

Reviewers: #mercurial, davidsp, rmcelroy

Reviewed By: rmcelroy

Subscribers: rmcelroy, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D5008882

Signature: t1:5008882:1494230125:822b0e92f45dff2a37e26e6b3e44b559b4a47e6d
2017-05-08 11:11:14 -07:00
David Soria Parra
cd7eed4dd0 lfs: ensure svfs is set
Summary:
Some remotestores (in particular null) do not have a svfs.
Check for it before we set it.

Test Plan:
Used a nullstore with a threshold set during push and saw it
now worked.

Reviewers: quark

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4984664
2017-05-04 16:18:32 -07:00
Jun Wu
64daf52ec8 lfs: use lfsvfs in dummy store
Summary: This simplifies code a lot.

Test Plan: Updated existing test

Reviewers: #mercurial, rmcelroy

Reviewed By: rmcelroy

Subscribers: rmcelroy, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D5001837

Signature: t1:5001837:1493885337:eea8b5cba7234453b32eba46fd58b4d5904367bf
2017-05-04 09:25:20 -07:00
Jun Wu
fb4cd9891d lfs: remove chunking feature
Summary:
Per discussion with @davidsp, we want to stick to Git-LFS specification and
avoid non-standard behavior. The chunking behavior will happen at LFS server
transparently.

The direct motivation for this is to make it possible to implement an
efficient `filectx.cmp` that just compares hashes.

Test Plan: Updated existing test

Reviewers: #mercurial, durham

Reviewed By: durham

Subscribers: mjpieters, davidsp

Differential Revision: https://phabricator.intern.facebook.com/D5001827

Signature: t1:5001827:1493914639:c58694873e79a8ca910bb8ee01bf593885896664
2017-05-04 09:20:34 -07:00
Jun Wu
3d56581781 lfs: store isbinary information in LFS metadata
Summary:
Usually LFS files are binary files. But there could be exceptions. This diff
adds a new customized field `x-is-binary` to record those exceptions.

The `filectx.isbinary` API is changed to use that metadata as a fast path.

This allows us to provide a transparent user experience (whether a file is
stored in LFS or not does not affect its original `isbinary` property),
while still being able to skip loading the LFS blob if the file is binary.

Test Plan: Added a new test case

Reviewers: #mercurial, rmcelroy

Reviewed By: rmcelroy

Subscribers: rmcelroy, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D5001117

Signature: t1:5001117:1493892406:2a8ebd42d0ae0dbae39b87b9ea16db22b99f3d12
2017-05-04 09:18:04 -07:00
Jun Wu
d3f8c4e9dc lfs: override filelog.renamed code path
Summary:
An `hg pull` test triggers the following code path server-side when using
remotefilelog:

  ...
  remote:   File "/usr/lib64/python2.7/site-packages/remotefilelog/remotefilelogserver.py", line 308, in streamer
  remote:     text = _loadfileblob(repo, cachepath, path, node)
  remote:   File "/usr/lib64/python2.7/site-packages/remotefilelog/remotefilelogserver.py", line 223, in _loadfileblob
  remote:     text = createfileblob(filectx)
  remote:   File "/usr/lib64/python2.7/site-packages/remotefilelog/remotefilelogserver.py", line 348, in createfileblob
  remote:     ancestors.extend([f for f in filectx.ancestors()])
  remote:   File "/usr/lib64/python2.7/site-packages/mercurial/context.py", line 1072, in ancestors
  remote:     for parent in c.parents()[:cut]:
  remote:   File "/usr/lib64/python2.7/site-packages/mercurial/context.py", line 923, in parents
  remote:     r = fl.renamed(self._filenode)
  remote:   File "/usr/lib64/python2.7/site-packages/mercurial/filelog.py", line 62, in renamed
  remote:     t = self.revision(node)
  ...

That triggers downloading a blob. We don't want to do that server-side. So
override the `renamed` method to use LFS fast path to test rename.

Practically, this reverts part of D4906074.

Test Plan: Run existing tests. This change was made on the server.

Reviewers: davidsp, #mercurial, rmcelroy

Reviewed By: rmcelroy

Subscribers: rmcelroy, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4992421

Signature: t1:4992421:1493802628:2bf2cf819bfed2aa61ea1c2323c03ab428732815
2017-05-03 11:10:48 -07:00
David Soria Parra
f8bb4e2122 lfs: add a 'null' store that doesn't require any setup
Summary:
We have to use the dummy store to ensure we don't try to open an HTTP
connection during push, which can fail and must not happen during p4fastimport.
However `dummy` requires some configuration. So just add an internal `null`
store that allows us to operate LFS in a mode that will ignore all large files
completely.

Test Plan: used it in p4fastimport

Reviewers: #sourcecontrol, quark

Reviewed By: quark

Subscribers: quark, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4963638

Signature: t1:4963638:1493316573:c012bd97794f9a57c3cf8c15d868a67ae3c03c31
2017-04-27 11:13:54 -07:00
Jun Wu
0d6151f530 remotefilelog: add lfs integration test
Summary:
The test covers common workflows like clone, commit, push, update, pull. It
exercises the remotefilelog plain store and Python datapack store to make sure
they won't lose the revlog flag. The test also tries to verify rename works
correctly.

Since the lfs extension may be eventually upstreamed, it seems a good idea to
make remotefilelog call `lfs.wrapfilelog` so lfs is free from remotefilelog
code.

Test Plan: Added a test

Reviewers: #mercurial, durham

Reviewed By: durham

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4904281

Signature: t1:4904281:1492560308:5fd9f214ada6de795735ea7d737d30c1bf39812a
2017-04-26 19:55:02 -07:00
Jun Wu
00a9465989 lfs: be compatible with filelog metadata
Summary:
This diff changes lfs `revision(raw=False)` output to include hg filelog
metadata. The LFS blob does not contain filelog metadata as before.

This hurts performance if there is a rename, or the binary starts with the
magic `\1\n`. But compatibility is greatly improved - it's now possible to swap
a non-lfs revision with mercurial rename to a lfs revision, and easier to be
compatible with remotefilelog (namely, remotefilelog defers filelog.add until
commit hash is known).

Test Plan: Modified existing test.

Reviewers: #mercurial, durham

Reviewed By: durham

Subscribers: rmcelroy, durham, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4906074

Signature: t1:4906074:1492559513:09b25fc1026d4ce8fd784a044d6724f12e8bda45
2017-04-21 19:56:27 -07:00
Jun Wu
11cd7af681 lfs: add a fast path for filelog.size
Summary:
Previously, `filelog.size` requires loading the lfs blob in memory. This
makes it unnecessary.

Test Plan: Run existing tests.

Reviewers: #mercurial, mitrandir

Reviewed By: mitrandir

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4861358

Signature: t1:4861358:1491923349:f1d2fe5656158854e1dd50987ea3db64a9793db6
2017-04-11 13:20:58 -07:00
Jun Wu
2f1756608e lfs: remove manifest walk in prepush hook
Summary:
`ctx.files` should contain changed files from both p1 and p2. There is no
need to walk the manifest, which could also be painfully slow for large
repos. So the manifest walk got removed.

Test Plan: `arc unit`

Reviewers: #mercurial, rmcelroy

Reviewed By: rmcelroy

Subscribers: rmcelroy, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4856552

Signature: t1:4856552:1491639067:8080f9bc7e246a8d36fcc3504716e9c95ee24fcf
2017-04-10 11:22:54 -07:00
Jun Wu
46d44c8b07 lfs: use non-chunking spec format if possible
Summary:
The "chunking" feature is not specified by [the current Git LFS standard](21e1695220/docs/spec.md).

Therefore avoid using it if possible - if there is only one chunk, use the
standard specification (`https://git-lfs.github.com/spec/v1`).

An upload message is slightly changed to be more accurate.

Test Plan: Changed existing tests

Reviewers: davidsp, #mercurial, rmcelroy

Reviewed By: rmcelroy

Subscribers: rmcelroy, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4856527

Signature: t1:4856527:1491638822:1c6f555b706e7bb22dd9090afa156f2161bf9f7f
2017-04-10 11:17:48 -07:00
Jun Wu
38632025a9 lfs: add bundle support
Summary:
This diff adds bundle support for lfs:

  - Let `hg bundle` use changegroup3 instead of changegroup2 to record revlog
    flags.
  - Hook related functions so `hg -R bundle.hg` works with LFS.

Test Plan: Added a test

Reviewers: #mercurial, rmcelroy

Reviewed By: rmcelroy

Subscribers: rmcelroy, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4856335

Signature: t1:4856335:1491638660:d56ba54ea5f59262f009418b9c7e48c8f2a25ed6
2017-04-10 11:11:37 -07:00
Jun Wu
cdc9accca5 lfs: do not store mercurial filelog metadata in lfs blobs
Summary:
Per discussion with @davidsp, it's better for LFS to not store Mercurial
filelog metadata, which is currently used to store rename information. That has
many advantages:

  - Large blobs could be reused across renames
  - No need to special handle files starting with `\1\n`
  - P4 LFS server implementation is much easier
  - remotefilelog LFS support is easier and cleaner

That said, the rename information is stored as lfs metadata using the
non-standard `x-hg-copy`, `x-hg-copyrev` keys. So they still exist and are
functional.

The disadvantage is that rename gets no longer hashed, which is probably fine.


Test Plan: Added a test

Reviewers: davidsp, #sourcecontrol, rmcelroy

Reviewed By: rmcelroy

Subscribers: jsgf, rmcelroy, stash, mjpieters, davidsp

Differential Revision: https://phabricator.intern.facebook.com/D4849764

Signature: t1:4849764:1491580506:1d80ad476b9cbd6773843cb52aee6745f478a0b0
2017-04-07 18:29:35 -07:00