Summary:
_walkstreamfiles() uses mercurial.store.decodedir(), so
mercurial.store needs to be imported.
Test Plan:
Confirmed that _walkstreamfiles() no longer throws an exception when cloning a
remote shallow repository.
Reviewers: durham, pyd, rmcelroy
Reviewed By: rmcelroy
Subscribers: net-systems-diffs@, exa, yogeshwer
Differential Revision: https://phabricator.fb.com/D2409648
Signature: t1:2409648:1441245825:00a758f6f0884b77572078589f18592ca6cb6fa4
Streaming clones were taking a while because apparently self.datafiles()
actually stats each .i file instead of just returning the list straight from
fncache. To fix this, let's not call datafiles() when we know the matcher is
going to reject everything anyways.
This significantly speeds up streaming clones.
Previously we'd just send one enormous batch for everything to the
server. This led to prolonged periods of no progress output for the
user. Now we send batches in smaller chunks (default is 100) which
gives the user some idea that things are working.
Includes a trivial test, which doesn't really verify that the batching
logic is used as described, but at least prevents the boneheaded error
I had in an earlier (unmailed) version of this patch which forgot to
use configint() when loading the config setting.
Without this, the only way to report a failure of a file load in a
batched set of getfile requests is to fail the entire batch, which is
potentially painful. Instead, add our own error reporting in-band
which the client can then detect and raise.
I'm not completely happy with the somewhat adhoc error reporting here,
but we expect our server to have at least one additional error ("not
allowed to see file contents") which will require some special
handling on our end, so we need some level of flexibility in the error
reporting protocol so we can extend it later. Sigh.
Open question: should we reserve some range of error codes so that
it's easy for strange custom servers to have related monkeypatches to
client code for custom handling of unforseen-by-remotefilelog
conditions?
I couldn't figure out how to actually get the client to try loading
file contents over http in the test, but the get-with-headers test at
least proves that the server responses look the way I expect.
We were not prefetching the potential dependent files for the filelog revisions
we received over the wire. This resulted in a lot of non-batched downloads,
which was super slow. This fixes it by batch downloading the parents and delta
parents of the incoming filelog revisions and adds a test.
touch -t is portable, but requires some computation to get a date
value that's a week ago. A Python oneliner is a little goofy, but
seemed like a straightforward enough answer that I chose that.
This lets clients send many getfile requests in a single transaction.
Note that this requires 76fcf62accb0 be applied to your Mercurial, or
you'll be bitten by a bug[0] in Mercurial's wireproto batching. As a
result of this change, remotefilelog now effectively requires the
upcoming Mercurial 3.5 if you want to use a specific release.
0: http://bz.selenic.com/show_bug.cgi?id=4739
Right now, this is a naive fetch-one-file method. The next change will
mark the method as batchable and use a batch in the client so that
many files can be requested in a single RPC.
The way the protocol is defined for getfiles interleaves reading
filenames and sending file contents, which works fine over ssh but is
incompatible with http.
This change is probably not neccessary now that remotefilelog
correctly checks for its own capability first, but it helped me debug
so I left it in for completeness.
If we instead wrap wireproto.capabilities, then our capabilities don't
get transmitted via the hello command, so not all clients will notice
the new capability unless we do the wrapping here.
Test output is in the test that previously demonstrated the
defect. Note that there's still a defect: we're advertising the
capability over http even though we have no hope of the getfiles
method working over http.
1) The client doesn't look-before-you-leap on the remotefilelog capability
2) The http server crashes ungracefully when handling a remotefilelog
request (ideally it should respond with '400 No Such Method' as a
server without the extension would.)
3) capabilities are inconsistently advertised between cmd=hello and
cmd=capabilities.
Future patches will attempt to clean up most of this.
The magic string 'internal' causes Mercurial to never blame
remotefilelog for being broken. I had suspected that remotefilelog
might work with 3.4, but the tests fail against 3.4.1, so I'm just
making testedwith empty.
The rev graph building code was flawed because it didn't track second parents
correctly. This was caught when someone was developing an extension and
attempted to commit a merge commit in some way.
repo.sopener has been deprecated since hg 2.3, and repo.svfs replaces
it. Since it's been dead for so long, let's just use svfs and call it
good enough.
Summary:
The incominghook was meant to pregenerate any remotefilelog blobs that were
likely to be needed shortly. Unfortunately it actually just slows down pushes,
since in large repos the hook takes longer than the push does sometimes.
So let's just remove it.
Test Plan: Apparently there were no tests for this :p
Reviewers: sid0, lcharignon, mitrandir, ericsumner, rmcelroy
Reviewed By: rmcelroy
Differential Revision: https://phabricator.fb.com/D2185894
Signature: t1:2185894:1435126819:e1e1125520411356eccff4baee31ab2938ebc0fe
Summary: I really don't think it should be in this list.
Test Plan: `hg`
Reviewers: durham, #sourcecontrol, rmcelroy
Reviewed By: durham, #sourcecontrol, rmcelroy
Subscribers: rmcelroy
Differential Revision: https://phabricator.fb.com/D1997655
Signature: t1:1997655:1429189594:aa8f355a6fc61e300f824be6b2fbd64a42dde2b5
Summary:
When adjustlinkrevs got moved to the filectx upstream, we incorrectly
moved it to the remotefilectx inside remotefilelog. We don't actually use
remotefilectx on the server, so wrapping it did nothing.
The fix is to move the wrapping to be in remotefilelogserver.py so it is
executed on the server side.
Test Plan:
Did a checkout with my shallow client pointed at a full repo with no
blob cache. Verified it went quickly (minutes, instead of hours).
Reviewers: pyd
Differential Revision: https://phabricator.fb.com/D2097851
Summary:
Since we only prefetch things that are in the sparse checkout, copy tracing
(which touches everything in the manifest diff) would do individual file
downloads for every file. Let's just remove those files from the copy tracing
check entirely since the user probably doesn't care if they're outside the
sparse checkout.
Test Plan: Added a test
Reviewers: sid0, rmcelroy, lcharignon, pyd
Differential Revision: https://phabricator.fb.com/D2083768
Summary:
Match with with latest version of core to pass the test.
There were a couple of changes in core that broke the extension, I matched
those changes to make the test pass.
Test Plan: The tests are all passing
Reviewers: durham
Differential Revision: https://phabricator.fb.com/D2053958
Upstream now has a matcher on _computeforwardmissing which will allow us to only
prefetch the necessary parts of a sparse checkout.
Since we're now being returned an iterator, we need to convert it to a list
since we iterate over it and return it.
Summary:
Previously remotefilelog would prefetch every file in a commit. With the sparse
checkout extension we want to only prefetch things in the sparse checkout.
This commit makes remotefilelog aware of the possible existence of a sparse
matcher.
Test Plan: Added tests
Reviewers: sid0, rmcelroy, pyd, lcharignon
Subscribers: kang
Differential Revision: https://phabricator.fb.com/D1967207
Summary:
Per @pyd's review of D1933267, we need to check for the linknode in cl.nodemap,
not in cl (whose __contains__ method only looks for revs and doesn't even check
for visibility... lolz).
Test Plan: ran tests
Reviewers: durham, sid0, pyd, ericsumner, lcharignon, davidsp, mitrandir
Reviewed By: mitrandir
Subscribers: akushner, daviser, pyd
Differential Revision: https://phabricator.fb.com/D1934941
Tasks: 6573011
Signature: t1:1934941:1427130649:b084635db9bfcd28c4d4a1bcf12a7500c06b323c
Summary:
The new version of adjust linknodes wasn't accounting for the fact that some
ancestries contained nodes that no longer exist. Check for that before looking
for common ancestors.
The old version of this code survived by luck. We were catching KeyErrors as one
base case, and it just happens that LookupError from the changelog is also a
KeyError, so it was getting caught and eaten.
Test Plan:
We should probably add a test, but I have to leave shortly and this is pretty
broken, so we'll have to take a rain check.
Reviewers: rmcelroy, pyd, sid0
Differential Revision: https://phabricator.fb.com/D1933267
Summary:
The new fixmappinglinknodes function was using recursion to traverse the file
history, but this would break for files with history that was extremely long
(stack overflow). Switch to using a manual stack approach.
Test Plan: Ran the tests (I'd added a test to cover this logic before).
Reviewers: sid0, davidsp, mitrandir, lcharignon, pyd, rmcelroy
Reviewed By: rmcelroy
Subscribers: michaelbarton
Differential Revision: https://phabricator.fb.com/D1931944
Signature: t1:1931944:1426884986:3a0ef144fb55b8c0533e5c5de90699a1823b891f
Summary: I'm going to add a new parameter upstream. Make this more generic so that we don't have to try and support both the old and the new versions.
Test Plan: Ran tests with both old and new hg.
Reviewers: davidsp, rmcelroy, akushner, pyd, daviser, mitrandir, ericsumner, durham
Reviewed By: durham
Differential Revision: https://phabricator.fb.com/D1920172
Signature: t1:1920172:1426615175:d90bda3b3cc30f6e5f3149af82ae9e43dee39455
Summary:
Previously remotefilelog did not produce all the necessary local data blobs
when doing a peer push/pull if the incoming changegroup had two manifests
that referred to the same file revision. We would only create a file blob
containing the history for the first occurrence, then if the user tried to
access the file history for other occurrences they got an exception.
The fix is to add linkrev fixup logic, similar to the adjustlinkrev() method
from core Mercurial's filectx. Now, if no valid local file blob can be found, we
will compute a valid history by reading the changelog.
We might be able to write this data to disk in the future as well to prevent
having to repeatedly compute this.
Test Plan: Added a test
Reviewers: sid0, rmcelroy, pyd, mitrandir, lcharignon
Differential Revision: https://phabricator.fb.com/D1904453
Summary:
For hg-git conversions we're going to cause commits without actually updating to the base. Currently, this will cause lots of individual fetches.
The test demonstrates the issue -- wihtout this patch it'll fetch the 2 files over 2 fetches, but with it it'll fetch the files over 1 fetch.
Test Plan: Ran the tests.
Reviewers: davidsp, rmcelroy, akushner, pyd, daviser, mitrandir, ericsumner, durham
Reviewed By: durham
Differential Revision: https://phabricator.fb.com/D1893721
Tasks: 6390769
Signature: t1:1893721:1425624679:5651f71d5023919e9321646275b681b573847c44