Commit Graph

45 Commits

Author SHA1 Message Date
Durham Goode
f43ba75915 remotefilelog: fix pyflakes and module import errors
Summary:
This fixes all the pyflaks and module errors for the main remotefilelog
code base.

Test Plan: ./run-tests.py test-check* test-remotefilelog*

Reviewers: #mercurial, quark

Reviewed By: quark

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4055537

Signature: t1:4055537:1477049663:ee904d311d17d3659e055e2c109c68c9023cfd1f
2016-10-21 11:02:09 -07:00
Durham Goode
a07c114b8d remotefilelog: fix to use new manifestlog functions
Summary:
readfast and readdelta have moved onto the manifestctx structures in upstream
hg, so let's change remotefilelog to use them.

This breaks the ability for this version of remotefilelog to work with old
versions of Mercurial, but we never really guaranteed this to begin with.

Test Plan: Ran the tests, they now pass

Reviewers: #sourcecontrol

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D3915785
2016-09-23 13:28:36 -07:00
Durham Goode
3b7beae747 ctree: add ctreemanifest hg extension
Summary:
Adds the initial extension that sets up the ctreemanifest. It currently relies
on the fastmanifest extension to hook into all the manifest APIs to construct
ctreemanifests.

Test Plan:
In a future patch, I was able to run 'hg manifests' on a commit and
have it return the manifest contents by reading the treemanifest.

Reviewers: #fastmanifest, ttung

Reviewed By: ttung

Subscribers: ttung

Differential Revision: https://phabricator.intern.facebook.com/D3755327

Signature: t1:3755327:1472114482:0c5862cba68ed4db643d28c2fae01f33f5352970
2016-08-29 16:19:52 -07:00
Tony Tung
f642d24f1c [cdatapack] create a fastdatapack class
Summary:
fastdatapack is the same as datapack.  add selector in datapackstore to determine which datapack to create.

test-datapack-fast.t is the same as tset-datapack.t, except it enables fastdatapack

Test Plan: pass test-datapack.t test-datapack-fast.t

Reviewers: #fastmanifest, durham

Reviewed By: durham

Subscribers: mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3666932

Signature: t1:3666932:1470426499:45292064e2868caab152d9a5b788840c5f63e4e4
2016-08-05 14:35:29 -07:00
Durham Goode
d7722fcc7c stores: reverse order of cache and local stores
In the old days we would check the cache first, then the local store. This was
important because the cache is more likely to contain correct data (since it
comes from the final pushed version of commits), versus local data which may
contain information about stripped commits.

As part of the big store refactor, this order got switched unintentionally. So
let's switch it back.
2016-06-16 10:22:31 -07:00
Durham Goode
6f3d6c53f5 utils: unify cachepath access through a util function
Summary:
Previously a bunch of different places accessed the cachepath through ui.config
directly. This is a problem because we need to resolve any environment variables
in the path, and some spots didn't do this. So let's unify all accesses through
a helper function that takes care of the environment variables.

Test Plan: Added a test

Reviewers: mitrandir, lcharignon, #sourcecontrol, ttung, simonfar

Reviewed By: simonfar

Subscribers: simonfar

Differential Revision: https://phabricator.intern.facebook.com/D3385583

Signature: t1:3385583:1464971813:5b9ee5ed3d6ff9f1a78cb9e0269e433844758c9d
2016-06-03 09:45:58 -07:00
Durham Goode
c9621d3d1a repack: don't require complete history during data repack
Summary:
Previously, when repacking deltas we would require a full history of the node so
we could order the hashes optimally. In some situations though, we don't have
the full history available (like if we're only repacking a subset of packs), so
we need to be able to repack even without full history.

This patch handles the case where a given delta doesn't have history
information. We just store it as a full text.

This becomes useful in an upcoming series that will introduce incremental
packing that only packs a subset of the packs.

Test Plan: Added a test

Reviewers: #mercurial, ttung, mitrandir

Reviewed By: mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3278346

Signature: t1:3278346:1463086170:54c0fbefe78f9cafa7efc4b6f037887a924ab4a5
2016-05-16 10:59:09 -07:00
Durham Goode
a5ed23b7ca fileserverclient: separate data and history in prefetching
Summary:
Previously the fileserverclient logic only checked the data store when
determining whether to prefetch a given key or not. This meant that if the file
was in the data store but not the history store, it could result in a history
lookup failing to prefetch.

The fix is to separate the notions of data and history in the prefetch logic. By
default we just check the data store like before, but an optional argument
allows us to specify checking the history store as well (and we change the
remotemetadatastore to pass that argument).

Test Plan:
Ran the tests. Also ran the repack scenario in my large repo that
reproed the issue. In another diff, I'm going to come back and add a suite of
tests around the various repack permutations that I've seen cause issues.

Reviewers: #mercurial, ttung, mitrandir

Reviewed By: mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3277239

Signature: t1:3277239:1463085690:49ad478048cd13836b60f7ac9190e2294f5e9c64
2016-05-16 10:59:09 -07:00
Durham Goode
05ceb8b419 store: basic wire protocol for bundle delivery
Summary:
This adds a new wire protocol command to allow clients to request a set
of file contents and histories from the server and receive them in pack format.
It's pretty simple and always returns all the history for every node requested
(which is a bit overkill), but it's labeled v1 and we can iterate on it.

Test Plan: Added a test

Reviewers: #mercurial, ttung, mitrandir

Reviewed By: mitrandir

Subscribers: mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3277212

Signature: t1:3277212:1463421279:459cc84265502175b47df293647aab7e7a830185
2016-05-16 10:59:09 -07:00
Durham Goode
a54d3f1257 store: make repack only repack the shared cache
Summary:
Previously, hg repack would repack all the objects in all the store and dump the
new packs in .hg/store/packs. Initially we only want to repack the shared cache
though, so let's change repack to only operate on shared stores, and to write
out the new packs to the hgcache, under the appropriate repo name.

In a future patch I'm going to go through all this store stuff and replace all
uses of os.path and direct file reads/writes with a mercurial vfs.

Test Plan:
Ran repack in a large repo and verified packs were produced in
$HGCACHE/$REPONAME/packs

Ran hg cat on a file to verify that it read the data from the pack and did not do any remotefilelog network calls.

Reviewers: lcharignon, rmcelroy, ttung, quark, mitrandir

Reviewed By: mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3250213

Signature: t1:3250213:1462315927:694661795141e2c869ba661a54cea8f4b90823df
2016-05-04 14:52:33 -07:00
Durham Goode
22948ce7e1 checkcode: add check code test
Summary: Adds the same check code test that upstream Mercurial uses.

Test Plan:
Ran it, and fixed all the failures. I won't land this commit until
all the failure fixes are landed.

Reviewers: #sourcecontrol, ttung, rmcelroy, wez

Reviewed By: wez

Subscribers: quark, rmcelroy, wez

Differential Revision: https://phabricator.intern.facebook.com/D3221380

Signature: t1:3221380:1461802769:19f5bdc209c05edb442faa70ae572ce31e2fbc95
2016-04-28 10:18:47 -07:00
Durham Goode
18cde8ba89 store: add a historypack class that can read histpacks
Summary:
The previous patch added logic to repack store history and write it to
a histpack file. This patch adds a pack reader implementation that knows how to
read histpacks.

Test Plan:
Ran the tests.  Also tested this in conjunction with the next patch
which actually reads from the data structure.

Reviewers: #sourcecontrol, ttung, mitrandir, rmcelroy

Reviewed By: mitrandir, rmcelroy

Subscribers: rmcelroy, mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3219764

Signature: t1:3219764:1461718081:9d812b6aea87fe9eb48fdac9dbef282e4775c3c9
2016-04-27 16:49:24 -07:00
Durham Goode
43ed70b6f1 store: add datapack store that reads pack files from .hg/store/packs
Summary:
Now that we can read and write datapack files, let's add a store implementation that
can serve packed content. With this patch, it's technically possible for someone
to prefetch and repack large portions of history for long term storage with
remotefilelog.

My next set of commits (which haven't been written yet) will:
- add tests for all of this
- add an indexpack format for packing ancestor metadata (the datapack only packs
  revision content)

Test Plan:
Ran the tests. Also repacked a repo, deleted the old cache files, ran
hg up null && hg up master, and verified it checked out master with the
right files and without fetching blobs from the server.

Reviewers: #sourcecontrol, ttung, rmcelroy

Reviewed By: rmcelroy

Subscribers: rmcelroy

Differential Revision: https://phabricator.intern.facebook.com/D3205351

Signature: t1:3205351:1461751649:45a56b57d962a282aeef9478500a3b23495a0eb7
2016-04-27 16:49:15 -07:00
Durham Goode
f362c9a3a8 store: add getfiles() api to store
Summary:
This adds a api to the store contract that allows the store to return a list of
the name/node pairs that it contains. This will be used to allow a repack
algorithm to list the contents of the store so it can repack it into another
store. The old remotefilelog blob store used namehash+node keys, which is
different from the new store API's name+node keys, so the getfiles()
implementation here has to perform a reverse  namehash->name lookup so it can
satisfy the store API contract.

In the remotefilelog basestore implementation, it reads the file names from the
local data directory and the shared cache directory, and reverse resolves the
file name hashes into filenames to produce the list.

Test Plan: ran the tests

Reviewers: #sourcecontrol, ttung, rmcelroy

Reviewed By: rmcelroy

Subscribers: rmcelroy

Differential Revision: https://phabricator.intern.facebook.com/D3205321

Signature: t1:3205321:1461751437:a7c44c2bbe153122a3b85b8d82907a112cf77b1a
2016-04-27 16:49:06 -07:00
Durham Goode
7e1047d11f store: change union stores to accept a list of stores
Summary:
Instead of hard coding the list of stores in each union store, let's make it a
list and just test each store in order. This will allow easily adding new stores
and reordering the priority of the existing ones.

Also fix the remote store's contains function. 'contains' is the old name, and
it now needs to be getmissing in order to fit the store contract.

Test Plan: ran the tests

Reviewers: #sourcecontrol, ttung, rmcelroy

Reviewed By: rmcelroy

Differential Revision: https://phabricator.fb.com/D3205314

Signature: t1:3205314:1461606028:3a513ac82c5de668a7e40bbf7cc88d8754e2f0bb
2016-04-26 15:10:38 -07:00
Durham Goode
9cfbf5a59e store: keep track of the writable store instead of hard coding it
Summary:
A future patch is going to change the union store to just contain an ordered
list of stores. Therefore we need a special spot to record which store is the
one that should receive writes.

Test Plan: ran the tests

Reviewers: #sourcecontrol

Differential Revision: https://phabricator.fb.com/D3205307
2016-04-26 15:10:38 -07:00
Durham Goode
84bc49f25d checkcode: fix shallowrepo, shallowutil, and setup.py
Summary: Fix failures found by check-code.

Test Plan: Ran the tests

Reviewers: #sourcecontrol, ttung

Reviewed By: ttung

Differential Revision: https://phabricator.fb.com/D3221375

Signature: t1:3221375:1461648312:7dbdd59e6370cb32b90d864a623d8066028741e7
2016-04-26 13:00:31 -07:00
Durham Goode
8ca8f7f6ca stores: remove fetch logic and replace with a remote store fallthrough
The old way of fetching from the server required the base store api expose a way
for outside callers to add fetch handlers to the store. This exposed some of the
underlying details of how data is fetched in an unnecessary way and added an
awkward subscription api.

Let's just treat our remote caches as another store we can fetch from, and
require that the over arching configure logic (in shallowrepo.py) can connect
all our stores together in a union store.
2016-04-04 16:26:12 -07:00
Durham Goode
1d97924c54 store: construct store during repo creation
We are refactoring the storage to be behind more abstract APIs. This patch
creates the new store objects on the repo and passes them to the
fileserverclient so it can add itself as a file provider, in the case of misses.
2016-04-04 16:26:12 -07:00
Durham Goode
faccfe65d4 Add prefetching to checklookup
Summary:
During hg status Mercurial sometimes needs to look at the size of contents of
the file and compare it to what's in history, which requires the file blob.

This patch causes those files to be batch downloaded before they are compared.

There was a previous attempt at this (see the deleted code), but it only wrapped
the dirstate once at the beginning, so it was lost if the dirstate object was
replaced at any point.

Test Plan: Added a test to verify unknown files require only one fetch.

Reviewers: #sourcecontrol, ttung

Reviewed By: ttung

Subscribers: dcapra

Differential Revision: https://phabricator.fb.com/D2756768

Signature: t1:2756768:1450130997:7c7101efe66c998e3182dfbd848aa6b1a57d509f
2015-12-14 14:44:08 -08:00
Martin von Zweigbergk
7251d9b51b repo: replace repo.parents() by repo[None].parents()
repo.parents() was removed in hg revision d5d613de0f44 (commands:
inline definition of localrepo.parents() and drop the method (API),
2015-11-11).
2015-12-10 17:25:14 -08:00
Durham Goode
ca8028eb16 Add kwargs to repo.sparsematch 2015-10-06 10:07:01 -07:00
Augie Fackler
5eecca9702 remotefilelog: handle the death of repo.sopener (hg change 0bbe3294361a)
repo.sopener has been deprecated since hg 2.3, and repo.svfs replaces
it. Since it's been dead for so long, let's just use svfs and call it
good enough.
2015-06-30 10:12:38 -04:00
Durham Goode
8bf6e4f004 sparse: make remotefilelog aware of sparse checkouts
Summary:
Previously remotefilelog would prefetch every file in a commit. With the sparse
checkout extension we want to only prefetch things in the sparse checkout.

This commit makes remotefilelog aware of the possible existence of a sparse
matcher.

Test Plan: Added tests

Reviewers: sid0, rmcelroy, pyd, lcharignon

Subscribers: kang

Differential Revision: https://phabricator.fb.com/D1967207
2015-04-02 09:58:46 -07:00
Siddharth Agarwal
43e26aff3b shallowrepo: prefetch files before a commitctx
Summary:
For hg-git conversions we're going to cause commits without actually updating to the base. Currently, this will cause lots of individual fetches.

The test demonstrates the issue -- wihtout this patch it'll fetch the 2 files over 2 fetches, but with it it'll fetch the files over 1 fetch.

Test Plan: Ran the tests.

Reviewers: davidsp, rmcelroy, akushner, pyd, daviser, mitrandir, ericsumner, durham

Reviewed By: durham

Differential Revision: https://phabricator.fb.com/D1893721

Tasks: 6390769

Signature: t1:1893721:1425624679:5651f71d5023919e9321646275b681b573847c44
2015-03-05 16:06:12 -08:00
Durham Goode
f9730cd521 Fix dirstate wrapping to match upstream
Upstream Mercurial commit f447144c8ada changed the dirstate.status output. This
updates remotefilelog to match that new output.
2014-10-22 12:36:53 -07:00
Durham Goode
37798a0827 Fix pull wrapping to match upstream
Upstream Mercurial has moved localrepo.pull into exchange.pull. This moves our
wrapping of that command out of shallowrepo and into __init__. Exchange is
becoming an increasingly important class, so we may want to think about moving
all exchange wrapper logic out to a separate module in remotefilelog.
2014-10-14 15:50:04 -07:00
Durham Goode
65503211ed Fix revset indexing bug and update test output
repo.revs() no longer returns an object that can be indexed, so we can't use []
on it anymore. So let's use list() on it first.

The bookmark output from upstream Mercurial has also changed, so we need to
update the tests.
2014-10-14 15:30:38 -07:00
Durham Goode
8a5a5330c1 Fix pullprefetch for recently landed commits
Summary:
Pull-prefetch would not download file versions from the server if the file
version already existed in the local cache or the local store data.
Unfortunately, if someone landed their commit, then later stripped their local
version, the local store data file version might become invalid and no local
cache version would exist. Meaning things like 'commit' might fail when offline.

This changes prefetch to always fetch from the server when dealing with files it
knows are from revs on the server.

Test Plan:
Added a test that makes local commits that already exist on the
server, and verifies that a pull-prefetch fetches the server file version,
despite that same version existing locally.

Reviewers: sid0, pyd, davidsp

Subscribers: orip

Differential Revision: https://phabricator.fb.com/D1607260
2014-10-09 15:20:54 -07:00
Durham Goode
17c16cf610 Optimize pullprefetch to limit number of stats
Summary:
Previously, if pullprefetch was set, we'd perform a prefetch of the
entire manifest of the specified revs (usually the public bookmarks). This
involved stat-ing all the relevant files in the cache to see if they already
existed, which added an extra 6 seconds or so to every pull.

Now we only prefetch the files that are different from our working copy. We
assume we already have all the files that are in our working copy. This reduces
the pullprefetch overhead significantly.

Test Plan:
Did a pull on my laptop. Verified it didn't hang for 6 seconds at the
prefetch stage. Also updated a test

Reviewers: davidsp, pyd, sid0

Reviewed By: sid0

Differential Revision: https://phabricator.fb.com/D1505841

Tasks: 4608894
2014-08-19 09:33:31 -07:00
Durham Goode
e5228d9989 Fix pullprefetch that uses bookmarks
Summary:
Previously, pullprefetch was executed during the repo.pull stage. This happens
before the bookmarks have been moved, so revsets like 'bookmark()' would
prefetch the wrong commits.

This change moves the pullprefetch logic to after the pull command is completely
finished.  Updated a test to make sure this is caught.

Also fixes a bug where we were using linkrevs to read a manifest rev entry. We
should be using the manifest rev instead.

Test Plan: Added a test. Ran it.

Reviewers: sid0, pyd, davidsp

Differential Revision: https://phabricator.fb.com/D1483345
2014-08-06 18:50:57 -07:00
Durham Goode
13058fb30c Allow auto-prefetching during pulls
Summary:
Adds a remotefilelog.pullprefetch config options that accepts a revset. Whenever
a pull is run, the revs matched by that revset will be prefetched. The most
common value for this will be '(bookmark() + heads(all())) & public()', since it will download
almost everything necessary to work offline.

Test Plan: Added a test. Ran it.

Reviewers: davidsp, pyd, sid0

Reviewed By: sid0

Differential Revision: https://phabricator.fb.com/D1419420
2014-07-03 13:05:11 -07:00
Durham Goode
c5b2f574a0 Fix changegroup wrapping with new upstream Mercurial
Summary:
Recent changes to upstream Mercurial have moved localrepo.getbundle and
localrepo.addchangegroupfiles to changegroup.py.  remotefilelog wraps these
functions, and thus needs to be updated.

Applyupdate also had a function signature change, which is fixed here.

Minor fix to a test as well, which had a hard coded time instead of a glob.

Test Plan: ./run-tests.py --with-hg=/data/users/durham/hg/hg

Reviewers: sid0, davidsp, pyd, dschleimer

Differential Revision: https://phabricator.fb.com/D1260737
2014-04-04 15:55:06 -07:00
Durham Goode
bdea38dd56 Move fileservice to be per repo instead of global
Previously the file service client was a global object that all repos could
share. This was a bit hacky and is no longer needed. Now the file service
client exists per repo instance.

This is part of a series of changes to abstract the local caching and remote
file service in such a way that we can plug and play implementations.
2014-02-11 14:41:56 -08:00
Durham Goode
17f5a0d712 Fix issues with hg pulling from svn 2013-12-12 12:34:39 -08:00
Durham Goode
393958c76b Allow naming repos
Enables specifying a name for a repo that is used in the cache key.
This allows multiple repos on a machine to share a cache without the
risk of keys overlapping.
2013-08-15 11:00:51 -07:00
Durham Goode
85e48b58fd Move server and debug logic into their own files
__init__.py was getting quite large. This change moves the server and debug
logic into their own files.  Client-side logic remains in __init__.py
2013-11-25 16:36:44 -08:00
Durham Goode
d9d4477013 Remove global variable for tracking shallow remotes
Previously we used a global variable to track if the incoming connection was
from a shallow remote (based on if the network command was a *_shallow command).
This is hacky and overall a bad idea. The new implementation stores the shallow
flag as a bundlecapability passed to the getbundle command.

A side effect of this is remotefilelog won't work with versions of mercurial
that don't use the getbundle command.
2013-11-25 14:22:56 -08:00
Durham Goode
e5f5e3244b Add more comments explaining various complexities 2013-11-05 17:19:59 -08:00
Durham Goode
1275d15990 Add include and exclude configuration settings
The remotefilelog extension currently doesn't work with tags. Adding include and
exclude patterns allows users to specify which files they want to treat as
shallow and which the want to download the entire history for. By excluding
.hgtags from being shallow, this enables tags to work in a mostly shallow repo.

This also enables largefile like scenarios where most files are full and only a
few large ones are kept remote.
2013-09-26 10:46:06 -07:00
Durham Goode
6781d80d25 Fix local pulls to send file data 2013-09-09 11:44:08 -07:00
Durham Goode
3619a1911d Cut down number of sys calls during filelog reads
When the cache is stored on a filesystem, excessive stat calls can slow
mercurial updates down dramatically. This reduces it to a single open call for
the cache location and if that fails, a single open call for the local location.
2013-09-09 10:23:29 -07:00
Durham Goode
4d70ed4fce Fix a bug with status prefetching in merge scenarios 2013-09-04 19:07:01 -07:00
Durham Goode
4edeed8417 Prefetch lookup set during hg status 2013-08-30 11:09:19 -07:00
Durham Goode
f16a3a4134 Rename to remotefilelog since shallowrepo is already taken 2013-06-21 10:14:29 -07:00