Commit Graph

288 Commits

Author SHA1 Message Date
Durham Goode
2d1dcb4b97 Fix missing 'grp' import 2016-04-18 11:46:06 -07:00
Durham Goode
5b2914142a Fix status returning invalid results
The recent refactor caused remotefilelog.size() to include rename metadata in
the size count, which meant the size didn't match what the rest of Mercurial
expected. This caused clean files to show up as dirty in hg status if they had a
'lookup' dirstate state and were renames.
2016-04-10 09:46:24 -07:00
Durham Goode
2e93ca187a Add byte count checking when receiving from the server
Summary:
We've received a few complaints that receivemissing is throwing corrupt data
exceptions. My best guess is that we're not receiving all of the data for some
reason. Let's add an assertion to ensure all the data is present, so we can
narrow it down to a connection issue instead of actual corrupt data.

Test Plan: Ran the tests

Reviewers: #sourcecontrol, ttung

Differential Revision: https://phabricator.fb.com/D3136203
2016-04-05 09:50:12 -07:00
Durham Goode
24323a759c store: address code review feedback
This was meant to be part of the previous stack of commits, but I pushed the
wrong stack. This patch addresses a number of code review feedback points, the
most visible being to remain 'contains' to something else (in this case
'getmissing').
2016-04-04 16:48:55 -07:00
Durham Goode
8ca8f7f6ca stores: remove fetch logic and replace with a remote store fallthrough
The old way of fetching from the server required the base store api expose a way
for outside callers to add fetch handlers to the store. This exposed some of the
underlying details of how data is fetched in an unnecessary way and added an
awkward subscription api.

Let's just treat our remote caches as another store we can fetch from, and
require that the over arching configure logic (in shallowrepo.py) can connect
all our stores together in a union store.
2016-04-04 16:26:12 -07:00
Durham Goode
ece19111e0 ioutil: rename ioutil to shallowutil
The old name was not very descriptive. There's already a shallowutil, so let's
just use that.
2016-04-04 16:26:12 -07:00
Durham Goode
29ea8ada1e store: delete the localcache class
Now that all functionality has been moved to the new store, we no longer need
the localcache class. So let's delete it.
2016-04-04 16:26:12 -07:00
Durham Goode
ecf4378d18 store: implement gc in the new store
The last major piece of functionality that needs to be moved into the new store
is the gc algorithm. This is just a copy paste of the one that exists in
localcache.
2016-04-04 16:26:12 -07:00
Durham Goode
d70897e18c store: implement markrepo on the new store
Now that most of our storage has been moved behind the new store, let's also
move the ability to mark the repo to behind that storage abstraction.
2016-04-04 16:26:12 -07:00
Durham Goode
0dd4247520 store: make remotefilelog.ancestormap use the new store
Now that we have a metadatastore, let's use it to implement
remotefilelog.ancestormap. This gets rid of a bunch of ugly code.
2016-04-04 16:26:12 -07:00
Durham Goode
ad473d5a6b store: make remotefilelog.linknode us the new store
Now that we have the new metadatastore, let's use it to fetch the linknode
instead of parsing the data ourself.
2016-04-04 16:26:12 -07:00
Durham Goode
82bc4468ed store: make remotefilelog.renamed use the store
Now that we have a metadata store, let's switch remotefilelog.renamed to consult
it, instead of parsing the data itself.
2016-04-04 16:26:12 -07:00
Durham Goode
aba161c424 store: implement metadatastore functions
This implements the metadatastore APIs that were previously just stubs.
2016-04-04 16:26:12 -07:00
Durham Goode
8ad3ce6f41 store: change fileserviceclient to write via new store
Now that we have the new store abstraction, and now that remotefilelog.py writes
via it, let's also make fileserverclient write to the store via that API.

This required some refactoring of how receive missing worked, so we could pass
the filename down, as that is required for writing to the store.
2016-04-04 16:26:12 -07:00
Durham Goode
721f54d0df store: move remotefilelog content writing to be done via basestore
Now that we have the new store abstraction, let's route writes through it as
well.
2016-04-04 16:26:12 -07:00
Durham Goode
ffb239bdcb store: switch remotefilelog.size to use new store
Now that we can read data via the new store, let's switch remotefilelog to use
that instead of talking to the filesystem directly.
2016-04-04 16:26:12 -07:00
Durham Goode
69aff18063 store: switch remotefilelog.read to use self.revision
Now that remotefilelog.revision is implemented using the new contentstore, let's
switch remotefilelog.read to use that instead. This logic is almost identical to
what's in filelog.read
2016-04-04 16:26:12 -07:00
Durham Goode
50df2e518f store: switch remotefilelog.revision to use new store
Now that the new contentstore has get(), let's switch remotefilelog.revision to
use it instead.
2016-04-04 16:26:12 -07:00
Durham Goode
cb48cd034a store: add store data validation
The old store logic has validation for checking the data it's reading is
corrupt. Let's copy and paste that over to the new store.
2016-04-04 16:26:12 -07:00
Durham Goode
dfb49ad597 store: implement contentstore.get
This implements the basic function for fetching content data from the
remotefilelog store.
2016-04-04 16:26:12 -07:00
Durham Goode
647684cca8 store: implement basestore.contains
This implements the basic contains function that checks if the given (filename,
node) pairs are in the store.
2016-04-04 16:26:12 -07:00
Durham Goode
1d97924c54 store: construct store during repo creation
We are refactoring the storage to be behind more abstract APIs. This patch
creates the new store objects on the repo and passes them to the
fileserverclient so it can add itself as a file provider, in the case of misses.
2016-04-04 16:26:12 -07:00
Durham Goode
9c88142860 store: add union stores
Future patches will refactor the storage logic into a more abstract API. This
patch adds a union store, which will allow us to check both local client storage and
shared cache storage, without exposing the difference at higher levels.
2016-04-04 16:26:12 -07:00
Durham Goode
b62ef50278 store: add stubs for storage classes
Future patches will refactor the storage into a more abstract API. This is the
initial stubs for that API.
2016-04-04 16:26:12 -07:00
Durham Goode
492b9af06e ioutil: move helper functions to ioutil
Future patches will refactor the storage into more abstract APIs. Let's move
these utility functions out to be on their own.
2016-04-04 16:26:12 -07:00
Jun Wu
b7e6384e9c Allow repo = None in runcommand
Summary:
When running inside chg, `reposetup` will be called once since `serve` is not
a `norepo` command. Then if the user runs a `norepo` command like `help`,
`runcommand` will receive `repo = None` and error out. Fix it by checking
`repo` explicitly.

Test Plan: Run `chg help` and no exception is thrown.

Reviewers: #sourcecontrol, ttung, durham

Reviewed By: durham

Differential Revision: https://phabricator.fb.com/D3136328

Signature: t1:3136328:1459811387:3b86df9765aa5e20677031d6e9fc4bc3d524efa6
2016-04-04 16:22:16 -07:00
Durham Goode
f774b1b204 adjustlinknode: remove unnecessary ancestor walk
Summary:
Since we added the C code ancestor walk to this function, this python ancestor
walk is completely unnecessary, and can cause significant slow downs if none of
the ancestors are known linknodes (it walks the entire history).

Test Plan: Ran the tests

Reviewers: #sourcecontrol, ttung

Differential Revision: https://phabricator.fb.com/D3136150
2016-04-04 15:30:47 -07:00
Jun Wu
2ec49732fd Add shallowrepo check in wrapped log function
Summary:
Discovered by `hg log filename` in the hg-committed repo. It seems we missed
a check here.

Test Plan:
Run `hg log filename` in a non-remotefilelog repo with remotefilelog enabled
and make sure "warning: file log can be slow on large repos" is not printed.

Reviewers: #sourcecontrol, ttung, durham

Reviewed By: durham

Differential Revision: https://phabricator.fb.com/D3132523

Signature: t1:3132523:1459801676:bcba3bbcaf1c358ad11e8ad25c0a1d3cc2637a76
2016-04-04 13:33:28 -07:00
Kostia Balytskyi
4e61e19a3d remotefilelog: do ui.log of remotecache hit rate
Summary: We would like to utilize Martijn's logtoprocess extension to log cache hit rate.

Test Plan: None so far, will update the diff later.

Reviewers: #sourcecontrol, ttung

Differential Revision: https://phabricator.fb.com/D3094765
2016-04-01 03:16:00 -07:00
Mateusz Kwapich
cc54a98956 addchangegroup: adjust for new upstream API
Summary:
addchangegropfiles doesn't take the pr function as a parameter anymore.
The upstream change https://selenic.com/hg/rev/982e3ef7f5bf

Test Plan: tests are passing now on the release branch

Reviewers: #sourcecontrol, ttung, durham

Reviewed By: durham

Differential Revision: https://phabricator.fb.com/D3107217

Signature: t1:3107217:1459211189:4ece7531aff6043fc3acbfe43e2f471781c25c9d
2016-03-30 14:17:49 -07:00
Jun Wu
be261ba980 Fix tests for chg
Summary:
When running tests under chg, the server process won't exit per command.
Change tests to make them chg-friendly.

Test Plan:
Run tests with latest chg (with some additional patches not in core yet)

  cd ~/remotefilelog/tests
  ~/hg/tests/run-tests.py --chg

Reviewers: #sourcecontrol, ttung, durham

Reviewed By: durham

Subscribers: durham

Differential Revision: https://phabricator.fb.com/D3026087

Signature: t1:3026087:1457546610:bd469e51a50dbe49505afd661848b7280feacb53
2016-03-22 10:45:18 -07:00
Augie Fackler
9eb0009839 fileserverclient: use new iterbatch() method
This allows the client to send a single batch request for all file contents
and then handle the responses as they stream back to the client, which should
improve both running time and the user experience as far as it goes with
progress.
2016-03-22 10:06:24 -07:00
Ryan McElroy
d839113ded more leniency for slow network
Summary:
I saw a test failure on a loaded box where we got a low enough
throughput to write out "bytes/sec" instead of "KB/sec".

Test Plan: run all tests

Reviewers: #sourcecontrol, ttung, durham

Reviewed By: durham

Subscribers: durham

Differential Revision: https://phabricator.fb.com/D3018195

Signature: t1:3018195:1457379358:24cd762f4c2788b49bc6fa409f6a935f37a70980
2016-03-08 03:47:33 -08:00
Augie Fackler
86ea8ed060 commands: norepo was removed in e1563031f528
Use the decorator form instead, introduced in hg 3.1.
2016-03-03 13:40:31 -05:00
Wez Furlong
2ec314e26a remotefilelog: add separate option to validate localcache files
Summary:
We've recently had to dig into two different issues that resulted in broken
files landing in the localcache; one was due to a problem with the data source
for our cacheprocess becoming corrupt and the other was due to a failed write
(ENOSPC) causing a truncated file to be left in the local cache.

It is desirable to perform some lightweight consistency checks before we return
data up to the caller of localcache, but prior to this diff the validation
functionality was coupled to configuring a log file.

Due to the shared nature of the localcache it's not always clear cut where we
want to log localcache consistency issues, so it feels more flexible to
decouple logging from enabling checks.

This diff introduces `remotefilelog.validatecache` as a separate option that
can have three values:

* `off` - no checks are performed
* `on` - checks are performed during read and write
* `strict` - checks are performed during __contains__, read and write

The default is now `on`.

Test Plan: `./run-tests.py --with-hg=../../hg-crew/hg`

Reviewers: #sourcecontrol, ttung

Differential Revision: https://phabricator.fb.com/D2941067

Tasks: 10044183, 9987694
2016-02-18 08:34:33 -08:00
Durham Goode
a7a78cda1e More robust adjustlinknode code for None srcrev's
Summary:
The srcrev passed to adjustlinknode can sometimes be None, which causes an
exception. The code that throws the exception was introduced recently as part of
taking advantage of a C fast path.

The fix is to move the srcrev check to be after the None handling.

Test Plan:
I'm not sure how to repro this naturally actually.  I tried writing
tests that did rebases of renames, but it didn't trigger.  I manually verified
it by using the debugger to insert a None for the srcrev at the beginning of
adjustlinknode

Reviewers: lcharignon, #sourcecontrol, ttung, mitrandir

Reviewed By: mitrandir

Differential Revision: https://phabricator.fb.com/D2944899

Tasks: 10066192

Signature: t1:2944899:1455735567:c8eea240885847061239bf3df0ea59dbbd0e4858
2016-02-17 11:01:45 -08:00
Wez Furlong
fd584f7e56 remotefilelog: more graceful handling of write errors for localcache
Summary:
I debugged an issue this past week where a set of machines had exhausted the
disk space available on the partition where the local cache was situated.  This
particular tier didn't use cacheprocess, only the local cache.  There were some
number of truncated files in the local cache.

Inspecting the code here, it looks like we're using atomictempfile incorrectly.
atomictempfile.close() will unconditionally rename the temp file into place,
and we were calling this from a finally handler.

It seems safest to remove the try/finally from around this section of code and
just let the destructor trigger to clean up the temporary file in the error
path, and if we make it through writing the data, then call close and have it
move the file in to place.

Test Plan:
ran the tests.  They don't cover this case, but at least I didn't
obviously break anything:

```
 $ ./run-tests.py --with-hg=../../hg-crew/hg
...................
# Ran 19 tests, 0 skipped, 0 warned, 0 failed.
```

Reviewers: #sourcecontrol, ttung, mitrandir

Reviewed By: mitrandir

Subscribers: scyost

Differential Revision: https://phabricator.fb.com/D2940861

Tasks: 10044183

Signature: t1:2940861:1455673078:a7593d70c32151e13c8ccc31f92387e9c8cb23a0
2016-02-17 08:03:38 -08:00
Augie Fackler
6a4b66c8cc test-cacheprocess: add a new test that demonstrates talking to a cacheprocess
Right now this only demonstrates reading from the cache. Writing is
not currently implemented. This tests both the cache-hit and
cache-miss case for both of including the file path and not in the
cache request.
2016-02-08 14:42:14 -08:00
Augie Fackler
73ce6a3a01 hgignore: also ignore testtimes from hg testrunner
If you use ~/path/to/hg's/tests/run-tests.py to run tests, you get
this file now, so let's ignore it.
2016-02-08 14:40:27 -08:00
Durham Goode
2cce4008b6 adjustlinknode: user C fastpath
Summary:
The adjustlinknode logic was pretty slow, since it did all the ancestry
traversal in python. This patch makes it first use the C fastpath to check if
the provide linknode is correct (which it usually is), before proceeding to the
slow path.

The fastpath can process about 300,000 commits per second, versus the 9,000
commits per second by the slow path.

This cuts 'hg log <file>' down from 5s to 2.5s in situations where the log spans
several hundred thousand commits.

Test Plan:
Ran the tests, and ran hg log <file> on a file with a lot of history
and verified the time gain.

Reviewers: pyd, #sourcecontrol, ttung, quark

Reviewed By: quark

Subscribers: quark

Differential Revision: https://phabricator.fb.com/D2908532

Signature: t1:2908532:1454718666:c4e63d73057572f035082943ef2e6fe0a49238c1
2016-02-08 14:40:07 -08:00
Simon Farnsworth
6cdf20e7ad remotefilelog: Make TortoiseHG work with remotefilelog 2016-02-05 14:53:45 +00:00
Durham Goode
16d12ec27c Remove limit on adjust linknode lookup
Previously we limited the changelog scan for old commits to the most recent
100,000, under the assumption that most changes would be within that time frame.
This turned out to not be a good assumption, so let's remove the limitation.
2016-01-27 15:56:36 -08:00
Augie Fackler
afca077cf9 fileserverclient: add option to provide file path to cacheprocess
For our uses of remotefilelog, life is significantly easier if we also
have the file path rather than just a hash of the file path. Hide this
behind a config knob so users can enable it or not as makes sense.
2016-01-27 13:22:22 -08:00
Durham Goode
4ee8e7278d changegroup: support new _packermap name
Upstream changed changegroup.packermap to be changegroup._packermap. So we need
to update accordingly.
2016-01-19 16:34:53 -08:00
Durham Goode
77a1e268d2 tests: update tests to include cg3 in header
Upstream added a new changegroup format, so our headers changed slightly.
2016-01-19 16:34:20 -08:00
Durham Goode
13c2a7823f Add alternative linkrev lookup logic
Summary:
The old linkrev lookup logic depended on the repo containing the latest commit
to have contained that particular version of the file. If the latest version had
been stripped however (like what happens in rebase --abort currently), the
linkrev function would attempt to scan history from the current rev,
trying to find the linkrev node.

If the filectx was not provided with a 'current node', the linkrev function
would return None. This caused certain places to break, like the Mercurial
merge conflict resolution logic (which constructs a filectx using only a
fileid, and no changeid, for the merge ancestor).

The fix is to allow scanning all the latest commits in the repo, looking for the
appropriate linkrev. This is pretty slow (1 second for every 14,000 commits
inspected), but is better than just returning None and crashing.

Test Plan:
Manually repro'd the issue by making a commit, amending it, stripping the
amended version and going back to the original, making two sibling commits on
top of the original, then rebasing sibling 1 onto sibling 2 (so that the
original commit that had the bad linknode data was the ancestor during the
merge). Previously this failed, now it passes. I'd write a test, but it's 11pm
and I'm tired and I need this in by early tomorrow morning to make the cut.

Reviewers: #sourcecontrol, ttung, rmcelroy

Reviewed By: rmcelroy

Subscribers: trunkagent, rmcelroy

Differential Revision: https://phabricator.fb.com/D2826850

Signature: t1:2826850:1452680293:cb8c1f8c20ce13ad632925137dbdce6e994ab360
2016-01-13 11:25:26 -08:00
Laurent Charignon
707f243248 remotefilelog: make the wrapping of dispatch.run safer
Summary:
I somehow got a stacktrace with IPython on a non-remotefilelog repo that ran
this code and complained that fileservice didn't exit. I am not sure how it
happened but let's make the call safer to match the pattern used elsewhere in
the file.

Test Plan: No stacktrace seen after that, one line change

Reviewers: durham

Differential Revision: https://phabricator.fb.com/D2819402
2016-01-11 10:48:51 -08:00
Kostia Balytskyi
9500813607 remotefilelog: removing filelog check from verification process
Differential Revision: https://phabricator.fb.com/D2812664
2016-01-07 16:57:39 -08:00
Stanislau Hlebik
33b7e1013a remotefilelog: make .hg/store/data blobs read only
Summary:
Today, people running codemods or search/replace on their repos often accidentally corrupt their repos, and everyone ends up sad.
It's better to make them read-only

Test Plan: python run-tests.py

Reviewers: rmcelroy, #sourcecontrol, durham, ttung

Reviewed By: durham

Subscribers: mitrandir, quark, durham

Differential Revision: https://phabricator.fb.com/D2807369

Tasks: 9431187

Signature: t1:2807369:1452192329:b5ed6606cb66b1c830fc3d3fb5a81e6120387b38
2016-01-07 13:37:36 -08:00
Durham Goode
422b0cece0 tests: fix packermap dependency with upstream
Upstream added a changegroup3 format, and the test needs to account for that.
2016-01-04 14:59:10 -08:00