Commit Graph

14 Commits

Author SHA1 Message Date
Jeroen Vaelen
069330ec9f remotefilelog: lazily load history pack fanout table
Summary: Many operations that did not require access to histpacks were constructing them regardless. By postponing histpack initialization until we require access to it, we save up to 55ms on such operations.

Test Plan:
1. Stat profiler no longer shows ~6.5% time spent in basepack constructor during `hg book`. Also no longer shows up in the `hg show` profile.
2. Timed how much time was spent in basepack constructor, this was up to 55ms (mostly around 35ms).
3. Ran tests, they all pass.

Reviewers: #sourcecontrol, durham, quark, simonfar

Reviewed By: simonfar

Subscribers: simonfar, quark, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4500999

Signature: t1:4500999:1486502571:cda6cfb7be54eacb341b58fbff209113ea9cf670
2017-02-09 13:01:29 -08:00
Durham Goode
d3c6def7b8 remotefilelog: move pack file permissions to be in mutablebasepack
Summary:
Treemanifest had a bug where the pack files it created were 400 instead of 444.
This meant people sharing a cache on the same machine couldn't access them. In
most of the remotefilelog code we had set opener.createmode before creating the
pack but we forgot to for treemanifest.

This patch moves the opener creation and createmode setting into the mutable
pack class so we can't screw this up again.

Test Plan:
Tests that wrote to the packs before, now failed and had to be
updated to chmod the packs before writing to them.

Reviewers: #mercurial, rmcelroy

Reviewed By: rmcelroy

Subscribers: rmcelroy, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4411580

Tasks: 15469140

Signature: t1:4411580:1484321293:9aa78254677548a6dc2270c58cee0ec6f57dd089
2017-01-13 09:42:25 -08:00
Adam Simpkins
ab81106b08 [remotefilelog] don't crash on invalid pack files
Summary:
We have run into some cases where users ended up with empty pack file in their
packs directory.  Just log a warning in this case and skip this pack file,
rather than letting the exception propagate up and crashing the command.

Test Plan:
Created empty 0000000000000000000000000000000000000000.histpack and
0000000000000000000000000000000000000000.histidx files in my repository's
hgcache directory, and confirmed that "hg log" now simply warns about them
instead of crashing.

I didn't really test the perftest.py or treemanifest_correctness.py extensions
much.  They seem to throw exceptions, and look like they have maybe gotten a
bit stale.  I fixed one minor typo, but didn't dig into the other exceptions
too much.

Reviewers: durham

Reviewed By: durham

Subscribers: net-systems-diffs@, yogeshwer, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4402516

Tasks: 15428659

Signature: t1:4402516:1484155254:96d2980efcec2d735257b08910e1ca437ef1dad6
2017-01-12 09:47:29 -08:00
Durham Goode
c9037da765 remotefilelog: refactor mutablepack close/abort
Summary:
This makes mutablepacks close and abort function accessible from outside the
__exit__ logic. This will be useful later when we tie the lifetime of a
mutablepack to a transaction.

Test Plan: Ran the tests

Reviewers: #mercurial, quark

Reviewed By: quark

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4055730

Signature: t1:4055730:1477059602:ffdfd66e65279ddf3ff43d7c2ee65b00f1fd2600
2016-10-21 11:02:17 -07:00
Durham Goode
f43ba75915 remotefilelog: fix pyflakes and module import errors
Summary:
This fixes all the pyflaks and module errors for the main remotefilelog
code base.

Test Plan: ./run-tests.py test-check* test-remotefilelog*

Reviewers: #mercurial, quark

Reviewed By: quark

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4055537

Signature: t1:4055537:1477049663:ee904d311d17d3659e055e2c109c68c9023cfd1f
2016-10-21 11:02:09 -07:00
Durham Goode
df86b3486d treemanifest: auto create manifest pack directory
The pack/manifests directory wasn't being automatically created, so let's make
it so.
2016-09-21 13:51:39 -07:00
Durham Goode
8ce9d887d1 store: add markforrefresh api
Summary:
This api will be used to tell the store that new data has been added and the
store should check disk for new data (like new pack files).

Test Plan:
Used as part of a future diff that generates new tree entries during
hg pull time.

Reviewers: #fastmanifest, ttung

Reviewed By: ttung

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D3837694

Signature: t1:3837694:1473454036:0f1f37064cadc0fe3dabc6ecc6e3f7b3319e3d56
2016-09-12 11:44:53 -07:00
Tony Tung
155c9bcabf basepack: handle race condition between incremental packing and pack loading
Summary: If someone removes the pack file, getpack will throw an IOError.  Catch it and don't bother trying to add the pack to the list of available packs.

Test Plan: pdb.set_trace() in this method, then remove a file while the debugger is halted.  continue without a traceback.

Reviewers: #mercurial, quark

Reviewed By: quark

Subscribers: mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3712310

Tasks: 12660285

Signature: t1:3712310:1471080740:85f2e3f49b09f348ec654362eadf5f1e689b8a19
2016-08-15 11:41:49 -07:00
Durham Goode
073f8e3d22 repack: unmap memory occasionally to reclaim space
Summary:
When running large repack operations, the resident size of the process
could become quite large, since we're scanning in entire pack files. Linux/OSX
have api calls for telling the kernel it's ok to release some of that memory,
but those apis are not exposed to python.

So instead, let's unmap and remap the mmap's once a certain amount of data has
been read. I also tried changing the mmap accessors to use the file oriented api
(mmap.read(), mmap.seek(), etc) so we could switch to actual file handles during
repack, but it had a drastic affect on normal performance (repack took 1 hour
instead of a few minutes).

Long term we should move all of this logic to c++ so we can use the more
powerful APIs.

Test Plan:
Did a full repack on a laptop and verified memory capped out at 2GB
instead of exceeding 5GB.

Reviewers: #sourcecontrol, ttung

Differential Revision: https://phabricator.intern.facebook.com/D3545171
2016-07-12 11:46:48 -07:00
Durham Goode
77192943d4 repack: handle race condition with background repacks
Summary:
There was a race condition where if a repack is running and another hg process
launches, the new process will only see the original packs, and not any of the
new packs (even though the source blobs are being deleted from disk by the
repack).

The fix is to allow our pack store to refresh it's list of packs every so often.
In this particular implementation we do it at most every 100ms. A more robust
strategy would be to group key misses and only check for new packs at the end
once we have a list of all the misses, but this would require significant
refactoring to make everything grouped. This case should only ever happen during
repacks, so it should almost never occur more than once during a command, so the
100ms version is probably good enough.

Test Plan:
Ran `hg up && hg pull && sleep 0.2 && hg up master` in a loop with a
break point in the refresh code and caught it executing in a situation where the
background repack had removed the original sources and put them in a new pack.
Verified that it loaded the data from the new pack correctly.

Reviewers: #mercurial, ttung, lcharignon

Reviewed By: lcharignon

Subscribers: lcharignon

Differential Revision: https://phabricator.intern.facebook.com/D3524314

Signature: t1:3524314:1467907680:85be07ad953811000c468852eb0626f4d8b53a13
2016-07-07 15:59:06 -07:00
Jeroen Vaelen
07efaadb9d [remotefilelog] use hashlib to compute sha1 hashes
Summary:
hg-crew's c27dc3c3122 and c27dc3c3122^ were breaking our extensions:

```
$ hg log -r c27dc3c3122^
changeset:   9010734b79911d2d2e7405d91a4df479b35b3841
user:        Augie Fackler <raf@durin42.com>
date:        Thu, 09 Jun 2016 21:12:33 -0700
s.ummary:     cleanup: replace uses of util.(md5|sha1|sha256|sha512) with hashlib.\1
```

```
$ hg log -r c27dc3c3122
changeset:   0d55a7b8d07bf948c935822e6eea85b044383f00
user:        Augie Fackler <raf@durin42.com>
date:        Thu, 09 Jun 2016 21:13:23 -0700
s.ummary:     util: drop local aliases for md5, sha1, sha256, and sha512
```

I did a grep over facebook-hg-rpms to see what was affected:
```
$ grep "util\.\(md5\|sha1\|sha256\|sha512\)" -r ~/facebook-hg-rpms
/home/jeroenv/facebook-hg-rpms/remotefilelog/remotefilelog/basestore.py:            sha = util.sha1(filename).digest()
/home/jeroenv/facebook-hg-rpms/remotefilelog/remotefilelog/basestore.py:                sha = util.sha1(filename).digest()
/home/jeroenv/facebook-hg-rpms/remotefilelog/remotefilelog/shallowutil.py:    pathhash = util.sha1(file).hexdigest()
/home/jeroenv/facebook-hg-rpms/remotefilelog/remotefilelog/shallowutil.py:    pathhash = util.sha1(file).hexdigest()
/home/jeroenv/facebook-hg-rpms/remotefilelog/remotefilelog/debugcommands.py:    filekey = util.sha1(file).hexdigest()
/home/jeroenv/facebook-hg-rpms/remotefilelog/remotefilelog/historypack.py:        namehash = util.sha1(name).digest()
/home/jeroenv/facebook-hg-rpms/remotefilelog/remotefilelog/historypack.py:        node = util.sha1(filename).digest()
/home/jeroenv/facebook-hg-rpms/remotefilelog/remotefilelog/historypack.py:        files = ((util.sha1(filename).digest(), offset, size)
/home/jeroenv/facebook-hg-rpms/remotefilelog/remotefilelog/fileserverclient.py:    pathhash = util.sha1(file).hexdigest()
/home/jeroenv/facebook-hg-rpms/remotefilelog/remotefilelog/fileserverclient.py:    pathhash = util.sha1(file).hexdigest()
/home/jeroenv/facebook-hg-rpms/remotefilelog/remotefilelog/basepack.py:        self.sha = util.sha1()
/home/jeroenv/facebook-hg-rpms/remotefilelog/tests/test-datapack.py:        return util.sha1(content).digest()
/home/jeroenv/facebook-hg-rpms/remotefilelog/tests/test-histpack.py:        return util.sha1(content).digest()
Binary file /home/jeroenv/facebook-hg-rpms/hg-crew/.hg/store/data/mercurial/revlog.py.i matches
/home/jeroenv/facebook-hg-rpms/fb-hgext/sparse.py:            return util.sha1(fh.read()).hexdigest()
/home/jeroenv/facebook-hg-rpms/fb-hgext/sparse.py:        sha1 = util.sha1()
/home/jeroenv/facebook-hg-rpms/fb-hgext/sparse.py:        sha1 = util.sha1()
/home/jeroenv/facebook-hg-rpms/fb-hgext/sparse.py:        sha1 = util.sha1()
/home/jeroenv/facebook-hg-rpms/fb-hgext/sparse.py:    sha1 = util.sha1()
/home/jeroenv/facebook-hg-rpms/mutable-history/hgext/simple4server.py:        sha = util.sha1()
/home/jeroenv/facebook-hg-rpms/mutable-history/hgext/evolve.py:        sha = util.sha1()
```
This diff is part of the fix.

Test Plan:
Ran the tests.
```
$MERCURIALRUNTEST -S -j 48 --with-hg ~/local/facebook-hg-rpms/hg-crew/hg
```

Reviewers: #sourcecontrol, ttung

Differential Revision: https://phabricator.intern.facebook.com/D3440041

Tasks: 11762191
2016-06-15 15:48:16 -07:00
Durham Goode
304c6f5bd0 pack: move common pack logic into basepack
Summary:
This moves the common logic from datapack and historypack into a common
basepack. At the moment the only common logic is the constructor, which handles
version checking, fanout initialization, and mmap stuff.

Test Plan: Ran the tests

Reviewers: mitrandir, #mercurial, ttung, mjpieters

Reviewed By: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D3306558

Signature: t1:3306558:1463474571:35d3d2e71849b8111e5455da2dd4810725a35523
2016-05-24 02:15:58 -07:00
Durham Goode
bb3a62a266 pack: move common pack store logic to basepackstore
Summary:
This takes the duplicate logic from datapackstore and historypackstore and moves
it into a common subclass.

Test Plan: Ran the tests

Reviewers: ttung, mitrandir, #mercurial, mjpieters

Reviewed By: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D3306551

Signature: t1:3306551:1463474958:ddd36a43c2a3cbb34254c2b0153c0f3c5d431edb
2016-05-24 02:15:58 -07:00
Durham Goode
7cce219abb pack: move common logic out of mutabledatapack into base class
Summary:
mutabledatapack and mutablehistorypack share a lot of common code, especially
around the index and fanout table. Let's move much of the code to a common
mutablebasepack class and out of mutabledatapack. In the next patch I will make
mutablehistorypack a subclass of mutablebasepack and delete all the duplicate
logic.

Test Plan: Ran the tests

Reviewers: mitrandir, #mercurial, ttung, rmcelroy

Reviewed By: rmcelroy

Subscribers: quark, rmcelroy

Differential Revision: https://phabricator.intern.facebook.com/D3306542

Signature: t1:3306542:1463611860:16bc68416c9bbed87748a50f55a3bac7c618fdf1
2016-05-20 09:31:37 -07:00