Commit Graph

559 Commits

Author SHA1 Message Date
Tony Tung
96f89fa049 [ctree] fix findChild
Summary: The strings are not necessarily null-terminated and length needs to be considered.

Test Plan: used in later diff to find a path.

Reviewers: #fastmanifest, durham

Reviewed By: durham

Subscribers: durham, mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3770358

Signature: t1:3770358:1472173786:9a2f681c90476aafd481c301ff65ac8b199214ec
2016-08-26 13:49:17 -07:00
Tony Tung
79f0ebb34a [ctree] create a new method appendbinfromhex
Summary:
appendbinfromhex appends the binary representation of a 40-byte hex string onto a std::string.  This allows us to reuse a std::string rather than to allocate a new one every time.

Also:

1. converted binfromhex to use this method.
2. updated the docblocks to actually reflect reality.

Test Plan: `make local && cd ~/work/fbsource && PYTHONPATH=~/work/mercurial/facebook-hg-rpms/remotefilelog:~/work/mercurial/facebook-hg-rpms/fb-hgext/:~/work/mercurial/facebook-hg-rpms/remotenames/:~/work/mercurial/facebook-hg-rpms/lz4revlog/ /opt/local/bin/python2.7 ~/work/mercurial/facebook-hg-rpms/hg-crew/hg --config extensions.perftest=~/work/mercurial/facebook-hg-rpms/remotefilelog/tests/perftest.py testtree --kind flat,ctree,fast --test fulliter,diff,find --build "master~5::master"`

Reviewers: #fastmanifest, durham

Reviewed By: durham

Subscribers: mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3770048

Signature: t1:3770048:1472105520:eac79a42360ebfa258519346b68fc4541c2dbb7c
2016-08-26 13:49:17 -07:00
Tony Tung
a6bc389217 [ctree] free memory when destroying a treemanifest object
Summary: Everyone who holds heap-allocated memory gets destructors!

Test Plan: valgrind and confirmed no memory leaking from ctreemanifest

Reviewers: #fastmanifest, durham

Reviewed By: durham

Subscribers: mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3763327

Tasks: 12818084

Signature: t1:3763327:1472105103:8ac9d9694be4bf3b09e19e4381737622c94a2dac
2016-08-26 14:00:37 -07:00
Tony Tung
118cc2f8f0 [ctree] cache the root manifest upon retrieval
Summary: In most cases, this is pretty straightforward.  The only unusual case is `_treemanifest_find`, which does the actual resolution of the root manifest (unlike diff and iter).

Test Plan: make local

Reviewers: #fastmanifest, durham

Reviewed By: durham

Subscribers: durham, mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3763324

Signature: t1:3763324:1472173572:3dcda9b318ad818d2f51e8c3472c7770739faafe
2016-08-26 13:49:17 -07:00
Tony Tung
5436c47a0d [ctree] rename node to rootNode
Summary: This more accurately describes its purpose.

Test Plan: simple refactor, so thus just make local

Reviewers: #fastmanifest, durham

Reviewed By: durham

Subscribers: durham, mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3763323

Signature: t1:3763323:1472104318:b48246777db0527c7022066438c211f54c88703e
2016-08-26 13:49:17 -07:00
Tony Tung
75c02e78a9 [ctree] initialize all the fields in ManifestEntry constructor
Summary: It makes a destructor possible.

Test Plan: make local

Reviewers: #fastmanifest, durham

Reviewed By: durham

Subscribers: mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3763322

Signature: t1:3763322:1472104242:e76ef1943c2082ddecff872c6472de4706505922
2016-08-26 13:49:17 -07:00
Ryan McElroy
f4dd73e113 remotefilelog: pass modern check-code
Test Plan: run-tests.py test-check-code-hg.t

Reviewers: #mercurial, ttung, simonfar

Reviewed By: simonfar

Subscribers: simonfar, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D3777581

Tasks: 12855049

Signature: t1:3777581:1472224785:a15040cec1c95ca60d1be837d905b3c3d87be362
2016-08-26 08:48:07 -07:00
Ryan McElroy
de1c93d6e5 Move files in old remotefilelog root to proper locations
Test Plan: Inspection, Code Review

Reviewers: #mercurial, ttung, simonfar

Reviewed By: simonfar

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D3777384

Tasks: 12855049

Signature: t1:3777384:1472217126:435190970cefea03ad366b474a115efaddb97391
2016-08-26 06:27:59 -07:00
Adam Simpkins
e25889f60f fix crash in "hg pull" when remotefilelog not enabled
Summary:
764cd9916c94 recently introduced code that was unconditionally checking the
repo.includepattern and repo.excludepattern attributes on a local repository
without first checking if this is a shallow repository.  These attributes only
exist on shallow repositories, causing "hg pull" to crash on non-shallow
repositories.  This crash wouldn't happen in simple circumstances, since the
remotefilelog extension only gets fully set up once a shallow repository object
has been created, however when using chg you can end up with scenarios where a
non-shallow repository is used in the same hg process after a shallow one.

This refactors the code to now store the local repository object on the remote
peer rather than trying to store the individual shallow, includepattern, and
excludepattern attributes.

Overall this code does still feel a bit janky to me -- the rest of the peer API
is independent of the local repository, but the _callstream() wrapper cares
about the local repository being referenced.  It seems like we should ideally
redesign the APIs so that _callstream() receives the local repository data as
an argument (or we should make the peer <--> local repository assocation more
formal and explicit if think it's better to force an association here).

Test Plan: Added a new test which triggered the crash, but passes with these changes.

Reviewers: ttung, mitrandir, durham

Reviewed By: durham

Subscribers: net-systems-diffs@, yogeshwer

Differential Revision: https://phabricator.intern.facebook.com/D3756493

Tasks: 12823586

Signature: t1:3756493:1471971600:9666e9c31bf59070c3ace0821d47d322671eb5b1
2016-08-23 14:14:42 -07:00
Tony Tung
de4ad14a9c [ctree] make path a reference in diff
Summary: It's a fixed reference that we use, so no need for a pointer.

Test Plan: `PYTHONPATH=~/work/mercurial/facebook-hg-rpms/remotefilelog:~/work/mercurial/facebook-hg-rpms/fb-hgext/:~/work/mercurial/facebook-hg-rpms/remotenames/:~/work/mercurial/facebook-hg-rpms/lz4revlog/ /opt/local/bin/python2.7 ~/work/mercurial/facebook-hg-rpms/hg-crew/hg --config extensions.perftest=~/work/mercurial/facebook-hg-rpms/remotefilelog/tests/perftest.py testtree --kind flat,ctree,fast --test fulliter,diff,find --build "master~5::master"`

Reviewers: #fastmanifest, durham

Reviewed By: durham

Subscribers: mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3753463

Signature: t1:3753463:1471905506:802315db9d2aa34363c7b0bccaefcbcda21b1a1e
2016-08-22 15:47:46 -07:00
Tony Tung
aab04b7fcf [ctree] have ManifestIterator::next return the entry directly
Summary: Since we're no longer returning a struct, we no longer need to return the value through a pointer argument.

Test Plan: `PYTHONPATH=~/work/mercurial/facebook-hg-rpms/remotefilelog:~/work/mercurial/facebook-hg-rpms/fb-hgext/:~/work/mercurial/facebook-hg-rpms/remotenames/:~/work/mercurial/facebook-hg-rpms/lz4revlog/ /opt/local/bin/python2.7 ~/work/mercurial/facebook-hg-rpms/hg-crew/hg --config extensions.perftest=~/work/mercurial/facebook-hg-rpms/remotefilelog/tests/perftest.py testtree --kind flat,ctree,fast --test fulliter,diff,find --build "master~5::master"`

Reviewers: #fastmanifest, durham

Reviewed By: durham

Subscribers: mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3752376

Tasks: 12818084

Signature: t1:3752376:1471891162:e7cc159d2077d77d234902e4598b9661f15840c0
2016-08-22 15:47:31 -07:00
Tony Tung
fc7dffd109 [ctree] change the ManifestIterator to return ManifestEntry *
Summary: This allows us to modify the ManifestEntry stored in memory.  This also requires us to remove the const qualifier in a number of places.

Test Plan: `PYTHONPATH=~/work/mercurial/facebook-hg-rpms/remotefilelog:~/work/mercurial/facebook-hg-rpms/fb-hgext/:~/work/mercurial/facebook-hg-rpms/remotenames/:~/work/mercurial/facebook-hg-rpms/lz4revlog/ /opt/local/bin/python2.7 ~/work/mercurial/facebook-hg-rpms/hg-crew/hg --config extensions.perftest=~/work/mercurial/facebook-hg-rpms/remotefilelog/tests/perftest.py testtree --kind flat,ctree,fast --test fulliter,diff,find --build "master~5::master"`

Reviewers: #fastmanifest, durham

Reviewed By: durham

Subscribers: mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3752366

Tasks: 12818084

Signature: t1:3752366:1471891101:1d42a04d85b7e8db34644dc8fbf1bb3481fbb7bc
2016-08-22 15:47:16 -07:00
Tony Tung
a3b14127d3 [ctree] extract treemanifest code to its own file
Test Plan: make local

Reviewers: #fastmanifest, durham

Reviewed By: durham

Subscribers: mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3733819

Signature: t1:3733819:1471542946:8b322a7099eceb41826bb9e0edeca52e2daceb4c
2016-08-22 15:45:06 -07:00
Tony Tung
4b9e4ad3ee [ctree] rename treemanifest.cpp to py-treemanifest.cpp
Summary: py-treemanifest.cpp will be mostly just python binding stuff.

Test Plan: make local

Reviewers: #fastmanifest, durham

Reviewed By: durham

Subscribers: mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3733832

Signature: t1:3733832:1471542792:d6f1130b5c16487f4d0173cedd91857fd3711c1d
2016-08-22 15:44:50 -07:00
Tony Tung
e953a3cc27 [ctree] get rid of manifestkey
Summary: Directly pass in the path + len and the node.  Note that the path is now a char* + len, because this allows us to use the path in treemanifest_find directly, rather than to construct a new path.

Test Plan: run existing perftest without crash.

Reviewers: #fastmanifest, durham

Reviewed By: durham

Subscribers: mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3738280

Signature: t1:3738280:1471890923:c13283f1c61dc020ba1918ee9b25c24dfd2fc19b
2016-08-22 15:42:03 -07:00
Tony Tung
8205ab233f [ctree] methods to find and add children
Test Plan: used in later diff.

Reviewers: #fastmanifest, durham

Reviewed By: durham

Subscribers: durham, mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3732102

Signature: t1:3732102:1471905481:000ff5d976c348bac9f993f9c0b56f4b9c8b84f0
2016-08-22 15:41:52 -07:00
Tony Tung
33f0d41725 [ctree] modularize the code
Summary:
I like many small files.

There is one place where I'm making a functional change (convert.h) to satisfy angry compilers.

Test Plan: make local.

Reviewers: #fastmanifest, durham

Reviewed By: durham

Subscribers: mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3732584

Signature: t1:3732584:1471542758:d0b7804753ea4fd39a507090338ae3c5104dc7fa
2016-08-22 15:40:56 -07:00
Tony Tung
b05f64d3f0 [ctree] add constructor for ManifestEntry that allocates memory and places its data
Summary: For adding new children to Manifests, we need to be able to create new ManifestEntries.  Since these ManifestEntries will not be backed by datapack data structures, we need ManifestEntries that have its own memory allocation.

Test Plan: used in later diff.

Reviewers: #fastmanifest, durham

Reviewed By: durham

Subscribers: durham, mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3732101

Signature: t1:3732101:1471541200:17b60af0977109757610168637a276f5dd999f8b
2016-08-22 13:58:42 -07:00
Tony Tung
f22b6105d5 [ctree] remove hack for clang
Summary: D3730823 removes the need for it.

Test Plan: compiles

Reviewers: #fastmanifest, durham

Reviewed By: durham

Subscribers: mathieubaudet, mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3730880

Signature: t1:3730880:1471467917:2b202c5cab1a10fcfe5899670b11bc44740983f7
2016-08-22 13:58:10 -07:00
Tony Tung
efd9e50796 [ctree] remove dead code
Summary: This is not used.

Test Plan: compiles

Reviewers: #fastmanifest, durham

Reviewed By: durham

Subscribers: mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3726149

Signature: t1:3726149:1471401105:3c27f091b180dd14493a3e50d4043b10b077142b
2016-08-22 13:57:58 -07:00
Tony Tung
fd38308160 [ctree] add a Manifest pointer to ManifestEntry
Summary:
If a manifest has already been loaded for a ManifestEntry, we should cache that entry and reuse it.  Two reasons:

1) if someone makes a modification to a tree, we need to persist that.
2) better performance.

Missing in this diff: memory cleanup

Test Plan: compiles

Reviewers: #fastmanifest, durham

Reviewed By: durham

Subscribers: durham, mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3725842

Signature: t1:3725842:1471401136:cbf4a987c35ea19ca432059cc15e299f0aa5568b
2016-08-22 13:57:41 -07:00
Tony Tung
f1ce770ce5 [ctree] convert ManifestFetcher to standard declaration + definitions layout
Summary: This allows us to use ManifestFetcher inside the ManifestEntry class.

Test Plan: compiles

Reviewers: #fastmanifest, durham

Subscribers: durham, mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3725838
2016-08-22 13:55:43 -07:00
Tony Tung
71837a8c16 [ctree] transact in manifests rather than manifestkeys
Summary: This will allow us to replace portions of manifests with in-memory representations.

Test Plan: existing script doesn't crash.

Reviewers: #fastmanifest, durham

Reviewed By: durham

Subscribers: durham, mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3725831

Signature: t1:3725831:1471400740:4d9891c01f8567f4ceab76d6bd36e7dc595de4a6
2016-08-22 13:55:19 -07:00
Tony Tung
0727c6d030 [ctree] remove the need for a nextentrystart by returning the end of the entry
Summary: `parseptr` refers to where the parsing is going next.  This simplifies the code and makes ManifestEntry less tied to the original memory allocation.

Test Plan: the existing test i've been running doesn't crash.

Reviewers: #fastmanifest, durham

Reviewed By: durham

Subscribers: mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3717940

Signature: t1:3717940:1471383988:9b718e137b8ffaf8aad09f78f15791eec57fce6e
2016-08-22 13:42:55 -07:00
Tony Tung
dc8b5e0229 [ctree] parse manifests at load time
Summary: Parse manifests when we construct the Manifest object.  This allows us to do manipulations to the Manifest object and not have to deal with a parallel set of data structures.

Test Plan:
`PYTHONPATH=~/work/mercurial/facebook-hg-rpms/remotefilelog:~/work/mercurial/facebook-hg-rpms/fb-hgext/:~/work/mercurial/facebook-hg-rpms/remotenames/ valgrind ~/work/mercurial/facebook-hg-rpms/hg-crew/hg  --config extensions.perftest=~/work/mercurial/facebook-hg-rpms/remotefilelog/tests/perftest.py testtree --kind flat,ctree,fast --test fulliter,diff,find --build "@~2::@"`

note that I'm running this with Valgrind.

Reviewers: #fastmanifest, durham

Reviewed By: durham

Subscribers: durham, mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3717936

Signature: t1:3717936:1471383856:012634c1e59f1da9fc1e5a918e7f7d99d30d6992
2016-08-22 13:42:40 -07:00
Tony Tung
4245bfb5d8 [ctree] get rid of unused index field
Summary: It's not necessary for the direction we're going in.

Test Plan: compiles

Reviewers: #fastmanifest, durham

Reviewed By: durham

Subscribers: mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3717930

Signature: t1:3717930:1471383901:d8bfddb7ea7fb08d26315cbfce7af65d14662bc8
2016-08-22 13:42:27 -07:00
Tony Tung
56a6784b4d [ctree] Manifest objects are now allocated on the heap to permit them to be persisted
Summary:
We need manifest objects to be able to stick around in memory, because now they have overrides and all that other good stuff.

This probably introduces a metric ton of memory leaks, but we'll slowly whittle them down.

Test Plan: same script.

Reviewers: #fastmanifest, durham

Reviewed By: durham

Subscribers: durham, mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3717304

Signature: t1:3717304:1471385439:9269ab248d233a970c6725dbb4ca3eb661f6a96e
2016-08-22 13:41:27 -07:00
Tony Tung
a89dcaf23f [ctree] remove InMemoryManifest
Summary: All manifests will be in-memory to simplify the design.

Test Plan: `PYTHONPATH=~/work/mercurial/facebook-hg-rpms/remotefilelog:~/work/mercurial/facebook-hg-rpms/fb-hgext/:~/work/mercurial/facebook-hg-rpms/remotenames/ ~/work/mercurial/facebook-hg-rpms/hg-crew/hg  --config extensions.perftest=~/work/mercurial/facebook-hg-rpms/remotefilelog/tests/perftest.py testtree --kind flat,ctree,fast --test fulliter,diff,find --build "@~2::@"`

Reviewers: #fastmanifest, durham

Reviewed By: durham

Subscribers: durham, mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3717296

Signature: t1:3717296:1471383272:0fca293beaf811ef39de87300437b0bf880e1ca7
2016-08-22 13:40:57 -07:00
Tony Tung
25b86ae774 [ctree] use a common idiom for naming structures
Summary: xxx is the C++ class, py_xxx is the python wrapper for it.

Test Plan: make local

Reviewers: #fastmanifest, akushner

Reviewed By: akushner

Subscribers: mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3701536

Signature: t1:3701536:1471150787:5f3dfc3a65360233f9ee2ea1757ed4f368f70d62
2016-08-22 13:40:15 -07:00
Tony Tung
ff44d5f816 [ctree] handle construction and destruction in treemanifest object
Summary: py_treemanifest is initialized to 0s.  We initialize the `tm` field by explicitly calling the constructor in `treemanifest_init` and we destroy everything by explicitly calling the destructor in `treemanifest_dealloc`

Test Plan: `PYTHONPATH=~/work/mercurial/facebook-hg-rpms/remotefilelog:~/work/mercurial/facebook-hg-rpms/fb-hgext/:~/work/mercurial/facebook-hg-rpms/remotenames/:~/work/mercurial/facebook-hg-rpms/lz4revlog/ /opt/local/bin/python2.7 ~/work/mercurial/facebook-hg-rpms/hg-crew/hg --config extensions.perftest=~/work/mercurial/facebook-hg-rpms/remotefilelog/tests/perftest.py testtree --kind flat,ctree,fast --test fulliter,diff,find --build "master~5::master"` in fbsource

Reviewers: #fastmanifest

Subscribers: mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3730823
2016-08-22 13:39:35 -07:00
Tony Tung
536f6aaaee [ctree] create a py_treemanifest struct that handles the exclusively-python stuff
Summary: treemanifest will become the C++-only treemanifest object.

Test Plan: `PYTHONPATH=~/work/mercurial/facebook-hg-rpms/remotefilelog:~/work/mercurial/facebook-hg-rpms/fb-hgext/:~/work/mercurial/facebook-hg-rpms/remotenames/ ~/work/mercurial/facebook-hg-rpms/hg-crew/hg  --config extensions.perftest=~/work/mercurial/facebook-hg-rpms/remotefilelog/tests/perftest.py testtree --kind flat,ctree,fast --test fulliter,diff,find --build "@~2::@"`

Reviewers: #fastmanifest, durham

Subscribers: durham, mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3701468
2016-08-22 13:37:53 -07:00
Tony Tung
11823e1845 [ctree] move conversion methods to convert.h
Summary:
yo refactoring.

Depends on D3699264

Test Plan: compiles

Reviewers: #fastmanifest, akushner

Reviewed By: akushner

Subscribers: mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3699498

Signature: t1:3699498:1471151476:b08a69fb474413c9fc780eaffba4058aca00dacc
2016-08-18 14:44:26 -07:00
Tony Tung
8894da1d35 [ctree] fix comments for clarity/grammar
Test Plan: meh.

Reviewers: #fastmanifest, durham

Reviewed By: durham

Subscribers: mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3718053

Signature: t1:3718053:1471384016:be6ad07218e90e03708eae3c9f8a9bc4b1c67520
2016-08-17 12:42:42 -07:00
Tony Tung
1f25a00b79 [ctree] fix comment style
Summary: Matches javadoc/doxygen style

Test Plan: compiles

Reviewers: #fastmanifest, akushner

Reviewed By: akushner

Subscribers: akushner, mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3699264

Signature: t1:3699264:1471152249:d917e1f53426ace7b2e26df2cc72379fa97f93c1
2016-08-17 12:41:54 -07:00
Tony Tung
1083a7147d [ctree] if the flag is unset, return the null character as the flag
Summary: If the flag is not present, then `flag` field is set to NULL.  In that case, the current code will segfault.  Now we will assign `\0` to `*resultflag`.

Test Plan:
run `PYTHONPATH=~/work/mercurial/facebook-hg-rpms/remotefilelog:~/work/mercurial/facebook-hg-rpms/fb-hgext/:~/work/mercurial/facebook-hg-rpms/remotenames/ ~/work/mercurial/facebook-hg-rpms/hg-crew/hg  --config extens
.perftest=~/work/mercurial/facebook-hg-rpms/remotefilelog/tests/perftest.py testtree --kind flat,ctree,fast --test fulliter,diff,find --build "@~2::@"`
 without crashing.

Reviewers: #fastmanifest

Differential Revision: https://phabricator.intern.facebook.com/D3700352
2016-08-17 12:37:19 -07:00
Tony Tung
d1a61c4d0f [ctree] move the treemf reference up to fileiter
Summary: This is the first step in disentangling the C++ code from the python interface.

Test Plan: `PYTHONPATH=~/remotefilelog/build/lib.linux-x86_64-2.6/ python ~/hg/hg --config extensions.remotefilelog=~/remotefilelog/remotefilelog --config extensions.perftest=~/remotefilelog/tests/perftest.py testtree --kind flat,ctree --test fulliter,diff,find --build "master~5000::master"`

Reviewers: #fastmanifest, durham

Reviewed By: durham

Subscribers: durham, mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3700627

Signature: t1:3700627:1471382403:d8dc6dc5c295ec55878ca020b91fc0b30d930ce8
2016-08-17 11:48:01 -07:00
Tony Tung
c882541a19 CMakeLists.txt to build both ctreemanifest and cdatapack
Summary: Useful if you run CLion.

Test Plan: built everything.

Reviewers: #fastmanifest, durham

Reviewed By: durham

Subscribers: mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3700542

Signature: t1:3700542:1471382263:260eb0a1588480f1d4c798df60d559c63ed19c8a
2016-08-17 11:42:31 -07:00
Tony Tung
155c9bcabf basepack: handle race condition between incremental packing and pack loading
Summary: If someone removes the pack file, getpack will throw an IOError.  Catch it and don't bother trying to add the pack to the list of available packs.

Test Plan: pdb.set_trace() in this method, then remove a file while the debugger is halted.  continue without a traceback.

Reviewers: #mercurial, quark

Reviewed By: quark

Subscribers: mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3712310

Tasks: 12660285

Signature: t1:3712310:1471080740:85f2e3f49b09f348ec654362eadf5f1e689b8a19
2016-08-15 11:41:49 -07:00
Tony Tung
30ba9cdd24 [cdatapack] madvise the memory away
Summary: Once we're done reading the delta data, we madvise it away.

Test Plan:
dump all the hashes from a datapack into a separate file.  then run a script to fetch all the delta chains.  observed that the memory footprint did not increase significantly.

```
#!/usr/bin/env python

import binascii

import cdatapack

dp = cdatapack.datapack('/dev/shm/hgcache/fbsource/packs/8b5d28f7a5bd7391a0b060c88af8cca3af357c24')

for ix, line in enumerate(open('/tmp/hashes', 'r')):
    line = line.strip()
    dp.getdeltachain(binascii.unhexlify(line))
```

Reviewers: #fastmanifest, durham

Reviewed By: durham

Subscribers: durham, mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3686830

Signature: t1:3686830:1470716333:e8fc8e3e3fa29c1931f69222c17e10f519a4a8c2
2016-08-15 11:39:14 -07:00
Jun Wu
372dc6028a fileserverclient: remotefilepeer could have not implemented "cleanup"
Summary:
We actually have no idea what "hg.peer" will return and should check
if it has a "cleanup" method or not before wrapping it.

Test Plan: Code Review

Reviewers: ttung, #mercurial, rmcelroy

Reviewed By: rmcelroy

Differential Revision: https://phabricator.intern.facebook.com/D3703201

Signature: t1:3703201:1470924859:852eaf275c89ceced285a4b74d09938e489d9ee0

Blame Revision: D3685587
2016-08-11 15:14:54 +01:00
Tony Tung
aa27058507 [ctree] fix build on clang
Summary:
clang has a bug where the fully qualified class name cannot be used to invoke the destructor https://llvm.org/bugs/show_bug.cgi?id=12350

This is one of the suggested workarounds.

Test Plan: compiles on darwin

Reviewers: #fastmanifest, lcharignon

Reviewed By: lcharignon

Subscribers: mathieubaudet, mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3692812

Signature: t1:3692812:1470852812:09ef7de2a322034a01f5569b574c47fcc6f0c8d7
2016-08-10 13:04:38 -07:00
Tony Tung
a401e1db1c [cdatapack] free the data segments allocated for delta chains
Summary: Since we allocate the memory for the uncompressed data on the heap, we need to free it.

Test Plan: compiles.

Reviewers: #fastmanifest, durham

Reviewed By: durham

Subscribers: mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3686334

Signature: t1:3686334:1470716123:10fae5169e73ab581b282f771d188f804198d169
2016-08-09 13:48:32 -07:00
Jun Wu
037146cbd9 Workaround an issue where sshpeer + workers lead to deadlocks
Summary:
The deadlock happens in sshpeer.cleanup:

```
  # sshpeer.cleanup
  def cleanup(self):
      if self.pipeo is None:
          return
      self.pipeo.close() # [1]
      self.pipei.close()
      try:
          # read the error descriptor until EOF
          for l in self.pipee: # [2]
              self.ui.status(_("remote: "), l)
      except (IOError, ValueError):
          pass
      self.pipee.close()
```

Notes:

  1. Normally, this closes "stdin" of the server-side ssh process so
     the "ssh" process running server-side will detect EOF and end.
     However, with workers calling "os.fork", there could be other
     processes keeping "pipeo" open thus "ssh" won't receive EOF.
  2. Deadlock happens here, if the ssh process cannot get EOF and
     does not end itself, thus does not close its stderr.

The dead lock happens with these steps:

  1. "hg update ..." starts. Let's call it "the master process".
  2. Memcache miss happens, and a sshpeer is started by fileserverclient
     pipei, pipeo, pipee get created.
  3. Workers start. They inherit pipei, pipeo, pipee.
  4. The master process wait for the workers without closing pipeo.
  5. The workers are at the "cleanup" method.
  6. The server-side "ssh" reading its stdin hangs because the master
     process hasn't close pipeo, thus no EOF to its stdin.
  7. The server-side "ssh" process never ends. It does not close its
     stderr.
  8. The workers reading from pipee never end. They never get EOF
     because the server-side "ssh" won't close its stderr.
  9. The master process waiting for workers never never completes.
     Because workers won't exit before they read all pipee.

The patch closes pipee for forked processes to address the issue.
Ideally, we want this in sshpeer.py because it could in theory
affect other sshpeer use-cases. But let's do it for remotefilelog
for now since remotefilelog is currently the only victim of this
deadlock pattern.

Test Plan:
Add some extra debugging logs. Check the wrapped `_cleanup` gets called
and things work normally.

Reviewers: #mercurial, ttung, durham

Reviewed By: durham

Differential Revision: https://phabricator.intern.facebook.com/D3685587

Tasks: 12563156

Signature: t1:3685587:1470688591:0f4f97508699b273e17df867898d65205ee52434
2016-08-08 20:13:12 +01:00
Durham Goode
f63c7b38a0 ctree: large refactor to introduce Manifest, ManifestFetcher, and InMemoryManifest
This is a large refactor of the original code base. I apologize for it not being
broken up into multiple commits.

This adds the Manifest class (that represents a single Manifest directory
entry), the ManifestFetcher class (which allows fetching children manifests),
and the InMemoryManifest class (that represents an uncommitted Manifest).

The combination of these classes does a few things:
1. It removes most of the references to PythonObj from the primary algorithms.
So in the future we could use this code from other C++ code without relying on
Python.
2. It refactors the manifest access in such a way that we allow manifests to be
stored in either python strings or in memory. This opens the doors to
implementing manifest editting apis.
2016-08-08 12:20:44 -07:00
Durham Goode
c8ca415b0e ctree: add getattr to PythonObj
Summary:
Now that we have a wrapper around python objects, let's add a getattr() function
to easily retrieve attributes off the object.

Test Plan: Ran the perf test

Reviewers: #fastmanifest

Differential Revision: https://phabricator.intern.facebook.com/D3674366
2016-08-08 12:16:25 -07:00
Durham Goode
18a54f3d14 ctree: switch all PyObjects to use new wrapper class
Summary:
This patch introduces a PythonObj class which implements the copy-constructor,
destructor, and assignment operator in a way that will manage the ref count
automatically. If we move to C++ 11 we could also implement the move constructor
and move assignment operators to make this even more efficient.

The current implementation allows implicitly converting to and from PyObject*,
which may be questionable design wise, but makes switching to and using this
class much cleaner since we can do things like `PythonObj foo = PyObject_New()`
and `PyObject_DoStuff(myPythonObj)`.

Test Plan: Ran the perf test. It succeeded, and I saw no effect on perf.

Reviewers: #fastmanifest

Differential Revision: https://phabricator.intern.facebook.com/D3674311
2016-08-08 12:16:25 -07:00
Durham Goode
81e68944ae ctree: use new pyexception type to propagate python exceptions
Summary:
Instead of returning NULL and propagating it up the call stack, let's throw
pyexception (which assumes the python error string has already been set) and the
top of the stack can just return NULL.

Test Plan: Ran my perf test suite

Reviewers: #fastmanifest

Differential Revision: https://phabricator.intern.facebook.com/D3673804
2016-08-08 12:16:25 -07:00
Durham Goode
f9a0e52da4 ctree: implement diff
Summary:
This implements treemanifest.diff(). It takes two manifests and iterates over
them to produce a python dictionary containing the differences.

I'm not proud of this.  Just putting it up for review for completeness since it
completes the find, diff, iter trifecta. I need to refactor it to remove some of
the duplication before it gets accepted.

Test Plan:
Ran it as part of a perf test suite using diffs across various
distances.  It takes 250ms to diff across 5000 commits, and 900ms to diff across
50,000 commits.

Reviewers: #fastmanifest

Differential Revision: https://phabricator.intern.facebook.com/D3646003
2016-08-08 12:16:25 -07:00
Durham Goode
031783c47f ctree: convert manifesttree struct to class
Summary:
By making this a class we can encapsulate common operations like parsing and
directory checking. The later diff that implements treemanifest_diff uses this
a lot.

Test Plan: Ran my perf tests for find and iter (and for the later patch diff).

Reviewers: #fastmanifest

Differential Revision: https://phabricator.intern.facebook.com/D3673228
2016-08-08 12:16:25 -07:00
Durham Goode
c43476ae8a ctree: explicitly cast to Py_ssize_t
Summary:
The varargs style Py_BuildValue functions have no idea what type the incoming
arguments are, so it passed the hard coded ints as 32 bit instead of 64bit.
Let's explicitly cast every number being passed to that function as Py_ssize_t
instead.

Test Plan: We were seeing segfaults without this change. Now we don't.

Reviewers: #fastmanifest

Differential Revision: https://phabricator.intern.facebook.com/D3673217
2016-08-08 12:16:25 -07:00
Durham Goode
bced440ce6 ctree: implement find
Summary:
This implements treemanifest.find(), which takes a filename and returns the node
and flag for it, if it exists.

This isn't the prettiest function I've ever written.  I need to think about how
to refactor this to unify the various traversal algorithms that are used in
treemanifest.

Test Plan:
Ran it as part of a perf test suite. It's basically 0 milliseconds in
every case.

Reviewers: #fastmanifest

Differential Revision: https://phabricator.intern.facebook.com/D3645967
2016-08-08 12:16:25 -07:00
Durham Goode
d5ae2f2fdf ctree: implement __iter__
Summary:
This implements the basic __iter__ logic. It returns an iterator that returns
every file path in the manifest.

Test Plan:
Ran a separate perf test suite to verify the performance of this. It
can iterate over 1 million files in about 550 milliseconds, assuming a fast
store.

Reviewers: #fastmanifest

Differential Revision: https://phabricator.intern.facebook.com/D3645890
2016-08-08 12:16:25 -07:00
Durham Goode
f3263cc7ab ctree: define basic fileiter type
Summary:
This defines the basic type for representing an iteration over all the keys in
the treemanifest. The next patch fill add the logic that constructs and mutates
this type as it iterates.

Test Plan: I ran a perf test in a future patch that executes all of this code.

Reviewers: #fastmanifest

Differential Revision: https://phabricator.intern.facebook.com/D3645768
2016-08-08 12:16:25 -07:00
Durham Goode
28b736554e ctree: add common helper methods
Summary:
This adds getdata and binfromhex functions. These are common functions that will
be used throughout the implementation of the ctreemanifest.

Test Plan: Ran a perf test suit on the code in a later diff.

Reviewers: #fastmanifest

Differential Revision: https://phabricator.intern.facebook.com/D3644960
2016-08-08 12:16:25 -07:00
Durham Goode
d55020433d ctree: add an initial c type definition for treemanifest
Summary:
This is the basic definition of the c treemanifest type. Future patches will add
functions to this type.

Test Plan: Tested by running a perf suite on a later version of this series

Reviewers: #fastmanifest

Differential Revision: https://phabricator.intern.facebook.com/D3644935
2016-08-08 12:16:25 -07:00
Durham Goode
aa0ad156a2 ctree: adds an initial cpython file for the ctreemanifest implementation
Summary: Just a simple module declaration with no logic yet.

Test Plan: Ran perf tests against a later implementation of this

Reviewers: #fastmanifest

Differential Revision: https://phabricator.intern.facebook.com/D3644720
2016-08-08 12:16:25 -07:00
Tony Tung
f642d24f1c [cdatapack] create a fastdatapack class
Summary:
fastdatapack is the same as datapack.  add selector in datapackstore to determine which datapack to create.

test-datapack-fast.t is the same as tset-datapack.t, except it enables fastdatapack

Test Plan: pass test-datapack.t test-datapack-fast.t

Reviewers: #fastmanifest, durham

Reviewed By: durham

Subscribers: mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3666932

Signature: t1:3666932:1470426499:45292064e2868caab152d9a5b788840c5f63e4e4
2016-08-05 14:35:29 -07:00
Tony Tung
bbeadc1631 [cdatapack] wire up getdeltachain
Summary: Call the getdeltachain in C, format the results for python.

Test Plan: pass test-repack.t

Reviewers: #fastmanifest, durham

Reviewed By: durham

Subscribers: durham, mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3671744

Signature: t1:3671744:1470382442:efae340c8cf5b173407c909c0816bf26704c7bf5
2016-08-05 11:44:19 -07:00
Tony Tung
4a9136f325 [cdatapack] optimize fanout table lookups by doing the calculations at load time
Summary: Do the divides when we load up the index table.  Lookups are now cheaper.

Test Plan: pass test-repack.t

Reviewers: #fastmanifest, durham

Reviewed By: durham

Subscribers: mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3671719

Signature: t1:3671719:1470382246:937fd19f60ad71304e6a75f348fa92f067aba895
2016-08-05 11:43:39 -07:00
Durham Goode
baaff31090 repack: move background repack into requirement check
Previously we were kicking off background repacks even for non remotefilelog
repos. Moving the repack to be inside the remotefilelog requirement check will
prevent this.
2016-08-05 10:00:22 -07:00
Tony Tung
e45027053c [cdatapack] iterator should return deltalen and not delta
Summary: This is to match the Python API.

Test Plan: used in later diff to pass test-datapack.t

Reviewers: #fastmanifest, durham

Reviewed By: durham

Subscribers: mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3668366

Signature: t1:3668366:1470341366:7c77684e646fe31a742e76158133597b46307797
2016-08-04 13:52:55 -07:00
Tony Tung
437072e8ec [cdatapack] fix bug in following deltabasenode pointers
Summary: The raw index is a byte offset, not an entry number.  I hope the compiler is smart enough to optimize out the divide and multiply. :)

Test Plan: cdatapack_get on a delta chain that has a deltabasenode does not crash!

Reviewers: #fastmanifest, durham

Reviewed By: durham

Subscribers: mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3667033

Signature: t1:3667033:1470341429:b37da6c9ea6e37fe79b48ec6766c857b5e56c36a
2016-08-04 13:52:41 -07:00
Tony Tung
4fc0694b79 [cdatapack] use int for string length
Summary: Unless `PY_SSIZE_T_CLEAN` is defined, the length is int.

Test Plan: compiles.

Reviewers: #fastmanifest, durham

Reviewed By: durham

Subscribers: mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3666998

Signature: t1:3666998:1470341329:3ddd1f7e36389aff3e18482db4d0b28ab8d7c12f
2016-08-04 13:52:24 -07:00
Tony Tung
66ffd84e1d [cdatapack] add some error handling
Summary:
Replace some TODOs with actual error handling code.

Also lumped in typo fixes and style changes.  Sorry.

Test Plan: Used in a later diffs to pass test-datapack.t

Reviewers: #fastmanifest, durham

Reviewed By: durham

Subscribers: mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3666875

Signature: t1:3666875:1470341306:67cd439341ad30fcd690ff2e399e8beacea1c0bb
2016-08-04 13:51:45 -07:00
Tony Tung
66368c07a1 [cdatapack] fix bugs in the fanout table
Summary:
1. When bisecting, we don't want to wrap around.  If middle == 0 and we're lesser than that, we should just fail.
2. large fanout should be header->config & LARGE_FANOUT.  | means it's always a large fanout.
3. the format of the fanout table on disk makes it impossible to differentiate between an empty fanout entry and the first fanout entry, as they are both '0'.  Therefore, any entry before the *second* fanout entry must implicitly search the 0th element.
4. fixed a bug in the calculation of the last index entry.

Test Plan: passed test-datapack.t with other fixes applied.

Reviewers: #fastmanifest, durham

Reviewed By: durham

Subscribers: durham, mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3666770

Signature: t1:3666770:1470341277:3f4f63a365e8bb0f4da6e574fc7f15228877c682
2016-08-04 13:51:07 -07:00
Tony Tung
8237209f77 [cdatapack] stricter const
Summary: We don't ever need to modify the node sha data, so make it const.

Test Plan: compiles

Reviewers: #fastmanifest, durham

Reviewed By: durham

Subscribers: mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3666714

Signature: t1:3666714:1470340104:080ffa290a49388e797dcefc66976f6341932b76
2016-08-04 13:50:00 -07:00
Tony Tung
c299d186f4 [cdatapack] implement _find()
Summary: Depends on D3660087, D3654810

Test Plan:
```
#!/usr/bin/env python

import binascii

import cdatapack

a = cdatapack.datapack('d864669a5651d04505ec6e5e9dba1319cde71f7b')

bin = binascii.unhexlify('f2e53f83c5dc806aa2eda87bb15fe0367baf3a7e')
print a._find(bin)
```

yields:

```
[andromeda]:~/work/mercurial/facebook-hg-rpms/remotefilelog:eaf5d75> python foo.py
('\xf2\xe5?\x83\xc5\xdc\x80j\xa2\xed\xa8{\xb1_\xe06{\xaf:~', 4294967295, 285122348L, 8374L)
```

Reviewers: #fastmanifest, durham

Reviewed By: durham

Subscribers: mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3660091

Signature: t1:3660091:1470339492:21a7f3067bda7822c8f396120f99f1dc6e4e26b5
2016-08-04 13:49:19 -07:00
Tony Tung
9b036e561e [cdatapack] expose the find interface
Summary: Needed if we want to do a hybrid implementation of cdatapack

Test Plan: used in following diff.

Reviewers: #fastmanifest, durham

Reviewed By: durham

Subscribers: mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3660087

Signature: t1:3660087:1470339373:4e8b548f1509af7f34d0a4bf8bd85723f38d238d
2016-08-04 13:48:32 -07:00
Tony Tung
113f23b65e [cdatapack] implement iteritems()
Summary: `iteritems()` differs from `__iter__()` slightly in that it yields the delta base and delta.

Test Plan:
run this toy program

```
#!/usr/bin/env python

import cdatapack

a = cdatapack.datapack('d864669a5651d04505ec6e5e9dba1319cde71f7b')
for x in a.iterentries():
    print x[0], repr(x[1]), repr(x[2]), len(x[3])
for x in a:
    print x[0], repr(x[1])
```

Reviewers: #fastmanifest, durham

Reviewed By: durham

Subscribers: mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3659133

Signature: t1:3659133:1470339839:dbdce5990a30ffe019ccc44fce97925b64524acd
2016-08-04 13:48:21 -07:00
Tony Tung
a28137668e [cdatapack] skeleton for iterator type
Summary: cdatapack now has a getiter function, and it returns a cdatapack_iterator.

Test Plan:
using this toy program, dumped a pack file.

```
#!/usr/bin/env python

import cdatapack

a = cdatapack.datapack('d864669a5651d04505ec6e5e9dba1319cde71f7b')
for x in a:
    print x
```

Reviewers: #fastmanifest, durham

Reviewed By: durham

Subscribers: durham, mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3659005

Signature: t1:3659005:1470339657:aa39cc57a669b9bc4604933ce35ed20b3f81b468
2016-08-04 13:48:04 -07:00
Tony Tung
e8da5b62df [cdatapack] fix build on linux hosts
Summary:
1. Get ntohl from arpa/inet.h as per the posix spec
2. Get ntohll from endian.h's be64toh

Test Plan: make local

Reviewers: #fastmanifest, durham

Reviewed By: durham

Subscribers: mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3671211

Signature: t1:3671211:1470341382:e6b0fe12094246aeb6be09252122bde9680e4599
2016-08-04 13:23:11 -07:00
Tony Tung
7b70d8c572 [cdatapack] skeleton for the python type
Summary:
This is the skeleton for the python type.  Only initialization and the destructor are filled in.

Depends on D3654786.

Test Plan:
```
[andromeda]:~/work/mercurial/facebook-hg-rpms/remotefilelog:0478c29> ls -l d864669a5651d04505ec6e5e9dba1319cde71f7b*
-r--r--r--  1 tonytung  staff     947666 Jul 26 14:08 d864669a5651d04505ec6e5e9dba1319cde71f7b.dataidx
-r--r--r--  1 tonytung  staff  285130722 Jul 26 14:08 d864669a5651d04505ec6e5e9dba1319cde71f7b.datapack
[andromeda]:~/work/mercurial/facebook-hg-rpms/remotefilelog:0478c29> cat foo.py
#!/usr/bin/env python

import cdatapack

a = cdatapack.datapack('d864669a5651d04505ec6e5e9dba1319cde71f7b')
[andromeda]:~/work/mercurial/facebook-hg-rpms/remotefilelog:0478c29> python foo.py
[andromeda]:~/work/mercurial/facebook-hg-rpms/remotefilelog:0478c29>
```

Reviewers: #fastmanifest, durham

Reviewed By: durham

Subscribers: mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3654810

Signature: t1:3654810:1470175451:c2d4e4acc138685c1030ed98f0afd9379f9fa0c4
2016-08-03 15:29:44 -07:00
Tony Tung
76f5986ab9 [cdatapack] adds an initial cpython file for the cdatapack implementation
Summary: Just a simple module declaration with no logic yet.

Test Plan:
```
[andromeda]:~/work/mercurial/facebook-hg-rpms/remotefilelog:2445a3a> make local

<output snipped>

[andromeda]:~/work/mercurial/facebook-hg-rpms/remotefilelog:2445a3a> python
Python 2.7.11 (default, Mar  1 2016, 18:40:10)
[GCC 4.2.1 Compatible Apple LLVM 7.0.2 (clang-700.1.81)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import cdatapack
>>>
```

Reviewers: #fastmanifest, durham

Reviewed By: durham

Subscribers: mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3654786

Signature: t1:3654786:1470175354:c7e8847dcc74c83483d21888ad30cd9242fb461c
2016-08-03 15:29:29 -07:00
Tony Tung
58e395d2e6 [cdatapack] utility to retrieve and checksum the delta chain
Summary:
Given a node sha, find it in the index file and retrieve the deltas.  Checksum the data and dump it.

Depends on D3637000, D3636945

Test Plan:
```
[andromeda]:~/work/mercurial/facebook-hg-rpms/remotefilelog:68cd351> /Users/tonytung/Library/Caches/CLion2016.2/cmake/generated/cdatapack-64b7828e/64b7828e/Debug0/cdatapack_get  d864669a5651d04505ec6e5e9dba1319cde71f7b  f2e53f83c5dc806aa2eda87bb15fe0367baf3a7e

source/zippydb/tier_spec/tier_settings/zippydb.wildcard.tmpfs.zippydb_settings.cconf
Node                                      Delta Base                                Delta SHA1                                Delta Length
f2e53f83c5dc806aa2eda87bb15fe0367baf3a7e  0000000000000000000000000000000000000000  f32b366a6c44430df6526133f82f9638426ba9c5  37769
[andromeda]:~/work/mercurial/facebook-hg-rpms/remotefilelog:68cd351> hg debugdatapack d864669a5651d04505ec6e5e9dba1319cde71f7b --node f2e53f83c5dc806aa2eda87bb15fe0367baf3a7e

source/zippydb/tier_spec/tier_settings/zippydb.wildcard.tmpfs.zippydb_settings.cconf
Node                                      Delta Base                                Delta SHA1                                Delta Length
f2e53f83c5dc806aa2eda87bb15fe0367baf3a7e  0000000000000000000000000000000000000000  f32b366a6c44430df6526133f82f9638426ba9c5  37769
[andromeda]:~/work/mercurial/facebook-hg-rpms/remotefilelog:68cd351>
```

Reviewers: durham

Reviewed By: durham

Subscribers: mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3637416

Signature: t1:3637416:1470094723:bce7e903cd0b80c293e16b7532c49e552d3039ef
2016-08-03 15:29:01 -07:00
Olivier Trempe
af90628992 flogheads: return an empty list when requesting heads of a non-existing filelog 2016-08-02 10:41:41 -07:00
Olivier Trempe
8005fc4b2f fileserverclient: add wireproto command for requesting a filelog's heads
Allowing discovery of all the heads of a filelog allows supporting some existing
Mercurial use cases, like viewing all the versions of a file in a UI.
2016-08-02 10:40:42 -07:00
Olivier Trempe
51e02acf2c filelogrevset: Return revset.baseset instead of plain list. Add test for kind in path. 2016-08-02 09:40:50 -07:00
Tony Tung
974339f97e [cdatapack] fix index retrieval bugs
Summary:
1. offsets are absolute byte offsets.  convert them to entry offsets to make the bisect code a lot simpler.
2. when writing entries to pack chain, we need to advance the pointer.

Depends on D3627122

Test Plan: used in later diff.

Reviewers: durham

Reviewed By: durham

Subscribers: mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3637000

Signature: t1:3637000:1469741885:c2416a3b30e5bb2b64e6bb7062f4c02098be91eb
2016-08-01 14:18:35 -07:00
Tony Tung
641c5f4f01 [debugcommands] return the uncompressed delta length when iterating over a datapack
Summary:
When retrieving a delta chain, datapack.py uncompresses the delta chain data.  However, when iterating over the datapack, we get the compressed length.  THis is not desirable as the output is no longer consistent.  This diff peeks into the lz4 header to get the uncompressed length when iterating.

Depends on D3627119

Test Plan:
```
[andromeda]:~/work/mercurial/facebook-hg-rpms/remotefilelog:e2ef218> hg debugdatapack d864669a5651d04505ec6e5e9dba1319cde71f7b --node f2e53f83c5dc806aa2eda87bb15fe0367baf3a7e

source/zippydb/tier_spec/tier_settings/zippydb.wildcard.tmpfs.zippydb_settings.cconf
Node                                      Delta Base                                Delta SHA1                                Delta Length
f2e53f83c5dc806aa2eda87bb15fe0367baf3a7e  0000000000000000000000000000000000000000  f32b366a6c44430df6526133f82f9638426ba9c5  37769
[andromeda]:~/work/mercurial/facebook-hg-rpms/remotefilelog:e2ef218> hg debugdatapack d864669a5651d04505ec6e5e9dba1319cde71f7b | tail -n 4

source/zippydb/tier_spec/tier_settings/zippydb.wildcard.tmpfs.zippydb_settings.cconf
Node          Delta Base    Delta Length
f2e53f83c5dc  000000000000  37769
[andromeda]:~/work/mercurial/facebook-hg-rpms/remotefilelog:e2ef218>
```

Reviewers: durham

Reviewed By: durham

Subscribers: mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3636945

Signature: t1:3636945:1469811243:b21d90d9599244ed4600c5336818b9a18eacf3ff
2016-08-01 14:16:29 -07:00
Tony Tung
20126b3bd8 [cdatapack] fix memory handling for cdatapack
Summary:
`->index_table` is not heap-alloacted.  however, `->fanout_table` is and should be released.

Also added call to `close_datapack()` at the end of `cdatapack_dump.c`.

Depends on D3627122

Test Plan: valgrind is much happier now.

Reviewers: durham

Reviewed By: durham

Subscribers: mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3631368

Signature: t1:3631368:1469741779:e0c4e5d59c7e73c8aa3507901df3005383f0d3f5
2016-08-01 14:11:16 -07:00
Tony Tung
705c0731b6 [remotefilelog] initial checkin of a c datapack parser
Summary: This is not yet complete, but seems to be able to parse a data file.

Test Plan:
`/Users/tonytung/Library/Caches/CLion2016.2/cmake/generated/cdatapack-64b7828e/64b7828e/Debug/cdatapack_dump d864669a5651d04505ec6e5e9dba1319cde71f7b > /tmp/2`

compare it with the output of `hg debugdatapack --long d864669a5651d04505ec6e5e9dba1319cde71f7b > /tmp/1`

and it exactly matches.

Reviewers: durham

Reviewed By: durham

Subscribers: mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3627122

Signature: t1:3627122:1470085301:c9b9e8b2fa57bb7a09dd56d3c811ff8eadbb85ba
2016-08-01 14:05:37 -07:00
Tony Tung
9e557758b0 [datapack] add --node as a parameter to dump extra data about a node
Summary:
It obtains the deltachain and dumps the chain to the console.

Depends on D3627117.

Test Plan:
```
[andromeda]:~/work/mercurial/facebook-hg-rpms/remotefilelog:3266095> hg debugdatapack d864669a5651d04505ec6e5e9dba1319cde71f7b --node ba5fbf1aba48f25d46228626917b2705adc9e7c8

arcanist/__phutil_library_map__.php
Node                                      Delta Base                                Delta SHA1                                Delta Length
ba5fbf1aba48f25d46228626917b2705adc9e7c8  0000000000000000000000000000000000000000  df442a6f976b946c266f76b0f63a198e8aabf809  3993
[andromeda]:~/work/mercurial/facebook-hg-rpms/remotefilelog:3266095>
```

Reviewers: durham

Reviewed By: durham

Subscribers: mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3627119

Signature: t1:3627119:1469738313:d61726585a020ed4cbabbb1f623eb202ccd51b9f
2016-07-28 17:15:21 -07:00
Tony Tung
7111de0994 [datapack] allow for long hashes to be printed
Summary:
This will help verify the C datapack reader.

Depends on D3627112.

Test Plan:
```
[andromeda]:~/work/mercurial/facebook-hg-rpms/remotefilelog:4063a65> hg debugdatapack --long  d864669a5651d04505ec6e5e9dba1319cde71f7b  | head -n 15

arcanist/__phutil_library_map__.php
Node                                      Delta Base                                Delta Length
ba5fbf1aba48f25d46228626917b2705adc9e7c8  0000000000000000000000000000000000000000  1265

arcanist/canary/FacebookConfigeratorArcanistCanaryWorkflow.php
Node                                      Delta Base                                Delta Length
142f9991fca1a16c6544cb6e5a0071296e712268  0000000000000000000000000000000000000000  6546

arcanist/lint/FacebookConfigeratorLintEngine.php
Node                                      Delta Base                                Delta Length
c8630501c45f1bc1dc47df2ee2ad354993438cdb  0000000000000000000000000000000000000000  2811

arcanist/lint/linter/FbcodePyFlake8Linter.php
Node                                      Delta Base                                Delta Length
[andromeda]:~/work/mercurial/facebook-hg-rpms/remotefilelog:4063a65> hg debugdatapack   d864669a5651d04505ec6e5e9dba1319cde71f7b  | head -n 15

arcanist/__phutil_library_map__.php
Node          Delta Base    Delta Length
ba5fbf1aba48  000000000000  1265

arcanist/canary/FacebookConfigeratorArcanistCanaryWorkflow.php
Node          Delta Base    Delta Length
142f9991fca1  000000000000  6546

arcanist/lint/FacebookConfigeratorLintEngine.php
Node          Delta Base    Delta Length
c8630501c45f  000000000000  2811

arcanist/lint/linter/FbcodePyFlake8Linter.php
Node          Delta Base    Delta Length
[andromeda]:~/work/mercurial/facebook-hg-rpms/remotefilelog:4063a65>
```

Reviewers: durham

Reviewed By: durham

Subscribers: mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3627117

Signature: t1:3627117:1469735318:103e9a21be082749332572c9c4f9942ea9c1c248
2016-07-28 17:07:04 -07:00
Tony Tung
3da845968f [datapack] fix computation of the paged-in size
Summary:
It should include the filelen and the deltalen fields, which are
2 and 8 bytes.

Test Plan: visual.

Reviewers: durham

Reviewed By: durham

Subscribers: mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3627108

Signature: t1:3627108:1469742083:ffb59768906d9e5463065eec92e1c80cc8482884
2016-07-28 17:06:50 -07:00
Olivier Trempe
a2ce732706 fileserverclient: fixed lingering ssh connection due to reference cycle on pull operations
Calling wrapfunction on the remotefilepeer(sshpeer) object in exchangepull
function introduces a reference cycle. Hence, this object will not be deleted
until the process dies. This is not a big issue for processes having a short
lifetime(e.g. lauched by command line.)
However, for persistent processes (e.g. TortoiseHg), this can lead to multiple
lingering ssh connections to the server(actually one by pull operation).

The fix is to not wrap the remotefilepeer._callstream. This method is defined
right into the remotefilepeer object. The required repo data is made available
in the remotefilepeer object by monkeypatching this object in the exchangepull
function.
2016-07-22 13:47:02 -07:00
Olivier Trempe
0368ca40fc Fix filelogrevset not properly handling "kind" in path 2016-07-22 13:09:48 -07:00
Durham Goode
df65096278 pull: add more requirement checking
In some situations the remotefilelog setup logic could be called, which will
wrap certain functions, and then later a call will happen to a repo that wasn't
remotefilelog which will run some remotefilelog code because of the wrapping.

Normally we take care of this by checking for the remotefilelog requirement. We
missed it in this one spot though.
2016-07-22 12:33:56 -07:00
Martin von Zweigbergk
0df928d828 shallowbundle: specifically compare instance to remotefilelog.remotefilelog
In two place, we were checking if a revlog was an instance of
revlog.revlog and, I think, treating it as a
remotefilelog.remotefilelog otherwise. I noticed this when I created
another non-revlog.revlog revlog in narrowhg and remotefilelog thought
it was a remotefilelog.remotefilelog. Let's specifically check if it's
a remotefilelog.remotefilelog instead.
2016-07-15 23:53:09 -07:00
Durham Goode
073f8e3d22 repack: unmap memory occasionally to reclaim space
Summary:
When running large repack operations, the resident size of the process
could become quite large, since we're scanning in entire pack files. Linux/OSX
have api calls for telling the kernel it's ok to release some of that memory,
but those apis are not exposed to python.

So instead, let's unmap and remap the mmap's once a certain amount of data has
been read. I also tried changing the mmap accessors to use the file oriented api
(mmap.read(), mmap.seek(), etc) so we could switch to actual file handles during
repack, but it had a drastic affect on normal performance (repack took 1 hour
instead of a few minutes).

Long term we should move all of this logic to c++ so we can use the more
powerful APIs.

Test Plan:
Did a full repack on a laptop and verified memory capped out at 2GB
instead of exceeding 5GB.

Reviewers: #sourcecontrol, ttung

Differential Revision: https://phabricator.intern.facebook.com/D3545171
2016-07-12 11:46:48 -07:00
Durham Goode
77192943d4 repack: handle race condition with background repacks
Summary:
There was a race condition where if a repack is running and another hg process
launches, the new process will only see the original packs, and not any of the
new packs (even though the source blobs are being deleted from disk by the
repack).

The fix is to allow our pack store to refresh it's list of packs every so often.
In this particular implementation we do it at most every 100ms. A more robust
strategy would be to group key misses and only check for new packs at the end
once we have a list of all the misses, but this would require significant
refactoring to make everything grouped. This case should only ever happen during
repacks, so it should almost never occur more than once during a command, so the
100ms version is probably good enough.

Test Plan:
Ran `hg up && hg pull && sleep 0.2 && hg up master` in a loop with a
break point in the refresh code and caught it executing in a situation where the
background repack had removed the original sources and put them in a new pack.
Verified that it loaded the data from the new pack correctly.

Reviewers: #mercurial, ttung, lcharignon

Reviewed By: lcharignon

Subscribers: lcharignon

Differential Revision: https://phabricator.intern.facebook.com/D3524314

Signature: t1:3524314:1467907680:85be07ad953811000c468852eb0626f4d8b53a13
2016-07-07 15:59:06 -07:00
Durham Goode
15fcba5c21 cachegroup: fix directory permissions for shared cache
Summary:
The shared cache needs to be completely g+ws so that all members of the group
can write to each directory in it. The old code only applied g+ws to the leaf
directories, so other users aren't able to write to non leaf directories (like
hgcache/7a/83beca8.../ others couldn't write to 7a/)

Test Plan:
Updated a test to view group permissions for the intermediate
directories

Reviewers: #mercurial, ttung, simpkins

Reviewed By: simpkins

Subscribers: lcharignon, net-systems-diffs@, simpkins, mbolin

Differential Revision: https://phabricator.intern.facebook.com/D3523918

Signature: t1:3523918:1467930221:452b11b56a2e69896bf8d2cd0acd7131b41f90d8
2016-07-07 15:58:59 -07:00
Durham Goode
7c44b94bb0 repack: fix repack heuristic to account for unusual copies
Summary:
Previously, the history repack logic would stop traversing history for a given
filename once it encountered a rename. This isn't quite right, since the history
could eventually be traversed back to the original file, where we'd need to
continue processing. So now we check for when the copyfrom becomes the filename.

Also, if the copy source file and the copy target file have two nodes with the
same value, we would not process the one in the copy target (since it was marked
do not process). We fix this by explicitly checking if the node is one of the
known entries in the file being processed.

Test Plan: Added a test

Reviewers: #mercurial, ttung, mitrandir, rmcelroy

Reviewed By: mitrandir, rmcelroy

Differential Revision: https://phabricator.intern.facebook.com/D3523215

Signature: t1:3523215:1467828169:bd487c8f296352c1a1b9355cb55f9001bd5e19a9
2016-07-07 15:58:47 -07:00
Martin von Zweigbergk
adcdb9289c commands: tell @command decorator about arguments
Before this patch, debugremotefilelog and verifyremotefilelog would
crash if not given a path. Also, many commands would accept arguments
they then ignored.
2016-06-30 10:14:17 -07:00
Martin von Zweigbergk
c9390fde26 debugdatapack: make function name match command 2016-06-30 10:11:37 -07:00
Olivier Trempe
d8d662c766 shallowutil: windows compatibility for readonly files
On Windows, os.rename cannot rename readonly files and cannot overwrite
destination if it already exists. Create small wrappers to handle these cases.
2016-07-05 00:30:42 -07:00
Durham Goode
0b111a5610 basestore: fix incorrect variable name 2016-06-22 15:55:38 -07:00
Laurent Charignon
9a1fb623cc shallowutil: add missing import
Summary:
Before this patch, we were not importing mercurial.error, this was
causing a crash when calling error.Abort. This patch adds the missing import.

Test Plan: Tests pass, and add a new test

Reviewers: durham

Differential Revision: https://phabricator.intern.facebook.com/D3457086
2016-06-20 15:18:14 -07:00
Durham Goode
d7722fcc7c stores: reverse order of cache and local stores
In the old days we would check the cache first, then the local store. This was
important because the cache is more likely to contain correct data (since it
comes from the final pushed version of commits), versus local data which may
contain information about stripped commits.

As part of the big store refactor, this order got switched unintentionally. So
let's switch it back.
2016-06-16 10:22:31 -07:00
Jeroen Vaelen
07efaadb9d [remotefilelog] use hashlib to compute sha1 hashes
Summary:
hg-crew's c27dc3c3122 and c27dc3c3122^ were breaking our extensions:

```
$ hg log -r c27dc3c3122^
changeset:   9010734b79911d2d2e7405d91a4df479b35b3841
user:        Augie Fackler <raf@durin42.com>
date:        Thu, 09 Jun 2016 21:12:33 -0700
s.ummary:     cleanup: replace uses of util.(md5|sha1|sha256|sha512) with hashlib.\1
```

```
$ hg log -r c27dc3c3122
changeset:   0d55a7b8d07bf948c935822e6eea85b044383f00
user:        Augie Fackler <raf@durin42.com>
date:        Thu, 09 Jun 2016 21:13:23 -0700
s.ummary:     util: drop local aliases for md5, sha1, sha256, and sha512
```

I did a grep over facebook-hg-rpms to see what was affected:
```
$ grep "util\.\(md5\|sha1\|sha256\|sha512\)" -r ~/facebook-hg-rpms
/home/jeroenv/facebook-hg-rpms/remotefilelog/remotefilelog/basestore.py:            sha = util.sha1(filename).digest()
/home/jeroenv/facebook-hg-rpms/remotefilelog/remotefilelog/basestore.py:                sha = util.sha1(filename).digest()
/home/jeroenv/facebook-hg-rpms/remotefilelog/remotefilelog/shallowutil.py:    pathhash = util.sha1(file).hexdigest()
/home/jeroenv/facebook-hg-rpms/remotefilelog/remotefilelog/shallowutil.py:    pathhash = util.sha1(file).hexdigest()
/home/jeroenv/facebook-hg-rpms/remotefilelog/remotefilelog/debugcommands.py:    filekey = util.sha1(file).hexdigest()
/home/jeroenv/facebook-hg-rpms/remotefilelog/remotefilelog/historypack.py:        namehash = util.sha1(name).digest()
/home/jeroenv/facebook-hg-rpms/remotefilelog/remotefilelog/historypack.py:        node = util.sha1(filename).digest()
/home/jeroenv/facebook-hg-rpms/remotefilelog/remotefilelog/historypack.py:        files = ((util.sha1(filename).digest(), offset, size)
/home/jeroenv/facebook-hg-rpms/remotefilelog/remotefilelog/fileserverclient.py:    pathhash = util.sha1(file).hexdigest()
/home/jeroenv/facebook-hg-rpms/remotefilelog/remotefilelog/fileserverclient.py:    pathhash = util.sha1(file).hexdigest()
/home/jeroenv/facebook-hg-rpms/remotefilelog/remotefilelog/basepack.py:        self.sha = util.sha1()
/home/jeroenv/facebook-hg-rpms/remotefilelog/tests/test-datapack.py:        return util.sha1(content).digest()
/home/jeroenv/facebook-hg-rpms/remotefilelog/tests/test-histpack.py:        return util.sha1(content).digest()
Binary file /home/jeroenv/facebook-hg-rpms/hg-crew/.hg/store/data/mercurial/revlog.py.i matches
/home/jeroenv/facebook-hg-rpms/fb-hgext/sparse.py:            return util.sha1(fh.read()).hexdigest()
/home/jeroenv/facebook-hg-rpms/fb-hgext/sparse.py:        sha1 = util.sha1()
/home/jeroenv/facebook-hg-rpms/fb-hgext/sparse.py:        sha1 = util.sha1()
/home/jeroenv/facebook-hg-rpms/fb-hgext/sparse.py:        sha1 = util.sha1()
/home/jeroenv/facebook-hg-rpms/fb-hgext/sparse.py:    sha1 = util.sha1()
/home/jeroenv/facebook-hg-rpms/mutable-history/hgext/simple4server.py:        sha = util.sha1()
/home/jeroenv/facebook-hg-rpms/mutable-history/hgext/evolve.py:        sha = util.sha1()
```
This diff is part of the fix.

Test Plan:
Ran the tests.
```
$MERCURIALRUNTEST -S -j 48 --with-hg ~/local/facebook-hg-rpms/hg-crew/hg
```

Reviewers: #sourcecontrol, ttung

Differential Revision: https://phabricator.intern.facebook.com/D3440041

Tasks: 11762191
2016-06-15 15:48:16 -07:00
Durham Goode
d860baf210 cachegroup: fix pack path use of cachegroup
Summary:
The pack path logic did not use the correct unix group when
remotefilelog.cachegroup was specified. This fixes that.

Test Plan:
I manually tested it by deleting a pack dir and running repack. This
is hard to create an automated test for since the feature isn't really cross
platform, and we don't have a way to know what groups they have on their
machine.

Reviewers: #sourcecontrol, ttung, rmcelroy

Reviewed By: rmcelroy

Differential Revision: https://phabricator.intern.facebook.com/D3400756

Tasks: 11584114

Signature: t1:3400756:1465342537:ed023f6dc830117df5e85e294a41486f072714c9
2016-06-08 09:09:06 -07:00
Durham Goode
7cb6908a76 copyfrom: fix copy metadata in local blobs
The new pack stores return None for the copyfrom field, instead of the expected
''. We need the local file blob generator to handle this case, instead of just
putting None in the copyfrom field.
2016-06-06 14:16:06 -07:00
Durham Goode
3d127ad4a3 repack: cleanup empty directories
Summary:
Now that repack can clean up old remotefilelog blobs, let's have it also delete
any empty directories that get left behind.

Test Plan: Updated an existing test to cover it

Reviewers: mitrandir, lcharignon, #sourcecontrol, ttung, simonfar

Reviewed By: simonfar

Subscribers: simonfar

Differential Revision: https://phabricator.intern.facebook.com/D3385546

Signature: t1:3385546:1464972782:5ca63cf0a5589bb8a537957f50b4bc5ec4e0f0f5
2016-06-06 10:04:18 -07:00
Durham Goode
6f3d6c53f5 utils: unify cachepath access through a util function
Summary:
Previously a bunch of different places accessed the cachepath through ui.config
directly. This is a problem because we need to resolve any environment variables
in the path, and some spots didn't do this. So let's unify all accesses through
a helper function that takes care of the environment variables.

Test Plan: Added a test

Reviewers: mitrandir, lcharignon, #sourcecontrol, ttung, simonfar

Reviewed By: simonfar

Subscribers: simonfar

Differential Revision: https://phabricator.intern.facebook.com/D3385583

Signature: t1:3385583:1464971813:5b9ee5ed3d6ff9f1a78cb9e0269e433844758c9d
2016-06-03 09:45:58 -07:00
Durham Goode
01595d2684 repack: allow background repacks to repack non-pack stores
Previously, background repacks would only repack pack files, which meant there
was no automated way to repack loose remotefilelog files without manually
running 'hg repack'. This allows incremental repacks to also pack the loose
files.

It also changes the config knob for background repacks, so we can enable pack
file usage without the server having to support it just yet.
2016-06-01 10:06:35 -07:00
Durham Goode
8b6c78b675 unionstore: allow incomplete delta chains
A previous patch allowed the unionmetadatastore to return partial histories if a
certain config provided. This allowed repack to get partial history information.
This patch does the same for deltachains. This isn't currently used, but will be
used in the future to allow repacking packs with partial delta chains by just
lifting them out of one pack and putting them directly in another.
2016-05-26 02:15:46 -07:00
Durham Goode
2450b3f243 unionstore: allow partial history output from union stores
Previousy, a union store required that it be able to compute the entire history
of the revision. This caused problems in repack, since it may only have a
partial history. Instead of throwing a KeyError and giving the repack algorithm
no history information at all, we add a config knob to let the repack logic
specify that it's ok with partial histories.

The next patch will do the same for contentstore.
2016-05-26 02:13:53 -07:00
Durham Goode
a2646d8da9 packs: change LookupError to KeyError
We've unified on KeyError being the error thrown when the pack is missing the
desired filename+filehash, but there were a few old places still using
LookupError. This patch changes them to also be KeyError.

This fixes an issue where a repack could throw a LookupError when it only had a
partial history of a file. Now that we throw a KeyError, the exception is caught
and handled appropriately.
2016-05-26 02:07:11 -07:00
Durham Goode
304c6f5bd0 pack: move common pack logic into basepack
Summary:
This moves the common logic from datapack and historypack into a common
basepack. At the moment the only common logic is the constructor, which handles
version checking, fanout initialization, and mmap stuff.

Test Plan: Ran the tests

Reviewers: mitrandir, #mercurial, ttung, mjpieters

Reviewed By: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D3306558

Signature: t1:3306558:1463474571:35d3d2e71849b8111e5455da2dd4810725a35523
2016-05-24 02:15:58 -07:00
Durham Goode
bb3a62a266 pack: move common pack store logic to basepackstore
Summary:
This takes the duplicate logic from datapackstore and historypackstore and moves
it into a common subclass.

Test Plan: Ran the tests

Reviewers: ttung, mitrandir, #mercurial, mjpieters

Reviewed By: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D3306551

Signature: t1:3306551:1463474958:ddd36a43c2a3cbb34254c2b0153c0f3c5d431edb
2016-05-24 02:15:58 -07:00
Durham Goode
5343c35df7 pack: make mutablebasepack the base for mutablehistorypack
Summary:
Now that we have a mutablebasepack base class, we can get rid of all the
redundant logic in mutablehistorypack. This also has the side effect and making
the historypack's fanout table dynamically size, just like the datapack already
does. That required a few changes to the historypack reader class as well.

Test Plan: Ran the tests

Reviewers: mitrandir, #mercurial, ttung

Differential Revision: https://phabricator.intern.facebook.com/D3306546
2016-05-24 02:15:58 -07:00
Martin von Zweigbergk
14962d6e74 fileserverclient: make iterbatch() case work with new store
The iterbatch() handling added in f93aa99d4f1e (fileserverclient: use
new iterbatch() method, 2016-03-22) was broken by 31e88bf6faf0 (store:
change fileserviceclient to write via new store, 2016-04-04). Fix it
by copying the pattern introduced elsewhere in that change.
2016-05-18 22:39:20 -07:00
Durham Goode
7cce219abb pack: move common logic out of mutabledatapack into base class
Summary:
mutabledatapack and mutablehistorypack share a lot of common code, especially
around the index and fanout table. Let's move much of the code to a common
mutablebasepack class and out of mutabledatapack. In the next patch I will make
mutablehistorypack a subclass of mutablebasepack and delete all the duplicate
logic.

Test Plan: Ran the tests

Reviewers: mitrandir, #mercurial, ttung, rmcelroy

Reviewed By: rmcelroy

Subscribers: quark, rmcelroy

Differential Revision: https://phabricator.intern.facebook.com/D3306542

Signature: t1:3306542:1463611860:16bc68416c9bbed87748a50f55a3bac7c618fdf1
2016-05-20 09:31:37 -07:00
Durham Goode
142c8f9f66 packfetch: remove copy metadata from data before sending over the wire
Summary:
In normal Mercurial, the filelog entry's contents contains extra metadata that
stores the copy source. In the new pack format, that information is stored in
the history store, not in the data store. Therefore we need to change the server
side logic that responds to requests for packs to move that information over to
the history side before it sends the data.

Test Plan: Added a test

Reviewers: ttung, mitrandir, #mercurial, rmcelroy

Reviewed By: rmcelroy

Subscribers: rmcelroy

Differential Revision: https://phabricator.intern.facebook.com/D3306539

Signature: t1:3306539:1463609462:0c1e33e0892f96effcc96c8f78401cf0d8ab5cbd
2016-05-20 09:31:34 -07:00
Durham Goode
b6871085ab repack: add incremental repacking for history packs
Summary:
Previously we only had incremental repacking for data packs. This patch adds it
for history packs as well. The algorithm here is simpler, since the amount of
history data is generally much smaller than the amount of delta data.

The algorithm is basically: if there are 2 things bigger than 100MB, repack
them; else repack up to 100MB of smaller things.

The datapack hashes changed because having the history available during a repack
allows us to make different decisions about delta ordering, etc.

Test Plan: Updated the tsets

Reviewers: mitrandir, #mercurial, ttung, rmcelroy

Reviewed By: rmcelroy

Subscribers: rmcelroy

Differential Revision: https://phabricator.intern.facebook.com/D3306535

Signature: t1:3306535:1463696613:f40ed10c9dfed40d7bc455582592a7aed108ec3a
2016-05-20 09:31:31 -07:00
Durham Goode
93fbca3e39 repack: add automatic incremental background repacking after pull
Summary:
This runs the incremental background repacking logic after hg pull.

As part of adding tests, I also added a 'hg debugwaitonrepack' function that
will wait until any pending repack is done before returning, so the tests can
wait on repacks without so many sleeps.

Test Plan: Adds a test

Reviewers: mitrandir, #mercurial, ttung, rmcelroy

Reviewed By: rmcelroy

Subscribers: rmcelroy

Differential Revision: https://phabricator.intern.facebook.com/D3306526

Signature: t1:3306526:1463696933:9e27daf0c08076468e8f365a3c372fa7d4f56bde
2016-05-20 09:31:28 -07:00
Durham Goode
7227563c61 repack: add incremental repack
Summary:
This adds a --incremental flag to the hg repack command. This flag causes repack
to look at the distribution of pack files in the repo and performs the most
minimal repack to keep the repo in good shape. Currently it's only implemented
for datapacks.

The new remotefilelog.datagenerations config contains a list of the sizes for
the different generations of pack files. For instance:

  [remotefilelog]
  datagenerations=1GB
    100MB
    1MB

Designates 4 generations - packs over 1GB, packs over 100MB, packs over 1MB, and
implicitly packs undex 1MB. The incremental algorithm will try to keep each
generation to less than 3 pack files (prioritizing the larger generations
first). When performing a repack it will grab at least 2 packs, and will grab
more if the total pack size is less than 100MB (since repacking at that level is
pretty cheap).

I have no idea if this is a good algorithm. We'll how to see and iterate.

Test Plan: Adds a test

Reviewers: mitrandir, #mercurial, ttung, rmcelroy

Reviewed By: rmcelroy

Subscribers: rmcelroy

Differential Revision: https://phabricator.intern.facebook.com/D3306523

Signature: t1:3306523:1463697129:c87f4a397ef357b5ca4a80d01e9a6ca4d61f9d3d
2016-05-20 09:31:25 -07:00
Durham Goode
a36b9bd403 repack: move repack logic to static functions
Summary:
A future patch will be adding incremental repack, so let's move our repack logic
to the repack module so it's easier to refactor and extend.

Also adds a message for when a background repack kicks off (since we'll be
calling that from other places eventually).

Test Plan: Adds a test

Reviewers: mitrandir, #mercurial, ttung, rmcelroy

Reviewed By: rmcelroy

Subscribers: rmcelroy

Differential Revision: https://phabricator.intern.facebook.com/D3306521

Signature: t1:3306521:1463602886:cece3d517f0672b829702866482c902812f9ae27
2016-05-20 09:31:22 -07:00
Durham Goode
39fb6f6f14 packs: prevent creation of empty packs
Summary:
Previously creating a mutable pack then not adding anything to it, would result
in a 1 byte pack file. This patch fixes it so it doesn't produce any pack file.
A future patch found this bug and includes a test that executes this code path.

Test Plan: Future patch adds a test

Reviewers: mitrandir, #mercurial, ttung, rmcelroy

Reviewed By: rmcelroy

Differential Revision: https://phabricator.intern.facebook.com/D3306520

Signature: t1:3306520:1463602354:f47e9478ea686b0b86525b5979dec9ce56301b2a
2016-05-20 09:31:19 -07:00
Durham Goode
02c488a9f6 util: close temporary file descriptor
When we moved away from the core mercurial atomic temp logic, we forgot to close
the file descriptor that was opened by mkstemp.
2016-05-20 08:41:04 -07:00
Olivier Trempe
9f5acf47bd windows: chmod file to be writable before deleting
On windows, you cannot delete a file if it is readonly. Since remotefilelog
makes heavy use of readonly files, we need to chmod them to be writable before
deleting them.

We only do this on windows, since on unix based systems if the current user is
not the owner of the file (and is just accessing the file via a shared group),
they cannot chmod the file.

Renames some variables to disambiguate the stat module as well.
2016-05-20 08:37:49 -07:00
Olivier Trempe
66b95b7f12 windows: PWD environment variable not available
The PWD environment variable isn't available on windows, so let's just not
bother checking the current directory for repos in that case.
2016-05-20 08:33:34 -07:00
Olivier Trempe
bf52350bbe windows: os.getuid not supported
Windows does not support os.getuid, so avoid using it.
2016-05-20 08:08:57 -07:00
Olivier Trempe
abf6481106 windows: grp module not supported
On windows the grp module is not present, so we need to avoid importing it. This
means the shared group feature of remotefilelog is not supported on windows.
2016-05-20 08:08:57 -07:00
Olivier Trempe
a1a32e5f37 windows: use binary mode for reading and writing files
On windows, there is text mode and binary mode for reading and writing files.
Since we're dealing with files as just blob data, we always want binary mode.
2016-05-20 08:08:57 -07:00
Martin von Zweigbergk
41226c0217 debugindex: use normal handling for --dir
'debugindex --dir' is used for looking up the revlog of a directory
manifest, so it should use the normal (local file system) handling,
not remotefilelog.
2016-01-04 21:32:21 -08:00
Durham Goode
56411349f7 datapack: make fanout size dynamic
Summary:
Previously all fanout tables were 2^16 in length. For small packs, this resulted
in very sparse fanout tables that had to be linearly scanned to find the end of
the search bounds, which was slow.

With this patch, the data index now dynamically chooses what fanout table size
to use.  If the pack has over 2^16 / 10 entries, we use 2^16, otherwise we use
2^8. The reasoning is in the code.

The patch is a bit large because we had to take a bunch of constants and
duplicate them, and change their accessors to access them through member
variables.

Test Plan: Added a test

Reviewers: #mercurial, ttung, mitrandir

Reviewed By: mitrandir

Subscribers: mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3279000

Signature: t1:3279000:1463170357:9746d19dde14743bac9a8c40cafbc618504c420f
2016-05-16 10:59:09 -07:00
Durham Goode
c9621d3d1a repack: don't require complete history during data repack
Summary:
Previously, when repacking deltas we would require a full history of the node so
we could order the hashes optimally. In some situations though, we don't have
the full history available (like if we're only repacking a subset of packs), so
we need to be able to repack even without full history.

This patch handles the case where a given delta doesn't have history
information. We just store it as a full text.

This becomes useful in an upcoming series that will introduce incremental
packing that only packs a subset of the packs.

Test Plan: Added a test

Reviewers: #mercurial, ttung, mitrandir

Reviewed By: mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3278346

Signature: t1:3278346:1463086170:54c0fbefe78f9cafa7efc4b6f037887a924ab4a5
2016-05-16 10:59:09 -07:00
Durham Goode
9c3aa14c26 repack: remove copyfrom detection
Summary:
In an old version of the code, we would walk the entire history of a node during
data repack, which meant we had to keep track of when we saw a rename, and stop
walking there. Since then, we've changed the code to no longer walk the entire
history, and instead walk just the parts it was told to repack for this
particular file. This means we no longer ever walk across a copy, and therefore
don't need this copy detection logic.

Test Plan: Ran the tests

Reviewers: #mercurial, ttung, mitrandir

Reviewed By: mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3278443

Signature: t1:3278443:1463086137:c6d9eb6637bf3b8636a3df7e531f265d51cab0de
2016-05-16 10:59:09 -07:00
Durham Goode
12ec6ebcb4 store: force remote stores to always check the shared cache
Summary:
In a remotecontent/metadatastore a `get` request first runs prefetch, then reads
the resulting data from the shared cache store. Before this patch, the prefetch
would not download a value if it existed in the local data store, which means
nothing would be added to the shared cache store, and the `get` would fail. This
patch changes the remote stores to always prefetch based only on the contents of
the shared cache, so data will always be written.

This issue showed up when attempting to repack pack files that contained
references to nodes that were in the local store (which it didn't have access
to) but not the shared cache.

Test Plan:
Manually verified my issue disappeared. This isn't actually an issue
anymore, since future patches refactor the way repack works to not rely on the
remote stores, so this shouldn't be hit again. But it's a safe change
regardless.

Reviewers: #mercurial, ttung, mitrandir

Reviewed By: mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3278362

Signature: t1:3278362:1463086099:987d2fdd1c75e518f815c3159473e8cb22a15ba0
2016-05-16 10:59:09 -07:00
Durham Goode
e13e9ef243 historypack: fix handling of section lookup key errors
Summary:
In the old days _findsection would return None if the section wasn't found. We
have since changed it to throw a KeyError like all the other operations in the
packs. We need to update getmissing to eat that error like usual.

Test Plan: ran the repack that was failing, it succeeded.  Added a unit test

Reviewers: #mercurial, ttung, mitrandir

Reviewed By: mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3277245

Signature: t1:3277245:1463085831:0b32f852f49bd45ef2dfd3298313b9c2b87f75b6
2016-05-16 10:59:09 -07:00
Durham Goode
a5ed23b7ca fileserverclient: separate data and history in prefetching
Summary:
Previously the fileserverclient logic only checked the data store when
determining whether to prefetch a given key or not. This meant that if the file
was in the data store but not the history store, it could result in a history
lookup failing to prefetch.

The fix is to separate the notions of data and history in the prefetch logic. By
default we just check the data store like before, but an optional argument
allows us to specify checking the history store as well (and we change the
remotemetadatastore to pass that argument).

Test Plan:
Ran the tests. Also ran the repack scenario in my large repo that
reproed the issue. In another diff, I'm going to come back and add a suite of
tests around the various repack permutations that I've seen cause issues.

Reviewers: #mercurial, ttung, mitrandir

Reviewed By: mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3277239

Signature: t1:3277239:1463085690:49ad478048cd13836b60f7ac9190e2294f5e9c64
2016-05-16 10:59:09 -07:00
Durham Goode
5bb799ed73 prefetch: add progress to pack prefetch
Summary:
Add a simple progress bar on the client when receiving a pack from the
server.

Test Plan: Ran it

Reviewers: #mercurial, ttung, mitrandir

Reviewed By: mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3277265

Signature: t1:3277265:1463085643:aa0960092958c4f56a6d1c3a3901348dba48aa91
2016-05-16 10:59:09 -07:00
Durham Goode
5e4370b46d packs: add debug commands to view pack contents
Summary: Some simple debug commands to print the contents of each pack.

Test Plan: Ran it manually, and added a simple test

Reviewers: #mercurial, ttung, mitrandir

Reviewed By: mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3277233

Signature: t1:3277233:1463085196:c54fc875d536a96150bb1461b77247a5d7a9402c
2016-05-16 10:59:09 -07:00
Durham Goode
ad5352432e packs: change iterkeys() to __iter__
Summary:
In a future patch we'll add debug commmands to view the contents of packs. Let's
repurpose iterkeys to just be __iter__ and have it return most of the data about
each entry.

Test Plan: Ran the tests

Reviewers: #mercurial, ttung, mitrandir

Reviewed By: mitrandir

Subscribers: mitrandir, quark

Differential Revision: https://phabricator.intern.facebook.com/D3277225

Signature: t1:3277225:1463170015:c1acbc7f435f1cf8d07fea4f32bf742a22de5716
2016-05-16 10:59:09 -07:00
Durham Goode
e617c24532 datapack: add index marker for no delta base
Summary:
Previously, we assumed every delta in a pack had a complete chain in that pack,
so the index had no way to indicate a deltabase offset that didn't exist in the
pack.

Let's add a new marker to indicate that the delta base doesn't exist in this
pack.

Test Plan: Tested in my large repo repack scenario.  Added a test

Reviewers: #mercurial, ttung, mitrandir

Reviewed By: mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3277222

Signature: t1:3277222:1463084346:1687cfa174e98c2cf3022de9e9c3808881f689cd
2016-05-16 10:59:09 -07:00
Durham Goode
05ceb8b419 store: basic wire protocol for bundle delivery
Summary:
This adds a new wire protocol command to allow clients to request a set
of file contents and histories from the server and receive them in pack format.
It's pretty simple and always returns all the history for every node requested
(which is a bit overkill), but it's labeled v1 and we can iterate on it.

Test Plan: Added a test

Reviewers: #mercurial, ttung, mitrandir

Reviewed By: mitrandir

Subscribers: mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3277212

Signature: t1:3277212:1463421279:459cc84265502175b47df293647aab7e7a830185
2016-05-16 10:59:09 -07:00
Durham Goode
2a938a761c store: add copyfrom information to history index
Summary:
Previously we were throwing away copy information when we repacked things into
pack files. The hope was that we could store copy information somewhere else,
and keep the history pack using fixed length entries. Since storing copy
information elsewhere is a long ways off, let's just go ahead and put copy info
in the pack file.

This makes the entries non-fixed length, which means any iteration over them has
to read the length of each entry. This also affects the historypack filename
hashes since they are content based, so the tests had to change.

This matches the old remotefilelog behavior more closely (which is why no code
had to change outside the pack logic).

Test Plan: Added a test

Reviewers: #mercurial, mitrandir, ttung

Reviewed By: mitrandir

Subscribers: mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3262185

Signature: t1:3262185:1462562602:935683692276c7fa569d381b18aa3b18656793b1
2016-05-16 10:59:09 -07:00
Durham Goode
24c9c438d3 history: remove dependency on 80 character pack entry
Summary:
In a future diff we will be changing the format of history pack entries
to have a variable length file path in them. Before doing that, let's remove the
places that depend on each entry being exactly 80 characters.

Test Plan: Ran the tests

Reviewers: #mercurial, mitrandir, ttung, rmcelroy

Reviewed By: rmcelroy

Subscribers: rmcelroy

Differential Revision: https://phabricator.intern.facebook.com/D3262154

Signature: t1:3262154:1462820474:b7f5666001b3b7f8863b4de4826266204f3e87aa
2016-05-16 10:59:09 -07:00
Durham Goode
e4fb7d66bb history: remove getparents and getlinknode apis
Summary:
These APIs weren't actually used, and the questions can be answered via
the existing getancestors() api anyway.

They were originally put in place because they are the type of question that
doesn't require the full ancestor tree, so we could answer them without doing in
traversal. In an upcoming patch we add the concept of copyfrom back into the
historypack, and getparents becomes confusing since it doesn't expose knowledge
of copy information. So I just decided to delete it all until we need it.

In the future we may want a 'gethistoryinfo(filename, node)' api that just
returns (p1, p2, linknode, copyfrom), to fulfill that original need of history
information without a full ancestor traversal.

Test Plan: Ran the tests

Reviewers: #mercurial, ttung, mitrandir

Reviewed By: mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3261734

Signature: t1:3261734:1462413665:987c4703e53468a75346aa323188107a5c070fde
2016-05-16 10:59:09 -07:00
Durham Goode
e7a294e65f utils: change writefile to use custom atomic temp logic
Summary:
The mercurial atomic temp logic makes some assumptions about modes and copies
the mode bits from the existing file. If the existing file is readonly, that
means the atomic temp fails to open the file for writing after it's initially
created.

Since we only actually need like 4 lines from the atomic temp code, I just
implemented it on our end, with custom logic for mode handling.

Test Plan: Ran the tests

Reviewers: #mercurial, ttung, quark

Reviewed By: quark

Differential Revision: https://phabricator.intern.facebook.com/D3288589

Signature: t1:3288589:1462993609:653a0a63266b9129b0e18807858e5de02310ecf9
2016-05-11 13:33:06 -07:00
Durham Goode
ce493adf83 repack: add --background option
Summary:
This allows triggering a repack that can be run in the background. In the future
we will trigger this automatically under certain circumstances (like too many
pack files).

Test Plan: Added a test

Reviewers: #mercurial, ttung, quark

Reviewed By: quark

Subscribers: quark

Differential Revision: https://phabricator.intern.facebook.com/D3261161

Signature: t1:3261161:1462398568:5ae25f3e5a9acd0f4b34490b34a62be33cc69e3c
2016-05-04 14:53:23 -07:00
Durham Goode
8e290a5d4a repack: add lock to limit it to only one repack
Summary:
This adds a lock that limits us to running only one repack at a time. We also
add a simple prerepack hook to allow the tests to insert a sleep to test this
functionality.

Test Plan: Added a test

Reviewers: #mercurial, ttung, mitrandir

Reviewed By: mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3260428

Signature: t1:3260428:1462393311:3e1bf5dd047e7f3521679ca7640b448f5e784913
2016-05-04 14:53:19 -07:00
Durham Goode
c2cdcde2fb store: read recent packs first
Summary:
Since recent packs are likely to contain more recent data, let's put them at the
front of the pack list so they are checked first.

In a future diff I'll come back and refact the common code between datapack and
historypack into one base class.

Test Plan:
Ran the tests. Used the debugger to verify that the sort order was
correct.

Reviewers: #mercurial, ttung, mjpieters

Reviewed By: mjpieters

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D3259823

Signature: t1:3259823:1462388395:8ee48a7b02c6abc079878e53c5b336675249cb91
2016-05-04 14:53:16 -07:00
Durham Goode
1621511aa7 store: a number of performance improvements
Summary:
Some miscellaneous perf improvements:
- Get rid of unnecessary sorting
- Bail early from remotefilelog reverse filename lookup if there are no keys
- Avoid unnecessary class/enum lookups in ancestor hotpath
- Fix n^2 behavior in history repacking
- Remove unnecessary set() and tuple instantiation in hotpath
- Use __slots__ on repackentry to improve access times

Test Plan:
Ran the tests. Tested the actual perf improvements via 'hg repack' in
a large repo.  The n^2 fix caused a massive perf when, but the others shaved off
a number of seconds as well.

Reviewers: #mercurial, ttung, mitrandir

Reviewed By: mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3257595

Signature: t1:3257595:1462392755:80209fe66c2632d2c16a2840500b0655926e7513
2016-05-04 14:53:13 -07:00
Durham Goode
99a426f3eb store: fix fanout logic for history pack
This fix was already applied to datapack a while ago. Basically, if the pack
doesn't have a lot of revisions, it's possible that the fanout table is sparsely
populated. Therefore we need to scan forward when looking for the end bounds of
our fanout table.
2016-05-03 15:41:57 -07:00
Durham Goode
7da17af64f store: record what files were created during a repack
Summary:
Previously, if you ran repack twice in a row, it would actually delete your
packs, because the repack produced files with the same name as before, and the
cleanup then deleted them.

The fix is to have the stores record what files they produced in the ledger,
then future clean up work can avoid deleting anything that was created during
the repack.

Test Plan: Added a test

Reviewers: #mercurial, ttung, mitrandir

Reviewed By: mitrandir

Subscribers: mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3255819

Signature: t1:3255819:1462393814:d32155b12535990f72fbe48de045eddbb6f7fab6
2016-05-04 14:53:10 -07:00
Durham Goode
30cc85653c store: make pack files read-only
Summary:
Since pack files should never change after they are created, let's create them
with read-only permissions. It turns out that the Mercurial vfs doesn't apply
the correct permissions to files created by mkstemp (and we have to use mkstemp
since we don't know the name of the file until after we've written all the data
to it), so we have to manually call the permission fixing code.

We also need to fix our mmap calls to be readonly now, otherwise we get a
runtime permission denied exception.

Test Plan: Added a test

Reviewers: #mercurial, ttung, mitrandir

Reviewed By: mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3255816

Signature: t1:3255816:1462321201:dff4fb4c9301d67a77043ecc1d96262bb5d6a54a
2016-05-04 14:53:07 -07:00
Durham Goode
1d4b4dbb36 store: switch mutable packs to use openers
Summary:
Instead of passing in a path and performing joins ourselves, let's use an
opener. This will help handle all the file permission edge cases.

Test Plan: Ran the tests

Reviewers: #mercurial, ttung, mitrandir

Reviewed By: mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3255165

Signature: t1:3255165:1462393836:38a28c850a0dc06838d9c17672d3dffd9903bbd7
2016-05-04 14:53:04 -07:00
Durham Goode
0671f3d76a store: add version header to index
Summary: Pretty straight forward

Test Plan: Ran the tests

Reviewers: lcharignon, rmcelroy, ttung, quark, mitrandir

Reviewed By: mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3254892

Signature: t1:3254892:1462315520:888bb27ef121c08d463f9fd4cf9eeb3c42383a96
2016-05-04 14:53:01 -07:00
Durham Goode
d71eace818 store: refactor version number and size to constants
Summary:
A future patch will add a version number to the index file. Let's move the
version size, fanout start, and index start to constants so we can more easily
change them without changing the code.

Test Plan: Ran the tests

Reviewers: lcharignon, rmcelroy, ttung, mitrandir, quark

Reviewed By: mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3254876

Signature: t1:3254876:1462315858:63fe56e8cfcdbb0209861898ce0c45c7d7b33e35
2016-05-04 14:52:58 -07:00
Durham Goode
4aa798d76e store: don't allow gc to delete pack files
Summary:
`hg gc` is very aggressive in that it deletes any files in the cache that it
determines aren't a needed key, including files it doesn't recognize. Let's
teach it to not delete pack files.

In the future we'll need to make `hg gc` able to garbage collect the contents of
pack files as well.

Test Plan: It's probably fine!

Reviewers: lcharignon, rmcelroy, ttung, quark, mitrandir

Reviewed By: mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3254744

Signature: t1:3254744:1462308257:c1f932b88abf3337370f16c05c789422ea51b0e1
2016-05-04 14:52:55 -07:00
Durham Goode
312acdb24e store: implement markledger and cleanup for historypack
Summary:
Implementing these two functions allows historypacks to be repacked, either into
a new format, or by combining multiple packs into a single new one.

Test Plan: Added a test in my next patch

Reviewers: lcharignon, ttung, rmcelroy, mitrandir, quark

Reviewed By: mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3251542

Signature: t1:3251542:1462392294:f95f7666a3a5df675f1351a19af7532c4742af2b
2016-05-04 14:52:49 -07:00
Durham Goode
d86ae1e525 store: implement markledger and cleanup for datapack
Summary:
Implementing these two functions will allow datapack's to be repacked (either
into other formats, or by combining multiple packs into one).

A future patch will add a test.

Test Plan: Added a test in a future patch

Reviewers: lcharignon, ttung, rmcelroy, mitrandir, quark

Reviewed By: mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3251539

Signature: t1:3251539:1462393256:7caa09677fbcaaf57a47d7a833684883483c5b3a
2016-05-04 14:52:46 -07:00
Durham Goode
6b25f32192 store: add revision count to historypack filesection
Summary:
Previously, given a historypack file, we had no way of reading the contents,
since we had no way to know when to stop reading the revision entries for a
given file section.

This patch changes the format to have a revision count value after the filename
and before the revisions. The documentation already documented the format like
this, and therefore doesn't need updating.

A future patch will use this information to iterate over all the revisions in
the pack.

Test Plan: Added a test in a future patch

Reviewers: lcharignon, ttung, rmcelroy, quark, mitrandir

Reviewed By: mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3251538

Signature: t1:3251538:1462393282:f46b50e79237bfa8a25ff1957344588622b2699a
2016-05-04 14:52:43 -07:00
Durham Goode
548ccdeae1 store: make historypack file section writing lazier
Summary:
In a later patch we will need to add the count of revisions in a given file
section to the on-disk format. To make that easier, let's make the file section
serialization lazy, so that we will have the full list when it comes time to
count the entries.

Test Plan: added a test in a future patch

Reviewers: lcharignon, ttung, rmcelroy, quark, mitrandir

Reviewed By: mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3251537

Signature: t1:3251537:1462393274:60b72a47de45f5a94f4f5a8d34b3942db0aa3fda
2016-05-04 14:52:40 -07:00
Durham Goode
a54d3f1257 store: make repack only repack the shared cache
Summary:
Previously, hg repack would repack all the objects in all the store and dump the
new packs in .hg/store/packs. Initially we only want to repack the shared cache
though, so let's change repack to only operate on shared stores, and to write
out the new packs to the hgcache, under the appropriate repo name.

In a future patch I'm going to go through all this store stuff and replace all
uses of os.path and direct file reads/writes with a mercurial vfs.

Test Plan:
Ran repack in a large repo and verified packs were produced in
$HGCACHE/$REPONAME/packs

Ran hg cat on a file to verify that it read the data from the pack and did not do any remotefilelog network calls.

Reviewers: lcharignon, rmcelroy, ttung, quark, mitrandir

Reviewed By: mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3250213

Signature: t1:3250213:1462315927:694661795141e2c869ba661a54cea8f4b90823df
2016-05-04 14:52:33 -07:00
Durham Goode
cfd60406ad store: add context manager to mutable pack classes
Summary:
Previously, if a repack failed, it would leave temporary pack files laying
around. By adding enter/exit functions to mutable packs, we can guarantee
cleanup happens.

Test Plan: Ran repack, verified that a failure did not leave tmp files

Reviewers: rmcelroy, quark, ttung, lcharignon, mitrandir

Reviewed By: mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3250201

Signature: t1:3250201:1462234552:7f20260a193ed1dd858bf6e9f489ac902d859218
2016-05-03 12:34:45 -07:00
Durham Goode
d2e7ae7519 store: make repack command use new repacker
Summary:
Now that all the repack logic is in place, let's switch the repack
command to use the new version. This also means the repack command will now
clean up the old remotefilelog blobs once it's finished.

Test Plan:
Ran hg repack in a large repo. Verified it deleted the old
remotefilelog blobs, and verified that I could still updated around the
repository without making any remotefilelog network requests.

A future diff will add standard .t mercurial tests for the repack command.

Reviewers: rmcelroy, ttung, lcharignon, quark, mitrandir

Reviewed By: mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3249601

Signature: t1:3249601:1462235506:03c0d95f6a82cfc04b340b139f39c02853941a17
2016-05-03 12:34:09 -07:00
Durham Goode
4ecb47b021 store: move history repack logic to repacker
Summary:
We had a naive repack implementation in historypack.py. Let's move it to the
repack module and do the minor adjustments to use the new repackerledger apis.

Test Plan:
Ran hg repack in conjunction with future diffs that make use of this
api

Reviewers: rmcelroy, ttung, lcharignon, quark, mitrandir

Reviewed By: mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3249587

Signature: t1:3249587:1462232544:591cd8bec09f781370896470746eae5a4489531f
2016-05-03 12:33:54 -07:00
Durham Goode
735aa964d5 store: move data repack logic to repacker
Summary:
We had a naive repack implementation in datapack.py. Let's move it to the repack
module and do the minor adjustments to use the new repackerledger apis.

Test Plan: Ran it in conjunction with future diffs that make use of this api.

Reviewers: rmcelroy, ttung, lcharignon, quark, mitrandir

Reviewed By: mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3249585

Signature: t1:3249585:1462232504:a00aa65afca9562a2c1456cc4ab48c50d1ba5b68
2016-05-03 12:33:36 -07:00
Durham Goode
c902797dc9 store: implement markledger and cleanup on stores
Summary:
This implements the new markledger and cleanup apis on the existing
remotefilelog stores. These apis are used to tell the repacker what each store
has, and allows each store to cleanup if its data has been repacked.

Test Plan:
Ran repack in conjunction with the future diffs that make use of
these apis.

Reviewers: rmcelroy, ttung, lcharignon, quark, mitrandir

Reviewed By: mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3249584

Signature: t1:3249584:1462226133:1e8faffc9f6bf8f7c94e6e79aee8865e3c41648c
2016-05-03 12:33:00 -07:00
Durham Goode
c429030ca4 store: add class definitions and stub for repack
Summary:
This introduces the high level classes that will implement the generic repack
logic.

Test Plan: Ran the repack in conjunction with later commits that use these apis.

Reviewers: rmcelroy, ttung, lcharignon, quark, mitrandir

Reviewed By: mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3249577

Signature: t1:3249577:1462225435:000f9cc29ae2a3d7fdbedf546c8936ef45d1e4cf
2016-05-03 12:32:35 -07:00
Durham Goode
b049a0910a store: datapack fix perf issue
Summary:
Using range() allocates a full list, which is 2**16 entries in the fanout case.
Let's use xrange instead. This is a notable performance win when checking many
keys.

Also removed an unused variable and use index instead of self._index since this
is a hotpath.

Test Plan: Ran hg repack

Reviewers: rmcelroy, ttung, lcharignon, quark, mitrandir

Reviewed By: mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3249563

Signature: t1:3249563:1462240834:c19d6cbf0b6237f15ca8d81e8da856752df0ec59
2016-05-03 12:30:44 -07:00
Durham Goode
1704e5c8fb store: add tests for historypack
Summary:
This adds a basic test suite for the historypack class, and fixes some issues it
found.

Test Plan: ./run-tests.py test-historypack.py

Reviewers: mitrandir, rmcelroy, ttung, lcharignon

Reviewed By: lcharignon

Differential Revision: https://phabricator.intern.facebook.com/D3237858

Signature: t1:3237858:1461884966:c0ec90a2735255e5ef70eade09915066a7b71ee5
2016-04-28 17:37:03 -07:00
Durham Goode
8f4d83edeb shallowbundle: fix broken fallback orig call
This was caught by tests running in an unusual configuration
2016-04-28 17:34:08 -07:00
Durham Goode
22948ce7e1 checkcode: add check code test
Summary: Adds the same check code test that upstream Mercurial uses.

Test Plan:
Ran it, and fixed all the failures. I won't land this commit until
all the failure fixes are landed.

Reviewers: #sourcecontrol, ttung, rmcelroy, wez

Reviewed By: wez

Subscribers: quark, rmcelroy, wez

Differential Revision: https://phabricator.intern.facebook.com/D3221380

Signature: t1:3221380:1461802769:19f5bdc209c05edb442faa70ae572ce31e2fbc95
2016-04-28 10:18:47 -07:00
Durham Goode
29d3dda67e checkcode: fix various store files
Summary: Fix check code for various store related files

Test Plan: Ran the tests

Reviewers: #sourcecontrol, mitrandir, ttung

Reviewed By: mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3222465

Signature: t1:3222465:1461701300:34560288be4dc921f0252d4ad8fdc9c8d9357e23
2016-04-27 16:49:33 -07:00
Durham Goode
98fd33f8cb store: add missing imports
Summary: These were missing, and only needed in exception cases.

Test Plan: nope

Reviewers: #sourcecontrol, rmcelroy, ttung

Reviewed By: rmcelroy

Subscribers: rmcelroy

Differential Revision: https://phabricator.intern.facebook.com/D3219749

Signature: t1:3219749:1461608742:91e3a721e78188c52431b6c5d1b3ad091e249c3a
2016-04-27 16:49:30 -07:00
Durham Goode
f92668636b store: add historypack store that reads histpack files from .hg/store/packs
Summary:
Now that we can read and write histpack files, let's add a store implementation that
can serve packed content.

My next set of commits (which haven't been written yet) will:
- add tests for all of this

Test Plan:
Ran the tests. Also repacked a repo, deleted the old cache files,
ran hg log FILE, and verified it produced results without hitting the network.

Reviewers: #sourcecontrol, ttung, mitrandir, rmcelroy

Reviewed By: mitrandir, rmcelroy

Subscribers: rmcelroy, mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3219765

Signature: t1:3219765:1461717992:9b2e8646c0555472fa00ee7059c0f283fd4c2c65
2016-04-27 16:49:27 -07:00
Durham Goode
18cde8ba89 store: add a historypack class that can read histpacks
Summary:
The previous patch added logic to repack store history and write it to
a histpack file. This patch adds a pack reader implementation that knows how to
read histpacks.

Test Plan:
Ran the tests.  Also tested this in conjunction with the next patch
which actually reads from the data structure.

Reviewers: #sourcecontrol, ttung, mitrandir, rmcelroy

Reviewed By: mitrandir, rmcelroy

Subscribers: rmcelroy, mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3219764

Signature: t1:3219764:1461718081:9d812b6aea87fe9eb48fdac9dbef282e4775c3c9
2016-04-27 16:49:24 -07:00
Durham Goode
f22bae206b store: add a historypack format and a repacker for it
Summary:
This is an initial implementation of a history pack file creator and a repacker
class that can produce it. A history pack is a pack file that contains no file
content, just history information (parents and linknodes).

A histpack is two files:

- a .histpack file consisting of a series of file sections, each of which
  contains a series of revision entries (node, p1, p2, linknode)
- a .histidx file containing a filename based index to the various file sections
  in the histpack.

See the code for documentation of the exact format.

Test Plan:
ran the tests.  A future diff will add unit tests for all the new pack
structures.

Ran `hg repack` on a large repo. Verified pack files were produced in
.hg/store/packs. In a future diff, I verified that the data could be read
correctly.

Reviewers: #sourcecontrol, mitrandir, ttung, rmcelroy

Reviewed By: rmcelroy

Subscribers: mitrandir, rmcelroy, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D3219762

Signature: t1:3219762:1461751982:e7bbc65e8f01c812fc1eb566d2d48208b0913766
2016-04-27 16:49:21 -07:00
Durham Goode
f17f6cc093 store: add revisions to datapack in alphabetical order
Summary:
This forces the revisions in the datapack to be added in alphabetical order.
This makes the algorithm more deterministic, but otherwise has little effect.

Test Plan: Ran the tests, ran repack

Reviewers: #sourcecontrol, rmcelroy, ttung

Reviewed By: rmcelroy

Subscribers: rmcelroy

Differential Revision: https://phabricator.intern.facebook.com/D3219760

Signature: t1:3219760:1461687720:7be5fdc1419f8214c8c83074494b33214b3684ae
2016-04-27 16:49:18 -07:00
Durham Goode
43ed70b6f1 store: add datapack store that reads pack files from .hg/store/packs
Summary:
Now that we can read and write datapack files, let's add a store implementation that
can serve packed content. With this patch, it's technically possible for someone
to prefetch and repack large portions of history for long term storage with
remotefilelog.

My next set of commits (which haven't been written yet) will:
- add tests for all of this
- add an indexpack format for packing ancestor metadata (the datapack only packs
  revision content)

Test Plan:
Ran the tests. Also repacked a repo, deleted the old cache files, ran
hg up null && hg up master, and verified it checked out master with the
right files and without fetching blobs from the server.

Reviewers: #sourcecontrol, ttung, rmcelroy

Reviewed By: rmcelroy

Subscribers: rmcelroy

Differential Revision: https://phabricator.intern.facebook.com/D3205351

Signature: t1:3205351:1461751649:45a56b57d962a282aeef9478500a3b23495a0eb7
2016-04-27 16:49:15 -07:00
Durham Goode
56c83ea072 store: add a datapack class that can read datapacks
Summary:
The previous patch added logic to repack store contents and write it to a
datapack file. This patch adds a new store implementation that knows how to read
datapacks.

It's just a simple implementation without any parallelism. So there's room for
improvement.

Test Plan:
Ran the tests.  Also tested this in conjunction with the next patch
which actually reads from the data structure.

Reviewers: #sourcecontrol, ttung, rmcelroy

Reviewed By: rmcelroy

Subscribers: rmcelroy

Differential Revision: https://phabricator.intern.facebook.com/D3205342

Signature: t1:3205342:1461750967:84377517cb1f285d37694a3f503d60ae85bacb66
2016-04-27 16:49:12 -07:00
Durham Goode
510ac021f3 store: add a basic repack and datapack format
Summary:
This is an initial implementation of a repack algorithm that can read data from
an arbitrary store (in this case the remotefilelog content store), and repack it
into a datapack.

A datapack is two files:

- a .datapack file consisting of a series of deltas (a delta may be a full text if the delta base is the nullid)
- a .dataidx file consisting of delta information and an index into the deltas

See the code for documentation of the exact format.

Test Plan:
ran the tests

Ran `hg repack` in a large repo. Verified that a datapack and a dataidx file
were created in .hg/store/packs. The datapack used 148MB instead of the 439MB the
old remotefilelog storage used.

Reviewers: #sourcecontrol, ttung, rmcelroy

Reviewed By: rmcelroy

Subscribers: rmcelroy

Differential Revision: https://phabricator.intern.facebook.com/D3205334

Signature: t1:3205334:1461751366:ee4bf6a580ffb667071a8046fda6f0858b7f25ae
2016-04-27 16:49:09 -07:00
Durham Goode
f362c9a3a8 store: add getfiles() api to store
Summary:
This adds a api to the store contract that allows the store to return a list of
the name/node pairs that it contains. This will be used to allow a repack
algorithm to list the contents of the store so it can repack it into another
store. The old remotefilelog blob store used namehash+node keys, which is
different from the new store API's name+node keys, so the getfiles()
implementation here has to perform a reverse  namehash->name lookup so it can
satisfy the store API contract.

In the remotefilelog basestore implementation, it reads the file names from the
local data directory and the shared cache directory, and reverse resolves the
file name hashes into filenames to produce the list.

Test Plan: ran the tests

Reviewers: #sourcecontrol, ttung, rmcelroy

Reviewed By: rmcelroy

Subscribers: rmcelroy

Differential Revision: https://phabricator.intern.facebook.com/D3205321

Signature: t1:3205321:1461751437:a7c44c2bbe153122a3b85b8d82907a112cf77b1a
2016-04-27 16:49:06 -07:00
Durham Goode
438db1be81 store: allow union metadatastore to combine ancestors from many stores
Summary:
The old store api required that each store be able to return the complete
ancestor history for a given name/node pair. This patch allows a store to return
only the parts of history it knows about, and the union store will combine that
history with the history from other stores to produce the full result. This is
useful for stores like bundle files, where they contain only a partial history
that needs to be annotated by the real store.

Test Plan: ran the tests

Reviewers: #sourcecontrol, ttung, rmcelroy

Reviewed By: rmcelroy

Subscribers: rmcelroy

Differential Revision: https://phabricator.intern.facebook.com/D3205319

Signature: t1:3205319:1461751511:210740b82cc6767b2f0c393715ac93d8f1b96bc7
2016-04-27 16:49:04 -07:00
Durham Goode
cce75d4663 store: add concept of delta chain to content store
Summary:
The old store contracts required that every store be able to produce the full
text for a revision. This patch modifies the contract so that a store (like a
bundle file store) can serve a delta chain and the union store can combine delta
chains from multiple stores together to create the final full text.

Test Plan: ran the tests

Reviewers: #sourcecontrol, rmcelroy

Reviewed By: rmcelroy

Subscribers: rmcelroy

Differential Revision: https://phabricator.fb.com/D3205315

Signature: t1:3205315:1461669845:3eb8968566285f6221c7c44435b855cc65da33f4
2016-04-26 15:10:38 -07:00
Durham Goode
7e1047d11f store: change union stores to accept a list of stores
Summary:
Instead of hard coding the list of stores in each union store, let's make it a
list and just test each store in order. This will allow easily adding new stores
and reordering the priority of the existing ones.

Also fix the remote store's contains function. 'contains' is the old name, and
it now needs to be getmissing in order to fit the store contract.

Test Plan: ran the tests

Reviewers: #sourcecontrol, ttung, rmcelroy

Reviewed By: rmcelroy

Differential Revision: https://phabricator.fb.com/D3205314

Signature: t1:3205314:1461606028:3a513ac82c5de668a7e40bbf7cc88d8754e2f0bb
2016-04-26 15:10:38 -07:00
Durham Goode
9cfbf5a59e store: keep track of the writable store instead of hard coding it
Summary:
A future patch is going to change the union store to just contain an ordered
list of stores. Therefore we need a special spot to record which store is the
one that should receive writes.

Test Plan: ran the tests

Reviewers: #sourcecontrol

Differential Revision: https://phabricator.fb.com/D3205307
2016-04-26 15:10:38 -07:00
Durham Goode
c3c047f0b7 Move sortnodes into shallowutil
Summary:
This is a generic topological sort and will be useful in the upcoming repacking
code.

Test Plan: Ran the tests

Reviewers: #sourcecontrol, ttung

Reviewed By: ttung

Differential Revision: https://phabricator.fb.com/D3204124

Signature: t1:3204124:1461260520:e1cb5c9d496f11e5f44e0cdbc5ba851b1573d2e1
2016-04-26 15:10:38 -07:00
Durham Goode
84bc49f25d checkcode: fix shallowrepo, shallowutil, and setup.py
Summary: Fix failures found by check-code.

Test Plan: Ran the tests

Reviewers: #sourcecontrol, ttung

Reviewed By: ttung

Differential Revision: https://phabricator.fb.com/D3221375

Signature: t1:3221375:1461648312:7dbdd59e6370cb32b90d864a623d8066028741e7
2016-04-26 13:00:31 -07:00
Durham Goode
3817826242 checkcode: fix remotefilelogserver and shallowbundle
Summary: Fix failures found by check-code.

Test Plan: Ran the tests

Reviewers: #sourcecontrol, ttung

Reviewed By: ttung

Differential Revision: https://phabricator.fb.com/D3221373

Signature: t1:3221373:1461648284:23203c17f4a87e33ff4e9be17a8b99bddbcdff05
2016-04-26 13:00:31 -07:00
Durham Goode
39d350996f checkcode: fix remotefilectx and remotefilelog
Summary: Fix failures found by check-code.

Test Plan: Ran the tests

Reviewers: #sourcecontrol, ttung

Reviewed By: ttung

Differential Revision: https://phabricator.fb.com/D3221371

Signature: t1:3221371:1461648217:e9702d761ab8fd6f85dee60a4c192cf25e784f11
2016-04-26 13:00:31 -07:00
Durham Goode
859510b65e checkcode: fix fileserverclient.py
Summary: Fix failures found by check-code.

Test Plan: Ran the tests

Reviewers: #sourcecontrol, ttung

Reviewed By: ttung

Differential Revision: https://phabricator.fb.com/D3221369

Signature: t1:3221369:1461648197:185cbbba61a9d1a7a1beacd64153185d0d0826ed
2016-04-26 13:00:31 -07:00
Durham Goode
71bd8c2561 checkcode: fix errors in cacheclient and debugcommands
Summary: Fix failures found by check-code.

Test Plan: Ran the tests

Reviewers: #sourcecontrol, ttung

Reviewed By: ttung

Differential Revision: https://phabricator.fb.com/D3221366

Signature: t1:3221366:1461648117:088f3a5837393499e1a383af860bd1a935e0cba7
2016-04-26 13:00:31 -07:00
Durham Goode
495a853d78 checkcode: fix __init__.py
Summary: Fix failures found by check-code.

Test Plan: Ran the tests

Reviewers: #sourcecontrol, ttung

Reviewed By: ttung

Differential Revision: https://phabricator.fb.com/D3221365

Signature: t1:3221365:1461646159:efeb0478c66cbd49d4a0a6c02a79d530b42f8248
2016-04-26 13:00:31 -07:00
Jun Wu
ead8969797 Fix missing errno import
Summary: Apparently we need to `import errno` in `shallowutil.py`

Test Plan: Code Review

Reviewers: #sourcecontrol, ttung, durham

Reviewed By: durham

Differential Revision: https://phabricator.fb.com/D3195117

Signature: t1:3195117:1461031210:424912a96448a2a8cb37197f006cfa95d4ab1cb1
2016-04-18 19:04:58 -07:00
Durham Goode
2d1dcb4b97 Fix missing 'grp' import 2016-04-18 11:46:06 -07:00
Durham Goode
5b2914142a Fix status returning invalid results
The recent refactor caused remotefilelog.size() to include rename metadata in
the size count, which meant the size didn't match what the rest of Mercurial
expected. This caused clean files to show up as dirty in hg status if they had a
'lookup' dirstate state and were renames.
2016-04-10 09:46:24 -07:00
Durham Goode
2e93ca187a Add byte count checking when receiving from the server
Summary:
We've received a few complaints that receivemissing is throwing corrupt data
exceptions. My best guess is that we're not receiving all of the data for some
reason. Let's add an assertion to ensure all the data is present, so we can
narrow it down to a connection issue instead of actual corrupt data.

Test Plan: Ran the tests

Reviewers: #sourcecontrol, ttung

Differential Revision: https://phabricator.fb.com/D3136203
2016-04-05 09:50:12 -07:00
Durham Goode
24323a759c store: address code review feedback
This was meant to be part of the previous stack of commits, but I pushed the
wrong stack. This patch addresses a number of code review feedback points, the
most visible being to remain 'contains' to something else (in this case
'getmissing').
2016-04-04 16:48:55 -07:00
Durham Goode
8ca8f7f6ca stores: remove fetch logic and replace with a remote store fallthrough
The old way of fetching from the server required the base store api expose a way
for outside callers to add fetch handlers to the store. This exposed some of the
underlying details of how data is fetched in an unnecessary way and added an
awkward subscription api.

Let's just treat our remote caches as another store we can fetch from, and
require that the over arching configure logic (in shallowrepo.py) can connect
all our stores together in a union store.
2016-04-04 16:26:12 -07:00
Durham Goode
ece19111e0 ioutil: rename ioutil to shallowutil
The old name was not very descriptive. There's already a shallowutil, so let's
just use that.
2016-04-04 16:26:12 -07:00
Durham Goode
29ea8ada1e store: delete the localcache class
Now that all functionality has been moved to the new store, we no longer need
the localcache class. So let's delete it.
2016-04-04 16:26:12 -07:00
Durham Goode
ecf4378d18 store: implement gc in the new store
The last major piece of functionality that needs to be moved into the new store
is the gc algorithm. This is just a copy paste of the one that exists in
localcache.
2016-04-04 16:26:12 -07:00
Durham Goode
d70897e18c store: implement markrepo on the new store
Now that most of our storage has been moved behind the new store, let's also
move the ability to mark the repo to behind that storage abstraction.
2016-04-04 16:26:12 -07:00
Durham Goode
0dd4247520 store: make remotefilelog.ancestormap use the new store
Now that we have a metadatastore, let's use it to implement
remotefilelog.ancestormap. This gets rid of a bunch of ugly code.
2016-04-04 16:26:12 -07:00
Durham Goode
ad473d5a6b store: make remotefilelog.linknode us the new store
Now that we have the new metadatastore, let's use it to fetch the linknode
instead of parsing the data ourself.
2016-04-04 16:26:12 -07:00
Durham Goode
82bc4468ed store: make remotefilelog.renamed use the store
Now that we have a metadata store, let's switch remotefilelog.renamed to consult
it, instead of parsing the data itself.
2016-04-04 16:26:12 -07:00