Commit Graph

40 Commits

Author SHA1 Message Date
Durham Goode
7e3a739970 treemanifest: set allowincomplete to true for repacks
Summary:
It's possible for history packs to not have all of history, so let's allow that
during repacks.

Also fix one exception to use hex()

Test Plan: A future patch has tests that exposed this

Reviewers: #mercurial, quark

Reviewed By: quark

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4956078

Signature: t1:4956078:1493264261:81c12520b7352d5040cdb027509c975678c10069
2017-04-27 10:44:34 -07:00
Jun Wu
4240bd017e remotefilelog: let content stores support metadata
Summary:
This diffs add a `getmeta` method to all content stores. The cdatapack code is
modified to pass the tests, it needs further change to support `getmeta`.

The datapack format is bumped to v1 from v0. For v1, we append a `metadata`
dict at the end of each revision. The dict is currently used to store revlog
flags and rawsize of raw revlog fulltext. In the future we can put more data
like a second hash etc, without changing API or format again.

This diff focuses on correctness. A datapack caching layer to speed up
`getmeta` will be added later.

Tests are updated since we write new v1 packfile now and the format change
leads to different content and packfile names.

`Makefile`, `ls-l.py` are added to make tests easier to maintain.

Test Plan: Updated existing tests.

Reviewers: #mercurial, rmcelroy, durham

Reviewed By: durham

Subscribers: rmcelroy, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4903917

Signature: t1:4903917:1493255844:7ef5d487096cd2f78f2aaae672a68d49f33632ee
2017-04-26 19:50:36 -07:00
Durham Goode
9dc2f1efb7 treemanifest: improve progress during data repack
Summary:
Large repacks had some long pauses where there was no feedback to the user. This
adds more granualar progress updates.

Test Plan: Ran repack in a large repo and saw more granular progress

Reviewers: #mercurial, quark

Reviewed By: quark

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4915492

Signature: t1:4915492:1492641504:9db31d534fe201bec838e77ba470c9051d3be04f
2017-04-19 21:14:04 -07:00
Durham Goode
68030f1dd7 treemanifest: add known argument to getancestor api
Summary:
During a repack we often want to access the ancestory for a bunch of nodes that
might be ancestors of each other. Using getancestors for that results in a lot
of duplicated work. For instance, getancestors(0) returns [0], getancestors(1)
returns [0, 1], getancestors(2) returns [0, 1, 2], etc. Which is n^2

This patch adds an optional `known` argument for getancestors that let's the
caller tell getancestors what ancestors it's already aware of. Then getancestors
can short circuit when it reaches that level. This avoids duplicate work during
repack.

Test Plan:
Ran treemanifest repack in our large repo and verified it made
progress over the nodes much faster than before

Reviewers: #mercurial, quark

Reviewed By: quark

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4901308

Signature: t1:4901308:1492640896:27d4a90c2993cd1fefbd8dbc211f2ec181178bce
2017-04-19 21:14:04 -07:00
Durham Goode
dae50fc99e treemanifest: support repacking revlogs into packs
Summary:
Accessing treemanifests in revlogs is incredibly slow. This patch adds the
ability for `hg repack` to repack the revlog content into a data pack file, and
teaches the getserverpack code to look in the pack file first.

Test Plan: Need to add a test. Sent out anyway for review

Reviewers: #mercurial, quark

Reviewed By: quark

Subscribers: quark, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4901298

Signature: t1:4901298:1492640706:2850b9f0e9bbee77952f46af3b784aa81253e626
2017-04-19 21:14:04 -07:00
Durham Goode
63f4fa69a0 datapack: remove delta reuse
Summary:
This code that reused deltas if the delta parent wasn't available was bugged
because it meant you could end up with a cycle in the delta chains. This was an
old optimization from before trees had history, so let's drop the optimization
(since trees now have history and can be correctly repacked).

Test Plan:
Ran repack on a packfile that previously caused cycles. Verified the
new version did not with `hg debugdatapack foo.datapack'

Reviewers: #mercurial

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4724520
2017-03-18 19:38:45 -07:00
Durham Goode
70ce116529 treemanifest: add history data to tree repacks
Summary:
Previously, tree repacks did not take into account tree history. It would just
look at the delta base and if the base existed, it would just reuse the delta.
This would A) result in very long chains, and B) result in chains where the full
text was the oldest version, instead of the newest (recent full texts means
faster access to recent versions).

This patch threads tree history into the repacker, which already knows how to
use history for repacks.

Test Plan:
Updated the tests, and inspected the new test results to ensure tree
entries that were not deltas before the repack became reverse deltas during the
repack.

Reviewers: #mercurial, simonfar

Reviewed By: simonfar

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4647359

Signature: t1:4647359:1488882710:dba72cf488766ce827b7641735164fa0efc9a303
2017-03-07 11:15:26 -08:00
Durham Goode
940796814d treemanifest: add option for using native store
Summary:
Adds a treemanifest.usecunionstore config flag for enabling and disabling use of
the native code uniondatapackstore.

Since we haven't implemented the repack APIs on the native datapack stores, we
currently have to force repack to use the old python implementations. Instead of
trying to expose just the appropriate APIs through the python interface, I think
we'll rewrite all of repack to be in C++ at a future time, since we can take
advantage of parallelism, etc.

Test Plan:
Updated test-treemanifest.t to use the c datapackstore. Also run all
the tests with --extra-config-opt=treemanifest.usecunionstore=True.

These tests caught a missing null check in the C++ code as well.

Reviewers: #mercurial, simonfar

Reviewed By: simonfar

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4609795

Signature: t1:4609795:1488365341:203362db5f470b613c4d6484686cd32c3fa8458f
2017-03-01 16:55:19 -08:00
Durham Goode
d3c6def7b8 remotefilelog: move pack file permissions to be in mutablebasepack
Summary:
Treemanifest had a bug where the pack files it created were 400 instead of 444.
This meant people sharing a cache on the same machine couldn't access them. In
most of the remotefilelog code we had set opener.createmode before creating the
pack but we forgot to for treemanifest.

This patch moves the opener creation and createmode setting into the mutable
pack class so we can't screw this up again.

Test Plan:
Tests that wrote to the packs before, now failed and had to be
updated to chmod the packs before writing to them.

Reviewers: #mercurial, rmcelroy

Reviewed By: rmcelroy

Subscribers: rmcelroy, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4411580

Tasks: 15469140

Signature: t1:4411580:1484321293:9aa78254677548a6dc2270c58cee0ec6f57dd089
2017-01-13 09:42:25 -08:00
Durham Goode
67f6d86cd7 treemanifest: add test for incremental tree repack
Summary:
This adds a test for hg repack --incremental handling tree packs. It fixes a few
bugs that were caught in the process:

1. Since remotefilelog was being imported via it's file path, it was being
loaded into the process as hgext_remotefilelog. When treemanifest loaded it into
the process via 'import remotefilelog' it was being imported as 'remotefilelog'.
This meant we had the same types imported into the same process with two
different names, which meant 'isinstance()' checks could fail when they
shouldn't (which affects incremental repacks). So we just drop the filepath.

2. We need to allow repacking local tree manifest data even if the full delta
chain isn't available (since part of the delta chain may be in the cache). So we
add allowincomplete to the data pack in this case.

Test Plan: Ran it

Reviewers: #mercurial, rmcelroy

Reviewed By: rmcelroy

Subscribers: rmcelroy, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4262412

Signature: t1:4262412:1480706110:45bb0a1e1b031f9cfd4658a5071bbac5df2f6543
2016-12-02 14:38:00 -08:00
Durham Goode
27e07eeed4 remotefilelog: make repack work for non-remotefilelog repos
Summary:
treemanifest also has the concept of repack and shares the same code as
remotefilelog's repack. We want treemanifest to be usable even without
remotefilelog, so let's update the repack code to not require the presence of
remotefilelog configured members on the repo object.

In the long term we'll probably move the repack and pack code out of
remotefilelog entirely.

Test Plan: A future patch adds a test that caught this

Reviewers: #mercurial, dsyang, rmcelroy

Reviewed By: rmcelroy

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4261571

Signature: t1:4261571:1480705343:2ceabbbbdeaf7408ef8b94ad3e670b9423162399
2016-12-02 14:37:49 -08:00
Durham Goode
edd8b25c30 repack: add incremental tree repacking
Summary:
Previously we only supported full repacks of the manifest stores. This patch
adds incremental repacking for both the shared manifest store and the local
manifest store.

Test Plan:
Did a few pulls to produce a few treemanifest pack files. Ran hg
repack --incremental after each one and verified it was a no-op until there were
enough pack files to trigger the repack.

Will add tests later today.

Reviewers: #mercurial, rmcelroy

Reviewed By: rmcelroy

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4260602

Signature: t1:4260602:1480705304:b17d3ec5ff8d4caafe935b2c4e941454052fe3ec
2016-12-02 14:37:47 -08:00
Durham Goode
046fb2ca08 repack: enable repacking of local manifest stores
Summary:
Previous hg repack would only repack the shared manifest cache store. This patch
makes it also repack the local manifest store too.

Test Plan:
Made a local commit in my test repo with treemanifests enabled.
Rebased the commit to master so now there were two local tree packs. Ran hg
repack and verified they were combined into one pack in
.hg/store/packs/manifests

I'll add a test later today

Reviewers: #mercurial, mjpieters

Reviewed By: mjpieters

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4260580

Signature: t1:4260580:1480696151:8f989e299dda50281ca63489e870202eb195d714
2016-12-02 14:37:45 -08:00
Durham Goode
26e1154b71 repack: make repack more generic
Summary:
Previously the repack logic was hardcoded to only work on the shared cache
stores. This patch refactors that knowledge to a higher level so a future patch
can also repack the local stores as well.

Test Plan: Ran the tests

Reviewers: #mercurial, mjpieters

Reviewed By: mjpieters

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4260569

Signature: t1:4260569:1480694802:30ab24dd031661227b4da1febc3d3daf9ad598fb
2016-12-02 14:37:42 -08:00
Durham Goode
6f15ced334 remotefilelog: repack tree manifests
Summary:
Previously hg repack would only repack file content pack files. This patch makes
it also repack tree manifest pack files.

Test Plan:
Ran pull repack in a repo and verified the manifest packs were
repacked. I'll add some tests around this at some point.

Reviewers: #mercurial

Differential Revision: https://phabricator.intern.facebook.com/D4240723
2016-11-29 16:00:39 -08:00
Durham Goode
fe6f541ed8 remotefilelog: reuse deltabase when history is not available
Summary:
Previously, if there was no history available for a given revision, we would
just store the full text in the pack file. This patch makes it attempt to reuse
the existing delta base instead. This will be useful for repacking
treemanifests, since we currently don't have histpacks for them (we just delta
them efficiently when they are first added to the repo).

Test Plan: Ran tests

Reviewers: #mercurial

Differential Revision: https://phabricator.intern.facebook.com/D4240705
2016-11-29 15:37:58 -08:00
Stanislau Hlebik
2a0e8205d4 extutil: create new package and move runshellfast here
Summary:
I'm going to use runshellfast in infinitepush.
To avoid copy-paste let's move it to the separate package.
Note: fastmanifest also has runshellfast but it's implementation
is a bit different and seems that it doesn't work with remotefilelog.
So I'll leave it alone for now.

Test Plan: python run-tests.py -j20

Reviewers: #sourcecontrol

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4175836
2016-11-21 00:52:30 -08:00
Durham Goode
4235752853 remotefilelog: rename getpackpath to getcachepackpath
Summary:
In a future diff we will be introducing packs into .hg/store, so we need to
differentiate between cache packs and local packs. This patch renames
getpackpath to getcachepackpath. A future diff will add getlocalpackpath.

This exposed a pyflakes error, so we fix that too.

Test Plan: Ran the tests

Reviewers: #mercurial, quark

Reviewed By: quark

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4055815

Signature: t1:4055815:1477059415:e0221557bbeec6701820c826f00390d3a71cd2d3
2016-10-21 11:02:20 -07:00
Durham Goode
f43ba75915 remotefilelog: fix pyflakes and module import errors
Summary:
This fixes all the pyflaks and module errors for the main remotefilelog
code base.

Test Plan: ./run-tests.py test-check* test-remotefilelog*

Reviewers: #mercurial, quark

Reviewed By: quark

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4055537

Signature: t1:4055537:1477049663:ee904d311d17d3659e055e2c109c68c9023cfd1f
2016-10-21 11:02:09 -07:00
Durham Goode
df86b3486d treemanifest: auto create manifest pack directory
The pack/manifests directory wasn't being automatically created, so let's make
it so.
2016-09-21 13:51:39 -07:00
Durham Goode
3b7beae747 ctree: add ctreemanifest hg extension
Summary:
Adds the initial extension that sets up the ctreemanifest. It currently relies
on the fastmanifest extension to hook into all the manifest APIs to construct
ctreemanifests.

Test Plan:
In a future patch, I was able to run 'hg manifests' on a commit and
have it return the manifest contents by reading the treemanifest.

Reviewers: #fastmanifest, ttung

Reviewed By: ttung

Subscribers: ttung

Differential Revision: https://phabricator.intern.facebook.com/D3755327

Signature: t1:3755327:1472114482:0c5862cba68ed4db643d28c2fae01f33f5352970
2016-08-29 16:19:52 -07:00
Ryan McElroy
f4dd73e113 remotefilelog: pass modern check-code
Test Plan: run-tests.py test-check-code-hg.t

Reviewers: #mercurial, ttung, simonfar

Reviewed By: simonfar

Subscribers: simonfar, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D3777581

Tasks: 12855049

Signature: t1:3777581:1472224785:a15040cec1c95ca60d1be837d905b3c3d87be362
2016-08-26 08:48:07 -07:00
Durham Goode
7c44b94bb0 repack: fix repack heuristic to account for unusual copies
Summary:
Previously, the history repack logic would stop traversing history for a given
filename once it encountered a rename. This isn't quite right, since the history
could eventually be traversed back to the original file, where we'd need to
continue processing. So now we check for when the copyfrom becomes the filename.

Also, if the copy source file and the copy target file have two nodes with the
same value, we would not process the one in the copy target (since it was marked
do not process). We fix this by explicitly checking if the node is one of the
known entries in the file being processed.

Test Plan: Added a test

Reviewers: #mercurial, ttung, mitrandir, rmcelroy

Reviewed By: mitrandir, rmcelroy

Differential Revision: https://phabricator.intern.facebook.com/D3523215

Signature: t1:3523215:1467828169:bd487c8f296352c1a1b9355cb55f9001bd5e19a9
2016-07-07 15:58:47 -07:00
Durham Goode
d860baf210 cachegroup: fix pack path use of cachegroup
Summary:
The pack path logic did not use the correct unix group when
remotefilelog.cachegroup was specified. This fixes that.

Test Plan:
I manually tested it by deleting a pack dir and running repack. This
is hard to create an automated test for since the feature isn't really cross
platform, and we don't have a way to know what groups they have on their
machine.

Reviewers: #sourcecontrol, ttung, rmcelroy

Reviewed By: rmcelroy

Differential Revision: https://phabricator.intern.facebook.com/D3400756

Tasks: 11584114

Signature: t1:3400756:1465342537:ed023f6dc830117df5e85e294a41486f072714c9
2016-06-08 09:09:06 -07:00
Durham Goode
6f3d6c53f5 utils: unify cachepath access through a util function
Summary:
Previously a bunch of different places accessed the cachepath through ui.config
directly. This is a problem because we need to resolve any environment variables
in the path, and some spots didn't do this. So let's unify all accesses through
a helper function that takes care of the environment variables.

Test Plan: Added a test

Reviewers: mitrandir, lcharignon, #sourcecontrol, ttung, simonfar

Reviewed By: simonfar

Subscribers: simonfar

Differential Revision: https://phabricator.intern.facebook.com/D3385583

Signature: t1:3385583:1464971813:5b9ee5ed3d6ff9f1a78cb9e0269e433844758c9d
2016-06-03 09:45:58 -07:00
Durham Goode
01595d2684 repack: allow background repacks to repack non-pack stores
Previously, background repacks would only repack pack files, which meant there
was no automated way to repack loose remotefilelog files without manually
running 'hg repack'. This allows incremental repacks to also pack the loose
files.

It also changes the config knob for background repacks, so we can enable pack
file usage without the server having to support it just yet.
2016-06-01 10:06:35 -07:00
Durham Goode
2450b3f243 unionstore: allow partial history output from union stores
Previousy, a union store required that it be able to compute the entire history
of the revision. This caused problems in repack, since it may only have a
partial history. Instead of throwing a KeyError and giving the repack algorithm
no history information at all, we add a config knob to let the repack logic
specify that it's ok with partial histories.

The next patch will do the same for contentstore.
2016-05-26 02:13:53 -07:00
Durham Goode
b6871085ab repack: add incremental repacking for history packs
Summary:
Previously we only had incremental repacking for data packs. This patch adds it
for history packs as well. The algorithm here is simpler, since the amount of
history data is generally much smaller than the amount of delta data.

The algorithm is basically: if there are 2 things bigger than 100MB, repack
them; else repack up to 100MB of smaller things.

The datapack hashes changed because having the history available during a repack
allows us to make different decisions about delta ordering, etc.

Test Plan: Updated the tsets

Reviewers: mitrandir, #mercurial, ttung, rmcelroy

Reviewed By: rmcelroy

Subscribers: rmcelroy

Differential Revision: https://phabricator.intern.facebook.com/D3306535

Signature: t1:3306535:1463696613:f40ed10c9dfed40d7bc455582592a7aed108ec3a
2016-05-20 09:31:31 -07:00
Durham Goode
93fbca3e39 repack: add automatic incremental background repacking after pull
Summary:
This runs the incremental background repacking logic after hg pull.

As part of adding tests, I also added a 'hg debugwaitonrepack' function that
will wait until any pending repack is done before returning, so the tests can
wait on repacks without so many sleeps.

Test Plan: Adds a test

Reviewers: mitrandir, #mercurial, ttung, rmcelroy

Reviewed By: rmcelroy

Subscribers: rmcelroy

Differential Revision: https://phabricator.intern.facebook.com/D3306526

Signature: t1:3306526:1463696933:9e27daf0c08076468e8f365a3c372fa7d4f56bde
2016-05-20 09:31:28 -07:00
Durham Goode
7227563c61 repack: add incremental repack
Summary:
This adds a --incremental flag to the hg repack command. This flag causes repack
to look at the distribution of pack files in the repo and performs the most
minimal repack to keep the repo in good shape. Currently it's only implemented
for datapacks.

The new remotefilelog.datagenerations config contains a list of the sizes for
the different generations of pack files. For instance:

  [remotefilelog]
  datagenerations=1GB
    100MB
    1MB

Designates 4 generations - packs over 1GB, packs over 100MB, packs over 1MB, and
implicitly packs undex 1MB. The incremental algorithm will try to keep each
generation to less than 3 pack files (prioritizing the larger generations
first). When performing a repack it will grab at least 2 packs, and will grab
more if the total pack size is less than 100MB (since repacking at that level is
pretty cheap).

I have no idea if this is a good algorithm. We'll how to see and iterate.

Test Plan: Adds a test

Reviewers: mitrandir, #mercurial, ttung, rmcelroy

Reviewed By: rmcelroy

Subscribers: rmcelroy

Differential Revision: https://phabricator.intern.facebook.com/D3306523

Signature: t1:3306523:1463697129:c87f4a397ef357b5ca4a80d01e9a6ca4d61f9d3d
2016-05-20 09:31:25 -07:00
Durham Goode
a36b9bd403 repack: move repack logic to static functions
Summary:
A future patch will be adding incremental repack, so let's move our repack logic
to the repack module so it's easier to refactor and extend.

Also adds a message for when a background repack kicks off (since we'll be
calling that from other places eventually).

Test Plan: Adds a test

Reviewers: mitrandir, #mercurial, ttung, rmcelroy

Reviewed By: rmcelroy

Subscribers: rmcelroy

Differential Revision: https://phabricator.intern.facebook.com/D3306521

Signature: t1:3306521:1463602886:cece3d517f0672b829702866482c902812f9ae27
2016-05-20 09:31:22 -07:00
Durham Goode
c9621d3d1a repack: don't require complete history during data repack
Summary:
Previously, when repacking deltas we would require a full history of the node so
we could order the hashes optimally. In some situations though, we don't have
the full history available (like if we're only repacking a subset of packs), so
we need to be able to repack even without full history.

This patch handles the case where a given delta doesn't have history
information. We just store it as a full text.

This becomes useful in an upcoming series that will introduce incremental
packing that only packs a subset of the packs.

Test Plan: Added a test

Reviewers: #mercurial, ttung, mitrandir

Reviewed By: mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3278346

Signature: t1:3278346:1463086170:54c0fbefe78f9cafa7efc4b6f037887a924ab4a5
2016-05-16 10:59:09 -07:00
Durham Goode
9c3aa14c26 repack: remove copyfrom detection
Summary:
In an old version of the code, we would walk the entire history of a node during
data repack, which meant we had to keep track of when we saw a rename, and stop
walking there. Since then, we've changed the code to no longer walk the entire
history, and instead walk just the parts it was told to repack for this
particular file. This means we no longer ever walk across a copy, and therefore
don't need this copy detection logic.

Test Plan: Ran the tests

Reviewers: #mercurial, ttung, mitrandir

Reviewed By: mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3278443

Signature: t1:3278443:1463086137:c6d9eb6637bf3b8636a3df7e531f265d51cab0de
2016-05-16 10:59:09 -07:00
Durham Goode
2a938a761c store: add copyfrom information to history index
Summary:
Previously we were throwing away copy information when we repacked things into
pack files. The hope was that we could store copy information somewhere else,
and keep the history pack using fixed length entries. Since storing copy
information elsewhere is a long ways off, let's just go ahead and put copy info
in the pack file.

This makes the entries non-fixed length, which means any iteration over them has
to read the length of each entry. This also affects the historypack filename
hashes since they are content based, so the tests had to change.

This matches the old remotefilelog behavior more closely (which is why no code
had to change outside the pack logic).

Test Plan: Added a test

Reviewers: #mercurial, mitrandir, ttung

Reviewed By: mitrandir

Subscribers: mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3262185

Signature: t1:3262185:1462562602:935683692276c7fa569d381b18aa3b18656793b1
2016-05-16 10:59:09 -07:00
Durham Goode
8e290a5d4a repack: add lock to limit it to only one repack
Summary:
This adds a lock that limits us to running only one repack at a time. We also
add a simple prerepack hook to allow the tests to insert a sleep to test this
functionality.

Test Plan: Added a test

Reviewers: #mercurial, ttung, mitrandir

Reviewed By: mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3260428

Signature: t1:3260428:1462393311:3e1bf5dd047e7f3521679ca7640b448f5e784913
2016-05-04 14:53:19 -07:00
Durham Goode
1621511aa7 store: a number of performance improvements
Summary:
Some miscellaneous perf improvements:
- Get rid of unnecessary sorting
- Bail early from remotefilelog reverse filename lookup if there are no keys
- Avoid unnecessary class/enum lookups in ancestor hotpath
- Fix n^2 behavior in history repacking
- Remove unnecessary set() and tuple instantiation in hotpath
- Use __slots__ on repackentry to improve access times

Test Plan:
Ran the tests. Tested the actual perf improvements via 'hg repack' in
a large repo.  The n^2 fix caused a massive perf when, but the others shaved off
a number of seconds as well.

Reviewers: #mercurial, ttung, mitrandir

Reviewed By: mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3257595

Signature: t1:3257595:1462392755:80209fe66c2632d2c16a2840500b0655926e7513
2016-05-04 14:53:13 -07:00
Durham Goode
7da17af64f store: record what files were created during a repack
Summary:
Previously, if you ran repack twice in a row, it would actually delete your
packs, because the repack produced files with the same name as before, and the
cleanup then deleted them.

The fix is to have the stores record what files they produced in the ledger,
then future clean up work can avoid deleting anything that was created during
the repack.

Test Plan: Added a test

Reviewers: #mercurial, ttung, mitrandir

Reviewed By: mitrandir

Subscribers: mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3255819

Signature: t1:3255819:1462393814:d32155b12535990f72fbe48de045eddbb6f7fab6
2016-05-04 14:53:10 -07:00
Durham Goode
4ecb47b021 store: move history repack logic to repacker
Summary:
We had a naive repack implementation in historypack.py. Let's move it to the
repack module and do the minor adjustments to use the new repackerledger apis.

Test Plan:
Ran hg repack in conjunction with future diffs that make use of this
api

Reviewers: rmcelroy, ttung, lcharignon, quark, mitrandir

Reviewed By: mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3249587

Signature: t1:3249587:1462232544:591cd8bec09f781370896470746eae5a4489531f
2016-05-03 12:33:54 -07:00
Durham Goode
735aa964d5 store: move data repack logic to repacker
Summary:
We had a naive repack implementation in datapack.py. Let's move it to the repack
module and do the minor adjustments to use the new repackerledger apis.

Test Plan: Ran it in conjunction with future diffs that make use of this api.

Reviewers: rmcelroy, ttung, lcharignon, quark, mitrandir

Reviewed By: mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3249585

Signature: t1:3249585:1462232504:a00aa65afca9562a2c1456cc4ab48c50d1ba5b68
2016-05-03 12:33:36 -07:00
Durham Goode
c429030ca4 store: add class definitions and stub for repack
Summary:
This introduces the high level classes that will implement the generic repack
logic.

Test Plan: Ran the repack in conjunction with later commits that use these apis.

Reviewers: rmcelroy, ttung, lcharignon, quark, mitrandir

Reviewed By: mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3249577

Signature: t1:3249577:1462225435:000f9cc29ae2a3d7fdbedf546c8936ef45d1e4cf
2016-05-03 12:32:35 -07:00