Commit Graph

16 Commits

Author SHA1 Message Date
Durham Goode
96ea7cd405 histpack: add per-node index
Summary:
Previously looking up a particular node in a histpack required a bisect to find
the file section, then a linear scan to find the particular node.  If you needed
to look up the latest 3000 nodes one by one, that involved 3000 linear scans,
many of which traversed the same nodes over and over.

This patch adds additional index at the end of the current histidx file. In a
future patch, we will change getnodeinfo() to use this index instead of the
linear scan logic.

Test Plan:
Ran the tests. I haven't actually verified that the data in these
indexes is correct. My next patch will add logic that reads these indexes and
will add tests around it. I won't land this until I've confirmed it's correct.

Reviewers: quark, #mercurial, rmcelroy

Reviewed By: rmcelroy

Subscribers: rmcelroy, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4983690

Signature: t1:4983690:1493798877:ae0802b4896b54bf066df9f684d94554855fd35a
2017-05-03 10:19:45 -07:00
Durham Goode
d1a927d335 packs: add entry count to pack index
Summary:
Previously, we used the length of the index file to determine the upper bounds
of the bisect. In a future patch we'll want to add more data to the end of the
index file, so we need to record how long the index portion of the index is.
This patch adds that information.

Test Plan: Ran the tests.

Reviewers: #mercurial, quark

Reviewed By: quark

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4983682

Signature: t1:4983682:1493693255:57ab9af2030847fedff05b6755113ba8ce0c933b
2017-05-03 10:19:45 -07:00
Durham Goode
8674a90bb5 histpack: add version 1
Summary:
This patch just bumps the histpack version number to 1 and adds a config flag to
enable writing v1 pack files. The format hasn't actually changed in this patch,
I'm just doing the verison bump so I can update all the hashes in the tests
without working about functionality change.

In the next patch I will modify the index format, which won't affect the hashes.

Test Plan:
Ran the tests. I also ran the tests with some debug code to manually
force the sha to include 0 instead of 1 and verified that the hash didn't change
(which confirms that all of these hash changes are just because of that one byte
version change).

Reviewers: #mercurial, quark

Reviewed By: quark

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4983675

Signature: t1:4983675:1493692444:5d88df4d46ce487f1b791417754ba000ecf10a1e
2017-05-03 10:19:45 -07:00
Durham Goode
2a7e625b34 treemanifest: add history pack data during commit
Summary:
Previously, a local commit would only write data packs, and it just threw away
the history data entirely. Let's add history packs and record them on commit.

Test Plan:
The tests are updated to show these new packs. In some cases the
datapacks got smaller as well, since they can now take advantage of history data
for delta choices.

Reviewers: #mercurial, quark

Reviewed By: quark

Subscribers: quark, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4956105

Signature: t1:4956105:1493265399:d3fa1052c207fba0045cbb92b4d833d18d48e099
2017-04-27 10:44:34 -07:00
Durham Goode
df97f609bf treemanifest: move history sorting responsibility into MutableHistoryPack
Summary:
Previously, the logic that added data to a mutable history pack was required to
add it in the correct order (all entries for a certain file at once, and in
newest-first order). This required the callers to jump through weird hoops if
the data came in out of order or at different times in the transaction.

This patch moves the ordering logic to be inside MutableHistoryPack, so callers
can add the data in any order they wish, and it will get sorted before being
serialized.

This does add memory pressure to things that read a lot of history, like repack.
If this becomes a problem we may want to add a 'historypack.flush()' api that
let's us tell the history pack it's ok to flush it's current contents to disk.

Test Plan: Ran the tests

Reviewers: #mercurial, quark

Reviewed By: quark

Subscribers: quark, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4956096

Signature: t1:4956096:1493264693:a2275a49e35565d4b11244e3e5dd82c25de7e16e
2017-04-27 10:44:34 -07:00
Durham Goode
10f743a300 treemanifest: chop off trailing slash when requesting manifests
Summary:
The name being passed to the store was wrong, because it had a trailing slash.
This wasn't caught before because the code was reading and writing the paths
with a slash at the end, so it matched as long as we only interacted with packs
produced by the code. The issue became more obvious when I tried to have packs
generated from revlogs interact with this code.

All the tests are affected since the entry keys changed.

Also use 'const ManifestFetcher &x' to pass the ref to avoid the copy.

Test Plan: Tests updated

Reviewers: #mercurial, quark

Reviewed By: quark

Subscribers: quark, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4901274

Signature: t1:4901274:1492638476:ff28f8976657baec99effbd82ecd436f6282ea5b
2017-04-19 21:14:04 -07:00
Durham Goode
3f65bd2532 treemanifest: don't build pack files that depend on other pack files
Summary:
Previously, when treemanifest would create packs of trees during pull, we
allowed trees to be delta'd against trees in other packs. This resulted in
smaller packs, but if the other pack disappeared for some reason (since it's a
cache), the chain broke.

This patch ensures that the first version of every tree added to a pack is a
full text.

This temporarily makes repacks worse, since the repacker doesn't know about
history to produce deltas when combining packs. The next patch adds history
awareness which improves the repack deltafication.

Test Plan:
Updated the tests, and inspected the new test results to ensure that
all packs only had deltas within the pack.

Reviewers: #mercurial, simonfar

Reviewed By: simonfar

Subscribers: simonfar, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4647348

Signature: t1:4647348:1488882214:e850622a853a534fc60caeef604c88c30740c60d
2017-03-07 11:15:25 -08:00
Durham Goode
943afede25 treemanifest: add support for creating history packs
Summary:
Previously the treemanifest auto-tree-creation logic only produced data packs
containing the actual contents of the tree blobs. This lost history information
which is important for our ability to efficiently repack the data files.

This patch creates history packs during pull as well. A future patch will also
create history packs for the local tree blob store.

Test Plan: Updated the tests to cover this

Reviewers: #mercurial, simonfar

Reviewed By: simonfar

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4638865

Signature: t1:4638865:1488449992:48b60961b50b90b6d0e75a64af1f36fb29944e7a
2017-03-07 11:15:25 -08:00
Durham Goode
940796814d treemanifest: add option for using native store
Summary:
Adds a treemanifest.usecunionstore config flag for enabling and disabling use of
the native code uniondatapackstore.

Since we haven't implemented the repack APIs on the native datapack stores, we
currently have to force repack to use the old python implementations. Instead of
trying to expose just the appropriate APIs through the python interface, I think
we'll rewrite all of repack to be in C++ at a future time, since we can take
advantage of parallelism, etc.

Test Plan:
Updated test-treemanifest.t to use the c datapackstore. Also run all
the tests with --extra-config-opt=treemanifest.usecunionstore=True.

These tests caught a missing null check in the C++ code as well.

Reviewers: #mercurial, simonfar

Reviewed By: simonfar

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4609795

Signature: t1:4609795:1488365341:203362db5f470b613c4d6484686cd32c3fa8458f
2017-03-01 16:55:19 -08:00
Durham Goode
760ab3473c treemanifest: error out if invalid configuration
Summary:
treemanifest requires fastmanifest, and fastmanifest.usetree requires
treemanifest. Let's make these dependencies explicit in the code and error out
if they are incorrect.

Test Plan: Added a test

Reviewers: #mercurial, wez

Reviewed By: wez

Subscribers: wez, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4417512

Signature: t1:4417512:1484347447:7e18340813fac0b298aa51a7cc2f89fc6953680f
2017-01-13 14:58:20 -08:00
Durham Goode
6492dbbf03 hybridmanifest: don't reuse node in match results
Summary:
Previously, _converttohybridmanifest would always create a new hybrid manifest
with the same node as the original. This meant that some code paths would
attempt to use the treemanifest from the node, instead of the already prepared
matches result. This meant the output could contain all the values from the
original tree, instead of just the matches output.

This is actually a regression from 98ba34a5194c09. Prior to that, matches did
not reuse the node.

Test Plan: Manually inspected the results in the debugger during a rebase.

Reviewers: #mercurial, ikostia

Reviewed By: ikostia

Subscribers: rmcelroy, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4247821

Signature: t1:4247821:1480499268:27f4a1b92ecf5d10009996b5b8f22bac02f3f38e
2016-12-02 14:37:28 -08:00
Durham Goode
a0dc16174d treemanifest: write deltas for trees
Summary:
Previously, when we wrote each tree entry into a pack file, it wasn't delta'd in
any way. This patch makes it store the delta against p1 in the pack file.

Testing in a large repo shows this reduces tree pack size by about 22x.

Test Plan:
Ran the tests. Did a pull in a large repo and saw the pack file was
22x smaller than before (and still usable).

Reviewers: #mercurial

Differential Revision: https://phabricator.intern.facebook.com/D4202088
2016-11-29 15:37:58 -08:00
Durham Goode
b179a3593c treemanifest: sort parent nodes for hash computation
Summary:
Core mercurial sorts p1 and p2 before computing the hash, so it's deterministic.
We need to do the same.

Test Plan: Ran the tests, saw a hash changed

Reviewers: #mercurial

Differential Revision: https://phabricator.intern.facebook.com/D4202063
2016-11-29 15:37:58 -08:00
Durham Goode
8aad6cabdd tests: glob away system specific output
ls -l prints a 'total #' line which apparently is system specific. So let's make
it a glob.
2016-10-28 16:50:27 -07:00
Durham Goode
df79de6e69 treemanifest: add test for commit producing trees
Summary: Adds a test for verifying that hg commit adds a tree pack to local storage.

Test Plan: Ran it

Reviewers: #mercurial, zamsden

Reviewed By: zamsden

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4082915

Signature: t1:4082915:1477590191:6637b9ab0fdad27a7d4933934168b57620dee235
2016-10-27 15:25:34 -07:00
Durham Goode
7ae5ac490d treemanifest: add test for autocreatetrees
Summary:
Adds a simple test for checking that trees are created during pull when
autocreatetrees is enabled.

Test Plan: Ran it

Reviewers: #mercurial, zamsden

Reviewed By: zamsden

Subscribers: zamsden, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4082871

Signature: t1:4082871:1477590276:09e594f4b87628bae8654ffdfc0bba18beb7ccad
2016-10-27 15:25:32 -07:00