Summary:
Previously looking up a particular node in a histpack required a bisect to find
the file section, then a linear scan to find the particular node. If you needed
to look up the latest 3000 nodes one by one, that involved 3000 linear scans,
many of which traversed the same nodes over and over.
This patch adds additional index at the end of the current histidx file. In a
future patch, we will change getnodeinfo() to use this index instead of the
linear scan logic.
Test Plan:
Ran the tests. I haven't actually verified that the data in these
indexes is correct. My next patch will add logic that reads these indexes and
will add tests around it. I won't land this until I've confirmed it's correct.
Reviewers: quark, #mercurial, rmcelroy
Reviewed By: rmcelroy
Subscribers: rmcelroy, mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D4983690
Signature: t1:4983690:1493798877:ae0802b4896b54bf066df9f684d94554855fd35a
Summary:
Previously, we used the length of the index file to determine the upper bounds
of the bisect. In a future patch we'll want to add more data to the end of the
index file, so we need to record how long the index portion of the index is.
This patch adds that information.
Test Plan: Ran the tests.
Reviewers: #mercurial, quark
Reviewed By: quark
Subscribers: mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D4983682
Signature: t1:4983682:1493693255:57ab9af2030847fedff05b6755113ba8ce0c933b
Summary:
This patch just bumps the histpack version number to 1 and adds a config flag to
enable writing v1 pack files. The format hasn't actually changed in this patch,
I'm just doing the verison bump so I can update all the hashes in the tests
without working about functionality change.
In the next patch I will modify the index format, which won't affect the hashes.
Test Plan:
Ran the tests. I also ran the tests with some debug code to manually
force the sha to include 0 instead of 1 and verified that the hash didn't change
(which confirms that all of these hash changes are just because of that one byte
version change).
Reviewers: #mercurial, quark
Reviewed By: quark
Subscribers: mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D4983675
Signature: t1:4983675:1493692444:5d88df4d46ce487f1b791417754ba000ecf10a1e
Summary:
Previously, a local commit would only write data packs, and it just threw away
the history data entirely. Let's add history packs and record them on commit.
Test Plan:
The tests are updated to show these new packs. In some cases the
datapacks got smaller as well, since they can now take advantage of history data
for delta choices.
Reviewers: #mercurial, quark
Reviewed By: quark
Subscribers: quark, mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D4956105
Signature: t1:4956105:1493265399:d3fa1052c207fba0045cbb92b4d833d18d48e099
Summary:
Previously, the logic that added data to a mutable history pack was required to
add it in the correct order (all entries for a certain file at once, and in
newest-first order). This required the callers to jump through weird hoops if
the data came in out of order or at different times in the transaction.
This patch moves the ordering logic to be inside MutableHistoryPack, so callers
can add the data in any order they wish, and it will get sorted before being
serialized.
This does add memory pressure to things that read a lot of history, like repack.
If this becomes a problem we may want to add a 'historypack.flush()' api that
let's us tell the history pack it's ok to flush it's current contents to disk.
Test Plan: Ran the tests
Reviewers: #mercurial, quark
Reviewed By: quark
Subscribers: quark, mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D4956096
Signature: t1:4956096:1493264693:a2275a49e35565d4b11244e3e5dd82c25de7e16e
Summary:
The name being passed to the store was wrong, because it had a trailing slash.
This wasn't caught before because the code was reading and writing the paths
with a slash at the end, so it matched as long as we only interacted with packs
produced by the code. The issue became more obvious when I tried to have packs
generated from revlogs interact with this code.
All the tests are affected since the entry keys changed.
Also use 'const ManifestFetcher &x' to pass the ref to avoid the copy.
Test Plan: Tests updated
Reviewers: #mercurial, quark
Reviewed By: quark
Subscribers: quark, mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D4901274
Signature: t1:4901274:1492638476:ff28f8976657baec99effbd82ecd436f6282ea5b
Summary:
Previously, when treemanifest would create packs of trees during pull, we
allowed trees to be delta'd against trees in other packs. This resulted in
smaller packs, but if the other pack disappeared for some reason (since it's a
cache), the chain broke.
This patch ensures that the first version of every tree added to a pack is a
full text.
This temporarily makes repacks worse, since the repacker doesn't know about
history to produce deltas when combining packs. The next patch adds history
awareness which improves the repack deltafication.
Test Plan:
Updated the tests, and inspected the new test results to ensure that
all packs only had deltas within the pack.
Reviewers: #mercurial, simonfar
Reviewed By: simonfar
Subscribers: simonfar, mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D4647348
Signature: t1:4647348:1488882214:e850622a853a534fc60caeef604c88c30740c60d
Summary:
Previously the treemanifest auto-tree-creation logic only produced data packs
containing the actual contents of the tree blobs. This lost history information
which is important for our ability to efficiently repack the data files.
This patch creates history packs during pull as well. A future patch will also
create history packs for the local tree blob store.
Test Plan: Updated the tests to cover this
Reviewers: #mercurial, simonfar
Reviewed By: simonfar
Subscribers: mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D4638865
Signature: t1:4638865:1488449992:48b60961b50b90b6d0e75a64af1f36fb29944e7a
Summary:
Adds a treemanifest.usecunionstore config flag for enabling and disabling use of
the native code uniondatapackstore.
Since we haven't implemented the repack APIs on the native datapack stores, we
currently have to force repack to use the old python implementations. Instead of
trying to expose just the appropriate APIs through the python interface, I think
we'll rewrite all of repack to be in C++ at a future time, since we can take
advantage of parallelism, etc.
Test Plan:
Updated test-treemanifest.t to use the c datapackstore. Also run all
the tests with --extra-config-opt=treemanifest.usecunionstore=True.
These tests caught a missing null check in the C++ code as well.
Reviewers: #mercurial, simonfar
Reviewed By: simonfar
Subscribers: mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D4609795
Signature: t1:4609795:1488365341:203362db5f470b613c4d6484686cd32c3fa8458f
Summary:
treemanifest requires fastmanifest, and fastmanifest.usetree requires
treemanifest. Let's make these dependencies explicit in the code and error out
if they are incorrect.
Test Plan: Added a test
Reviewers: #mercurial, wez
Reviewed By: wez
Subscribers: wez, mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D4417512
Signature: t1:4417512:1484347447:7e18340813fac0b298aa51a7cc2f89fc6953680f
Summary:
Previously, _converttohybridmanifest would always create a new hybrid manifest
with the same node as the original. This meant that some code paths would
attempt to use the treemanifest from the node, instead of the already prepared
matches result. This meant the output could contain all the values from the
original tree, instead of just the matches output.
This is actually a regression from 98ba34a5194c09. Prior to that, matches did
not reuse the node.
Test Plan: Manually inspected the results in the debugger during a rebase.
Reviewers: #mercurial, ikostia
Reviewed By: ikostia
Subscribers: rmcelroy, mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D4247821
Signature: t1:4247821:1480499268:27f4a1b92ecf5d10009996b5b8f22bac02f3f38e
Summary:
Previously, when we wrote each tree entry into a pack file, it wasn't delta'd in
any way. This patch makes it store the delta against p1 in the pack file.
Testing in a large repo shows this reduces tree pack size by about 22x.
Test Plan:
Ran the tests. Did a pull in a large repo and saw the pack file was
22x smaller than before (and still usable).
Reviewers: #mercurial
Differential Revision: https://phabricator.intern.facebook.com/D4202088
Summary:
Core mercurial sorts p1 and p2 before computing the hash, so it's deterministic.
We need to do the same.
Test Plan: Ran the tests, saw a hash changed
Reviewers: #mercurial
Differential Revision: https://phabricator.intern.facebook.com/D4202063
Summary: Adds a test for verifying that hg commit adds a tree pack to local storage.
Test Plan: Ran it
Reviewers: #mercurial, zamsden
Reviewed By: zamsden
Subscribers: mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D4082915
Signature: t1:4082915:1477590191:6637b9ab0fdad27a7d4933934168b57620dee235
Summary:
Adds a simple test for checking that trees are created during pull when
autocreatetrees is enabled.
Test Plan: Ran it
Reviewers: #mercurial, zamsden
Reviewed By: zamsden
Subscribers: zamsden, mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D4082871
Signature: t1:4082871:1477590276:09e594f4b87628bae8654ffdfc0bba18beb7ccad