Summary:
Previously our data packs would have incredibly long delta chains, where to read
the last entry in the file you had to read every previous delta all the way to
the beginning. This was very expensive for chains of 2+ million deltas. Let's
add a config option to limit how long the deltas get, and for now we'll default
it to 1000.
Test Plan: Added a test
Reviewers: #mercurial, rmcelroy
Reviewed By: rmcelroy
Subscribers: rmcelroy, mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D5095603
Signature: t1:5095603:1495213707:737d63129cf459ad6927a1f4deb0dda5a8ce0a7f
Summary:
Previously looking up a particular node in a histpack required a bisect to find
the file section, then a linear scan to find the particular node. If you needed
to look up the latest 3000 nodes one by one, that involved 3000 linear scans,
many of which traversed the same nodes over and over.
This patch adds additional index at the end of the current histidx file. In a
future patch, we will change getnodeinfo() to use this index instead of the
linear scan logic.
Test Plan:
Ran the tests. I haven't actually verified that the data in these
indexes is correct. My next patch will add logic that reads these indexes and
will add tests around it. I won't land this until I've confirmed it's correct.
Reviewers: quark, #mercurial, rmcelroy
Reviewed By: rmcelroy
Subscribers: rmcelroy, mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D4983690
Signature: t1:4983690:1493798877:ae0802b4896b54bf066df9f684d94554855fd35a
Summary:
Previously, we used the length of the index file to determine the upper bounds
of the bisect. In a future patch we'll want to add more data to the end of the
index file, so we need to record how long the index portion of the index is.
This patch adds that information.
Test Plan: Ran the tests.
Reviewers: #mercurial, quark
Reviewed By: quark
Subscribers: mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D4983682
Signature: t1:4983682:1493693255:57ab9af2030847fedff05b6755113ba8ce0c933b
Summary:
This patch just bumps the histpack version number to 1 and adds a config flag to
enable writing v1 pack files. The format hasn't actually changed in this patch,
I'm just doing the verison bump so I can update all the hashes in the tests
without working about functionality change.
In the next patch I will modify the index format, which won't affect the hashes.
Test Plan:
Ran the tests. I also ran the tests with some debug code to manually
force the sha to include 0 instead of 1 and verified that the hash didn't change
(which confirms that all of these hash changes are just because of that one byte
version change).
Reviewers: #mercurial, quark
Reviewed By: quark
Subscribers: mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D4983675
Signature: t1:4983675:1493692444:5d88df4d46ce487f1b791417754ba000ecf10a1e
Summary:
This diff makes 2 changes to v1 packfile metadata:
1. Move `key` in a metadata entry to before `size`.
```
old: [entry-size: 2 byte] [key: 1 byte] [data: var length]
new: [key: 1 byte] [data-size: 2 byte] [data: var length]
```
Previously `entry-size == 0` does not make sense.
2. Use binary to represent sizes, instead of ASCII.
Related utility methods are cleaned up a bit so it's harder to make mistakes.
Test Plan: Updated existing tests
Reviewers: #mercurial, durham
Reviewed By: durham
Subscribers: durham, mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D4983189
Signature: t1:4983189:1493689852:22d544d73ed63fac83f849786de035af304161ce
Summary:
Previously, the logic that added data to a mutable history pack was required to
add it in the correct order (all entries for a certain file at once, and in
newest-first order). This required the callers to jump through weird hoops if
the data came in out of order or at different times in the transaction.
This patch moves the ordering logic to be inside MutableHistoryPack, so callers
can add the data in any order they wish, and it will get sorted before being
serialized.
This does add memory pressure to things that read a lot of history, like repack.
If this becomes a problem we may want to add a 'historypack.flush()' api that
let's us tell the history pack it's ok to flush it's current contents to disk.
Test Plan: Ran the tests
Reviewers: #mercurial, quark
Reviewed By: quark
Subscribers: quark, mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D4956096
Signature: t1:4956096:1493264693:a2275a49e35565d4b11244e3e5dd82c25de7e16e
Summary:
This diffs add a `getmeta` method to all content stores. The cdatapack code is
modified to pass the tests, it needs further change to support `getmeta`.
The datapack format is bumped to v1 from v0. For v1, we append a `metadata`
dict at the end of each revision. The dict is currently used to store revlog
flags and rawsize of raw revlog fulltext. In the future we can put more data
like a second hash etc, without changing API or format again.
This diff focuses on correctness. A datapack caching layer to speed up
`getmeta` will be added later.
Tests are updated since we write new v1 packfile now and the format change
leads to different content and packfile names.
`Makefile`, `ls-l.py` are added to make tests easier to maintain.
Test Plan: Updated existing tests.
Reviewers: #mercurial, rmcelroy, durham
Reviewed By: durham
Subscribers: rmcelroy, mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D4903917
Signature: t1:4903917:1493255844:7ef5d487096cd2f78f2aaae672a68d49f33632ee
Summary:
The old test output had some flakiness where:
1. The linknode for y:1406e7411862 would flicker between commit 0 and commit 6,
even though y never existed in commit 0. This is caused by some ambiguity in how
the store represents the ancestor list (it's keyed on node instead of (name,
node)). To fix the test let's just modify file y so it doesn't have the same
hash as file x at any point.
2. The order of nodes in the histpack would flicker because there was no link
between the two separate histories of file y. This is caused by a lack of
sorting of the separate roots before writing to the histpack.
Test Plan: Ran the test many times
Reviewers: #mercurial, quark
Reviewed By: quark
Subscribers: quark, mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D4915472
Signature: t1:4915472:1492641328:d4542665e3a69fe2fc188b450f75ff10e6d681de
Summary:
This code that reused deltas if the delta parent wasn't available was bugged
because it meant you could end up with a cycle in the delta chains. This was an
old optimization from before trees had history, so let's drop the optimization
(since trees now have history and can be correctly repacked).
Test Plan:
Ran repack on a packfile that previously caused cycles. Verified the
new version did not with `hg debugdatapack foo.datapack'
Reviewers: #mercurial
Subscribers: mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D4724520
When remotefilelog moved from its own repo, the tests needed to be updated to
adjust the PYTHONPATH to ensure the in-repo remotefilelog was loaded instead of
the system one.
This meant any local runs of remotefilelog tests would've been using the system
remotefilelog unless the user had manually set the PYTHONPATH themselves.
Summary:
This fixes test-check-config-hg.t for tweakdefaults. And did some clean-up
for other minor issues.
I was trying to implement another feature (along with the clean-up) in
tweakdefaults and finally realized it's infeasible and drop the feature. But
the clean-up seems useful thus sent here.
Also change `cp -r` to `cp -R` to pass the usptream check-code test.
Test Plan: `arc unit`
Reviewers: #sourcecontrol, mitrandir
Reviewed By: mitrandir
Subscribers: mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D4253852
Signature: t1:4253852:1480613350:398e9b234fcc2360dcb8a3e3ad4e5bc5c4377857
Summary:
Previously, if there was no history available for a given revision, we would
just store the full text in the pack file. This patch makes it attempt to reuse
the existing delta base instead. This will be useful for repacking
treemanifests, since we currently don't have histpacks for them (we just delta
them efficiently when they are first added to the repo).
Test Plan: Ran tests
Reviewers: #mercurial
Differential Revision: https://phabricator.intern.facebook.com/D4240705
Summary:
This makes it possible to run all remotefilelog tests without others
It also avoids some issues with name collisions in the upcoming merge.
Test Plan: next commit is a merge and no conflicts in tests/
Reviewers: #sourcecontrol, ttung, durham, mitrandir, simonfar
Reviewed By: mitrandir, simonfar
Subscribers: mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D3764379
Tasks: 12855049
Signature: t1:3764379:1472217061:67a0cc8f1fc29f991be08fe965679535ff6df27a