Commit Graph

14 Commits

Author SHA1 Message Date
Mihails Smolins
a2bdf68679 remotefilelog: limit number of changesets to be prefetched
Summary:
Added config option 'prefetchdays' which indicates that commits older than
'prefetchdays' days should not be prefetched. This option is necessary to avoid
prefetch of huge amount of data. The default value is set to 14 days.

Test Plan: Ensure that unit tests pass

Reviewers: ryanmce, simonfar, durham, #fbhgext, simpkins

Reviewed By: #fbhgext, simpkins

Subscribers: simpkins

Differential Revision: https://phab.mercurial-scm.org/D420
2017-08-16 13:21:31 -07:00
Durham Goode
79639557a1 datapack: allow 'long' for metadata types
Summary:
Previously the code required that sizes be of type int. Since python plays loose
with integer types, we also need to support long.

Test Plan:
The existing test-remotefilelog-repack-fast.t test was completely
broken. It only enabled fast datapacks for the server repo, not the clients.
Enabling it for the clients as well catches this issue and verifies the fix.

Reviewers: #fbhgext, quark

Reviewed By: #fbhgext, quark

Differential Revision: https://phab.mercurial-scm.org/D54
2017-07-11 17:02:15 -07:00
Durham Goode
96ea7cd405 histpack: add per-node index
Summary:
Previously looking up a particular node in a histpack required a bisect to find
the file section, then a linear scan to find the particular node.  If you needed
to look up the latest 3000 nodes one by one, that involved 3000 linear scans,
many of which traversed the same nodes over and over.

This patch adds additional index at the end of the current histidx file. In a
future patch, we will change getnodeinfo() to use this index instead of the
linear scan logic.

Test Plan:
Ran the tests. I haven't actually verified that the data in these
indexes is correct. My next patch will add logic that reads these indexes and
will add tests around it. I won't land this until I've confirmed it's correct.

Reviewers: quark, #mercurial, rmcelroy

Reviewed By: rmcelroy

Subscribers: rmcelroy, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4983690

Signature: t1:4983690:1493798877:ae0802b4896b54bf066df9f684d94554855fd35a
2017-05-03 10:19:45 -07:00
Durham Goode
d1a927d335 packs: add entry count to pack index
Summary:
Previously, we used the length of the index file to determine the upper bounds
of the bisect. In a future patch we'll want to add more data to the end of the
index file, so we need to record how long the index portion of the index is.
This patch adds that information.

Test Plan: Ran the tests.

Reviewers: #mercurial, quark

Reviewed By: quark

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4983682

Signature: t1:4983682:1493693255:57ab9af2030847fedff05b6755113ba8ce0c933b
2017-05-03 10:19:45 -07:00
Durham Goode
8674a90bb5 histpack: add version 1
Summary:
This patch just bumps the histpack version number to 1 and adds a config flag to
enable writing v1 pack files. The format hasn't actually changed in this patch,
I'm just doing the verison bump so I can update all the hashes in the tests
without working about functionality change.

In the next patch I will modify the index format, which won't affect the hashes.

Test Plan:
Ran the tests. I also ran the tests with some debug code to manually
force the sha to include 0 instead of 1 and verified that the hash didn't change
(which confirms that all of these hash changes are just because of that one byte
version change).

Reviewers: #mercurial, quark

Reviewed By: quark

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4983675

Signature: t1:4983675:1493692444:5d88df4d46ce487f1b791417754ba000ecf10a1e
2017-05-03 10:19:45 -07:00
Jun Wu
4a613bcaef datapack: update the format about metadata
Summary:
This diff makes 2 changes to v1 packfile metadata:

1. Move `key` in a metadata entry to before `size`.

```
old: [entry-size: 2 byte] [key: 1 byte] [data: var length]
new: [key: 1 byte] [data-size: 2 byte] [data: var length]
```

Previously `entry-size == 0` does not make sense.

2. Use binary to represent sizes, instead of ASCII.

Related utility methods are cleaned up a bit so it's harder to make mistakes.

Test Plan: Updated existing tests

Reviewers: #mercurial, durham

Reviewed By: durham

Subscribers: durham, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4983189

Signature: t1:4983189:1493689852:22d544d73ed63fac83f849786de035af304161ce
2017-05-01 19:03:25 -07:00
Durham Goode
df97f609bf treemanifest: move history sorting responsibility into MutableHistoryPack
Summary:
Previously, the logic that added data to a mutable history pack was required to
add it in the correct order (all entries for a certain file at once, and in
newest-first order). This required the callers to jump through weird hoops if
the data came in out of order or at different times in the transaction.

This patch moves the ordering logic to be inside MutableHistoryPack, so callers
can add the data in any order they wish, and it will get sorted before being
serialized.

This does add memory pressure to things that read a lot of history, like repack.
If this becomes a problem we may want to add a 'historypack.flush()' api that
let's us tell the history pack it's ok to flush it's current contents to disk.

Test Plan: Ran the tests

Reviewers: #mercurial, quark

Reviewed By: quark

Subscribers: quark, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4956096

Signature: t1:4956096:1493264693:a2275a49e35565d4b11244e3e5dd82c25de7e16e
2017-04-27 10:44:34 -07:00
Jun Wu
4240bd017e remotefilelog: let content stores support metadata
Summary:
This diffs add a `getmeta` method to all content stores. The cdatapack code is
modified to pass the tests, it needs further change to support `getmeta`.

The datapack format is bumped to v1 from v0. For v1, we append a `metadata`
dict at the end of each revision. The dict is currently used to store revlog
flags and rawsize of raw revlog fulltext. In the future we can put more data
like a second hash etc, without changing API or format again.

This diff focuses on correctness. A datapack caching layer to speed up
`getmeta` will be added later.

Tests are updated since we write new v1 packfile now and the format change
leads to different content and packfile names.

`Makefile`, `ls-l.py` are added to make tests easier to maintain.

Test Plan: Updated existing tests.

Reviewers: #mercurial, rmcelroy, durham

Reviewed By: durham

Subscribers: rmcelroy, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4903917

Signature: t1:4903917:1493255844:7ef5d487096cd2f78f2aaae672a68d49f33632ee
2017-04-26 19:50:36 -07:00
Durham Goode
2ed79d7eb9 treemanifest: fix test flakiness
Summary:
The old test output had some flakiness where:

1. The linknode for y:1406e7411862 would flicker between commit 0 and commit 6,
even though y never existed in commit 0. This is caused by some ambiguity in how
the store represents the ancestor list (it's keyed on node instead of (name,
node)). To fix the test let's just modify file y so it doesn't have the same
hash as file x at any point.

2. The order of nodes in the histpack would flicker because there was no link
between the two separate histories of file y. This is caused by a lack of
sorting of the separate roots before writing to the histpack.

Test Plan: Ran the test many times

Reviewers: #mercurial, quark

Reviewed By: quark

Subscribers: quark, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4915472

Signature: t1:4915472:1492641328:d4542665e3a69fe2fc188b450f75ff10e6d681de
2017-04-19 21:14:04 -07:00
Durham Goode
63f4fa69a0 datapack: remove delta reuse
Summary:
This code that reused deltas if the delta parent wasn't available was bugged
because it meant you could end up with a cycle in the delta chains. This was an
old optimization from before trees had history, so let's drop the optimization
(since trees now have history and can be correctly repacked).

Test Plan:
Ran repack on a packfile that previously caused cycles. Verified the
new version did not with `hg debugdatapack foo.datapack'

Reviewers: #mercurial

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4724520
2017-03-18 19:38:45 -07:00
Durham Goode
41d4153092 remotefilelog: adjust PYTHONPATH during tests
When remotefilelog moved from its own repo, the tests needed to be updated to
adjust the PYTHONPATH to ensure the in-repo remotefilelog was loaded instead of
the system one.

This meant any local runs of remotefilelog tests would've been using the system
remotefilelog unless the user had manually set the PYTHONPATH themselves.
2017-02-09 18:02:52 -08:00
Jun Wu
5d5fedd10c tweakdefaults: fix check config test
Summary:
This fixes test-check-config-hg.t for tweakdefaults. And did some clean-up
for other minor issues.

I was trying to implement another feature (along with the clean-up) in
tweakdefaults and finally realized it's infeasible and drop the feature. But
the clean-up seems useful thus sent here.

Also change `cp -r` to `cp -R` to pass the usptream check-code test.

Test Plan: `arc unit`

Reviewers: #sourcecontrol, mitrandir

Reviewed By: mitrandir

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4253852

Signature: t1:4253852:1480613350:398e9b234fcc2360dcb8a3e3ad4e5bc5c4377857
2016-11-30 20:58:08 +00:00
Durham Goode
fe6f541ed8 remotefilelog: reuse deltabase when history is not available
Summary:
Previously, if there was no history available for a given revision, we would
just store the full text in the pack file. This patch makes it attempt to reuse
the existing delta base instead. This will be useful for repacking
treemanifests, since we currently don't have histpacks for them (we just delta
them efficiently when they are first added to the repo).

Test Plan: Ran tests

Reviewers: #mercurial

Differential Revision: https://phabricator.intern.facebook.com/D4240705
2016-11-29 15:37:58 -08:00
Ryan McElroy
99a672e000 remotefilelog: rename tests/test-* to tests/test-remotefilelog-*
Summary:
This makes it possible to run all remotefilelog tests without others
It also avoids some issues with name collisions in the upcoming merge.

Test Plan: next commit is a merge and no conflicts in tests/

Reviewers: #sourcecontrol, ttung, durham, mitrandir, simonfar

Reviewed By: mitrandir, simonfar

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D3764379

Tasks: 12855049

Signature: t1:3764379:1472217061:67a0cc8f1fc29f991be08fe965679535ff6df27a
2016-08-26 06:11:27 -07:00