Commit Graph

598 Commits

Author SHA1 Message Date
Jun Wu
b1a694579e remotefilelog: wrap flog.addrawrevision instead of flog.add
Summary:
remotefilelog needs to wait for changelog creation to get the commit hash so
all filelog appending operations are pending if linkrev is an integer.

That is better done in the `addrawrevision` layer since it will catch more code
paths. Besides, that means remotefilelog won't need to hash a same revision
twice.

Test Plan: arc unit

Reviewers: #mercurial, durham

Reviewed By: durham

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D5061330

Signature: t1:5061330:1494871717:a7224ebdc0f221fbaabbd2a58de975caec0e4b05
2017-05-16 16:23:37 -07:00
Jun Wu
a9a7b9784f remotefilelog: implement addrawrevision
Summary: This is required (but not complete) to allow LFS fast path.

Test Plan: arc unit

Reviewers: #mercurial, durham

Reviewed By: durham

Subscribers: durham

Differential Revision: https://phabricator.intern.facebook.com/D5061283

Signature: t1:5061283:1494959731:2d7ed7465c1724cd1f231a206f949da16f90649c
2017-05-16 16:01:13 -07:00
Jun Wu
8bf7723507 remotefilelog: move _createfileblob to a separate method
Summary: This makes the next patch cleaner to read.

Test Plan: arc unit

Reviewers: #mercurial, durham

Reviewed By: durham

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D5061271

Signature: t1:5061271:1494869273:967b647281847fa39e88558805bcf3b9a9e2b57b
2017-05-16 15:57:12 -07:00
Jun Wu
c10d3a3dff remotefilelog: implement nodemap
Summary:
Core HG code uses revlog.nodemap to test node existence. We will hit some
code path about LFS in the future. So let's add a nodemap to remotefilelog.

Currently, the code path won't be hit. In the future, it should only be hit by
`repo._filecommit` when a `remotefilectx` is used (which is an LFS fast path).
That means, `nodemap` test won't connect remote server for missing nodes.

In the future, we could add some "hints" to get/getmeta API to let it not look
for the remote store.

Test Plan:
Real test will be added when we do can hit that code path. But the new code is
short and looks fine.

Reviewers: #mercurial, durham

Reviewed By: durham

Subscribers: durham, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D5061254

Signature: t1:5061254:1494869146:88a0e1d04d292e2a64f29fdf52660f48b906665c
2017-05-16 15:56:15 -07:00
Durham Goode
12f99274cb treemanifest: add incremental server repack
Summary:
When the server performs a repack, it would read all the data from all the
revlogs. This was very slow and expensive. Let's add a --incremental option that
makes it only read the revlog entries that have a linkrev newer than the latest
linkrev data that's already in a pack file.

Test Plan: Adds a test

Reviewers: #mercurial, rmcelroy

Reviewed By: rmcelroy

Subscribers: rmcelroy, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4997260

Signature: t1:4997260:1493904216:c4f5b6c9652bbf8f66c1bc2b2547c898324d22cd
2017-05-16 15:28:13 -07:00
Jun Wu
8b7cc1bc20 remotefilelog: fix filelog.revdiff
Summary: `revdiff` should use raw revisions. 5d11b5ed in core hg is a similar fix.

Test Plan: Added a test.

Reviewers: #mercurial, durham

Reviewed By: durham

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D5031916

Signature: t1:5031916:1494366439:80d496ed6f69a784d345510147eab6c479496f84
2017-05-09 14:47:58 -07:00
Durham Goode
0723212833 histpack: switch getnodeinfo to use common _findnode function
Summary:
It should've probably been using this function from the beginning, I just
forgot. Let's switch it to use _findnode so it takes advantage of the new
indexes.

This changes the return value of _findnode, but that's fine because the only
caller was getmissing() which didn't use the return value.

Test Plan: Ran the tests. The getnodeinfo unit test covers this code path.

Reviewers: #mercurial, rmcelroy

Reviewed By: rmcelroy

Subscribers: rmcelroy, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4991411

Signature: t1:4991411:1493810207:7f694d41089a6efab822ae002645ed1531f6b344
2017-05-03 10:19:46 -07:00
Durham Goode
980bf7ecdb histpack: switch _findnode to use node index
Summary:
Now that we have node indexes, let's switch _findnode (which is responsible for
random access node lookups) to use that index.

Test Plan: Ran the tests. The getmissing unit test covers this code path.

Reviewers: #mercurial, rmcelroy

Reviewed By: rmcelroy

Subscribers: rmcelroy, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4991400

Signature: t1:4991400:1493802781:6422be2522c3a423c80e41ce776057720b32ee98
2017-05-03 10:19:46 -07:00
Durham Goode
ef4d4cca3a histpack: move bisect to it's own function
Summary:
A future patch will be doing a similar bisect over a different index, so let's
move the bisect logic to it's own function.

Test Plan: Ran the tests

Reviewers: #mercurial, rmcelroy

Reviewed By: rmcelroy

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4991393

Signature: t1:4991393:1493802679:1fb20c1e1ae6271f10953889f9d4d524d3e03a76
2017-05-03 10:19:46 -07:00
Durham Goode
bbebdda760 histpack: move entry reading to a function
Summary:
A future patch will be doing entry reading from another code path, so let's
refactor it out.

Test Plan: Ran the tests

Reviewers: #mercurial, rmcelroy

Reviewed By: rmcelroy

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4991388

Signature: t1:4991388:1493802655:c20b3e394a785b31c8e94cfb0285575c6ceed70d
2017-05-03 10:19:45 -07:00
Durham Goode
b559010043 treemanifest: refactor _findnode
Summary:
Now that _getancestors is an iterator, we can get rid of the custom scan code in
_findnode.

Test Plan: Ran the tests

Reviewers: #mercurial, rmcelroy

Reviewed By: rmcelroy

Subscribers: rmcelroy, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4991386

Signature: t1:4991386:1493798953:e3f7faa828ef3148f8c66930d04c20c6fa83b83b
2017-05-03 10:19:45 -07:00
Durham Goode
16b8b169b7 treemanifest: return node index from _findsection
Summary:
Now that we have node indexes, let's return their location from _findsection.
Patch includes a slight rename to a constant that was poorly named as well.

Test Plan: Ran the tests

Reviewers: #mercurial, rmcelroy

Reviewed By: rmcelroy

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4991385

Signature: t1:4991385:1493798906:c3147d00b6fc67b47191592032519afccf71e9af
2017-05-03 10:19:45 -07:00
Durham Goode
96ea7cd405 histpack: add per-node index
Summary:
Previously looking up a particular node in a histpack required a bisect to find
the file section, then a linear scan to find the particular node.  If you needed
to look up the latest 3000 nodes one by one, that involved 3000 linear scans,
many of which traversed the same nodes over and over.

This patch adds additional index at the end of the current histidx file. In a
future patch, we will change getnodeinfo() to use this index instead of the
linear scan logic.

Test Plan:
Ran the tests. I haven't actually verified that the data in these
indexes is correct. My next patch will add logic that reads these indexes and
will add tests around it. I won't land this until I've confirmed it's correct.

Reviewers: quark, #mercurial, rmcelroy

Reviewed By: rmcelroy

Subscribers: rmcelroy, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4983690

Signature: t1:4983690:1493798877:ae0802b4896b54bf066df9f684d94554855fd35a
2017-05-03 10:19:45 -07:00
Durham Goode
d1a927d335 packs: add entry count to pack index
Summary:
Previously, we used the length of the index file to determine the upper bounds
of the bisect. In a future patch we'll want to add more data to the end of the
index file, so we need to record how long the index portion of the index is.
This patch adds that information.

Test Plan: Ran the tests.

Reviewers: #mercurial, quark

Reviewed By: quark

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4983682

Signature: t1:4983682:1493693255:57ab9af2030847fedff05b6755113ba8ce0c933b
2017-05-03 10:19:45 -07:00
Durham Goode
8674a90bb5 histpack: add version 1
Summary:
This patch just bumps the histpack version number to 1 and adds a config flag to
enable writing v1 pack files. The format hasn't actually changed in this patch,
I'm just doing the verison bump so I can update all the hashes in the tests
without working about functionality change.

In the next patch I will modify the index format, which won't affect the hashes.

Test Plan:
Ran the tests. I also ran the tests with some debug code to manually
force the sha to include 0 instead of 1 and verified that the hash didn't change
(which confirms that all of these hash changes are just because of that one byte
version change).

Reviewers: #mercurial, quark

Reviewed By: quark

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4983675

Signature: t1:4983675:1493692444:5d88df4d46ce487f1b791417754ba000ecf10a1e
2017-05-03 10:19:45 -07:00
Jun Wu
491a1ce44c datapack: fix a freememory issue
Summary:
The issue was introduced by the getmeta refactoring. It didn't get caught by
tests because tests didn't trigger freememory.

Test Plan: The fix was verified working on fbsource

Reviewers: #mercurial, durham

Reviewed By: durham

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4978835

Signature: t1:4978835:1493689136:296d425cf5d08b807b898c0e8cd881c9207c6359
2017-05-01 19:09:06 -07:00
Jun Wu
4a613bcaef datapack: update the format about metadata
Summary:
This diff makes 2 changes to v1 packfile metadata:

1. Move `key` in a metadata entry to before `size`.

```
old: [entry-size: 2 byte] [key: 1 byte] [data: var length]
new: [key: 1 byte] [data-size: 2 byte] [data: var length]
```

Previously `entry-size == 0` does not make sense.

2. Use binary to represent sizes, instead of ASCII.

Related utility methods are cleaned up a bit so it's harder to make mistakes.

Test Plan: Updated existing tests

Reviewers: #mercurial, durham

Reviewed By: durham

Subscribers: durham, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4983189

Signature: t1:4983189:1493689852:22d544d73ed63fac83f849786de035af304161ce
2017-05-01 19:03:25 -07:00
Jun Wu
9a8545ebc1 cdatapack: implement getmeta for C module
Summary:
This diff implements getmeta in C and enables related tests.

Now all content stores support `getmeta`.

Test Plan: Run existing tests.

Reviewers: #mercurial, durham

Reviewed By: durham

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4960926

Signature: t1:4960926:1493611048:55095c32927fac74e698f21d47173cb8a7523fb6
2017-05-01 13:29:19 -07:00
Durham Goode
7e191da991 treemanifest: don't copy 00manifesttree.d during streaming clones
The old logic that prevented this clone only covered 00manifesttree.i. Let's
also cover 00manifesttree.d.
2017-04-27 12:03:56 -07:00
Durham Goode
2b12f514c7 treemanifest: include tree pack during pushes
Summary:
To enable pushing between peers (and eventually pushing to the server), let's
teach bundle creation to include the trees being pushed.

Test Plan: Adds a test

Reviewers: #mercurial, quark

Reviewed By: quark

Subscribers: quark, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4957456

Signature: t1:4957456:1493266296:67f98a2b3d691644bde9098a713d05266f349cde
2017-04-27 10:44:34 -07:00
Durham Goode
afacc9433b treemanifest: unify client and server tree history access patterns
Summary:
Previously the server would access the tree data in an adhoc manner. Sometimes
it would talk straight to revlogs, sometimes it would create stores and talk to
data packs. This patch makes it access trees the same way clients do, through
repo.svfs.manifestdatastore and manifesthistorystore.

This also cleans up the client store creation just a little and adds a
unionmetadatastore for unified history access.

Test Plan: Ran the tests

Reviewers: #mercurial, quark

Reviewed By: quark

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4957441

Signature: t1:4957441:1493263349:e76d177f7a9f45343e6f984d6c0ae2c7cacba035
2017-04-27 10:44:34 -07:00
Durham Goode
2f2f03d8d5 store: add history.getnodeinfo api
Summary:
Previously, history stores only had getancestors() apis which returned all the
ancestors. This was expensive if there was a lot of ancestors, like for the root
tree of treemanifests. Let's add an api for accessing a single history entry.
This will be useful in pack generation for only fetching the history related to
the trees we're sending at that time.

Test Plan: Added a test

Reviewers: #mercurial, quark

Reviewed By: quark

Subscribers: quark, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4957432

Signature: t1:4957432:1493263124:a155ac5a70c35f7e25a5cc48c9d9c2126d4c5858
2017-04-27 10:44:34 -07:00
Durham Goode
df97f609bf treemanifest: move history sorting responsibility into MutableHistoryPack
Summary:
Previously, the logic that added data to a mutable history pack was required to
add it in the correct order (all entries for a certain file at once, and in
newest-first order). This required the callers to jump through weird hoops if
the data came in out of order or at different times in the transaction.

This patch moves the ordering logic to be inside MutableHistoryPack, so callers
can add the data in any order they wish, and it will get sorted before being
serialized.

This does add memory pressure to things that read a lot of history, like repack.
If this becomes a problem we may want to add a 'historypack.flush()' api that
let's us tell the history pack it's ok to flush it's current contents to disk.

Test Plan: Ran the tests

Reviewers: #mercurial, quark

Reviewed By: quark

Subscribers: quark, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4956096

Signature: t1:4956096:1493264693:a2275a49e35565d4b11244e3e5dd82c25de7e16e
2017-04-27 10:44:34 -07:00
Durham Goode
7e3a739970 treemanifest: set allowincomplete to true for repacks
Summary:
It's possible for history packs to not have all of history, so let's allow that
during repacks.

Also fix one exception to use hex()

Test Plan: A future patch has tests that exposed this

Reviewers: #mercurial, quark

Reviewed By: quark

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4956078

Signature: t1:4956078:1493264261:81c12520b7352d5040cdb027509c975678c10069
2017-04-27 10:44:34 -07:00
Durham Goode
2b154613e2 pack: switch readers to read from file handle
Summary:
A future patch will move the pack wireprotocol to use bundle2. In this new world
we'll be given a file handle instead of a remote peer, so let's switch the
utility methods to work on a file handle instead.

Test Plan: Ran the tests

Reviewers: #mercurial, quark

Reviewed By: quark

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4924553

Signature: t1:4924553:1493050882:7a9ee8b282bf47ef393362dd0114d801dc2a68d5
2017-04-27 10:44:33 -07:00
Jun Wu
0d6151f530 remotefilelog: add lfs integration test
Summary:
The test covers common workflows like clone, commit, push, update, pull. It
exercises the remotefilelog plain store and Python datapack store to make sure
they won't lose the revlog flag. The test also tries to verify rename works
correctly.

Since the lfs extension may be eventually upstreamed, it seems a good idea to
make remotefilelog call `lfs.wrapfilelog` so lfs is free from remotefilelog
code.

Test Plan: Added a test

Reviewers: #mercurial, durham

Reviewed By: durham

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4904281

Signature: t1:4904281:1492560308:5fd9f214ada6de795735ea7d737d30c1bf39812a
2017-04-26 19:55:02 -07:00
Jun Wu
2b3b8e46a5 remotefilelog: add filelog methods
Summary:
Filelog methods like `addrevision`, `revision(raw=True)` are needed for flag
processor (lfs) to work correctly. Add them in remotefilelog so lfs wrapper
code could replace them.

Test Plan:
Run existing tests. Stronger tests and lfs integration test will be added when
this area is more complete.

Reviewers: #mercurial, durham, rmcelroy

Reviewed By: durham

Subscribers: rmcelroy, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4903959

Signature: t1:4903959:1493163809:5ebd88fac21d8225a12ce68bfc63a2867ee43769
2017-04-26 19:52:20 -07:00
Jun Wu
4240bd017e remotefilelog: let content stores support metadata
Summary:
This diffs add a `getmeta` method to all content stores. The cdatapack code is
modified to pass the tests, it needs further change to support `getmeta`.

The datapack format is bumped to v1 from v0. For v1, we append a `metadata`
dict at the end of each revision. The dict is currently used to store revlog
flags and rawsize of raw revlog fulltext. In the future we can put more data
like a second hash etc, without changing API or format again.

This diff focuses on correctness. A datapack caching layer to speed up
`getmeta` will be added later.

Tests are updated since we write new v1 packfile now and the format change
leads to different content and packfile names.

`Makefile`, `ls-l.py` are added to make tests easier to maintain.

Test Plan: Updated existing tests.

Reviewers: #mercurial, rmcelroy, durham

Reviewed By: durham

Subscribers: rmcelroy, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4903917

Signature: t1:4903917:1493255844:7ef5d487096cd2f78f2aaae672a68d49f33632ee
2017-04-26 19:50:36 -07:00
Jun Wu
8fcd86af16 remotefilelog: move constants to class to prepare index format change
Summary:
To be able to bump version and change formats, the related constants need to
be moved to individual classes. So a class (ex. datapack) can be subclassed to
handle different formats.

Test Plan: `arc unit`

Reviewers: #mercurial, durham

Reviewed By: durham

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4927284

Signature: t1:4927284:1493152641:e3274dd735d50baf193b7615dd314f4e6cf161f0
2017-04-26 13:34:15 -07:00
Jun Wu
0e2c18e2cd remotefilelog: add revlog flags information to the protocol
Summary:
Make the unpacked file format to include the revlog flag information, and make
the getfile(s) protocol support it.

Note: The `getpackv1` protocol and packfile format is not changed yet.

Test Plan:
Run existing tests. Stronger tests and lfs integration test will be added when
this area is more complete.

Reviewers: #mercurial, rmcelroy, durham

Reviewed By: durham

Subscribers: rmcelroy, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4903772

Signature: t1:4903772:1493152451:ab393b0208f0eee199ffc4c8fcfdfd5dd6d0f3ac
2017-04-26 13:08:13 -07:00
Durham Goode
9dc2f1efb7 treemanifest: improve progress during data repack
Summary:
Large repacks had some long pauses where there was no feedback to the user. This
adds more granualar progress updates.

Test Plan: Ran repack in a large repo and saw more granular progress

Reviewers: #mercurial, quark

Reviewed By: quark

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4915492

Signature: t1:4915492:1492641504:9db31d534fe201bec838e77ba470c9051d3be04f
2017-04-19 21:14:04 -07:00
Durham Goode
f17ea5b392 treemanifest: add config to not clone tree revlogs during remotefilelog clone
Summary:
Previously a streaming clone would pick up the 00manifesttree.i file. This patch
adds a config to prevent that file (and other meta/ manifests) from being
transfered. The config defaults to False since Google is currently depending on
the previous behavior.

Test Plan: Add a test

Reviewers: #mercurial, quark

Reviewed By: quark

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4915481

Signature: t1:4915481:1492641427:6c8a91da4c8c6e00e23f96a93807d3c426b50532
2017-04-19 21:14:04 -07:00
Durham Goode
2ed79d7eb9 treemanifest: fix test flakiness
Summary:
The old test output had some flakiness where:

1. The linknode for y:1406e7411862 would flicker between commit 0 and commit 6,
even though y never existed in commit 0. This is caused by some ambiguity in how
the store represents the ancestor list (it's keyed on node instead of (name,
node)). To fix the test let's just modify file y so it doesn't have the same
hash as file x at any point.

2. The order of nodes in the histpack would flicker because there was no link
between the two separate histories of file y. This is caused by a lack of
sorting of the separate roots before writing to the histpack.

Test Plan: Ran the test many times

Reviewers: #mercurial, quark

Reviewed By: quark

Subscribers: quark, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4915472

Signature: t1:4915472:1492641328:d4542665e3a69fe2fc188b450f75ff10e6d681de
2017-04-19 21:14:04 -07:00
Durham Goode
68030f1dd7 treemanifest: add known argument to getancestor api
Summary:
During a repack we often want to access the ancestory for a bunch of nodes that
might be ancestors of each other. Using getancestors for that results in a lot
of duplicated work. For instance, getancestors(0) returns [0], getancestors(1)
returns [0, 1], getancestors(2) returns [0, 1, 2], etc. Which is n^2

This patch adds an optional `known` argument for getancestors that let's the
caller tell getancestors what ancestors it's already aware of. Then getancestors
can short circuit when it reaches that level. This avoids duplicate work during
repack.

Test Plan:
Ran treemanifest repack in our large repo and verified it made
progress over the nodes much faster than before

Reviewers: #mercurial, quark

Reviewed By: quark

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4901308

Signature: t1:4901308:1492640896:27d4a90c2993cd1fefbd8dbc211f2ec181178bce
2017-04-19 21:14:04 -07:00
Durham Goode
dae50fc99e treemanifest: support repacking revlogs into packs
Summary:
Accessing treemanifests in revlogs is incredibly slow. This patch adds the
ability for `hg repack` to repack the revlog content into a data pack file, and
teaches the getserverpack code to look in the pack file first.

Test Plan: Need to add a test. Sent out anyway for review

Reviewers: #mercurial, quark

Reviewed By: quark

Subscribers: quark, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4901298

Signature: t1:4901298:1492640706:2850b9f0e9bbee77952f46af3b784aa81253e626
2017-04-19 21:14:04 -07:00
Durham Goode
3214c9a4df treemanifest: move revlogdatastore to contentstore.py
Summary: This is a generally useful store class. Let's move it to the store code.

Test Plan: Ran the tests

Reviewers: #mercurial, quark

Reviewed By: quark

Subscribers: quark, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4901293

Signature: t1:4901293:1492647203:0f76a6e78fd0035b61c4e45e18cd9fce59359768
2017-04-19 21:14:04 -07:00
Durham Goode
6a6c59d23d treemanifest: remove n^m behavior in ancestor logic
Summary:
The algorithm did a bfs over the commit graph, but it didn't check if it had
already processed a commit before. This meant every merge ended up traversing
both sides of the merge entirely (even if there was duplicate), and if there was
multiple merges this resulted in n^m behavior.

Test Plan:
Did a treemanifest repack in our big repo and verified it actually
made progress instead of getting stuck in cpu usage for hours

Reviewers: #mercurial, quark

Reviewed By: quark

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4901287

Signature: t1:4901287:1492639247:b547e7f4a2051117aff41183ceb78aae44695b7a
2017-04-19 21:14:04 -07:00
Kostia Balytskyi
aeef0ad5cc compatibility: migrate from scmutil.vfs to mercurial.vfs.vfs
Differential Revision: https://phabricator.intern.facebook.com/D4908906
2017-04-18 14:42:33 -07:00
Durham Goode
902af15d1e remotefilelog: refactor pack wireprotocol to separate file
Summary:
The pack wireprotocol will be useful for exchanging treemanifests, so let's
refactor it out to it's own file. There's a slight protocol change here, where
we terminate the response with 10 null bytes (2 for 0 length filename, 4 for 0
length data, 4 for 0 length history) instead of the original 2 null bytes (for
0 length filename) which didn't let us handle entries with '' as the name.

This pack exchange code isn't even used in production, since most remotefilelog
downloads are done via the lose file (getfile/getfiles) format.

Test Plan:
Ran the tests. Even though this code isn't used in production, the
prefetch and repack tests still cover it.

Reviewers: #mercurial, rmcelroy

Reviewed By: rmcelroy

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4860556

Signature: t1:4860556:1491853521:c3810a4a681606571354b270b957e8df0962c86a
2017-04-10 17:56:01 -07:00
Kyle Lippincott
1bb2683a91 shallowbundle: rename prev->prevnode, compare vs nullid not nullrev
As far as I can tell, 'nodechunk' is internal to remotefilelog (i.e. this
should not be called by mercurial directly) and every callsite has nodes
instead of revisions here.
2017-03-22 18:56:35 -07:00
Jun Wu
b2516a6de3 remotefilelog: try to set cachegroup even if the directory exists
Summary:
Sometimes the cache directory has wrong group set and our hg code fails with
permission errors. Try to solve that by detecting wrong groups or modes and
reset them.

Test Plan:
```
    In [2]: from remotefilelog import shallowutil
    In [5]: ui.setconfig('remotefilelog', 'cachegroup', 'kvm')
    In [6]: shallowutil.mkstickygroupdir(ui, '/tmp/d1/d2')
    # make sure /tmp/d1, /tmp/d1/d2 have group=kvm and the sticky bit set.
    # run `sudo chown -R quark:quark /tmp/d1`
    # run `sudo chmod g-s -R /tmp/d1`
    In [7]: shallowutil.mkstickygroupdir(ui, '/tmp/d1/d2')
    # make sure /tmp/d1/d2 is owend by "kvm" group and has the sticky bit
    # again, while /tmp/d1 remains unchanged.
```

Reviewers: #sourcecontrol, stash

Reviewed By: stash

Subscribers: stash, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4701627

Tasks: 16473317

Signature: t1:4701627:1489481138:64a6038ec5cc067cf05bad3b14ee9985e0bf6d96
2017-03-20 11:15:41 -07:00
Durham Goode
63f4fa69a0 datapack: remove delta reuse
Summary:
This code that reused deltas if the delta parent wasn't available was bugged
because it meant you could end up with a cycle in the delta chains. This was an
old optimization from before trees had history, so let's drop the optimization
(since trees now have history and can be correctly repacked).

Test Plan:
Ran repack on a packfile that previously caused cycles. Verified the
new version did not with `hg debugdatapack foo.datapack'

Reviewers: #mercurial

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4724520
2017-03-18 19:38:45 -07:00
Kostia Balytskyi
0c2d706810 remotefilelog: wrap lz4 imports for compatibility
Differential Revision: https://phabricator.intern.facebook.com/D4707115
2017-03-17 14:02:26 -07:00
Durham Goode
5903428735 remotefilelog: add metric logging for prefetch
Summary:
This adds ui.log() output for prefetch statistics. Extensions who hook into
ui.log() can now log this data to external metrics systems.

Test Plan:
Ran a hg prefetch with the config flags enabled, while ptailing the dev command
timer. Verified the result contained remotefilelogfetches*

```
CHGDISABLE=1 FB_HG_DIAGS=1 hg --config
extensions.remotefilelog=../fb-hgext/remotefilelog/ --config
sampling.key.remotefilelog.prefetch=perfpipe_dev_command_timers prefetch -r .~9
ptail -f perfpipe_dev_command_timers | grep durham
```

Reviewers: #mercurial, simonfar

Reviewed By: simonfar

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4711096

Signature: t1:4711096:1489591144:1c91a4fbd118a3c10c2a2c68391c9f5b0dbcedf3
2017-03-15 20:57:32 -07:00
Augie Fackler
fde49a1a56 datapack: don't depend on demandimport when cstore isn't available
We've got a goofy test binary that doesn't use demandimport, and this
was tripping it up.
2017-03-14 12:44:43 -07:00
Jun Wu
02d79f722e codemod: replace repo.join to repo.vfs.join
Summary: Upstream has deprecated `repo.join`. Let's use `repo.vfs.join` instead.

Test Plan: `arc unit`

Reviewers: #sourcecontrol, rmcelroy

Reviewed By: rmcelroy

Subscribers: rmcelroy, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4704018

Signature: t1:4704018:1489462391:532b15c24faa33d584f25bb9382c1ff4b2c0c483
2017-03-13 20:51:37 -07:00
Durham Goode
70ce116529 treemanifest: add history data to tree repacks
Summary:
Previously, tree repacks did not take into account tree history. It would just
look at the delta base and if the base existed, it would just reuse the delta.
This would A) result in very long chains, and B) result in chains where the full
text was the oldest version, instead of the newest (recent full texts means
faster access to recent versions).

This patch threads tree history into the repacker, which already knows how to
use history for repacks.

Test Plan:
Updated the tests, and inspected the new test results to ensure tree
entries that were not deltas before the repack became reverse deltas during the
repack.

Reviewers: #mercurial, simonfar

Reviewed By: simonfar

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4647359

Signature: t1:4647359:1488882710:dba72cf488766ce827b7641735164fa0efc9a303
2017-03-07 11:15:26 -08:00
Durham Goode
83dc9949d6 histpack: support history pack entries with '' as the filename
Summary:
To support treemanifests in history packs we need to support the empty filename
(i.e. the root of the repo). This removes some checks that prevented that from
working.

Test Plan:
A future patch will add history pack support for treemanifests,
including tests that cover this.

Reviewers: #mercurial, quark

Reviewed By: quark

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4637917

Signature: t1:4637917:1488429178:c25d03b73eb379d4126ebbcee4bb5797f7b841b2
2017-03-07 11:15:25 -08:00
Durham Goode
940796814d treemanifest: add option for using native store
Summary:
Adds a treemanifest.usecunionstore config flag for enabling and disabling use of
the native code uniondatapackstore.

Since we haven't implemented the repack APIs on the native datapack stores, we
currently have to force repack to use the old python implementations. Instead of
trying to expose just the appropriate APIs through the python interface, I think
we'll rewrite all of repack to be in C++ at a future time, since we can take
advantage of parallelism, etc.

Test Plan:
Updated test-treemanifest.t to use the c datapackstore. Also run all
the tests with --extra-config-opt=treemanifest.usecunionstore=True.

These tests caught a missing null check in the C++ code as well.

Reviewers: #mercurial, simonfar

Reviewed By: simonfar

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4609795

Signature: t1:4609795:1488365341:203362db5f470b613c4d6484686cd32c3fa8458f
2017-03-01 16:55:19 -08:00
Kostia Balytskyi
8458ee33af remotefilelog: make _getfiles step size configurable
Summary:
This makes the `_getfiles` batch size configurable. Let me know if some other config name will serve this purpose better.

**Reason for this fix**
Currently, `_getfile` will write 10000 lines of text into a pipe and only upon the success of this operation, will read file blobs from another pipe. Serving process will start writing file blobs into a second pipe as soon as it sees something in the first pipe. Second pipe's buffer will fill up as it is not read from by the client until client writes 10K file requests. 10K file requests fill the buffer of the first pipe and we have a deadlock.
Ideally, we should make client check whether it can write to the first pipe and if not, go and read from the second pipe, but that is a bigger fix.

Test Plan:
- run local tests, see them all passing
- except for `test-cstore.t`, but it fails for me without my changes as well
- this generally makes sense

Reviewers: #sourcecontrol, durham

Reviewed By: durham

Subscribers: durham, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4620152

Signature: t1:4620152:1488221258:04555177926d129c6ba41bc982ad4e913cb31b20
2017-02-27 14:45:51 -08:00