Summary: Removed 'missing' list variable as it is not needed
Test Plan: No. All tests pass
Reviewers: simonfar, durham, stash
Reviewed By: stash
Subscribers: stash, medson, mjpieters, #mercurial
Differential Revision: https://phabricator.intern.facebook.com/D5461432
Signature: t1:5461432:1500559459:9bca4b429807082dd9a2a2a0283ab7b23df82b70
Summary:
If remotefilelog server cache files are truncated, when the server
returns these files to the client, the client will crash trying to
decompress them. Detect truncated files and treat them as cache
misses.
Test Plan: Add unit test for truncated server cache files.
Reviewers: #fbhgext, simonfar
Reviewed By: #fbhgext, simonfar
Differential Revision: https://phab.mercurial-scm.org/D170
Summary:
If bgprefetchrevs is specified in config, background prefetch will be executed on
update and commit commands as well as on other commands which use update or
commit. Background prefetch will fetch revisions specified in config and if the
repack flag is set will also execute a repack.
Test Plan: Added test cases
Reviewers: simonfar, durham
Reviewed By: durham
Subscribers: medson, mjpieters, #mercurial
Differential Revision: https://phabricator.intern.facebook.com/D5454939
Tasks: 19727407
Signature: t1:5454939:1500551182:e9cce84f6dc98182b5cb30faeb811fd7fa5e22b0
Summary:
The history pack writer had a bug where if the same node was added to the
mutablehistorypack N times, it would write out that it had N entries, but then
it would only write a single entry. This caused corruption (the length value
didn't match the actual number of entries) that broke repack.
This primarily affected users who used the old version of treemanifest (where
trees were converted on the client side). The new version of treemanifest only
seems to repro this in rare cases, like when rebasing multiple commits that
create the same trees.
Test Plan: Added a test. It failed before and passes after.
Reviewers: #fbhgext, mitrandir
Reviewed By: #fbhgext, mitrandir
Subscribers: akushner
Differential Revision: https://phab.mercurial-scm.org/D128
Summary:
Previously prefetch on pull ran in foreground. Now it can run in background
if specified. Optionally background prefetch can be followed by background
repack.
Test Plan: Added a test case
Reviewers: simonfar, durham
Reviewed By: simonfar
Subscribers: medson, mjpieters, #mercurial
Differential Revision: https://phabricator.intern.facebook.com/D5406626
Tasks: 19727343
Signature: t1:5406626:1500059334:207b4100cca536cbe33f6c6dfad596d03a6fa14f
Summary:
We've seen certain machines end up with just datapack files and no dataidx
files. While we can't repro this, the only time this could possibly happen is
between the rename of the two temp files. So let's add error handling around
that logic.
Test Plan: Ran the tests. Manually inserted an exception between the renames, ran the tests with --keep-tmpdir, and verified there were no packs left over when the exception fired.
Reviewers: #fbhgext, quark
Reviewed By: #fbhgext, quark
Differential Revision: https://phab.mercurial-scm.org/D61
Summary:
Previously the code required that sizes be of type int. Since python plays loose
with integer types, we also need to support long.
Test Plan:
The existing test-remotefilelog-repack-fast.t test was completely
broken. It only enabled fast datapacks for the server repo, not the clients.
Enabling it for the clients as well catches this issue and verifies the fix.
Reviewers: #fbhgext, quark
Reviewed By: #fbhgext, quark
Differential Revision: https://phab.mercurial-scm.org/D54
Summary:
Eventually we will want to only send trees instead of flat manifests. Let's add
a config to disable sending them. For now it will still loop over the flat
manifests, since that's needed to discover which files to send, but it won't
actually send the flat manifests.
This makes it easier to test the pushrebase pack handling code in a future
patch.
Test Plan:
A future diff adds testing for pushing only tree packs and ensuring
they are received correctly.
Reviewers: #mercurial, mjpieters
Reviewed By: mjpieters
Subscribers: mitrandir, medson, mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D5330005
Signature: t1:5330005:1499713117:7e52dd79ab6759717dbac8431d01ce2e55d1f683
Summary: Number of files fetched from server is now logged to scuba.
Test Plan:
* rm -fr $(hg showconfig remotefilelog.cachepath)/* to clean cache
* hg show #rev
* check Scuba table 'Hg Remotefilelog' to ensure that # of files fetched from server is logged
Reviewers: durham, simonfar
Reviewed By: simonfar
Subscribers: medson, mjpieters, #mercurial
Differential Revision: https://phabricator.intern.facebook.com/D5375753
Tasks: 19727278
Signature: t1:5375753:1499347945:97ecef74418ad2fcbabfcd850a37622b23551815
Summary:
Previously we tried to keep incremental repacks small by only allowing 2 packs
to be repacked at once. This causes problems with treemanifest since hg pull
could create a pack file, which then gets repacked with a single other pack
file. This meant the total number of packs did not decrease. Let's increases the
number of files we pack at once to 3 so we can guarantee that a repack after
adding a pack still decreases the total number.
Test Plan:
Ran a local tree repack that was previously only causing two files to
be repacked. Now it repacked three files.
Reviewers: #mercurial, quark
Reviewed By: quark
Subscribers: medson, mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D5287238
Signature: t1:5287238:1497995177:54e229b564137ddc35ea28fd3d649a1085de0c72
Summary:
Previously remotefilelog would open a connection and leave the getfiles command
running on that connection, so it didn't have to reopen the ssh connection each
time. We want to reuse this ssh connection for treemanifest and fastannotate, so
let's switch it to a pool model where the connection is kept open but the
getfiles command is not left open.
If an exception happens while the connection is out of the pool, it is discarded
instead of being added back to the pool.
Test Plan:
Ran the tests. The fastannotate tests changed to reflect the new way
the connectionpool allows use.
Reviewers: quark, #mercurial, mitrandir
Reviewed By: mitrandir
Subscribers: mitrandir, medson, mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D5280323
Signature: t1:5280323:1497975420:e3ae1ee854a1afc90816502543a19ff36f59b497
Summary:
If the repo being repacked has commits being pushed, it's possible that the
repack will encounter file revisions who's linkrevs don't exist in the inmemory
changelog. Let's set an upper bound on what linkrevs to repack so we can only
process revisions who's commits we can see.
Test Plan: Added test
Reviewers: #mercurial, mjpieters
Reviewed By: mjpieters
Subscribers: medson, mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D5279171
Signature: t1:5279171:1497911997:a48c56abebd14a1c066c9fc1ee4098f813d062df
Summary:
When committing a node which already exists, the linkrev provided to filelog
is actually wrong (len(changelog)). The upstream Mercurial will not update
existing file revisions but remotefilelog will.
The fact that linkrev being wrong triggers the ProgrammingError added by
D5061330. This diff detects that case and avoid writing bad linkrevs or
raise ProgrammingError.
Test Plan: Added a test.
Reviewers: #mercurial, durham
Reviewed By: durham
Subscribers: medson, mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D5278060
Signature: t1:5278060:1497910883:b8c3b6281ad9c0a516b942ba8ca41c51801b2d6b
Creating a remotefilelog blob does a walk of the ancestors, but the walk had
poor checking for if a node was already queued (instead it checked if it had
already been processed, which it may not have been processed yet, despite being
queued). This resulted in extremely long wait times for files with lots of
merges in history because it became exponential.
This fixes that.
Summary:
The remotefilectx ancestor walk did not check if it had already walked a node,
and therefore very mergey history would cause this to be exponential runtime.
Let's add a visited check.
Test Plan: hg log on a file with very mergey history
Reviewers: #mercurial, quark
Reviewed By: quark
Subscribers: quark, mjpieters, medson
Differential Revision: https://phabricator.intern.facebook.com/D5191981
Tasks: 19121837
Signature: t1:5191981:1496865201:66b58b6831e9ae1707f834965456d95ec339cc86
Summary: This will be safer if upstream API ever changed again.
Test Plan: arc unit
Reviewers: #mercurial, rmcelroy
Reviewed By: rmcelroy
Subscribers: mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D5147881
Signature: t1:5147881:1496326141:c70559ffe2bb43953a1a0584163ca24c3235c5f3
Summary:
It has bitten us a few times already. Failing `_adjustlinknode()` only because
we couldn't download file from the server seems incorrect. There may be a
transient network issue for example.
Let's wrap this code in try/catch and warn in the case of failure.
Test Plan: arc unit
Reviewers: #mercurial, rmcelroy
Reviewed By: rmcelroy
Subscribers: mjpieters, #sourcecontrol
Differential Revision: https://phabricator.intern.facebook.com/D5147669
Signature: t1:5147669:1496225752:3cd6d9758e447b7f91b246a9ed1a20e6f73ed176
Summary:
This adds the `skiprevs` parameter to all `filectx.annotate` methods
to match the upstream change.
For fastannotate, things is a bit more complex since it's a mix of two
algorithms. For now we just fallback to the slow path for correctness.
I'll think about adding back a fast path later.
Test Plan: arc unit
Reviewers: sid0, #mercurial, quark
Subscribers: stash, mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D5128392
Summary: Upstream has deprecated cmdutil.commands() in favor of registrar.commands()
Test Plan: Ran the tests
Reviewers: #mercurial, quark
Reviewed By: quark
Subscribers: mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D5106486
Signature: t1:5106486:1495485074:0e20f00622cc651e8c9dda837f84dd84cc51099e
Summary:
The `cmp` logic does not look special from the original `basefilectx.cmp` so it
could be removed. This means we will benefit from LFS's override of
`basefilectx.cmp` and have a fast path for LFS binary diff.
Test Plan: Added a test. Make sure the old code fails that test.
Reviewers: #mercurial, durham
Reviewed By: durham
Subscribers: mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D5100529
Signature: t1:5100529:1495468055:ff7fa30f3277b942bf49ba1e9b8764effacd972a
Summary:
This adds a simple progress bar while the client receives a pack. We don't know
ahead of time how large it will be, so we can only provide the bouncy bar kind.
But we can show an increasing number of things downloaded, so they know
something is actually happening.
Test Plan:
Ran hg pull with the extension enabled, verified there was a progress
bar during the tree prefetch.
Reviewers: #mercurial, quark
Reviewed By: quark
Subscribers: mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D5098564
Signature: t1:5098564:1495235358:0020b28da890813f11dba70b157e526e7be418f8
Summary:
Previously our data packs would have incredibly long delta chains, where to read
the last entry in the file you had to read every previous delta all the way to
the beginning. This was very expensive for chains of 2+ million deltas. Let's
add a config option to limit how long the deltas get, and for now we'll default
it to 1000.
Test Plan: Added a test
Reviewers: #mercurial, rmcelroy
Reviewed By: rmcelroy
Subscribers: rmcelroy, mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D5095603
Signature: t1:5095603:1495213707:737d63129cf459ad6927a1f4deb0dda5a8ce0a7f
Summary:
remotefilelog needs to wait for changelog creation to get the commit hash so
all filelog appending operations are pending if linkrev is an integer.
That is better done in the `addrawrevision` layer since it will catch more code
paths. Besides, that means remotefilelog won't need to hash a same revision
twice.
Test Plan: arc unit
Reviewers: #mercurial, durham
Reviewed By: durham
Subscribers: mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D5061330
Signature: t1:5061330:1494871717:a7224ebdc0f221fbaabbd2a58de975caec0e4b05
Summary: This is required (but not complete) to allow LFS fast path.
Test Plan: arc unit
Reviewers: #mercurial, durham
Reviewed By: durham
Subscribers: durham
Differential Revision: https://phabricator.intern.facebook.com/D5061283
Signature: t1:5061283:1494959731:2d7ed7465c1724cd1f231a206f949da16f90649c
Summary: This makes the next patch cleaner to read.
Test Plan: arc unit
Reviewers: #mercurial, durham
Reviewed By: durham
Subscribers: mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D5061271
Signature: t1:5061271:1494869273:967b647281847fa39e88558805bcf3b9a9e2b57b
Summary:
Core HG code uses revlog.nodemap to test node existence. We will hit some
code path about LFS in the future. So let's add a nodemap to remotefilelog.
Currently, the code path won't be hit. In the future, it should only be hit by
`repo._filecommit` when a `remotefilectx` is used (which is an LFS fast path).
That means, `nodemap` test won't connect remote server for missing nodes.
In the future, we could add some "hints" to get/getmeta API to let it not look
for the remote store.
Test Plan:
Real test will be added when we do can hit that code path. But the new code is
short and looks fine.
Reviewers: #mercurial, durham
Reviewed By: durham
Subscribers: durham, mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D5061254
Signature: t1:5061254:1494869146:88a0e1d04d292e2a64f29fdf52660f48b906665c
Summary:
When the server performs a repack, it would read all the data from all the
revlogs. This was very slow and expensive. Let's add a --incremental option that
makes it only read the revlog entries that have a linkrev newer than the latest
linkrev data that's already in a pack file.
Test Plan: Adds a test
Reviewers: #mercurial, rmcelroy
Reviewed By: rmcelroy
Subscribers: rmcelroy, mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D4997260
Signature: t1:4997260:1493904216:c4f5b6c9652bbf8f66c1bc2b2547c898324d22cd
Summary: `revdiff` should use raw revisions. 5d11b5ed in core hg is a similar fix.
Test Plan: Added a test.
Reviewers: #mercurial, durham
Reviewed By: durham
Subscribers: mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D5031916
Signature: t1:5031916:1494366439:80d496ed6f69a784d345510147eab6c479496f84
Summary:
It should've probably been using this function from the beginning, I just
forgot. Let's switch it to use _findnode so it takes advantage of the new
indexes.
This changes the return value of _findnode, but that's fine because the only
caller was getmissing() which didn't use the return value.
Test Plan: Ran the tests. The getnodeinfo unit test covers this code path.
Reviewers: #mercurial, rmcelroy
Reviewed By: rmcelroy
Subscribers: rmcelroy, mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D4991411
Signature: t1:4991411:1493810207:7f694d41089a6efab822ae002645ed1531f6b344
Summary:
Now that we have node indexes, let's switch _findnode (which is responsible for
random access node lookups) to use that index.
Test Plan: Ran the tests. The getmissing unit test covers this code path.
Reviewers: #mercurial, rmcelroy
Reviewed By: rmcelroy
Subscribers: rmcelroy, mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D4991400
Signature: t1:4991400:1493802781:6422be2522c3a423c80e41ce776057720b32ee98
Summary:
A future patch will be doing a similar bisect over a different index, so let's
move the bisect logic to it's own function.
Test Plan: Ran the tests
Reviewers: #mercurial, rmcelroy
Reviewed By: rmcelroy
Subscribers: mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D4991393
Signature: t1:4991393:1493802679:1fb20c1e1ae6271f10953889f9d4d524d3e03a76
Summary:
A future patch will be doing entry reading from another code path, so let's
refactor it out.
Test Plan: Ran the tests
Reviewers: #mercurial, rmcelroy
Reviewed By: rmcelroy
Subscribers: mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D4991388
Signature: t1:4991388:1493802655:c20b3e394a785b31c8e94cfb0285575c6ceed70d
Summary:
Now that _getancestors is an iterator, we can get rid of the custom scan code in
_findnode.
Test Plan: Ran the tests
Reviewers: #mercurial, rmcelroy
Reviewed By: rmcelroy
Subscribers: rmcelroy, mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D4991386
Signature: t1:4991386:1493798953:e3f7faa828ef3148f8c66930d04c20c6fa83b83b
Summary:
Now that we have node indexes, let's return their location from _findsection.
Patch includes a slight rename to a constant that was poorly named as well.
Test Plan: Ran the tests
Reviewers: #mercurial, rmcelroy
Reviewed By: rmcelroy
Subscribers: mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D4991385
Signature: t1:4991385:1493798906:c3147d00b6fc67b47191592032519afccf71e9af
Summary:
Previously looking up a particular node in a histpack required a bisect to find
the file section, then a linear scan to find the particular node. If you needed
to look up the latest 3000 nodes one by one, that involved 3000 linear scans,
many of which traversed the same nodes over and over.
This patch adds additional index at the end of the current histidx file. In a
future patch, we will change getnodeinfo() to use this index instead of the
linear scan logic.
Test Plan:
Ran the tests. I haven't actually verified that the data in these
indexes is correct. My next patch will add logic that reads these indexes and
will add tests around it. I won't land this until I've confirmed it's correct.
Reviewers: quark, #mercurial, rmcelroy
Reviewed By: rmcelroy
Subscribers: rmcelroy, mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D4983690
Signature: t1:4983690:1493798877:ae0802b4896b54bf066df9f684d94554855fd35a
Summary:
Previously, we used the length of the index file to determine the upper bounds
of the bisect. In a future patch we'll want to add more data to the end of the
index file, so we need to record how long the index portion of the index is.
This patch adds that information.
Test Plan: Ran the tests.
Reviewers: #mercurial, quark
Reviewed By: quark
Subscribers: mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D4983682
Signature: t1:4983682:1493693255:57ab9af2030847fedff05b6755113ba8ce0c933b
Summary:
This patch just bumps the histpack version number to 1 and adds a config flag to
enable writing v1 pack files. The format hasn't actually changed in this patch,
I'm just doing the verison bump so I can update all the hashes in the tests
without working about functionality change.
In the next patch I will modify the index format, which won't affect the hashes.
Test Plan:
Ran the tests. I also ran the tests with some debug code to manually
force the sha to include 0 instead of 1 and verified that the hash didn't change
(which confirms that all of these hash changes are just because of that one byte
version change).
Reviewers: #mercurial, quark
Reviewed By: quark
Subscribers: mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D4983675
Signature: t1:4983675:1493692444:5d88df4d46ce487f1b791417754ba000ecf10a1e
Summary:
The issue was introduced by the getmeta refactoring. It didn't get caught by
tests because tests didn't trigger freememory.
Test Plan: The fix was verified working on fbsource
Reviewers: #mercurial, durham
Reviewed By: durham
Subscribers: mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D4978835
Signature: t1:4978835:1493689136:296d425cf5d08b807b898c0e8cd881c9207c6359
Summary:
This diff makes 2 changes to v1 packfile metadata:
1. Move `key` in a metadata entry to before `size`.
```
old: [entry-size: 2 byte] [key: 1 byte] [data: var length]
new: [key: 1 byte] [data-size: 2 byte] [data: var length]
```
Previously `entry-size == 0` does not make sense.
2. Use binary to represent sizes, instead of ASCII.
Related utility methods are cleaned up a bit so it's harder to make mistakes.
Test Plan: Updated existing tests
Reviewers: #mercurial, durham
Reviewed By: durham
Subscribers: durham, mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D4983189
Signature: t1:4983189:1493689852:22d544d73ed63fac83f849786de035af304161ce
Summary:
This diff implements getmeta in C and enables related tests.
Now all content stores support `getmeta`.
Test Plan: Run existing tests.
Reviewers: #mercurial, durham
Reviewed By: durham
Subscribers: mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D4960926
Signature: t1:4960926:1493611048:55095c32927fac74e698f21d47173cb8a7523fb6
Summary:
To enable pushing between peers (and eventually pushing to the server), let's
teach bundle creation to include the trees being pushed.
Test Plan: Adds a test
Reviewers: #mercurial, quark
Reviewed By: quark
Subscribers: quark, mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D4957456
Signature: t1:4957456:1493266296:67f98a2b3d691644bde9098a713d05266f349cde
Summary:
Previously the server would access the tree data in an adhoc manner. Sometimes
it would talk straight to revlogs, sometimes it would create stores and talk to
data packs. This patch makes it access trees the same way clients do, through
repo.svfs.manifestdatastore and manifesthistorystore.
This also cleans up the client store creation just a little and adds a
unionmetadatastore for unified history access.
Test Plan: Ran the tests
Reviewers: #mercurial, quark
Reviewed By: quark
Subscribers: mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D4957441
Signature: t1:4957441:1493263349:e76d177f7a9f45343e6f984d6c0ae2c7cacba035
Summary:
Previously, history stores only had getancestors() apis which returned all the
ancestors. This was expensive if there was a lot of ancestors, like for the root
tree of treemanifests. Let's add an api for accessing a single history entry.
This will be useful in pack generation for only fetching the history related to
the trees we're sending at that time.
Test Plan: Added a test
Reviewers: #mercurial, quark
Reviewed By: quark
Subscribers: quark, mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D4957432
Signature: t1:4957432:1493263124:a155ac5a70c35f7e25a5cc48c9d9c2126d4c5858
Summary:
Previously, the logic that added data to a mutable history pack was required to
add it in the correct order (all entries for a certain file at once, and in
newest-first order). This required the callers to jump through weird hoops if
the data came in out of order or at different times in the transaction.
This patch moves the ordering logic to be inside MutableHistoryPack, so callers
can add the data in any order they wish, and it will get sorted before being
serialized.
This does add memory pressure to things that read a lot of history, like repack.
If this becomes a problem we may want to add a 'historypack.flush()' api that
let's us tell the history pack it's ok to flush it's current contents to disk.
Test Plan: Ran the tests
Reviewers: #mercurial, quark
Reviewed By: quark
Subscribers: quark, mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D4956096
Signature: t1:4956096:1493264693:a2275a49e35565d4b11244e3e5dd82c25de7e16e