Commit Graph

593 Commits

Author SHA1 Message Date
Ryan McElroy
5e1c5d97c9 remotefilelog: make variable names more self-explanatory in adjustlinknode
Test Plan: unit tests, check-code

Reviewers: #mercurial, durham

Reviewed By: durham

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4528598

Signature: t1:4528598:1486576997:11071e8ae7b90a6e28a29d790fa5d19be78d2546
2017-02-09 03:35:58 -08:00
Ryan McElroy
36860751ce remotefilelog: allow linkrev to be valid for any passed rev
Summary:
Sometimes we pass in two revs because we're creating a merge commit.
Let's handle that case better.

Test Plan: tests still pass

Reviewers: #mercurial, durham

Reviewed By: durham

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4528594

Signature: t1:4528594:1486576897:6bd80017dfe2933e4c1a9685ba35bb1af0440eb0
2017-02-09 03:35:58 -08:00
Ryan McElroy
8d8edfb6e0 remotefilelog: force prefetch when it will speed up linkrev calculation
Summary:
See comment in code for details. This addresses a performance
problem that arises when the same remotefilelog repo makes two changes
to a file far apart in history with no other repo making changes to the
same file in between.

Test Plan: new test verifies that linkrevs are fixed up using remote blob

Reviewers: #mercurial, durham

Reviewed By: durham

Subscribers: stash, quark, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4475183

Tasks: 12958591

Signature: t1:4475183:1486576777:81f4f9815cb677c94954c23ecd8aa7edb006ffd9
2017-02-09 03:35:58 -08:00
Ryan McElroy
493f4333e4 remotefilelog: introduce helper function to verify linkrevs
Summary:
We will be reusing this logic in the next patch; for now we're simply
factoring it out of the main loop

Test Plan: tests continue to pass

Reviewers: #mercurial, durham, stash

Reviewed By: stash

Subscribers: stash, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4528290

Tasks: 12958591

Signature: t1:4528290:1486567826:cb17e2fa1ab71f491482eba1a88c107a18333781
2017-02-09 03:35:58 -08:00
Ryan McElroy
fefcb2ac11 remotefilelog: remove unused ancestor generation
Test Plan: tests still pass, visual inspection

Reviewers: #mercurial, durham, stash

Reviewed By: stash

Subscribers: stash, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4528287

Tasks: 12958591

Signature: t1:4528287:1486566142:59192cb6aba3f48d84790c4ba026a18a658b9a96
2017-02-09 03:35:58 -08:00
Stanislau Hlebik
665bafef73 remotefilelog: remove unused variable
Test Plan: arc unit

Reviewers: #sourcecontrol, mjpieters

Reviewed By: mjpieters

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4507718

Signature: t1:4507718:1486140219:3f4b6935a5146d06e12099cf786021a05863591a
2017-02-07 00:37:14 -08:00
Durham Goode
4eb48da7a3 remotefilelog: back out threading change
This threading changed introduced a deadlock. I'm not sure what caused it, but
let's back it out for now.
2017-02-01 07:00:31 -08:00
Adam Simpkins
9d76f8392d remotefilelog: don't define a sparsematch() method
Summary:
The remotefilelog extension previously unconditionally added a sparsematch()
method to repository objects, which just called super.sparsematch() if this is
a sparse repository, and returned None if this isn't a sparse repository.

However, defining a sparsematch() method that returns None confuses the sparse
extension.  The sparse extension expects that all repositories that define
sparesmatch() are actually sparse repositories, and never expects this method
to return null.

This updates the remotefilelog code to call its method maybesparsematch()
instead.

Test Plan:
Confirmed existing remotefilelog unit tests pass, and that the sparse extension
no longer crashes when it is used with non-sparse repositories.  (This happens
when using the share extension, if the current working directory is non-sparse
but the sharedpath repository configuration loads the sparse extension.)

Reviewers: tja, durham

Reviewed By: durham

Subscribers: net-systems-diffs@, yogeshwer, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4466108

Signature: t1:4466108:1485451483:287a519151e35bdb99f5be0d9287b4698386183e
2017-01-26 11:01:31 -08:00
Olivier Trempe
7a3f21ad7d remotefilelog: deadlock due to OS buffer filled up on large batch requests
Summary:
On Windows, the default SSH client shipped with TortoiseHg is TortoisePlink.
It is very slow and for some reason, we did not experiment deadlocks using this
client. (My unverified hypothesis is that it may have an internal buffer to
manage large requests).

Using another SSH client such as MSYS ssh.exe (shipped with git Windows
installer) dramatically speeds up tranfers (~8 times). However, using this client
yields the deadlock problem.

NOTE: I don't know why it's not a problem on linux boxes.

The fix is to issue requests from a separate thread, so we don't enter a
deadlock on the client.
2017-01-26 09:13:20 -08:00
Mateusz Kwapich
708d51fe90 remotefilelog: ommit prefetching when commitctx has a manifest
Summary:
this is neccessary for the new metaedit to not spend time in
remotefilelog

Test Plan: tests are passing

Reviewers: #sourcecontrol, rmcelroy, durham

Reviewed By: durham

Subscribers: durham, rmcelroy, quark, jeroenv, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4326589

Tasks: 14548013

Signature: t1:4326589:1485343332:df8f7f6169e148b4724b088878ca8dfff34e9275
2017-01-25 06:07:19 -08:00
Durham Goode
d3c6def7b8 remotefilelog: move pack file permissions to be in mutablebasepack
Summary:
Treemanifest had a bug where the pack files it created were 400 instead of 444.
This meant people sharing a cache on the same machine couldn't access them. In
most of the remotefilelog code we had set opener.createmode before creating the
pack but we forgot to for treemanifest.

This patch moves the opener creation and createmode setting into the mutable
pack class so we can't screw this up again.

Test Plan:
Tests that wrote to the packs before, now failed and had to be
updated to chmod the packs before writing to them.

Reviewers: #mercurial, rmcelroy

Reviewed By: rmcelroy

Subscribers: rmcelroy, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4411580

Tasks: 15469140

Signature: t1:4411580:1484321293:9aa78254677548a6dc2270c58cee0ec6f57dd089
2017-01-13 09:42:25 -08:00
Adam Simpkins
ab81106b08 [remotefilelog] don't crash on invalid pack files
Summary:
We have run into some cases where users ended up with empty pack file in their
packs directory.  Just log a warning in this case and skip this pack file,
rather than letting the exception propagate up and crashing the command.

Test Plan:
Created empty 0000000000000000000000000000000000000000.histpack and
0000000000000000000000000000000000000000.histidx files in my repository's
hgcache directory, and confirmed that "hg log" now simply warns about them
instead of crashing.

I didn't really test the perftest.py or treemanifest_correctness.py extensions
much.  They seem to throw exceptions, and look like they have maybe gotten a
bit stale.  I fixed one minor typo, but didn't dig into the other exceptions
too much.

Reviewers: durham

Reviewed By: durham

Subscribers: net-systems-diffs@, yogeshwer, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4402516

Tasks: 15428659

Signature: t1:4402516:1484155254:96d2980efcec2d735257b08910e1ca437ef1dad6
2017-01-12 09:47:29 -08:00
Aaron Grattafiori
7221a13080 remotefilelog: add prefetch progress bar
Summary: Adding progress bar to slow prefetch process.

Test Plan: Perform a prefetch with :-10 or :-100 and see a progress bar displayed.

Reviewers: hanw, mjpieters

Reviewed By: mjpieters

Subscribers: mjpieters, hanw

Differential Revision: https://phabricator.intern.facebook.com/D4398078

Tasks: 9993344

Signature: t1:4398078:1484071212:f906493464340bea103be66133aa7026284143c7
2017-01-10 12:07:15 -08:00
Jun Wu
d3424dd296 fastannotate: work better with remotefilelog
Summary:
fastannotate will tell remotefilelog what nodes of a file is already known in
linelog + revmap, so remotefilelog will not prefetch those files. Previously,
fastannotate either prevents all prefetching or allows all prefetching, which
is sub-optimal.

fastannotate could now donate its sshpeer to remotefilelog, so remotefilelog
won't need to start another one (assuming they can share a same sshpeer,
could be turned off via config options). This should reduce run time if SSH
handshake is expensive.

fastannotate could now also steal sshpeer from remotefilelog, so fastannotate
won't need to start another one. Combined with the above change, there would
always be only one sshpeer shared by fastannotate and remotefilelog.

Test Plan: Modified existing tests

Reviewers: #sourcecontrol, durham

Reviewed By: durham

Subscribers: ikostia, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4325382

Signature: t1:4325382:1481933531:39d6344b2c076fbbff1f07997cd268e7cee25e80
2016-12-19 12:13:51 -08:00
Jun Wu
d633064ec0 remotefilelog: correct ResponseError usage
Summary:
This fixed a user report (P56867550) where the error message does not get
shown correctly, because ResponseError takes two parameters.

Also did a minor change for the first ResponseError to comply the format.

Test Plan: `arc diff`

Reviewers: durham, #sourcecontrol, simonfar

Reviewed By: simonfar

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4280775

Signature: t1:4280775:1481016196:4762161be699e5d215ec86b2d1385493d977a2b6
2016-12-05 23:02:22 +00:00
Durham Goode
67f6d86cd7 treemanifest: add test for incremental tree repack
Summary:
This adds a test for hg repack --incremental handling tree packs. It fixes a few
bugs that were caught in the process:

1. Since remotefilelog was being imported via it's file path, it was being
loaded into the process as hgext_remotefilelog. When treemanifest loaded it into
the process via 'import remotefilelog' it was being imported as 'remotefilelog'.
This meant we had the same types imported into the same process with two
different names, which meant 'isinstance()' checks could fail when they
shouldn't (which affects incremental repacks). So we just drop the filepath.

2. We need to allow repacking local tree manifest data even if the full delta
chain isn't available (since part of the delta chain may be in the cache). So we
add allowincomplete to the data pack in this case.

Test Plan: Ran it

Reviewers: #mercurial, rmcelroy

Reviewed By: rmcelroy

Subscribers: rmcelroy, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4262412

Signature: t1:4262412:1480706110:45bb0a1e1b031f9cfd4658a5071bbac5df2f6543
2016-12-02 14:38:00 -08:00
Durham Goode
9514bca8ce packs: make debug*pack commands take more paths
Summary:
Previously 'hg debug[data|hist]pack' required passing the filepath without the
suffix (just up to the hash). This was kind of a pain when using tab complete,
and a pain in tests when trying to use 'find' to run hg debugdatapack on a
number of files. Let's make the debug commands more forgiving.

Test Plan: Manual verification

Reviewers: #mercurial, dsyang, rmcelroy

Reviewed By: rmcelroy

Subscribers: rmcelroy, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4261586

Signature: t1:4261586:1480705548:1aa5dbe44ed5e29c28a29a8256de12ffee8d7387
2016-12-02 14:37:53 -08:00
Durham Goode
27e07eeed4 remotefilelog: make repack work for non-remotefilelog repos
Summary:
treemanifest also has the concept of repack and shares the same code as
remotefilelog's repack. We want treemanifest to be usable even without
remotefilelog, so let's update the repack code to not require the presence of
remotefilelog configured members on the repo object.

In the long term we'll probably move the repack and pack code out of
remotefilelog entirely.

Test Plan: A future patch adds a test that caught this

Reviewers: #mercurial, dsyang, rmcelroy

Reviewed By: rmcelroy

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4261571

Signature: t1:4261571:1480705343:2ceabbbbdeaf7408ef8b94ad3e670b9423162399
2016-12-02 14:37:49 -08:00
Durham Goode
edd8b25c30 repack: add incremental tree repacking
Summary:
Previously we only supported full repacks of the manifest stores. This patch
adds incremental repacking for both the shared manifest store and the local
manifest store.

Test Plan:
Did a few pulls to produce a few treemanifest pack files. Ran hg
repack --incremental after each one and verified it was a no-op until there were
enough pack files to trigger the repack.

Will add tests later today.

Reviewers: #mercurial, rmcelroy

Reviewed By: rmcelroy

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4260602

Signature: t1:4260602:1480705304:b17d3ec5ff8d4caafe935b2c4e941454052fe3ec
2016-12-02 14:37:47 -08:00
Durham Goode
046fb2ca08 repack: enable repacking of local manifest stores
Summary:
Previous hg repack would only repack the shared manifest cache store. This patch
makes it also repack the local manifest store too.

Test Plan:
Made a local commit in my test repo with treemanifests enabled.
Rebased the commit to master so now there were two local tree packs. Ran hg
repack and verified they were combined into one pack in
.hg/store/packs/manifests

I'll add a test later today

Reviewers: #mercurial, mjpieters

Reviewed By: mjpieters

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4260580

Signature: t1:4260580:1480696151:8f989e299dda50281ca63489e870202eb195d714
2016-12-02 14:37:45 -08:00
Durham Goode
26e1154b71 repack: make repack more generic
Summary:
Previously the repack logic was hardcoded to only work on the shared cache
stores. This patch refactors that knowledge to a higher level so a future patch
can also repack the local stores as well.

Test Plan: Ran the tests

Reviewers: #mercurial, mjpieters

Reviewed By: mjpieters

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4260569

Signature: t1:4260569:1480694802:30ab24dd031661227b4da1febc3d3daf9ad598fb
2016-12-02 14:37:42 -08:00
Jun Wu
22742a1054 fastannotate: conditionally disable remotefilelog annotate prefetching
Summary:
If the annotate cache is up-to-date on the main branch, there is likely no need
to prefetch file contents, unless the user is annotating a side branch, which
requires a deeper integration between fastannotate and remotefilelog to just
prefetch the missing part correctly - I'll think about it later while this diff seems
good enough for common cases.

Test Plan: Added a new test

Reviewers: durham, #sourcecontrol, stash

Reviewed By: stash

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4256370

Signature: t1:4256370:1480612767:c2e2a9225fb63ff36ffb579f9641851eb8cd8b39
2016-12-01 02:12:53 +00:00
Durham Goode
6f15ced334 remotefilelog: repack tree manifests
Summary:
Previously hg repack would only repack file content pack files. This patch makes
it also repack tree manifest pack files.

Test Plan:
Ran pull repack in a repo and verified the manifest packs were
repacked. I'll add some tests around this at some point.

Reviewers: #mercurial

Differential Revision: https://phabricator.intern.facebook.com/D4240723
2016-11-29 16:00:39 -08:00
Durham Goode
fe6f541ed8 remotefilelog: reuse deltabase when history is not available
Summary:
Previously, if there was no history available for a given revision, we would
just store the full text in the pack file. This patch makes it attempt to reuse
the existing delta base instead. This will be useful for repacking
treemanifests, since we currently don't have histpacks for them (we just delta
them efficiently when they are first added to the repo).

Test Plan: Ran tests

Reviewers: #mercurial

Differential Revision: https://phabricator.intern.facebook.com/D4240705
2016-11-29 15:37:58 -08:00
Jun Wu
f08e17d3ed testedwith: change testedwith to "ships-with-fb-hgext"
Summary:
Using `testedwith = 'internal'` is not a good habit [1]. Having it
auto-updated in batch would also introduce a lot of churn. This diff makes
them "ships-with-fb-hgext". If we do want to fill the ideal "testedwith"
information, we could put it in a centric place, like a "fbtestedwith"
extension rewriting those "ships-with-fb-hgext" on the fly.

Maybe having in-repo tags for tested Mercurial releases is also a good idea.

[1]: www.mercurial-scm.org/repo/hg/rev/2af1014c2534

Test Plan: `arc lint`

Reviewers: #sourcecontrol, rmcelroy

Reviewed By: rmcelroy

Subscribers: rmcelroy, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4244689

Signature: t1:4244689:1480440027:3dc18d017b48beba1176fbfd120351889259eb4b
2016-11-29 13:24:07 +00:00
Martijn Pieters
d574fb13ed import mercurial.debugcommands if present
Summary:
The debugindex and debugindexdot commands have moved and are not registered
unless you import the new mercurial.debugcommands module.

Test Plan:
Run

   hg --config=extensions.remotefilelog=fb-hgext/remotefilelog help

and observe that you get help info, rather that the error

   hg: unknown command 'debugindex'

then run the fb-hgext test suite.

Reviewers: rmcelroy, quark, simonfar

Reviewed By: simonfar

Subscribers: mjpieters, #mercurial

Differential Revision: https://phabricator.intern.facebook.com/D4244047

Signature: t1:4244047:1480427216:dcaa1ca441ea189bdf68f1f619b4078d8c1d09dc
2016-11-29 13:48:59 +00:00
Stanislau Hlebik
2a0e8205d4 extutil: create new package and move runshellfast here
Summary:
I'm going to use runshellfast in infinitepush.
To avoid copy-paste let's move it to the separate package.
Note: fastmanifest also has runshellfast but it's implementation
is a bit different and seems that it doesn't work with remotefilelog.
So I'll leave it alone for now.

Test Plan: python run-tests.py -j20

Reviewers: #sourcecontrol

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4175836
2016-11-21 00:52:30 -08:00
Durham Goode
508d4b493b remotefilelog: update to work with manifestlog
Summary:
Upstream has refactored the manifest class, so we need to update remotefilelog
to work with the new structure.

Also fix a mismatch with the new adjustlinkrev signature.

Test Plan: Ran the tests

Reviewers: #mercurial, rmcelroy

Reviewed By: rmcelroy

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4180879

Signature: t1:4180879:1479285946:d5ac954c495dc8b802d276ab5ac71626a1726612
2016-11-16 12:11:08 -08:00
Jun Wu
77d215b481 remotefilelog: check file existence in _revertprefetch
Summary: This is to address a crash report: P56804194.

Test Plan: `arc unit`

Reviewers: durham, #sourcecontrol, rmcelroy

Reviewed By: rmcelroy

Subscribers: rmcelroy, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4141002

Signature: t1:4141002:1478547571:2a06a48c3687ce91f6a52b59ddc01a17466782a8
2016-11-07 18:30:10 +00:00
Durham Goode
201d5f2149 remotefilelog: fix getchangegroup signature
Summary:
The upstream getchangegroup function changed back in August and our attempt at
fixing it here did not correclty handle the case where remotefilelog is loaded
but not enabled for this repository.

This patches fixes the signature, and makes our wrapping more resitant to future
signature changes.

Test Plan: Added a test.

Reviewers: #mercurial, jsgf

Reviewed By: jsgf

Subscribers: jsgf, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4099519

Signature: t1:4099519:1477698801:adcb406f1908c4f7e508e5f99e126f383c6f2574
2016-10-31 13:40:33 -07:00
Durham Goode
36ec03fd19 remotefilelog: improve robustness of hg gc loop
Summary:
We've gotten reports of hg gc failing on some service machines because
`peer._repo.name` complains that repo has no attribute 'name'. I'm not sure how
this could happen, but it makes sense to make the hg gc loop more robust to the
possibility that the repos in the 'repos' file have changed their configuration
since they were added to the file.

Test Plan: Ran the tests

Reviewers: #mercurial, simonfar

Reviewed By: simonfar

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4072719

Signature: t1:4072719:1477385020:24d532b9442292ce8234cc91bc7de503d3b0c88f
2016-10-25 12:30:59 -07:00
Augie Fackler
857f31c0d7 fileserverclient: avoid ever requesting nullid nodes
I keep tripping over bugs where this angers some part of our
stack. It's dumb to even be fetching these, but it's harmless to skip
them, so issue a developer warning when we encounter one and refuse to
fetch it.
2016-10-25 12:29:44 -07:00
Durham Goode
fe06171422 remotefilelog: fix inconsistency with nonoverlap in upstream
Summary:
Upstream changed the signature of computenonoverlap. Let's change our wrapping
of it to be more robust to signature changes.

Test Plan: ./run-tests.py test-remotefilelog* test-check*

Reviewers: #mercurial, quark

Reviewed By: quark

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4062705

Tasks: 14037455

Signature: t1:4062705:1477096049:5a011a7a5edf9bb01475694777c738cdb02453f5
2016-10-21 17:29:06 -07:00
Durham Goode
3707e186b6 treemanifest: add local pack store to unionstore
Summary:
In a future patch we will start writing user local tree data into a local
directory. This patches add the local store to the union store so the contents
will be accessible once we start writing it.

It also renames manifest to manifests for consistency with the other store names
(like 'packs').

Test Plan: Ran the tests

Reviewers: #mercurial, quark

Reviewed By: quark

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4055827

Signature: t1:4055827:1477059457:0d5b0d999b47d88c73f5ab2721d8d27deacc01bc
2016-10-21 11:02:22 -07:00
Durham Goode
4235752853 remotefilelog: rename getpackpath to getcachepackpath
Summary:
In a future diff we will be introducing packs into .hg/store, so we need to
differentiate between cache packs and local packs. This patch renames
getpackpath to getcachepackpath. A future diff will add getlocalpackpath.

This exposed a pyflakes error, so we fix that too.

Test Plan: Ran the tests

Reviewers: #mercurial, quark

Reviewed By: quark

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4055815

Signature: t1:4055815:1477059415:e0221557bbeec6701820c826f00390d3a71cd2d3
2016-10-21 11:02:20 -07:00
Durham Goode
c9037da765 remotefilelog: refactor mutablepack close/abort
Summary:
This makes mutablepacks close and abort function accessible from outside the
__exit__ logic. This will be useful later when we tie the lifetime of a
mutablepack to a transaction.

Test Plan: Ran the tests

Reviewers: #mercurial, quark

Reviewed By: quark

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4055730

Signature: t1:4055730:1477059602:ffdfd66e65279ddf3ff43d7c2ee65b00f1fd2600
2016-10-21 11:02:17 -07:00
Durham Goode
f43ba75915 remotefilelog: fix pyflakes and module import errors
Summary:
This fixes all the pyflaks and module errors for the main remotefilelog
code base.

Test Plan: ./run-tests.py test-check* test-remotefilelog*

Reviewers: #mercurial, quark

Reviewed By: quark

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4055537

Signature: t1:4055537:1477049663:ee904d311d17d3659e055e2c109c68c9023cfd1f
2016-10-21 11:02:09 -07:00
Jun Wu
9e2525ff74 remotefilelog: cache linkrev
Summary:
Calculating linkrev can be expensive for remotefilelog, while
non-remotefilelog code would usually expect `fctx.linkrev()` to be very cheap.
So let's cache it.

Test Plan: `arc diff`

Reviewers: #sourcecontrol, durham

Reviewed By: durham

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D3936757

Signature: t1:3936757:1475078470:7630d3246a5ca9407456384ae0a9aff1035d5201
2016-09-28 14:28:20 +01:00
Durham Goode
a07c114b8d remotefilelog: fix to use new manifestlog functions
Summary:
readfast and readdelta have moved onto the manifestctx structures in upstream
hg, so let's change remotefilelog to use them.

This breaks the ability for this version of remotefilelog to work with old
versions of Mercurial, but we never really guaranteed this to begin with.

Test Plan: Ran the tests, they now pass

Reviewers: #sourcecontrol

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D3915785
2016-09-23 13:28:36 -07:00
Durham Goode
df86b3486d treemanifest: auto create manifest pack directory
The pack/manifests directory wasn't being automatically created, so let's make
it so.
2016-09-21 13:51:39 -07:00
Durham Goode
50d6b599f4 Move ctreemanifest and cdatapack out of remotefilelog
These don't really have any dependencies on remotefilelog, so let's move them
out.
2016-09-21 13:55:12 -07:00
Durham Goode
596c74e744 ctreemanifest: replace treemanifest.copy with copy constructor
Summary:
Originally we wanted to use treemanifest.copy so it was explicit when a user was
performing a copy. Unfortunately, pytreemanifest contains a treemanifest as a
field (not a pointer), which means we can't call copy and assign the result to
that location. Let's use a copy constructor instead which lets us perform the
copy using a placement allocation.

There is no change to the consumers of this api, because nothing actually
consumed it. The pytreemanifest code was performing a copy using the default
copy constructor, so things were just broken before.

Test Plan: Later patches worked

Reviewers: #fastmanifest

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D3885018
2016-09-19 13:25:34 -07:00
Durham Goode
b8f0befd50 ctreemanifest: initialize FindContext values
Summary:
FindContext did not initialize any of its values, and some of its callers did
not either. This meant 'invalidate_checksums' was set to a random value, which
meant it was almost always treated as true. Let's initialize the fields.

Test Plan: Later patches worked

Reviewers: #fastmanifest

Differential Revision: https://phabricator.intern.facebook.com/D3885017
2016-09-19 13:25:34 -07:00
Durham Goode
8e46cac9f0 ctreemanifest: replace fetcher.get() calls in diffrecurse
Summary:
diffrecurse was still calling the fetcher directly. This caused it to not access
'resolved' manifests correctly in some cases. Let's change all uses to use
entry->get_manifest() which handles this stuff correctly.

Test Plan: Later patches worked

Reviewers: #fastmanifest

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D3885016
2016-09-19 13:25:34 -07:00
Durham Goode
f9d88db548 ctreemanifest: prevent manifest entry resolve pre-filling in some cases
Summary:
entry->initialize(filename, filenamelen, ...) function also sets the this->resolved
to a new empty Manifest. This has a couple bad side effects. 1) If
this->resolved was already present, it is overwritten and the memory leaks (such
as when entry->update() is called after it has already been resolved. 2) For the
root level manifest, it means when something goes to access it, it doesn't
bother to try to load the actual manifest data from the store. It just serves an
empty manifest instead. So this completely broke reading from the store.

This temporary fix is to handle entry->resolved manually outside the call in a
couple spots (root manifestentry creation, and entry->update). Once some of my
other changes go in, we should come back here and probably remove the "new
Manifest()" from entry->initialize() entirely.

Test Plan: Later patches worked

Reviewers: #fastmanifest

Differential Revision: https://phabricator.intern.facebook.com/D3885015
2016-09-19 13:25:34 -07:00
Durham Goode
e0483997ca remotefilelog: fix ctreemanifest serialization
Summary:
The serialization was completely broken because std::string.append("\0") treated
the argument as a c-string which was completely empty. Let's use the
append(char*, size_t) version instead to make sure it appends the correct
characters.

Also makes othere uses of append in that function match the same pattern.

Test Plan: Later patches worked

Reviewers: #fastmanifest

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D3885014
2016-09-19 13:25:34 -07:00
Durham Goode
55d5e7a777 remotefilelog: fix Manifest copy construction and clean up iterator usage
Summary:
The Manifest(PythonObj) constructor parses the given python string and creates a
list of entries for them. During a copy, if the list of entries on the copy
source has been modified to differ from the original PythonObj, this results in
the new copy having incorrect entries. Instead of using the
ManifestEntry(PythonObj) constructor, let's use the default constructor, then
manually fill in the entries ourselves.

We still need to copy the PythonObj over to the new version, since that
increments the reference count and keeps the string alive (since some of our
entries may be referencing it).

This also cleans up some usages of iterators to be less dependent on the
internal side effects of called functions (i.e. don't depend on the called
function to move the iterator forward).

Test Plan: Later patches worked

Reviewers: #fastmanifest

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D3885013
2016-09-19 13:25:34 -07:00
Durham Goode
1fad56201c remotefilelog: don't use reference param for ManifestEntry init
Summary:
The old ManifestEntry init used a reference char* so it could move the pointer
to the end of the current read before it return, so the caller could then call
the function again to get the next entry. This caused problems when we tried to
use that initialization function to copy ManifestEntry's later, since it moved
the internal pointer of the entry we were copying.

Let's get rid of the & and make the function just return the pointer to the end
of the read, and the caller can decide to use it or not.

Test Plan: Later patches worked

Reviewers: #fastmanifest

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D3885012
2016-09-19 13:25:34 -07:00
Durham Goode
22ba1933b9 remotefilelog: fix missing import
Summary:
This is used by markforrefresh, which starts being called in a later
patch.

Test Plan: Later patches worked

Reviewers: #fastmanifest

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D3885011
2016-09-19 13:25:34 -07:00
Durham Goode
29486895ce ctree: implement treemanifest.write
Summary:
This adds a new write() function to treemanifest that allows us to serialize a
treemanifest instance into a provided data store.

Test Plan: A future diff uses this to serialize trees into a pack file.

Reviewers: #fastmanifest

Differential Revision: https://phabricator.intern.facebook.com/D3838787
2016-09-20 12:42:23 -07:00
Durham Goode
3dd98f3934 ctree: add compareName function to ManifestEntry
Summary:
This adds a useful function for performing name comparisons on manifest entries.
This will be used in a future patch to simplify iterating over two manifests at
the same time.

Test Plan:
Used this as part of a future diff that serializes manifests into
pack files.

Reviewers: #fastmanifest

Differential Revision: https://phabricator.intern.facebook.com/D3837743
2016-09-19 13:25:34 -07:00
Durham Goode
60625cf752 ctree: add Manifest.computeNode
Summary:
This adds a function for computing the final node of a manifest, given its two
parent nodes.

Test Plan:
Used this as part of a future diff that serialize manifests into a
pack file.

Reviewers: #fastmanifest

Differential Revision: https://phabricator.intern.facebook.com/D3837737
2016-09-19 13:25:34 -07:00
Jun Wu
6007cd6e15 remotefilelog: remove duplicated testedwith
Summary: There is a line `testedwith = ''` below.

Test Plan: Code Review

Reviewers: #mercurial, ttung, rmcelroy

Reviewed By: rmcelroy

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D3886521

Signature: t1:3886521:1474301901:cceadfc1d5b181435362d7ae75014bad1a75576f
2016-09-19 15:52:20 +01:00
Tony Tung
2bbc0657b8 [ctree] use unsigned characters to do the mercurial sorting
Summary:
This is required to get the proper sorting with characters with the MSb set.

Wrote unit test to cover it.

Test Plan:
1. passed that unit test.
2. `PYTHONPATH=~/work/mercurial/facebook-hg-rpms/fb-hgext/:~/work/mercurial/facebook-hg-rpms/remotenames/:~/work/mercurial/facebook-hg-rpms/lz4revlog/ /opt/local/bin/python2.7 ~/work/mercurial/facebook-hg-rpms/hg-crew/hg --config extensions.treemanifest_correctness=~/work/mercurial/facebook-hg-rpms/fb-hgext/tests/treemanifest_correctness.py --config remotefilelog.fastdatapack=True testtree  --build "master~50000::master" --revs 'master + master~5000'` is clean now!

Reviewers: #fastmanifest, durham

Reviewed By: durham

Subscribers: mitrandir, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D3869292

Signature: t1:3869292:1473922557:4c691f696ea991a578a151b8091ae44beff528df
2016-09-18 13:46:47 -07:00
Tony Tung
97688eb7a1 [ctree] fix output for diff
Summary: No flag == '' in mercurial world, not None.

Test Plan: much less spewage from `PYTHONPATH=~/work/mercurial/facebook-hg-rpms/fb-hgext/:~/work/mercurial/facebook-hg-rpms/remotenames/:~/work/mercurial/facebook-hg-rpms/lz4revlog/ /opt/local/bin/python2.7 ~/work/mercurial/facebook-hg-rpms/hg-crew/hg --config extensions.treemanifest_correctness=~/work/mercurial/facebook-hg-rpms/fb-hgext/tests/treemanifest_correctness.py --config remotefilelog.fastdatapack=True testtree  --build "master~50000::master" --revs 'master + master~5000'`

Reviewers: #fastmanifest, durham

Reviewed By: durham

Subscribers: mitrandir, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D3869289

Signature: t1:3869289:1473922533:d3f340c7f9cdca2a449b71c6791cb980fa244de6
2016-09-18 13:46:16 -07:00
Tony Tung
9ace5a2f03 [ctree] allow a directory and a file of the same name to coexist
Summary:
Mercurial core does this sometimes where it sets the new entries before it removes the old entries.  In that case, you might have:

old manifest:

```
abc
```

operations:
```
+ abc/def
- abc
```

If you add `abc/def` to a manifest already containing `abc`, it will throw a `TypeError` today.  With this change, we accept this weird state.

Test Plan: Update the tests to cover both versions of this scenario (directory was there first vs file was there first).  Pass all tests.

Reviewers: #fastmanifest, durham

Reviewed By: durham

Subscribers: durham, mitrandir, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D3865617

Signature: t1:3865617:1473897481:4cf4f122dfd3a3759fe84ea167b8cb0e78238bc2
2016-09-18 13:44:50 -07:00
Tony Tung
40e41cf202 [ctree] fileiter can return in mercurial-sorted order
Summary:
fileiter/stackframe gain a `sorted` field, which it uses to determine whether to use a ManifestIterator or a SortedManifestIterator.

add a unit test to test this scenario

Test Plan: pass unit tests.

Reviewers: #fastmanifest, durham

Reviewed By: durham

Subscribers: durham, mitrandir, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D3868088

Signature: t1:3868088:1474225427:64d19f3f58376682850e46a3547e7a6ce1de97f4
2016-09-18 13:44:17 -07:00
Tony Tung
4a1692385e [ctree] add SortedManifestIterator
Summary:
Mercurial has this odd sort order where the full path is used in the sort order.  Therefore 'abc.def' would precede 'abc/def'.

This introduces a comparator that follows this rule.  We add a new class, SortedManifestIterator, which returns entries in this order.

Test Plan: compiles

Reviewers: #fastmanifest, durham

Reviewed By: durham

Subscribers: durham, mitrandir, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D3862339

Signature: t1:3862339:1473900496:7f6fed3cdba055bb1791fbc8ae08546e24c96f10
2016-09-18 13:42:09 -07:00
Tony Tung
ee31c666ba [ctree] use a const char* to represent the flag
Summary: This way, we can have a true nullable (null, a.k.a., not present, and any character).

Test Plan: make local and pass tests

Reviewers: #fastmanifest, durham

Reviewed By: durham

Subscribers: mitrandir, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D3855340

Signature: t1:3855340:1473807404:54c1ae17c735665935a52b0140b58094f0b45a26
2016-09-18 13:40:14 -07:00
Tony Tung
9d7c2ba6d2 [ctree] remove double initialization of root.resolved
Summary: `initialize()` does it already.

Test Plan: run unit tests.

Reviewers: #fastmanifest, durham

Reviewed By: durham

Subscribers: mitrandir, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D3869524

Signature: t1:3869524:1474225262:57ccfd92a4d02bc92674007bcafaeafedef413a0
2016-09-18 13:38:32 -07:00
Tony Tung
70ecb670d8 [ctree] fix root resolution
Summary: D3842603 broke reads from datapacks because `.initialize()` creates a blank manifest and attaches it to the manifest entry.  When we subsequently try to read the root, we don't properly actually resolve it to the manifest in the datapack.  Instead, we use that blank manifest which is wrong.

Test Plan: `PYTHONPATH=~/work/mercurial/facebook-hg-rpms/fb-hgext/:~/work/mercurial/facebook-hg-rpms/remotenames/:~/work/mercurial/facebook-hg-rpms/lz4revlog/  /opt/local/bin/python2.7 ~/work/mercurial/facebook-hg-rpms/hg-crew/hg --config extensions.perftest=~/work/mercurial/facebook-hg-rpms/fb-hgext/tests/perftest.py --config remotefilelog.fastdatapack=False testtree --kind flat,ctree,fast --test fulliter --build "master~50000::master" --revs 'master + master~5000'

Reviewers: #fastmanifest, durham

Reviewed By: durham

Subscribers: durham, mitrandir, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D3862319

Signature: t1:3862319:1474225268:f01ef876d569bc09c1e0ace71492bbaae017404e
2016-09-18 13:38:01 -07:00
Tony Tung
5e9146e7ad [ctree] invalidate nodes when descendent nodes have been modified
Summary: Any modifying operation sholud set the `invalidate_checksum` flag, and as we unwind the stack, we clear the nodes if the flag is set.

Test Plan: not yet tested

Reviewers: #fastmanifest, durham

Reviewed By: durham

Subscribers: durham, mitrandir, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D3842843

Tasks: 13268688

Signature: t1:3842843:1473709799:89a82ffade1e56fe64ccae50abecd3de4b46f6bd
2016-09-12 15:30:07 -07:00
Durham Goode
b2af4f7300 ctree: add Manifest.serialize function
Summary:
This adds a Manifest function that can produce a serialized version of the
manifest that is appropriate for writing into a store.

Test Plan:
Used it as part of a future diff that serialized manifests to a pack
file.

Reviewers: #fastmanifest, ttung

Reviewed By: ttung

Subscribers: ttung, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D3837714

Signature: t1:3837714:1473454260:1c5864730744a560cbad4fc4ab5263ef4447f7dc
2016-09-12 11:44:59 -07:00
Durham Goode
8ce9d887d1 store: add markforrefresh api
Summary:
This api will be used to tell the store that new data has been added and the
store should check disk for new data (like new pack files).

Test Plan:
Used as part of a future diff that generates new tree entries during
hg pull time.

Reviewers: #fastmanifest, ttung

Reviewed By: ttung

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D3837694

Signature: t1:3837694:1473454036:0f1f37064cadc0fe3dabc6ecc6e3f7b3319e3d56
2016-09-12 11:44:53 -07:00
Durham Goode
4a0a22abdc ctree: implement treemanifest_copy
Summary:
This implements the copy function, which will be used to create a copy of a
parent manifest when beginning to create a new manifest.

Test Plan:
Used in conjunction with future diffs to serialize a new treemanifest
into a pack file.

Reviewers: #fastmanifest, ttung

Reviewed By: ttung

Subscribers: ttung, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D3837415

Signature: t1:3837415:1473453896:494fc856256764f14741c65258f354e08383f1f7
2016-09-12 11:44:47 -07:00
Tony Tung
578f49b708 [ctree] switch the root to a ManifestEntry object
Summary: We can't properly mark root node unless it's... well, markable.

Test Plan: `make -C .. local && PYTHONPATH=~/work/mercurial/facebook-hg-rpms/fb-hgext/:~/work/mercurial/facebook-hg-rpms/lz4revlog/ /opt/local/bin/python2.7 ~/work/mercurial/facebook-hg-rpms/hg-crew/tests/run-tests.py test-remotefilelog-datapack.py`

Reviewers: #fastmanifest, durham

Reviewed By: durham

Subscribers: durham, mitrandir, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D3842603

Signature: t1:3842603:1473550335:91e23cf0fb1c0abb51d12f6b289d87b955543228
2016-09-10 16:32:28 -07:00
Tony Tung
430b9a37ec [ctree] get rid of the remaining magic numbers for hash size (bin and hex)
Test Plan: make local

Reviewers: #fastmanifest, durham

Reviewed By: durham

Subscribers: mitrandir, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D3837310

Signature: t1:3837310:1473377616:d6c917e93efd3b64cebe4ac05fb60f966283b3ab
2016-09-10 16:27:53 -07:00
Tony Tung
f8a094bef9 [ctree] python bindings for remove and simple test
Summary:
Removes are treated as set to None.

The API is not exactly correct, as manifestdict needs to implement size/getitem/setitem, but we can refactor in a subsequent diff.

Test Plan: pass unit test

Reviewers: #fastmanifest, durham

Reviewed By: durham

Subscribers: mitrandir, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D3836197

Signature: t1:3836197:1473363870:21a447e4bc17148193fde7ed4887927089e54930
2016-09-10 16:27:40 -07:00
Tony Tung
997b26f2fe [ctree] c++ interface for remove
Summary: Uses `walk()` to locate the element.  Callback is super simple again.

Test Plan: used in later diff.

Reviewers: #fastmanifest, durham

Reviewed By: durham

Subscribers: mitrandir, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D3836142

Signature: t1:3836142:1473363781:0384d2c0b7154d3acbbab436c12bbdb55a5a63a7
2016-09-09 15:24:31 -07:00
Tony Tung
db1cad6baa [ctree] python bindings to treemanifest.set() and tests!!!!!!!!
Summary:
Python binding is pretty straightforward.  The test exercises a few simple paths: set, update, and conflict detection.

Not yet done: tests to set a deeply nested directory structure.

Test Plan: PASSED THE SIMPLE TESTS.  YES!

Reviewers: #fastmanifest, durham

Reviewed By: durham

Subscribers: durham, mitrandir, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D3832682

Signature: t1:3832682:1473363415:60240c6bbbcec82184b4588b83bbda750e8b77f1
2016-09-09 15:24:06 -07:00
Tony Tung
9218a52668 [ctree] c++ interface for setting/updating a node
Summary: This uses the `treemanifest.find(..)` interface to locate the desired tree entry and add/update a leaf node (a.k.a. file).  The `set()` method is mostly setup and teardown logic for find, and the little unique code that needs to exist is in `set_callback()`

Test Plan: used in later diff.

Reviewers: #fastmanifest, durham

Reviewed By: durham

Subscribers: trunkagent, durham, mitrandir, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D3832664

Signature: t1:3832664:1473459754:a258f42791b4c4d561abad043d8ad21ddb48fbed
2016-09-09 15:22:42 -07:00
Tony Tung
54195bfd97 [ctree] method to update a manifest entry's node and flags
Summary: This is needed when we're updating a node that already exists.

Test Plan: used in later diff.

Reviewers: #fastmanifest, durham

Reviewed By: durham

Subscribers: mitrandir, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D3832631

Signature: t1:3832631:1473362920:8a39f02ced91cc9328c57e416d7dfc9a704b0cc6
2016-09-08 13:39:48 -07:00
Tony Tung
d7fb495d51 [ctree] return the bin node instead of the hex node for .find()
Summary: Internally, mercurial transacts in bin nodes.

Test Plan: used in later diff

Reviewers: #fastmanifest, durham

Reviewed By: durham

Subscribers: durham, mitrandir, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D3832614

Signature: t1:3832614:1473359939:033cb1d944680287fb6563523c7782a80118b9e2
2016-09-08 13:39:31 -07:00
Tony Tung
65ccf8ee98 [ctree] change how ManifestEntry.initialize works to match initialization from flat text
Summary:
flags should be == NULL if it's not present, not that it points to NULL.

Since the flags pointer isn't always set, we have to use the filename pointer to set the separator field ('\n').

Test Plan: used in later diff.

Reviewers: #fastmanifest, durham

Reviewed By: durham

Subscribers: mitrandir, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D3832555

Signature: t1:3832555:1473359723:9f05dd353f0c1866232214cd0d489d1fb4573f51
2016-09-08 13:34:56 -07:00
Durham Goode
8180b78c83 ctree: implement treemanifest.walk
Summary:
This implements the walk function, which is almost identical to iterkeys, except
it filters the result using a provided matcher.

Test Plan:
As before, not sure how to execute this from in mercurial, but it's
pretty small so we'll test it with the full mercurial test suite once we can do
edits.

Reviewers: #fastmanifest

Differential Revision: https://phabricator.intern.facebook.com/D3775631
2016-09-08 11:40:08 -07:00
Durham Goode
83e26c7a23 ctree: implement treemanifest.iterentries
Summary:
This implements the ability to iterate of the entries of the manifest. We reuse
the key iter, but add a boolean option for returning the full (path, node, flag)
tuple instead of just the path.

Test Plan:
I'm not sure of how to execute this via an hg command, so for now I'm
just going to

doitlive

Reviewers: #fastmanifest

Differential Revision: https://phabricator.intern.facebook.com/D3775619
2016-09-08 11:42:34 -07:00
Durham Goode
ac2b28669a ctree: initialize variables before use 2016-09-08 11:40:08 -07:00
Tony Tung
68dd0c9944 [ctree] conversion method to hex from bin.
Summary: This is needed when we execute a `set`, which comes in with a binary node, but is stored as a hex string.

Test Plan: make local; used in later diff.

Reviewers: #fastmanifest, durham

Reviewed By: durham

Subscribers: mitrandir, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D3831923

Signature: t1:3831923:1473292766:bd4c1fdde356a8de5b4b74328e3805b7e1eae752
2016-09-07 17:50:30 -07:00
Tony Tung
bdf87070fa [ctree] clean up warnings
Summary:
1. remove unused `fetcher` variables.
2. std::string's second arg is a size_t, not a Py_ssize_t.

Test Plan: make local

Reviewers: #fastmanifest, durham

Reviewed By: durham

Subscribers: mitrandir, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D3830569

Signature: t1:3830569:1473292747:823b5c06490c45c0bfcc59373e7935af0c1b3630
2016-09-07 17:50:16 -07:00
Tony Tung
0986a590b3 [ctree] funnel all access to the root manifest via getManifest() method
Summary: Since a lot of accesses are coming from within the treemanifest class, I renamed the cached variable as rootManifest_DO_NOT_ACCESS_DIRECTLY to make people think twice before accessing it.

Test Plan: `PYTHONPATH=~/work/mercurial/facebook-hg-rpms/fb-hgext/:~/work/mercurial/facebook-hg-rpms/remotenames/:~/work/mercurial/facebook-hg-rpms/lz4revlog/ /opt/local/Library/Frameworks/Python.framework/Versions/2.7/Resources/Python.app/Contents/MacOS/Python ~/work/mercurial/facebook-hg-rpms/hg-crew/hg --config extensions.perftest=~/work/mercurial/facebook-hg-rpms/fb-hgext/tests/perftest.py  testtree --kind flat,ctree,fast --test fulliter,diff,find --build "master~5::master"`

Reviewers: #fastmanifest, durham

Reviewed By: durham

Subscribers: durham, mitrandir, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D3823297

Signature: t1:3823297:1473213565:54657eeecff35549c71dc0af91c71cca9d426ac1
2016-09-07 13:00:55 -07:00
Tony Tung
199a37f983 [ctreemanifest] constructor should allow for the construction of an empty manifest
Summary:
The root node is now optional.  If it's not specified, an empty tree is constructed.

Added an unit test to cover this code path.

Test Plan: `PYTHONPATH=~/work/mercurial/facebook-hg-rpms/fb-hgext/:~/work/mercurial/facebook-hg-rpms/remotenames/:~/work/mercurial/facebook-hg-rpms/lz4revlog/ /opt/local/bin/python2.7 ~/work/mercurial/facebook-hg-rpms/hg-crew/hg --config extensions.perftest=~/work/mercurial/facebook-hg-rpms/fb-hgext/tests/perftest.py --config remotefilelog.fastdatapack=True testtree --kind flat,ctree,fast --test fulliter,diff,find --build "master~5::master"` still runs.

Reviewers: #fastmanifest, durham

Reviewed By: durham

Subscribers: mitrandir, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D3803569

Signature: t1:3803569:1472759620:02c1f723fe3b7f4cdca9da6399fc2296c6d5a022
2016-09-06 12:40:32 -07:00
Tony Tung
3c8381a899 [cfastmanifest] only create a Manifest object for ManifestEntry::initialize if it's a directory
Test Plan: make local

Reviewers: #fastmanifest, durham

Reviewed By: durham

Subscribers: mitrandir, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D3803565

Signature: t1:3803565:1472759508:750a6c87d271260df7d6a3af1e1de48707145839
2016-09-06 12:40:15 -07:00
Tony Tung
1a4389406c [ctreemanifest] add node as a param for addChild
Summary: This allows us to set the node hash when we add the child.

Test Plan: make local

Reviewers: #fastmanifest, durham

Reviewed By: durham

Subscribers: mitrandir, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D3803564

Signature: t1:3803564:1472759281:a5e413ba7cf12319c4460da213ad8be7a0eed673
2016-09-06 12:39:07 -07:00
Tony Tung
96a13a14bf [ctreemanifest] const-ify the manifest directory flag
Summary:
DRY.  Also, it's safer.

Depends on D3772947

Test Plan: make local

Reviewers: #fastmanifest, durham

Reviewed By: durham

Subscribers: mitrandir, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D3796249

Signature: t1:3796249:1472663415:944a5209e0fecd0f9e6a5c54bf40df430b53016e
2016-09-06 12:38:54 -07:00
Tony Tung
6226cfb0f6 [ctreemanifest] no need for treemanifest_XXX now that we're part of the treemanifest class
Test Plan: make local

Reviewers: #fastmanifest, durham

Reviewed By: durham

Subscribers: durham, mitrandir, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D3803563

Signature: t1:3803563:1472759119:8b609f8725dae5c7658160427cd1d6709abe3355
2016-09-06 12:38:40 -07:00
Tony Tung
31dc781871 [ctreemanifest] move treemanifest_find inside treemanifest object
Summary: It's logical.

Test Plan: make local

Reviewers: #fastmanifest, durham

Reviewed By: durham

Subscribers: mitrandir, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D3803562

Signature: t1:3803562:1472759066:83bb06dd04a993677bc59598610d3e0a82480eb8
2016-09-06 12:38:28 -07:00
Tony Tung
fb19ea1321 [ctreemanifest] move treemanifest_get inside the treemanifest class
Summary: This removes a bunch of extra arguments we were passing in because the treemanifest knows about it already.  Also inlined the entire resolve-the-root-manifest operation into a separate method.

Test Plan: `PYTHONPATH=~/work/mercurial/facebook-hg-rpms/fb-hgext/:~/work/mercurial/facebook-hg-rpms/remotenames/:~/work/mercurial/facebook-hg-rpms/lz4revlog/ /opt/local/bin/python2.7 ~/work/mercurial/facebook-hg-rpms/hg-crew/hg --config extensions.perftest=~/work/mercurial/facebook-hg-rpms/fb-hgext/tests/perftest.py --config remotefilelog.fastdatapack=True testtree --kind flat,ctree,fast --test fulliter,diff,find --build "master~5::master"`

Reviewers: #fastmanifest, durham

Reviewed By: durham

Subscribers: durham, mitrandir, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D3800793

Signature: t1:3800793:1472758998:c4617894c947f84268a3fa1d0cdde03793889640
2016-09-06 12:38:15 -07:00
Tony Tung
dac5c4103c [ctreemanifest] store the fetcher instead of the store object
Summary: We don't actually care about the store.  We want to invoke the fetcher.  At some future point, we could replace the fetcher with something that goes to the network or whatever, or unit tests can point to something special.

Test Plan: make local && `PYTHONPATH=~/work/mercurial/facebook-hg-rpms/fb-hgext/:~/work/mercurial/facebook-hg-rpms/remotenames/:~/work/mercurial/facebook-hg-rpms/lz4revlog/ /opt/local/bin/python2.7 ~/work/mercurial/facebook-hg-rpms/hg-crew/hg --config extensions.perftest=~/work/mercurial/facebook-hg-rpms/fb-hgext/tests/perftest.py --config remotefilelog.fastdatapack=True testtree --kind flat,ctree,fast --test fulliter,diff,find --build "master~5::master"`

Reviewers: #fastmanifest, durham

Reviewed By: durham

Subscribers: mitrandir, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D3800789

Signature: t1:3800789:1472758892:67cee86424da4b7b903f2495e31bdd03d793b21a
2016-09-06 12:11:11 -07:00
Tony Tung
f17f92c8a0 [cfastmanifest] move Find structures to the top of the file
Summary: 1 pass compiler!

Test Plan: make local

Reviewers: #fastmanifest, durham

Reviewed By: durham

Subscribers: mitrandir, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D3800784

Signature: t1:3800784:1472758844:2cb86d11f422837e88230821b77c802def049e32
2016-09-06 12:10:35 -07:00
Tony Tung
b5d0b9f6ea [ctree] move diff C++ code to treemanifest.{cpp,h}
Test Plan: make local

Reviewers: #fastmanifest, durham

Reviewed By: durham

Subscribers: durham, mitrandir, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D3795008

Tasks: 12889491

Signature: t1:3795008:1472663226:63a578eacb007b832007d752c80215117f8d4879
2016-09-06 12:09:55 -07:00
Tony Tung
b168ae251e [ctree] replace _treemanifest_find with treemanifest_get
Summary: treemanifest_find is now a generic method to traverse a tree (like tree_path in cfastmanifest).  It manages the tree by adding and removing intermediate nodes (if requested).  Unlike _treemanifest_find, it uses lower-level APIs such as findChild and addChild, which uses std::list::iterator to avoid duplication of work.  Previously, we would need to traverse the entire list of children to find a child, and then if it's not found, traverse the entire list again to properly locate it.  Now we just find the exact location where the child is or should be, and then save that.

Test Plan: `PYTHONPATH=~/work/mercurial/facebook-hg-rpms/remotefilelog:~/work/mercurial/facebook-hg-rpms/fb-hgext/:~/work/mercurial/facebook-hg-rpms/remotenames/:~/work/mercurial/facebook-hg-rpms/lz4revlog/ /opt/local/bin/python2.7 ~/work/mercurial/facebook-hg-rpms/hg-crew/hg --config extensions.perftest=~/work/mercurial/facebook-hg-rpms/remotefilelog/tests/perftest.py testtree --kind flat,ctree,fast --test fulliter,diff,find --build "master~5::master"`

Reviewers: #fastmanifest, durham

Reviewed By: durham

Subscribers: mjpieters, durham, mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3772947

Signature: t1:3772947:1472758787:8e8597c30f2cf1512ac7195bfa44ed62c201deb2
2016-09-06 12:09:42 -07:00
Ryan McElroy
5e7e9048ae remotefilelog: fix for upstream api change
Summary:
In upstream mercurial, c472ca028b32 changed changegroup.getchangegroup() to not
take a 'common' argument anymore.

Test Plan: run tests, no more stack traces

Reviewers: #mercurial, ttung, durham, quark

Reviewed By: quark

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D3819304

Signature: t1:3819304:1473091041:587ecb88dfcdd806d372825ec4d7512eb7fdb94e
2016-09-05 12:02:28 -07:00
Durham Goode
fd8f1d7c11 ctree: free mmap pages after 1GB instead of 100MB
Summary:
We were seeing performance issues on linux if we performed the madvise every
100MB. Let's do it once every gigabyte instead. The performance cost is pretty
negligble at that level.

Test Plan:
Ran the perf test for fulliter and verified the time went back down
to normal (a factor of 6x).

Reviewers: #fastmanifest, ttung

Reviewed By: ttung

Subscribers: ttung, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D3799820

Signature: t1:3799820:1472675258:d279ac773aa4a2e027f4ecbe8a485b299a364efe
2016-08-31 13:28:18 -07:00
Tony Tung
430216e787 [cdatapack] make cdatapack_get valgrind-safe
Summary: strnlen on some platforms use xmm registers to parallelize the operation.  this may cause reads from uninitialized memory, which valgrind generates a false positive error.  to avoid that, we preinitialize the memory *beyond* what we normally write so valgrind does not flag that as reading from uninitialized memory.

Test Plan: valgrind only complains about one uninitialized read (fixed in next diff)

Reviewers: #fastmanifest, durham

Reviewed By: durham

Subscribers: mitrandir, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D3799193

Signature: t1:3799193:1472672134:c866b6295cbfc9f4dd5e0d84a3af9d41144981c8
2016-08-31 12:41:32 -07:00
Jun Wu
f68fc35c58 cdatapack: get be64toh defined correctly
Summary:
This is to fix the below compiling error:

```
remotefilelog/cdatapack/cdatapack.c:102:28: error: implicit declaration of function ‘be64toh’ [-Werror=implicit-function-declaration]
   packindex->data_offset = ntoh_data_offset(
                            ^^^^^^^^^^^^^^^^
```

Feature test macros must be defined before libc `#include`s.

Test Plan: `make local` and the above error disappers.

Reviewers: #sourcecontrol, ttung

Reviewed By: ttung

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D3797669

Signature: t1:3797669:1472671003:9f8674a14c39042d24c673e41321683807a60fab
2016-08-31 15:29:10 +01:00
Tony Tung
3bc89da6fd [cdatapack] set the return code when everything is fine.
Summary: D3786546 was broken in that it used an uninitialized return value.

Test Plan: `valgrind  /Users/tonytung/Library/Caches/CLion2016.2/cmake/generated/fb-hgext-c4fc6e6f/c4fc6e6f/RelWithDebInfo1/cdatapack_get ~/.hgcache/fbsource/packs/e70f2fc9ec08516a196a6f5f3a475b21b6cee3b2 749a00bf1a195a711de2818b23da78e7f1ee034a` is 100% clean.

Reviewers: #fastmanifest, quark

Reviewed By: quark

Subscribers: mitrandir, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D3799213

Signature: t1:3799213:1472671038:51522987c9b3bebc281d17f72a422af542d3d815
2016-08-31 12:21:58 -07:00
Jun Wu
3bd6e0c1e4 ctreemanifest: fix treemanifest_diffrecurse
Summary:
Seems like a logic error.

```
remotefilelog/ctreemanifest/py-treemanifest.cpp:289:34: error: self-comparison always evaluates to false [-Werror=tautological-compare]
           ((bool)selfentry->flag != (bool)selfentry->flag)
            ~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~
```

Test Plan: `make local`

Reviewers: #sourcecontrol, ttung

Reviewed By: ttung

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D3797686

Signature: t1:3797686:1472656256:f8637d6787b0726743d8ced96dc179dc090bf029
2016-08-31 15:37:25 +01:00
Durham Goode
121c178e45 ctree: fix matcher use in py-treemanifest
There was a dumb bug where we attempted to use the matcher if it was NULL,
instead of only using the matcher when it was not NULL.  We just need to remove
the bad !.
2016-08-30 17:44:25 -07:00
Tony Tung
8f4429ebd2 [cdatapack] track how much memory we've paged in for reading the datapack, and madvise away if over a limit
Summary:
Things I considered:
1. doing this after a fixed number of delta chain reads (this would likely be just fine, actually).  there is the risk of a few very long/large chains blowing out the memory footprint.
2. tracking the actual memory addresses and not madvising away everything away. (unlikely to help, since it's probably the syscall + page table updates that is costly, not the address range)
3. letting FreeBSD / MACOS go nuts with MADV_FREE while limiting only linux (which uses the more costly MADV_DONTNEED)

In the end, I settled on this because we're already using this for the python version and it hasn't killed anyone.

Test Plan:
run:

```
#!/usr/bin/env python2.7

import binascii

import cdatapack

dp = cdatapack.datapack('/Users/tonytung/.hgcache/fbsource/packs/e70f2fc9ec08516a196a6f5f3a475b21b6cee3b2')

for ix, line in enumerate(open('/tmp/hashes', 'r')):
    line = line.strip()
    try:
        dp.getdeltachain(binascii.unhexlify(line))
    except:
        print "failed when retrieving " + line
        raise
```

and observed that the memory footprint stayed quite low.

Reviewers: #fastmanifest, durham

Reviewed By: durham

Subscribers: mitrandir, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D3794420

Tasks: 12829688

Signature: t1:3794420:1472597468:6642f1e6cd85859dbf7cce3e4ce35d5375601be0
2016-08-30 16:21:24 -07:00
Tony Tung
12f10ef039 [cdatapack] handle quirk in lz4's treatment of 0-length inputs
Summary: When lz4 gets a 0-length input to compress, it outputs a standard header (0 byte uncompressed output, followed by 12 mystery bytes.)  When we uncompress this, `LZ4_decompress_fast` returns -1 (!).  So we treat this as an oddball case and don't declare it corrupt.

Test Plan: dump all the entries in my fbsource datapack without any corruption reports

Reviewers: #fastmanifest, durham

Reviewed By: durham

Subscribers: mitrandir, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D3794368

Signature: t1:3794368:1472597620:a1368e55b6f74882c778036a5459e92d9c026b6a
2016-08-30 16:21:11 -07:00