Summary:
Sometimes we pass in two revs because we're creating a merge commit.
Let's handle that case better.
Test Plan: tests still pass
Reviewers: #mercurial, durham
Reviewed By: durham
Subscribers: mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D4528594
Signature: t1:4528594:1486576897:6bd80017dfe2933e4c1a9685ba35bb1af0440eb0
Summary:
See comment in code for details. This addresses a performance
problem that arises when the same remotefilelog repo makes two changes
to a file far apart in history with no other repo making changes to the
same file in between.
Test Plan: new test verifies that linkrevs are fixed up using remote blob
Reviewers: #mercurial, durham
Reviewed By: durham
Subscribers: stash, quark, mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D4475183
Tasks: 12958591
Signature: t1:4475183:1486576777:81f4f9815cb677c94954c23ecd8aa7edb006ffd9
Summary:
We will be reusing this logic in the next patch; for now we're simply
factoring it out of the main loop
Test Plan: tests continue to pass
Reviewers: #mercurial, durham, stash
Reviewed By: stash
Subscribers: stash, mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D4528290
Tasks: 12958591
Signature: t1:4528290:1486567826:cb17e2fa1ab71f491482eba1a88c107a18333781
Summary:
The remotefilelog extension previously unconditionally added a sparsematch()
method to repository objects, which just called super.sparsematch() if this is
a sparse repository, and returned None if this isn't a sparse repository.
However, defining a sparsematch() method that returns None confuses the sparse
extension. The sparse extension expects that all repositories that define
sparesmatch() are actually sparse repositories, and never expects this method
to return null.
This updates the remotefilelog code to call its method maybesparsematch()
instead.
Test Plan:
Confirmed existing remotefilelog unit tests pass, and that the sparse extension
no longer crashes when it is used with non-sparse repositories. (This happens
when using the share extension, if the current working directory is non-sparse
but the sharedpath repository configuration loads the sparse extension.)
Reviewers: tja, durham
Reviewed By: durham
Subscribers: net-systems-diffs@, yogeshwer, mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D4466108
Signature: t1:4466108:1485451483:287a519151e35bdb99f5be0d9287b4698386183e
Summary:
On Windows, the default SSH client shipped with TortoiseHg is TortoisePlink.
It is very slow and for some reason, we did not experiment deadlocks using this
client. (My unverified hypothesis is that it may have an internal buffer to
manage large requests).
Using another SSH client such as MSYS ssh.exe (shipped with git Windows
installer) dramatically speeds up tranfers (~8 times). However, using this client
yields the deadlock problem.
NOTE: I don't know why it's not a problem on linux boxes.
The fix is to issue requests from a separate thread, so we don't enter a
deadlock on the client.
Summary:
this is neccessary for the new metaedit to not spend time in
remotefilelog
Test Plan: tests are passing
Reviewers: #sourcecontrol, rmcelroy, durham
Reviewed By: durham
Subscribers: durham, rmcelroy, quark, jeroenv, mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D4326589
Tasks: 14548013
Signature: t1:4326589:1485343332:df8f7f6169e148b4724b088878ca8dfff34e9275
Summary:
Treemanifest had a bug where the pack files it created were 400 instead of 444.
This meant people sharing a cache on the same machine couldn't access them. In
most of the remotefilelog code we had set opener.createmode before creating the
pack but we forgot to for treemanifest.
This patch moves the opener creation and createmode setting into the mutable
pack class so we can't screw this up again.
Test Plan:
Tests that wrote to the packs before, now failed and had to be
updated to chmod the packs before writing to them.
Reviewers: #mercurial, rmcelroy
Reviewed By: rmcelroy
Subscribers: rmcelroy, mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D4411580
Tasks: 15469140
Signature: t1:4411580:1484321293:9aa78254677548a6dc2270c58cee0ec6f57dd089
Summary:
We have run into some cases where users ended up with empty pack file in their
packs directory. Just log a warning in this case and skip this pack file,
rather than letting the exception propagate up and crashing the command.
Test Plan:
Created empty 0000000000000000000000000000000000000000.histpack and
0000000000000000000000000000000000000000.histidx files in my repository's
hgcache directory, and confirmed that "hg log" now simply warns about them
instead of crashing.
I didn't really test the perftest.py or treemanifest_correctness.py extensions
much. They seem to throw exceptions, and look like they have maybe gotten a
bit stale. I fixed one minor typo, but didn't dig into the other exceptions
too much.
Reviewers: durham
Reviewed By: durham
Subscribers: net-systems-diffs@, yogeshwer, mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D4402516
Tasks: 15428659
Signature: t1:4402516:1484155254:96d2980efcec2d735257b08910e1ca437ef1dad6
Summary: Adding progress bar to slow prefetch process.
Test Plan: Perform a prefetch with :-10 or :-100 and see a progress bar displayed.
Reviewers: hanw, mjpieters
Reviewed By: mjpieters
Subscribers: mjpieters, hanw
Differential Revision: https://phabricator.intern.facebook.com/D4398078
Tasks: 9993344
Signature: t1:4398078:1484071212:f906493464340bea103be66133aa7026284143c7
Summary:
fastannotate will tell remotefilelog what nodes of a file is already known in
linelog + revmap, so remotefilelog will not prefetch those files. Previously,
fastannotate either prevents all prefetching or allows all prefetching, which
is sub-optimal.
fastannotate could now donate its sshpeer to remotefilelog, so remotefilelog
won't need to start another one (assuming they can share a same sshpeer,
could be turned off via config options). This should reduce run time if SSH
handshake is expensive.
fastannotate could now also steal sshpeer from remotefilelog, so fastannotate
won't need to start another one. Combined with the above change, there would
always be only one sshpeer shared by fastannotate and remotefilelog.
Test Plan: Modified existing tests
Reviewers: #sourcecontrol, durham
Reviewed By: durham
Subscribers: ikostia, mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D4325382
Signature: t1:4325382:1481933531:39d6344b2c076fbbff1f07997cd268e7cee25e80
Summary:
This fixed a user report (P56867550) where the error message does not get
shown correctly, because ResponseError takes two parameters.
Also did a minor change for the first ResponseError to comply the format.
Test Plan: `arc diff`
Reviewers: durham, #sourcecontrol, simonfar
Reviewed By: simonfar
Subscribers: mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D4280775
Signature: t1:4280775:1481016196:4762161be699e5d215ec86b2d1385493d977a2b6
Summary:
This adds a test for hg repack --incremental handling tree packs. It fixes a few
bugs that were caught in the process:
1. Since remotefilelog was being imported via it's file path, it was being
loaded into the process as hgext_remotefilelog. When treemanifest loaded it into
the process via 'import remotefilelog' it was being imported as 'remotefilelog'.
This meant we had the same types imported into the same process with two
different names, which meant 'isinstance()' checks could fail when they
shouldn't (which affects incremental repacks). So we just drop the filepath.
2. We need to allow repacking local tree manifest data even if the full delta
chain isn't available (since part of the delta chain may be in the cache). So we
add allowincomplete to the data pack in this case.
Test Plan: Ran it
Reviewers: #mercurial, rmcelroy
Reviewed By: rmcelroy
Subscribers: rmcelroy, mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D4262412
Signature: t1:4262412:1480706110:45bb0a1e1b031f9cfd4658a5071bbac5df2f6543
Summary:
Previously 'hg debug[data|hist]pack' required passing the filepath without the
suffix (just up to the hash). This was kind of a pain when using tab complete,
and a pain in tests when trying to use 'find' to run hg debugdatapack on a
number of files. Let's make the debug commands more forgiving.
Test Plan: Manual verification
Reviewers: #mercurial, dsyang, rmcelroy
Reviewed By: rmcelroy
Subscribers: rmcelroy, mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D4261586
Signature: t1:4261586:1480705548:1aa5dbe44ed5e29c28a29a8256de12ffee8d7387
Summary:
treemanifest also has the concept of repack and shares the same code as
remotefilelog's repack. We want treemanifest to be usable even without
remotefilelog, so let's update the repack code to not require the presence of
remotefilelog configured members on the repo object.
In the long term we'll probably move the repack and pack code out of
remotefilelog entirely.
Test Plan: A future patch adds a test that caught this
Reviewers: #mercurial, dsyang, rmcelroy
Reviewed By: rmcelroy
Subscribers: mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D4261571
Signature: t1:4261571:1480705343:2ceabbbbdeaf7408ef8b94ad3e670b9423162399
Summary:
Previously we only supported full repacks of the manifest stores. This patch
adds incremental repacking for both the shared manifest store and the local
manifest store.
Test Plan:
Did a few pulls to produce a few treemanifest pack files. Ran hg
repack --incremental after each one and verified it was a no-op until there were
enough pack files to trigger the repack.
Will add tests later today.
Reviewers: #mercurial, rmcelroy
Reviewed By: rmcelroy
Subscribers: mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D4260602
Signature: t1:4260602:1480705304:b17d3ec5ff8d4caafe935b2c4e941454052fe3ec
Summary:
Previous hg repack would only repack the shared manifest cache store. This patch
makes it also repack the local manifest store too.
Test Plan:
Made a local commit in my test repo with treemanifests enabled.
Rebased the commit to master so now there were two local tree packs. Ran hg
repack and verified they were combined into one pack in
.hg/store/packs/manifests
I'll add a test later today
Reviewers: #mercurial, mjpieters
Reviewed By: mjpieters
Subscribers: mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D4260580
Signature: t1:4260580:1480696151:8f989e299dda50281ca63489e870202eb195d714
Summary:
Previously the repack logic was hardcoded to only work on the shared cache
stores. This patch refactors that knowledge to a higher level so a future patch
can also repack the local stores as well.
Test Plan: Ran the tests
Reviewers: #mercurial, mjpieters
Reviewed By: mjpieters
Subscribers: mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D4260569
Signature: t1:4260569:1480694802:30ab24dd031661227b4da1febc3d3daf9ad598fb
Summary:
If the annotate cache is up-to-date on the main branch, there is likely no need
to prefetch file contents, unless the user is annotating a side branch, which
requires a deeper integration between fastannotate and remotefilelog to just
prefetch the missing part correctly - I'll think about it later while this diff seems
good enough for common cases.
Test Plan: Added a new test
Reviewers: durham, #sourcecontrol, stash
Reviewed By: stash
Subscribers: mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D4256370
Signature: t1:4256370:1480612767:c2e2a9225fb63ff36ffb579f9641851eb8cd8b39
Summary:
Previously hg repack would only repack file content pack files. This patch makes
it also repack tree manifest pack files.
Test Plan:
Ran pull repack in a repo and verified the manifest packs were
repacked. I'll add some tests around this at some point.
Reviewers: #mercurial
Differential Revision: https://phabricator.intern.facebook.com/D4240723
Summary:
Previously, if there was no history available for a given revision, we would
just store the full text in the pack file. This patch makes it attempt to reuse
the existing delta base instead. This will be useful for repacking
treemanifests, since we currently don't have histpacks for them (we just delta
them efficiently when they are first added to the repo).
Test Plan: Ran tests
Reviewers: #mercurial
Differential Revision: https://phabricator.intern.facebook.com/D4240705
Summary:
Using `testedwith = 'internal'` is not a good habit [1]. Having it
auto-updated in batch would also introduce a lot of churn. This diff makes
them "ships-with-fb-hgext". If we do want to fill the ideal "testedwith"
information, we could put it in a centric place, like a "fbtestedwith"
extension rewriting those "ships-with-fb-hgext" on the fly.
Maybe having in-repo tags for tested Mercurial releases is also a good idea.
[1]: www.mercurial-scm.org/repo/hg/rev/2af1014c2534
Test Plan: `arc lint`
Reviewers: #sourcecontrol, rmcelroy
Reviewed By: rmcelroy
Subscribers: rmcelroy, mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D4244689
Signature: t1:4244689:1480440027:3dc18d017b48beba1176fbfd120351889259eb4b
Summary:
The debugindex and debugindexdot commands have moved and are not registered
unless you import the new mercurial.debugcommands module.
Test Plan:
Run
hg --config=extensions.remotefilelog=fb-hgext/remotefilelog help
and observe that you get help info, rather that the error
hg: unknown command 'debugindex'
then run the fb-hgext test suite.
Reviewers: rmcelroy, quark, simonfar
Reviewed By: simonfar
Subscribers: mjpieters, #mercurial
Differential Revision: https://phabricator.intern.facebook.com/D4244047
Signature: t1:4244047:1480427216:dcaa1ca441ea189bdf68f1f619b4078d8c1d09dc
Summary:
I'm going to use runshellfast in infinitepush.
To avoid copy-paste let's move it to the separate package.
Note: fastmanifest also has runshellfast but it's implementation
is a bit different and seems that it doesn't work with remotefilelog.
So I'll leave it alone for now.
Test Plan: python run-tests.py -j20
Reviewers: #sourcecontrol
Subscribers: mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D4175836
Summary:
Upstream has refactored the manifest class, so we need to update remotefilelog
to work with the new structure.
Also fix a mismatch with the new adjustlinkrev signature.
Test Plan: Ran the tests
Reviewers: #mercurial, rmcelroy
Reviewed By: rmcelroy
Subscribers: mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D4180879
Signature: t1:4180879:1479285946:d5ac954c495dc8b802d276ab5ac71626a1726612
Summary:
The upstream getchangegroup function changed back in August and our attempt at
fixing it here did not correclty handle the case where remotefilelog is loaded
but not enabled for this repository.
This patches fixes the signature, and makes our wrapping more resitant to future
signature changes.
Test Plan: Added a test.
Reviewers: #mercurial, jsgf
Reviewed By: jsgf
Subscribers: jsgf, mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D4099519
Signature: t1:4099519:1477698801:adcb406f1908c4f7e508e5f99e126f383c6f2574
Summary:
We've gotten reports of hg gc failing on some service machines because
`peer._repo.name` complains that repo has no attribute 'name'. I'm not sure how
this could happen, but it makes sense to make the hg gc loop more robust to the
possibility that the repos in the 'repos' file have changed their configuration
since they were added to the file.
Test Plan: Ran the tests
Reviewers: #mercurial, simonfar
Reviewed By: simonfar
Subscribers: mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D4072719
Signature: t1:4072719:1477385020:24d532b9442292ce8234cc91bc7de503d3b0c88f
I keep tripping over bugs where this angers some part of our
stack. It's dumb to even be fetching these, but it's harmless to skip
them, so issue a developer warning when we encounter one and refuse to
fetch it.
Summary:
Upstream changed the signature of computenonoverlap. Let's change our wrapping
of it to be more robust to signature changes.
Test Plan: ./run-tests.py test-remotefilelog* test-check*
Reviewers: #mercurial, quark
Reviewed By: quark
Subscribers: mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D4062705
Tasks: 14037455
Signature: t1:4062705:1477096049:5a011a7a5edf9bb01475694777c738cdb02453f5
Summary:
In a future patch we will start writing user local tree data into a local
directory. This patches add the local store to the union store so the contents
will be accessible once we start writing it.
It also renames manifest to manifests for consistency with the other store names
(like 'packs').
Test Plan: Ran the tests
Reviewers: #mercurial, quark
Reviewed By: quark
Subscribers: mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D4055827
Signature: t1:4055827:1477059457:0d5b0d999b47d88c73f5ab2721d8d27deacc01bc
Summary:
In a future diff we will be introducing packs into .hg/store, so we need to
differentiate between cache packs and local packs. This patch renames
getpackpath to getcachepackpath. A future diff will add getlocalpackpath.
This exposed a pyflakes error, so we fix that too.
Test Plan: Ran the tests
Reviewers: #mercurial, quark
Reviewed By: quark
Subscribers: mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D4055815
Signature: t1:4055815:1477059415:e0221557bbeec6701820c826f00390d3a71cd2d3
Summary:
This makes mutablepacks close and abort function accessible from outside the
__exit__ logic. This will be useful later when we tie the lifetime of a
mutablepack to a transaction.
Test Plan: Ran the tests
Reviewers: #mercurial, quark
Reviewed By: quark
Subscribers: mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D4055730
Signature: t1:4055730:1477059602:ffdfd66e65279ddf3ff43d7c2ee65b00f1fd2600
Summary:
This fixes all the pyflaks and module errors for the main remotefilelog
code base.
Test Plan: ./run-tests.py test-check* test-remotefilelog*
Reviewers: #mercurial, quark
Reviewed By: quark
Subscribers: mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D4055537
Signature: t1:4055537:1477049663:ee904d311d17d3659e055e2c109c68c9023cfd1f
Summary:
Calculating linkrev can be expensive for remotefilelog, while
non-remotefilelog code would usually expect `fctx.linkrev()` to be very cheap.
So let's cache it.
Test Plan: `arc diff`
Reviewers: #sourcecontrol, durham
Reviewed By: durham
Subscribers: mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D3936757
Signature: t1:3936757:1475078470:7630d3246a5ca9407456384ae0a9aff1035d5201
Summary:
readfast and readdelta have moved onto the manifestctx structures in upstream
hg, so let's change remotefilelog to use them.
This breaks the ability for this version of remotefilelog to work with old
versions of Mercurial, but we never really guaranteed this to begin with.
Test Plan: Ran the tests, they now pass
Reviewers: #sourcecontrol
Subscribers: mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D3915785
Summary:
Originally we wanted to use treemanifest.copy so it was explicit when a user was
performing a copy. Unfortunately, pytreemanifest contains a treemanifest as a
field (not a pointer), which means we can't call copy and assign the result to
that location. Let's use a copy constructor instead which lets us perform the
copy using a placement allocation.
There is no change to the consumers of this api, because nothing actually
consumed it. The pytreemanifest code was performing a copy using the default
copy constructor, so things were just broken before.
Test Plan: Later patches worked
Reviewers: #fastmanifest
Subscribers: mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D3885018
Summary:
FindContext did not initialize any of its values, and some of its callers did
not either. This meant 'invalidate_checksums' was set to a random value, which
meant it was almost always treated as true. Let's initialize the fields.
Test Plan: Later patches worked
Reviewers: #fastmanifest
Differential Revision: https://phabricator.intern.facebook.com/D3885017
Summary:
diffrecurse was still calling the fetcher directly. This caused it to not access
'resolved' manifests correctly in some cases. Let's change all uses to use
entry->get_manifest() which handles this stuff correctly.
Test Plan: Later patches worked
Reviewers: #fastmanifest
Subscribers: mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D3885016
Summary:
entry->initialize(filename, filenamelen, ...) function also sets the this->resolved
to a new empty Manifest. This has a couple bad side effects. 1) If
this->resolved was already present, it is overwritten and the memory leaks (such
as when entry->update() is called after it has already been resolved. 2) For the
root level manifest, it means when something goes to access it, it doesn't
bother to try to load the actual manifest data from the store. It just serves an
empty manifest instead. So this completely broke reading from the store.
This temporary fix is to handle entry->resolved manually outside the call in a
couple spots (root manifestentry creation, and entry->update). Once some of my
other changes go in, we should come back here and probably remove the "new
Manifest()" from entry->initialize() entirely.
Test Plan: Later patches worked
Reviewers: #fastmanifest
Differential Revision: https://phabricator.intern.facebook.com/D3885015
Summary:
The serialization was completely broken because std::string.append("\0") treated
the argument as a c-string which was completely empty. Let's use the
append(char*, size_t) version instead to make sure it appends the correct
characters.
Also makes othere uses of append in that function match the same pattern.
Test Plan: Later patches worked
Reviewers: #fastmanifest
Subscribers: mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D3885014
Summary:
The Manifest(PythonObj) constructor parses the given python string and creates a
list of entries for them. During a copy, if the list of entries on the copy
source has been modified to differ from the original PythonObj, this results in
the new copy having incorrect entries. Instead of using the
ManifestEntry(PythonObj) constructor, let's use the default constructor, then
manually fill in the entries ourselves.
We still need to copy the PythonObj over to the new version, since that
increments the reference count and keeps the string alive (since some of our
entries may be referencing it).
This also cleans up some usages of iterators to be less dependent on the
internal side effects of called functions (i.e. don't depend on the called
function to move the iterator forward).
Test Plan: Later patches worked
Reviewers: #fastmanifest
Subscribers: mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D3885013
Summary:
The old ManifestEntry init used a reference char* so it could move the pointer
to the end of the current read before it return, so the caller could then call
the function again to get the next entry. This caused problems when we tried to
use that initialization function to copy ManifestEntry's later, since it moved
the internal pointer of the entry we were copying.
Let's get rid of the & and make the function just return the pointer to the end
of the read, and the caller can decide to use it or not.
Test Plan: Later patches worked
Reviewers: #fastmanifest
Subscribers: mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D3885012
Summary:
This is used by markforrefresh, which starts being called in a later
patch.
Test Plan: Later patches worked
Reviewers: #fastmanifest
Subscribers: mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D3885011
Summary:
This adds a new write() function to treemanifest that allows us to serialize a
treemanifest instance into a provided data store.
Test Plan: A future diff uses this to serialize trees into a pack file.
Reviewers: #fastmanifest
Differential Revision: https://phabricator.intern.facebook.com/D3838787
Summary:
This adds a useful function for performing name comparisons on manifest entries.
This will be used in a future patch to simplify iterating over two manifests at
the same time.
Test Plan:
Used this as part of a future diff that serializes manifests into
pack files.
Reviewers: #fastmanifest
Differential Revision: https://phabricator.intern.facebook.com/D3837743
Summary:
This adds a function for computing the final node of a manifest, given its two
parent nodes.
Test Plan:
Used this as part of a future diff that serialize manifests into a
pack file.
Reviewers: #fastmanifest
Differential Revision: https://phabricator.intern.facebook.com/D3837737
Summary:
This is required to get the proper sorting with characters with the MSb set.
Wrote unit test to cover it.
Test Plan:
1. passed that unit test.
2. `PYTHONPATH=~/work/mercurial/facebook-hg-rpms/fb-hgext/:~/work/mercurial/facebook-hg-rpms/remotenames/:~/work/mercurial/facebook-hg-rpms/lz4revlog/ /opt/local/bin/python2.7 ~/work/mercurial/facebook-hg-rpms/hg-crew/hg --config extensions.treemanifest_correctness=~/work/mercurial/facebook-hg-rpms/fb-hgext/tests/treemanifest_correctness.py --config remotefilelog.fastdatapack=True testtree --build "master~50000::master" --revs 'master + master~5000'` is clean now!
Reviewers: #fastmanifest, durham
Reviewed By: durham
Subscribers: mitrandir, mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D3869292
Signature: t1:3869292:1473922557:4c691f696ea991a578a151b8091ae44beff528df
Summary:
Mercurial core does this sometimes where it sets the new entries before it removes the old entries. In that case, you might have:
old manifest:
```
abc
```
operations:
```
+ abc/def
- abc
```
If you add `abc/def` to a manifest already containing `abc`, it will throw a `TypeError` today. With this change, we accept this weird state.
Test Plan: Update the tests to cover both versions of this scenario (directory was there first vs file was there first). Pass all tests.
Reviewers: #fastmanifest, durham
Reviewed By: durham
Subscribers: durham, mitrandir, mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D3865617
Signature: t1:3865617:1473897481:4cf4f122dfd3a3759fe84ea167b8cb0e78238bc2
Summary:
fileiter/stackframe gain a `sorted` field, which it uses to determine whether to use a ManifestIterator or a SortedManifestIterator.
add a unit test to test this scenario
Test Plan: pass unit tests.
Reviewers: #fastmanifest, durham
Reviewed By: durham
Subscribers: durham, mitrandir, mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D3868088
Signature: t1:3868088:1474225427:64d19f3f58376682850e46a3547e7a6ce1de97f4
Summary:
Mercurial has this odd sort order where the full path is used in the sort order. Therefore 'abc.def' would precede 'abc/def'.
This introduces a comparator that follows this rule. We add a new class, SortedManifestIterator, which returns entries in this order.
Test Plan: compiles
Reviewers: #fastmanifest, durham
Reviewed By: durham
Subscribers: durham, mitrandir, mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D3862339
Signature: t1:3862339:1473900496:7f6fed3cdba055bb1791fbc8ae08546e24c96f10
Summary: This way, we can have a true nullable (null, a.k.a., not present, and any character).
Test Plan: make local and pass tests
Reviewers: #fastmanifest, durham
Reviewed By: durham
Subscribers: mitrandir, mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D3855340
Signature: t1:3855340:1473807404:54c1ae17c735665935a52b0140b58094f0b45a26
Summary: `initialize()` does it already.
Test Plan: run unit tests.
Reviewers: #fastmanifest, durham
Reviewed By: durham
Subscribers: mitrandir, mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D3869524
Signature: t1:3869524:1474225262:57ccfd92a4d02bc92674007bcafaeafedef413a0
Summary: D3842603 broke reads from datapacks because `.initialize()` creates a blank manifest and attaches it to the manifest entry. When we subsequently try to read the root, we don't properly actually resolve it to the manifest in the datapack. Instead, we use that blank manifest which is wrong.
Test Plan: `PYTHONPATH=~/work/mercurial/facebook-hg-rpms/fb-hgext/:~/work/mercurial/facebook-hg-rpms/remotenames/:~/work/mercurial/facebook-hg-rpms/lz4revlog/ /opt/local/bin/python2.7 ~/work/mercurial/facebook-hg-rpms/hg-crew/hg --config extensions.perftest=~/work/mercurial/facebook-hg-rpms/fb-hgext/tests/perftest.py --config remotefilelog.fastdatapack=False testtree --kind flat,ctree,fast --test fulliter --build "master~50000::master" --revs 'master + master~5000'
Reviewers: #fastmanifest, durham
Reviewed By: durham
Subscribers: durham, mitrandir, mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D3862319
Signature: t1:3862319:1474225268:f01ef876d569bc09c1e0ace71492bbaae017404e
Summary: Any modifying operation sholud set the `invalidate_checksum` flag, and as we unwind the stack, we clear the nodes if the flag is set.
Test Plan: not yet tested
Reviewers: #fastmanifest, durham
Reviewed By: durham
Subscribers: durham, mitrandir, mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D3842843
Tasks: 13268688
Signature: t1:3842843:1473709799:89a82ffade1e56fe64ccae50abecd3de4b46f6bd
Summary:
This adds a Manifest function that can produce a serialized version of the
manifest that is appropriate for writing into a store.
Test Plan:
Used it as part of a future diff that serialized manifests to a pack
file.
Reviewers: #fastmanifest, ttung
Reviewed By: ttung
Subscribers: ttung, mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D3837714
Signature: t1:3837714:1473454260:1c5864730744a560cbad4fc4ab5263ef4447f7dc
Summary:
This api will be used to tell the store that new data has been added and the
store should check disk for new data (like new pack files).
Test Plan:
Used as part of a future diff that generates new tree entries during
hg pull time.
Reviewers: #fastmanifest, ttung
Reviewed By: ttung
Subscribers: mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D3837694
Signature: t1:3837694:1473454036:0f1f37064cadc0fe3dabc6ecc6e3f7b3319e3d56
Summary:
This implements the copy function, which will be used to create a copy of a
parent manifest when beginning to create a new manifest.
Test Plan:
Used in conjunction with future diffs to serialize a new treemanifest
into a pack file.
Reviewers: #fastmanifest, ttung
Reviewed By: ttung
Subscribers: ttung, mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D3837415
Signature: t1:3837415:1473453896:494fc856256764f14741c65258f354e08383f1f7
Summary:
Removes are treated as set to None.
The API is not exactly correct, as manifestdict needs to implement size/getitem/setitem, but we can refactor in a subsequent diff.
Test Plan: pass unit test
Reviewers: #fastmanifest, durham
Reviewed By: durham
Subscribers: mitrandir, mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D3836197
Signature: t1:3836197:1473363870:21a447e4bc17148193fde7ed4887927089e54930
Summary: Uses `walk()` to locate the element. Callback is super simple again.
Test Plan: used in later diff.
Reviewers: #fastmanifest, durham
Reviewed By: durham
Subscribers: mitrandir, mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D3836142
Signature: t1:3836142:1473363781:0384d2c0b7154d3acbbab436c12bbdb55a5a63a7
Summary:
Python binding is pretty straightforward. The test exercises a few simple paths: set, update, and conflict detection.
Not yet done: tests to set a deeply nested directory structure.
Test Plan: PASSED THE SIMPLE TESTS. YES!
Reviewers: #fastmanifest, durham
Reviewed By: durham
Subscribers: durham, mitrandir, mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D3832682
Signature: t1:3832682:1473363415:60240c6bbbcec82184b4588b83bbda750e8b77f1
Summary: This uses the `treemanifest.find(..)` interface to locate the desired tree entry and add/update a leaf node (a.k.a. file). The `set()` method is mostly setup and teardown logic for find, and the little unique code that needs to exist is in `set_callback()`
Test Plan: used in later diff.
Reviewers: #fastmanifest, durham
Reviewed By: durham
Subscribers: trunkagent, durham, mitrandir, mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D3832664
Signature: t1:3832664:1473459754:a258f42791b4c4d561abad043d8ad21ddb48fbed
Summary: This is needed when we're updating a node that already exists.
Test Plan: used in later diff.
Reviewers: #fastmanifest, durham
Reviewed By: durham
Subscribers: mitrandir, mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D3832631
Signature: t1:3832631:1473362920:8a39f02ced91cc9328c57e416d7dfc9a704b0cc6
Summary: Internally, mercurial transacts in bin nodes.
Test Plan: used in later diff
Reviewers: #fastmanifest, durham
Reviewed By: durham
Subscribers: durham, mitrandir, mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D3832614
Signature: t1:3832614:1473359939:033cb1d944680287fb6563523c7782a80118b9e2
Summary:
flags should be == NULL if it's not present, not that it points to NULL.
Since the flags pointer isn't always set, we have to use the filename pointer to set the separator field ('\n').
Test Plan: used in later diff.
Reviewers: #fastmanifest, durham
Reviewed By: durham
Subscribers: mitrandir, mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D3832555
Signature: t1:3832555:1473359723:9f05dd353f0c1866232214cd0d489d1fb4573f51
Summary:
This implements the walk function, which is almost identical to iterkeys, except
it filters the result using a provided matcher.
Test Plan:
As before, not sure how to execute this from in mercurial, but it's
pretty small so we'll test it with the full mercurial test suite once we can do
edits.
Reviewers: #fastmanifest
Differential Revision: https://phabricator.intern.facebook.com/D3775631
Summary:
This implements the ability to iterate of the entries of the manifest. We reuse
the key iter, but add a boolean option for returning the full (path, node, flag)
tuple instead of just the path.
Test Plan:
I'm not sure of how to execute this via an hg command, so for now I'm
just going to
doitlive
Reviewers: #fastmanifest
Differential Revision: https://phabricator.intern.facebook.com/D3775619
Summary: This is needed when we execute a `set`, which comes in with a binary node, but is stored as a hex string.
Test Plan: make local; used in later diff.
Reviewers: #fastmanifest, durham
Reviewed By: durham
Subscribers: mitrandir, mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D3831923
Signature: t1:3831923:1473292766:bd4c1fdde356a8de5b4b74328e3805b7e1eae752
Summary:
1. remove unused `fetcher` variables.
2. std::string's second arg is a size_t, not a Py_ssize_t.
Test Plan: make local
Reviewers: #fastmanifest, durham
Reviewed By: durham
Subscribers: mitrandir, mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D3830569
Signature: t1:3830569:1473292747:823b5c06490c45c0bfcc59373e7935af0c1b3630
Summary: Since a lot of accesses are coming from within the treemanifest class, I renamed the cached variable as rootManifest_DO_NOT_ACCESS_DIRECTLY to make people think twice before accessing it.
Test Plan: `PYTHONPATH=~/work/mercurial/facebook-hg-rpms/fb-hgext/:~/work/mercurial/facebook-hg-rpms/remotenames/:~/work/mercurial/facebook-hg-rpms/lz4revlog/ /opt/local/Library/Frameworks/Python.framework/Versions/2.7/Resources/Python.app/Contents/MacOS/Python ~/work/mercurial/facebook-hg-rpms/hg-crew/hg --config extensions.perftest=~/work/mercurial/facebook-hg-rpms/fb-hgext/tests/perftest.py testtree --kind flat,ctree,fast --test fulliter,diff,find --build "master~5::master"`
Reviewers: #fastmanifest, durham
Reviewed By: durham
Subscribers: durham, mitrandir, mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D3823297
Signature: t1:3823297:1473213565:54657eeecff35549c71dc0af91c71cca9d426ac1
Summary:
The root node is now optional. If it's not specified, an empty tree is constructed.
Added an unit test to cover this code path.
Test Plan: `PYTHONPATH=~/work/mercurial/facebook-hg-rpms/fb-hgext/:~/work/mercurial/facebook-hg-rpms/remotenames/:~/work/mercurial/facebook-hg-rpms/lz4revlog/ /opt/local/bin/python2.7 ~/work/mercurial/facebook-hg-rpms/hg-crew/hg --config extensions.perftest=~/work/mercurial/facebook-hg-rpms/fb-hgext/tests/perftest.py --config remotefilelog.fastdatapack=True testtree --kind flat,ctree,fast --test fulliter,diff,find --build "master~5::master"` still runs.
Reviewers: #fastmanifest, durham
Reviewed By: durham
Subscribers: mitrandir, mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D3803569
Signature: t1:3803569:1472759620:02c1f723fe3b7f4cdca9da6399fc2296c6d5a022
Summary: This allows us to set the node hash when we add the child.
Test Plan: make local
Reviewers: #fastmanifest, durham
Reviewed By: durham
Subscribers: mitrandir, mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D3803564
Signature: t1:3803564:1472759281:a5e413ba7cf12319c4460da213ad8be7a0eed673
Summary: This removes a bunch of extra arguments we were passing in because the treemanifest knows about it already. Also inlined the entire resolve-the-root-manifest operation into a separate method.
Test Plan: `PYTHONPATH=~/work/mercurial/facebook-hg-rpms/fb-hgext/:~/work/mercurial/facebook-hg-rpms/remotenames/:~/work/mercurial/facebook-hg-rpms/lz4revlog/ /opt/local/bin/python2.7 ~/work/mercurial/facebook-hg-rpms/hg-crew/hg --config extensions.perftest=~/work/mercurial/facebook-hg-rpms/fb-hgext/tests/perftest.py --config remotefilelog.fastdatapack=True testtree --kind flat,ctree,fast --test fulliter,diff,find --build "master~5::master"`
Reviewers: #fastmanifest, durham
Reviewed By: durham
Subscribers: durham, mitrandir, mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D3800793
Signature: t1:3800793:1472758998:c4617894c947f84268a3fa1d0cdde03793889640
Summary: We don't actually care about the store. We want to invoke the fetcher. At some future point, we could replace the fetcher with something that goes to the network or whatever, or unit tests can point to something special.
Test Plan: make local && `PYTHONPATH=~/work/mercurial/facebook-hg-rpms/fb-hgext/:~/work/mercurial/facebook-hg-rpms/remotenames/:~/work/mercurial/facebook-hg-rpms/lz4revlog/ /opt/local/bin/python2.7 ~/work/mercurial/facebook-hg-rpms/hg-crew/hg --config extensions.perftest=~/work/mercurial/facebook-hg-rpms/fb-hgext/tests/perftest.py --config remotefilelog.fastdatapack=True testtree --kind flat,ctree,fast --test fulliter,diff,find --build "master~5::master"`
Reviewers: #fastmanifest, durham
Reviewed By: durham
Subscribers: mitrandir, mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D3800789
Signature: t1:3800789:1472758892:67cee86424da4b7b903f2495e31bdd03d793b21a
Summary: treemanifest_find is now a generic method to traverse a tree (like tree_path in cfastmanifest). It manages the tree by adding and removing intermediate nodes (if requested). Unlike _treemanifest_find, it uses lower-level APIs such as findChild and addChild, which uses std::list::iterator to avoid duplication of work. Previously, we would need to traverse the entire list of children to find a child, and then if it's not found, traverse the entire list again to properly locate it. Now we just find the exact location where the child is or should be, and then save that.
Test Plan: `PYTHONPATH=~/work/mercurial/facebook-hg-rpms/remotefilelog:~/work/mercurial/facebook-hg-rpms/fb-hgext/:~/work/mercurial/facebook-hg-rpms/remotenames/:~/work/mercurial/facebook-hg-rpms/lz4revlog/ /opt/local/bin/python2.7 ~/work/mercurial/facebook-hg-rpms/hg-crew/hg --config extensions.perftest=~/work/mercurial/facebook-hg-rpms/remotefilelog/tests/perftest.py testtree --kind flat,ctree,fast --test fulliter,diff,find --build "master~5::master"`
Reviewers: #fastmanifest, durham
Reviewed By: durham
Subscribers: mjpieters, durham, mitrandir
Differential Revision: https://phabricator.intern.facebook.com/D3772947
Signature: t1:3772947:1472758787:8e8597c30f2cf1512ac7195bfa44ed62c201deb2
Summary:
In upstream mercurial, c472ca028b32 changed changegroup.getchangegroup() to not
take a 'common' argument anymore.
Test Plan: run tests, no more stack traces
Reviewers: #mercurial, ttung, durham, quark
Reviewed By: quark
Subscribers: mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D3819304
Signature: t1:3819304:1473091041:587ecb88dfcdd806d372825ec4d7512eb7fdb94e
Summary:
We were seeing performance issues on linux if we performed the madvise every
100MB. Let's do it once every gigabyte instead. The performance cost is pretty
negligble at that level.
Test Plan:
Ran the perf test for fulliter and verified the time went back down
to normal (a factor of 6x).
Reviewers: #fastmanifest, ttung
Reviewed By: ttung
Subscribers: ttung, mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D3799820
Signature: t1:3799820:1472675258:d279ac773aa4a2e027f4ecbe8a485b299a364efe
Summary: strnlen on some platforms use xmm registers to parallelize the operation. this may cause reads from uninitialized memory, which valgrind generates a false positive error. to avoid that, we preinitialize the memory *beyond* what we normally write so valgrind does not flag that as reading from uninitialized memory.
Test Plan: valgrind only complains about one uninitialized read (fixed in next diff)
Reviewers: #fastmanifest, durham
Reviewed By: durham
Subscribers: mitrandir, mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D3799193
Signature: t1:3799193:1472672134:c866b6295cbfc9f4dd5e0d84a3af9d41144981c8
Summary:
This is to fix the below compiling error:
```
remotefilelog/cdatapack/cdatapack.c:102:28: error: implicit declaration of function ‘be64toh’ [-Werror=implicit-function-declaration]
packindex->data_offset = ntoh_data_offset(
^^^^^^^^^^^^^^^^
```
Feature test macros must be defined before libc `#include`s.
Test Plan: `make local` and the above error disappers.
Reviewers: #sourcecontrol, ttung
Reviewed By: ttung
Subscribers: mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D3797669
Signature: t1:3797669:1472671003:9f8674a14c39042d24c673e41321683807a60fab
Summary: D3786546 was broken in that it used an uninitialized return value.
Test Plan: `valgrind /Users/tonytung/Library/Caches/CLion2016.2/cmake/generated/fb-hgext-c4fc6e6f/c4fc6e6f/RelWithDebInfo1/cdatapack_get ~/.hgcache/fbsource/packs/e70f2fc9ec08516a196a6f5f3a475b21b6cee3b2 749a00bf1a195a711de2818b23da78e7f1ee034a` is 100% clean.
Reviewers: #fastmanifest, quark
Reviewed By: quark
Subscribers: mitrandir, mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D3799213
Signature: t1:3799213:1472671038:51522987c9b3bebc281d17f72a422af542d3d815
There was a dumb bug where we attempted to use the matcher if it was NULL,
instead of only using the matcher when it was not NULL. We just need to remove
the bad !.
Summary:
Things I considered:
1. doing this after a fixed number of delta chain reads (this would likely be just fine, actually). there is the risk of a few very long/large chains blowing out the memory footprint.
2. tracking the actual memory addresses and not madvising away everything away. (unlikely to help, since it's probably the syscall + page table updates that is costly, not the address range)
3. letting FreeBSD / MACOS go nuts with MADV_FREE while limiting only linux (which uses the more costly MADV_DONTNEED)
In the end, I settled on this because we're already using this for the python version and it hasn't killed anyone.
Test Plan:
run:
```
#!/usr/bin/env python2.7
import binascii
import cdatapack
dp = cdatapack.datapack('/Users/tonytung/.hgcache/fbsource/packs/e70f2fc9ec08516a196a6f5f3a475b21b6cee3b2')
for ix, line in enumerate(open('/tmp/hashes', 'r')):
line = line.strip()
try:
dp.getdeltachain(binascii.unhexlify(line))
except:
print "failed when retrieving " + line
raise
```
and observed that the memory footprint stayed quite low.
Reviewers: #fastmanifest, durham
Reviewed By: durham
Subscribers: mitrandir, mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D3794420
Tasks: 12829688
Signature: t1:3794420:1472597468:6642f1e6cd85859dbf7cce3e4ce35d5375601be0
Summary: When lz4 gets a 0-length input to compress, it outputs a standard header (0 byte uncompressed output, followed by 12 mystery bytes.) When we uncompress this, `LZ4_decompress_fast` returns -1 (!). So we treat this as an oddball case and don't declare it corrupt.
Test Plan: dump all the entries in my fbsource datapack without any corruption reports
Reviewers: #fastmanifest, durham
Reviewed By: durham
Subscribers: mitrandir, mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D3794368
Signature: t1:3794368:1472597620:a1368e55b6f74882c778036a5459e92d9c026b6a