Commit Graph

49 Commits

Author SHA1 Message Date
Martin von Zweigbergk
817da894d8 remotefilelog: don't assign a size_t to an int
On some platforms, size_t is 64 bits and int is 32 bits.
2017-04-04 11:48:27 -07:00
Durham Goode
95eacecbfc treemanifest: fix incorrect tree creation for empty manifests
Summary:
There was a bug where if a commit had an empty manifest (i.e. same contents as
it's parents, but different p1/p2) and it resulted in a
non-empty-but-still-a-noop delta in the revlog (i.e. a delta that deletes a line
and replaces it with the same content), this resulted in a no-op set to the tree
manifest. When the tree was serialized, it noticed that the set was a no-op, so
it didn't serialize that particular tree, but the parent didn't get notified it
was a no-op, so we serialized parent directories with pointers to sub trees that
did not exist.

The fix is to not store new sub-tree nodes on parents when the sub-tree contents
are the same. Now we just store the original sub-tree node. So we no longer
accidentally reference non-existent trees.

Unfortunately I'm not sure how Mercurial can get into this situation (how do you
produce a delta that has content, but the content is a no-op?), so I'm not sure
how to test it. The tree verification command in another patch can catch this
exception though.

Test Plan:
Ran 'hg debuggentrees' on a repo that has a manifest entry that
exhibits this problem. Verified via the debugger that only one tree (the root
node) was generated from adding that manifest.

Reviewers: #mercurial, simonfar

Reviewed By: simonfar

Subscribers: simonfar, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4724539

Signature: t1:4724539:1489789531:02a9a75a85aa2a0a6e4c16e163867bd5a6f55670
2017-03-18 19:40:15 -07:00
Durham Goode
84a862e6f3 treemanifest: support matcher in diff
Summary:
Upstream has added a matcher argument to the diff API which allows diff to avoid
traversing certain parts of the tree. This adds support for that to our native
treemanifest implementation.

Test Plan: Added tests for diff with matches

Reviewers: #mercurial, stash

Reviewed By: stash

Subscribers: stash, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4677023

Signature: t1:4677023:1489076627:dbcea209d300a68fa050f68c52b4fd9949b85302
2017-03-12 12:49:18 -07:00
Durham Goode
a365f1d7b2 treemanifest: refactor path manipulation in diff
Summary:
A bunch of if statements were doing the same thing every time. This moves that
logic out of the individual if statements, which makes it cleaner and will make
a subsequent patch that runs a matcher against the paths easier.

Test Plan: Ran the tests

Reviewers: #mercurial, stash

Reviewed By: stash

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4677011

Signature: t1:4677011:1489075541:9379597b8866358ad5dad3f1f9ae00a0a0b523f9
2017-03-12 12:49:18 -07:00
Durham Goode
5dac028b0e cstore: move pythonutil to cstore
Summary:
Now that ctreemanifest no longer depends on python.h, let's move pythonutil over
to cstore where all the python code is.

Test Plan: Ran the build and the tests

Reviewers: #mercurial, simonfar

Reviewed By: simonfar

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4663988

Signature: t1:4663988:1488895919:652b3fc35a2dd12c51a9f70e32997c7b4d037c95
2017-03-07 11:39:46 -08:00
Durham Goode
9b3457e955 ctreemanifest: replace PythonObj with DiffResult in treemanifest
Summary:
This is the last piece of removing the Python dependency from the core
treemanifest code. This replaces the old PythonObj diff dictionary with a new
DiffResult class that has a PythonDiffResult implementation.

Test Plan:
Ran the test suite, including the unit tests that explicitly cover
treemanifest diff from python.

Reviewers: #mercurial, simonfar

Reviewed By: simonfar

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4663984

Signature: t1:4663984:1488889615:62f064924b0d6b45dfa7e490d19418060c374f40
2017-03-07 11:39:46 -08:00
Durham Goode
8eede45c7f cstore: add Matcher class
Summary:
As part of breaking the native cstore implementation away from Python, let's
create a Matcher class that can be used to perform path match testing. Initially
the only implementation is the PythonMatcher which just wraps a python match
object.

Test Plan: Covered by existing matcher tests in cstore-treemanifest.py

Reviewers: #mercurial, simonfar

Reviewed By: simonfar

Subscribers: simonfar, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4663354

Signature: t1:4663354:1488914384:2c33c7e0e7f2eade0786b6ff41503317989fd1e5
2017-03-07 11:39:46 -08:00
Durham Goode
a79aa29030 cstore: move uniondatapackstore holder to be a shared_ptr
Summary:
In a future patch we will want to pass the uniondatapackstore around to other
objects who will contribute to the lifetime. Let's change it to a shared_ptr so
that becomes easy.

Let's also make the destructor virtual, so we can pass different types of stores
around and have them be destructed correctly.

Test Plan: Ran the tests

Reviewers: #mercurial, simonfar

Reviewed By: simonfar

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4603893

Signature: t1:4603893:1487847173:2fc3505032ea8c30cf9e0f76ac4e75d64513d87d
2017-02-23 14:03:04 -08:00
Durham Goode
195cb62bde ctreemanifest: remove PythonObj from manifest data structures
Summary:
The old code kept a PythonObj around inside the ManifestFetcher for fetching
manifest contents from the store. As part of moving the treemanifest code to use
the new native cstore API let's make the manifest code depend on a Store
abstraction and have one implementation be a PythonStore.

This removes almost all of the python dependencies from the core treemanifest
code, except some logic around running the python matcher during iteration and
writing directly to the python result dict during diff. We'll abstract those
away later.

Test Plan: Built and ran the tests

Reviewers: #mercurial, simonfar

Reviewed By: simonfar

Subscribers: simonfar, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4569944

Signature: t1:4569944:1487847102:d005b6484fd7de9335961b0bc4530505b25f961d
2017-02-23 14:03:03 -08:00
Durham Goode
3b5fb8770e cstore: implement UnionDatapackStore.getdeltachain()
Summary:
Implements the getdeltachain function on the new UnionDatapackStore class.

This required some modifications to the DeltaChainIterator. Since the results of
the iterator may cross multiple different chains, we need to keep each chain
alive until the iterator is destructed, so we need to keep a reference to each
chain. We also had to remove the size() property from the iterator since the
fact that the chain spans chains means we don't know the size up front.

Test Plan: Adds a test

Reviewers: #mercurial, simonfar

Reviewed By: simonfar

Subscribers: simonfar, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4556458

Signature: t1:4556458:1487199872:07dffa3121acfbeb6d6993b518e6f4887122d4d5
2017-02-23 14:03:03 -08:00
Durham Goode
2cd1eeb08e ctreemanifest: move treemanifest into cstore
Summary:
As part of unifying our native store data structures into a single library,
let's move the treemanifest (including the python extension) into py-cstore.

Test Plan:
Built and ran the tests. Verified there was no ctreemanifest.so
dependency in the built cstore.so by using 'ldd cstore.so' on Linux and 'otools
-L cstore.so' on OSX.

Reviewers: #mercurial, simonfar

Reviewed By: simonfar

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4602484

Signature: t1:4602484:1487842683:964cbb43b7cb20d0db699ef691fe7fcf6bccf2e8
2017-02-23 14:03:03 -08:00
Durham Goode
3796c0a311 ctreemanifest: make find() throw KeyError
Summary:
Previously the find function would return None if the given file was not present
in the tree. The other manifest implementations throw a KeyError here instead.
This affects things like sparse looking for files in the null revision, where it
was receiving a None and crashing because it expected a KeyError in this case.

Test Plan: Added a test. Also, clones were working again.

Reviewers: #mercurial, rmcelroy

Reviewed By: rmcelroy

Subscribers: rmcelroy, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4412592

Signature: t1:4412592:1484321389:690e6cd7b3d1599714102ebf361e59fcb0ed59ef
2017-01-13 09:42:39 -08:00
Durham Goode
dd42602a25 ctreemanifest: fix new tree iteration when nodes are late-detected as same
We need to keep the path in sync with the stack, and since we've already popped
the stack here, we also need to pop the path in this short-circuit case.
2016-12-31 18:22:38 -08:00
Durham Goode
51ae957ffc treemanifest: add simple test for tree repack
Summary:
This adds a simple test that verifies hg repack will pack two tree manifest
packs into one.

It caught a bug where creating a treemanifest for a commit with a null parent
produced incorrect output because it constructed an empty tree and tried to use
it's node as the parent of the delta, when there should not have been any delta
in the first place. This is fixed by this diff as well.

Test Plan: Ran the new test

Reviewers: #mercurial, dsyang, rmcelroy

Reviewed By: rmcelroy

Subscribers: rmcelroy, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4261591

Signature: t1:4261591:1480705822:ef21fb8cebd8b89f92f58f11bb1dab59bf97664d
2016-12-02 14:37:55 -08:00
Durham Goode
7a71d2e604 ctreemanifest: implement fast path for matches
Summary:
The original python code for manifest matches has a fast path for when the
matcher contains a specific list of files, this let it check those specific
files instead of iterating over the entire manifest. Let's add this same
optimization to our ctreemanifest implementation.

This greatly speeds up copies._computeforwardmissing() during rebases.

Test Plan: Ran rebase and verified it was much faster

Reviewers: #mercurial, ikostia

Reviewed By: ikostia

Subscribers: rmcelroy, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4248075

Signature: t1:4248075:1480505248:964cbdd701fe29c38a7a0d9bf0752fb5d4bb0ed0
2016-12-02 14:37:31 -08:00
Durham Goode
61a86bdc94 treemanifest: catch pyexception for __nonzero__
Summary:
The recently added __nonzero__ function was not handling the possibility of
pyexceptions. This adds the appropriate catch.

A pyexception is where somewhere up the stack something has set the python error
string, and we just need to return the appropriate error value from the top
level c api to indicate an error happened.

Test Plan: none. I just caught this while debugging

Reviewers: #mercurial

Differential Revision: https://phabricator.intern.facebook.com/D4202090
2016-11-29 15:37:58 -08:00
Durham Goode
a0dc16174d treemanifest: write deltas for trees
Summary:
Previously, when we wrote each tree entry into a pack file, it wasn't delta'd in
any way. This patch makes it store the delta against p1 in the pack file.

Testing in a large repo shows this reduces tree pack size by about 22x.

Test Plan:
Ran the tests. Did a pull in a large repo and saw the pack file was
22x smaller than before (and still usable).

Reviewers: #mercurial

Differential Revision: https://phabricator.intern.facebook.com/D4202088
2016-11-29 15:37:58 -08:00
Durham Goode
ceb783595f treemanifest: add ManifestNode type and use it in NewTreeIter
Summary:
As part of enabling deltas in our tree pack files, we need NewTreeIter to return
the p1 and p2 Manifests as well. To do this we need to return a Manifest/Node
tuple. Since this is becoming a common structure, let's define a type for it and
change NewTreeIter to use it for it's return values.

Test Plan:
Ran the tests. With a future diff I built a pack file and verified it
was small because of deltas.

Reviewers: #mercurial

Differential Revision: https://phabricator.intern.facebook.com/D4202080
2016-11-29 15:37:58 -08:00
Durham Goode
b179a3593c treemanifest: sort parent nodes for hash computation
Summary:
Core mercurial sorts p1 and p2 before computing the hash, so it's deterministic.
We need to do the same.

Test Plan: Ran the tests, saw a hash changed

Reviewers: #mercurial

Differential Revision: https://phabricator.intern.facebook.com/D4202063
2016-11-29 15:37:58 -08:00
Durham Goode
f758ad8c4c treemanifest: fix bug in NewTreeIterator walk
Summary:
The new tree walk did not check if the compare entry was actually a directory
before traversing it. This caused problems for commits that deleted a file and
replaced it with a directory, since it attempted to recurse down the file, which
had no treemanifest.

Test Plan: Added a test

Reviewers: simpkins, #mercurial, zamsden, rmcelroy

Reviewed By: rmcelroy

Subscribers: net-systems-diffs@, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4055678

Signature: t1:4055678:1477430721:ec679afcb4ff6ea2bbf44927e04f2bfcdee8cc03
2016-10-25 14:39:52 -07:00
Durham Goode
a866710af5 treemanifest: improve performance of treemanifest.text()
Summary:
This improves the performance of treemanifest.text() from 5s to 3s. This
function is used when converting a treemanifest to a full text.

Test Plan:
Ran the unit tests.

This was visible when running hg commit with the extension enabled. I verified
the time went down from 5 to 3. A future diff will add an integration test suite
for the treemanifest extension.

Reviewers: simpkins, #mercurial, quark

Reviewed By: quark

Subscribers: quark, net-systems-diffs@, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4055693

Signature: t1:4055693:1477067172:47a61fe98e57b4b10e329b7f53f41b80bf15d205
2016-10-21 11:02:15 -07:00
Durham Goode
c0aee83142 treemanifest: fix build breaks on OSX 2016-10-17 11:47:47 -07:00
Simon Farnsworth
195bbe3fcb Fix compile-breaking format string issue
Summary: "%d" is the wrong format specifier for size_t, and thus this does not build.

Test Plan: make local and see build succeed

Reviewers: #mercurial, quark

Reviewed By: quark

Subscribers: quark, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4030507

Signature: t1:4030507:1476726943:44c480a1e231fe98354d386116b112d508436093
2016-10-17 11:31:18 -07:00
Durham Goode
8557048507 treemanifest: implement iteritems()
This implements the iteritems api. As part of this change we needed to expand
our existing iterator options to allow separating nodes from flags.
2016-10-14 16:01:12 -07:00
Durham Goode
8eea28cd2e treemanifest: implement treemanifest.keys() 2016-10-14 16:01:12 -07:00
Durham Goode
c3bc594c82 treemanifest: fix and test treemanifest.matches()
It had a bug where it thought the node was hex but it was already binary.
2016-10-14 16:01:12 -07:00
Durham Goode
2b41307ebe treemanifest: fix treemanifest.flags() when the file doesn't exist
manifest.flags() actually returns the default value if the filename doesn't
exist. So we need to replicate that behavior.

As part of this fix, I changed treemanifest.get() to return a boolean indicating
whether the file was found or not.
2016-10-14 16:01:12 -07:00
Durham Goode
f7249fd2e2 treemanifest: implement treemanifest.__nonzero__
This implements the __nonzero__ function, which is necessary for things like
`if mymanifest:`
2016-10-14 16:01:12 -07:00
Durham Goode
ae6de9bbdb treemanifest: implement treemanifest.dirs()
This implements the dirs function, which returns a collection set that can
answer the question of if a directory is in the manifest. Currently we do a
naive solution of using util.dirs(), which iterates over all the files. Given
that we have a tree already, we should be able to return something smarter in
the future.
2016-10-14 16:01:12 -07:00
Durham Goode
c171077ebd treemanifest: add None check for treemanifest.contains()
Mercurial sometimes checks if "None in mf", so we need to make sure we return
False in that situation.
2016-10-14 16:01:12 -07:00
Durham Goode
a68bd8bc73 treemanifest: add error checking for all argument strings
We weren't checking if the passed in string arg was successfuly parsed. This
patch adds checks for all of those instances.
2016-10-14 16:01:12 -07:00
Durham Goode
75c7cb42a1 treemanifest: implement hasdir()
Implements the py-treemanifest.hasdir() function
2016-10-14 16:01:12 -07:00
Durham Goode
89ad596f78 treemanifest: allow find() to get files and directories
Previously find would only return files. For treemanifest.hasdir() we needed it
to find directories as well. This patch adds a new enum for indicating if the
find should return files, directories, or both.
2016-10-14 16:01:12 -07:00
Durham Goode
445c6b0d4e treemanifest: fix diff on transient trees
The diff algorithm assumed every tree already had a node. If we are iterating
over an uncommitted tree, it may have tree entries with NULL as their node. We
need to always recurse in these cases.
2016-10-14 16:01:12 -07:00
Durham Goode
cd423fd42a treemanifest: implement 'clean' on diff
hg has an optional 'clean' arg on diff, which causes it to also return files
that aren't different between the two diffs. This implements it on our
treemanifest diff algorithm.
2016-10-14 16:01:12 -07:00
Durham Goode
7f2988ce25 treemanifest: implements text()
Implements the py-treemanifest.text() function
2016-10-14 16:01:12 -07:00
Durham Goode
f108eebde4 treemanifest: implements __setitem__
Implements the py-treemanifest.__setitem__ function. This also handles the
__delitem__ case.
2016-10-14 16:01:12 -07:00
Durham Goode
d96c188d00 treemanifest: implement setflag()
Implements the py-treemanifest.setflag() function
2016-10-14 16:01:12 -07:00
Durham Goode
ea9c99a376 treemanifest: implement get()
Implements the py-treemanifest.get() function
2016-10-14 16:01:12 -07:00
Durham Goode
9819c37220 treemanifest: update entry nodes even if they haven't changed
During serialization, if we encountered a tree entry that had a NULL node, but
the contents matched the prior version, we considered it unchanged and did not
replace the null pointer with a pointer to the actual hash. This meant when it's
parent tried to serialize it, it would encounter a null pointer exception and
crash. Now we always fill in the node during popResult, since by definition it
will be null there (since the only way for something to be pushed is for it to
be null).
2016-10-14 16:01:12 -07:00
Durham Goode
24071807b7 treemanifest: change find() to do copy-on-write
The find() function is used to perform set and delete operations on a tree. Now
that we track Manifests via mutability and ref counting, we can change find() to
do copy-on-write.
2016-10-14 16:01:12 -07:00
Durham Goode
e8554fc3d9 treemanifest: add concept of mutability to manifest, and use it during edits
This adds the concept of mutable and immutable Manifests. During a treemanifest
copy, any sub-manifests that are immutable (such as ones that had been loaded
from a store, or those that are in memory but have been mark immutable), do not
need to be copied. This dramatically reduces the amount of memory allocation
happening when copying trees during automatic tree creation during hg pull.
2016-10-14 16:01:12 -07:00
Durham Goode
1faefff046 treemanifest: convert all ownership Manifest references to ManifestPtr
Now that we have a ManifestPtr object, let's use it in all the places we
currently have Manifest ownership and cleanup happening. We don't need to fix up
any places that are just using Manifests from a readonly, non-lifetime related
perspective.

This gets rid of all the 'delete' calls on Manifest, except the one inside
~ManiestPtr;
2016-10-14 16:01:12 -07:00
Durham Goode
b84ba7af52 treemanifest: introduce ManifestPtr and refcount
Copying our tree manifests is currently the most expensive part of converting
manifests to trees on the fly. Let's introduce refcounting to the Manifest
lifetime, so we can share Manifests across treemanifest instances. Future diffs
will convert all uses of Manifest* to ManifestPtr, then even more future diffs
will change copy and edit operations to be copy-on-write.
2016-10-14 16:01:12 -07:00
Zachary Amsden
30c28a2667 Fix bogus update
Summary:
Forgot to commit, so test build just succeeded as it built with
uncommitted change

Test Plan: ./fb_build_rpm.py --release AAAAAA

Reviewers: simpkins

Reviewed By: simpkins

Subscribers: net-systems-diffs@, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D3978726

Signature: t1:3978726:1475713413:3dcb5389c562ccc1b5675d2b78bcdb2dcba38780
2016-10-05 21:34:50 -07:00
Zachary Amsden
57128e9ab7 Fix Darwin ctreemanifest build
Summary:
Apparently, clang infers that pointer variables in private
structs are unreferenced if they are aliased by parameter names in the
constructor. This doesn't appear to happen with variable passed by
reference.  Unalias the field to work-around the problem.

Test Plan: ./fb_build_rpm.py on OS/X

Reviewers: ttung, durham, simpkins

Reviewed By: simpkins

Subscribers: quark, net-systems-diffs@, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D3978090

Tasks: 13740577

Signature: t1:3978090:1475709499:e50c751341d172f055ed02376521bd880644b01f
2016-10-05 16:39:49 -07:00
Jun Wu
ca6d644eab ctreemanifest: fix compilation error
Summary:
This fixes the following error when being compiled by clang:

  In file included from ctreemanifest/treemanifest.cpp:10:
  ctreemanifest/treemanifest.h:247:15: error: private field 'mainRoot' is not used [-Werror,-Wunused-private-field]
      Manifest *mainRoot;
                ^

Test Plan: `make local` on OS X.

Reviewers: durham, ttung, #sourcecontrol

Subscribers: simpkins, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D3949603
2016-09-30 04:51:44 +01:00
Adam Simpkins
a9cad103ae [ctreemanifest] fix ambiguous call to string::erase()
Summary:
gcc 4.9 complains that erase(0) is ambiguous, and may be either string::erase(size_t)
or string::erase(iterator) (since iterator is "char*", and for historical
reasons 0 can be interpreted as a null char*).

Fix the code to explicitly indicate that it means the erase(size_t) version.

Test Plan: Confirmed that the code built successfully with gcc 4.9.

Reviewers: durham, mitrandir, ttung

Reviewed By: ttung

Subscribers: net-systems-diffs@, yogeshwer, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D3911252

Signature: t1:3911252:1474589551:688ecab4d59053dbdf7b48a062f248d8363e17f3
2016-09-22 18:58:04 -07:00
Durham Goode
50d6b599f4 Move ctreemanifest and cdatapack out of remotefilelog
These don't really have any dependencies on remotefilelog, so let's move them
out.
2016-09-21 13:55:12 -07:00