Commit Graph

58 Commits

Author SHA1 Message Date
Durham Goode
10f743a300 treemanifest: chop off trailing slash when requesting manifests
Summary:
The name being passed to the store was wrong, because it had a trailing slash.
This wasn't caught before because the code was reading and writing the paths
with a slash at the end, so it matched as long as we only interacted with packs
produced by the code. The issue became more obvious when I tried to have packs
generated from revlogs interact with this code.

All the tests are affected since the entry keys changed.

Also use 'const ManifestFetcher &x' to pass the ref to avoid the copy.

Test Plan: Tests updated

Reviewers: #mercurial, quark

Reviewed By: quark

Subscribers: quark, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4901274

Signature: t1:4901274:1492638476:ff28f8976657baec99effbd82ecd436f6282ea5b
2017-04-19 21:14:04 -07:00
Durham Goode
45a47ec954 treemanifest: fix serialization during walksubtree of non-finalized trees
Summary:
Previously, walksubtree would segfault if you ran it on a tree that had
non-finalized entries, since they contain node pointers that point to NULL.
Let's fix this by having serialization refer to the nullid if the pointer is
null.

Test Plan: Updated the tests to cover this

Reviewers: #mercurial, quark

Reviewed By: quark

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4901257

Signature: t1:4901257:1492637151:803b3183ad766e34f0ad24f14a8ec1b784600879
2017-04-19 21:14:04 -07:00
Durham Goode
e1744f2469 treemanifest: remove accidental use of '\0' * 40
Summary:
In one place we tried to set the node to null ('\0' * 20) but instead set it to
40 null bytes. If we're using a hex representation we should be using '0' * 40.
Let's make a global constant for this as well.

Test Plan: Ran the tests

Reviewers: #mercurial, quark

Reviewed By: quark

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4901253

Signature: t1:4901253:1492637015:15bca25bff67167c3c1f51179b7c6f4db2589586
2017-04-19 21:14:04 -07:00
Durham Goode
b16fef8f08 treemanifest: add ManifestEntry.hasNode
Summary:
Previously, callers would need to be aware of how ManifestEntry stored its
no-node state, by checking if entry.node == NULL. Moving to a hasNode() api
allows us to be more flexible in how we represent an empty node.

Test Plan: Ran the tests

Reviewers: #mercurial, quark

Reviewed By: quark

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4901246

Signature: t1:4901246:1492637096:bc193dd998d9f9653dd17b3324a2e71ac8732235
2017-04-19 21:14:04 -07:00
Durham Goode
e0d844028e treemanifest: refactor finalizing logic out of subtreeiter
Summary:
A future patch will want to use SubtreeIterator for comparing two arbitrary
trees. To do this, we need to refactor out all the logic that actually computes
permanent hashes based on the other tree (since if I'm comparing against an
arbitrary tree, I don't want to use its hashes as my p1/p2 like I would if I'm
serializing the new parts of a tree).

This drastically simplifies the code as well, making it much easier to maintain.

Test Plan: Ran the tests

Reviewers: #mercurial, quark

Reviewed By: quark

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4895795

Signature: t1:4895795:1492629678:18617385720b0e1ef5d1513085910a0d3968e29b
2017-04-19 21:14:03 -07:00
Durham Goode
f51d0f1465 treemanifest: add FinalizerIterator
Summary:
A future patch will make SubtreeIterator more generic for diffing trees. To do
that we'll need to move the finalization logic out of SubtreeIterator. This
patch adds a new FinalizerIterator class that simply wraps SubtreeIterator. In
the next patch logic will move between the two classes.

Test Plan: Ran the tests

Reviewers: #mercurial, quark

Reviewed By: quark

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4895794

Signature: t1:4895794:1492629514:9ee6e60c32e99041ce16f92c2923de6b4ad0b104
2017-04-19 21:14:03 -07:00
Durham Goode
d7d340e50e treemanifest: add updatebinnode and updatehexnode
Summary:
A future patch will call ManifestEntry.update in more places, and it is
convienent to have the bin-to-hex conversion unified into one place.

Test Plan: Ran the tests

Reviewers: #mercurial, quark

Reviewed By: quark

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4895788

Signature: t1:4895788:1492629387:7d64db0169bc47e5996666c4fb8fde639e83ba8b
2017-04-19 21:14:03 -07:00
Durham Goode
f706fa0de0 treemanifest: move manifest node onto Manifest
Summary:
Previously the manifest node was only contained by the pointer to the manifest
(usually the container manifest). This made accessing the node awkward in a
bunch of places. Let's refactor it so the node is kept on the Manifest itself as
well, and it handles invalidation and computation of that node as it is editted.

Test Plan: Ran the tests

Reviewers: #mercurial, stash

Reviewed By: stash

Subscribers: stash, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4880076

Signature: t1:4880076:1492591856:f8cd065cac8d3f7cac3069ee9cb61881a24ee426
2017-04-19 21:14:03 -07:00
Durham Goode
c1f233a304 treemanifest: rename NewTreeIterator to SubtreeIterator
Summary:
The NewTreeIterator class was used for iterating over a new tree, when given
it's p1. Mainly for performing the write of a new tree. We want to make this a
more generic mechanism so we can do the diff of one tree with an arbitrary
nother (like when deciding what trees to send a client if they have a given base
node). Let's start by renaming the class.

Test Plan: Ran the tests

Reviewers: #mercurial, quark

Reviewed By: quark

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4880048

Signature: t1:4880048:1492110564:bd51d35e4cfcca2d6b1d4cf81d78fdacf2b52f3d
2017-04-19 21:14:03 -07:00
Martin von Zweigbergk
817da894d8 remotefilelog: don't assign a size_t to an int
On some platforms, size_t is 64 bits and int is 32 bits.
2017-04-04 11:48:27 -07:00
Durham Goode
95eacecbfc treemanifest: fix incorrect tree creation for empty manifests
Summary:
There was a bug where if a commit had an empty manifest (i.e. same contents as
it's parents, but different p1/p2) and it resulted in a
non-empty-but-still-a-noop delta in the revlog (i.e. a delta that deletes a line
and replaces it with the same content), this resulted in a no-op set to the tree
manifest. When the tree was serialized, it noticed that the set was a no-op, so
it didn't serialize that particular tree, but the parent didn't get notified it
was a no-op, so we serialized parent directories with pointers to sub trees that
did not exist.

The fix is to not store new sub-tree nodes on parents when the sub-tree contents
are the same. Now we just store the original sub-tree node. So we no longer
accidentally reference non-existent trees.

Unfortunately I'm not sure how Mercurial can get into this situation (how do you
produce a delta that has content, but the content is a no-op?), so I'm not sure
how to test it. The tree verification command in another patch can catch this
exception though.

Test Plan:
Ran 'hg debuggentrees' on a repo that has a manifest entry that
exhibits this problem. Verified via the debugger that only one tree (the root
node) was generated from adding that manifest.

Reviewers: #mercurial, simonfar

Reviewed By: simonfar

Subscribers: simonfar, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4724539

Signature: t1:4724539:1489789531:02a9a75a85aa2a0a6e4c16e163867bd5a6f55670
2017-03-18 19:40:15 -07:00
Durham Goode
84a862e6f3 treemanifest: support matcher in diff
Summary:
Upstream has added a matcher argument to the diff API which allows diff to avoid
traversing certain parts of the tree. This adds support for that to our native
treemanifest implementation.

Test Plan: Added tests for diff with matches

Reviewers: #mercurial, stash

Reviewed By: stash

Subscribers: stash, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4677023

Signature: t1:4677023:1489076627:dbcea209d300a68fa050f68c52b4fd9949b85302
2017-03-12 12:49:18 -07:00
Durham Goode
a365f1d7b2 treemanifest: refactor path manipulation in diff
Summary:
A bunch of if statements were doing the same thing every time. This moves that
logic out of the individual if statements, which makes it cleaner and will make
a subsequent patch that runs a matcher against the paths easier.

Test Plan: Ran the tests

Reviewers: #mercurial, stash

Reviewed By: stash

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4677011

Signature: t1:4677011:1489075541:9379597b8866358ad5dad3f1f9ae00a0a0b523f9
2017-03-12 12:49:18 -07:00
Durham Goode
5dac028b0e cstore: move pythonutil to cstore
Summary:
Now that ctreemanifest no longer depends on python.h, let's move pythonutil over
to cstore where all the python code is.

Test Plan: Ran the build and the tests

Reviewers: #mercurial, simonfar

Reviewed By: simonfar

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4663988

Signature: t1:4663988:1488895919:652b3fc35a2dd12c51a9f70e32997c7b4d037c95
2017-03-07 11:39:46 -08:00
Durham Goode
9b3457e955 ctreemanifest: replace PythonObj with DiffResult in treemanifest
Summary:
This is the last piece of removing the Python dependency from the core
treemanifest code. This replaces the old PythonObj diff dictionary with a new
DiffResult class that has a PythonDiffResult implementation.

Test Plan:
Ran the test suite, including the unit tests that explicitly cover
treemanifest diff from python.

Reviewers: #mercurial, simonfar

Reviewed By: simonfar

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4663984

Signature: t1:4663984:1488889615:62f064924b0d6b45dfa7e490d19418060c374f40
2017-03-07 11:39:46 -08:00
Durham Goode
8eede45c7f cstore: add Matcher class
Summary:
As part of breaking the native cstore implementation away from Python, let's
create a Matcher class that can be used to perform path match testing. Initially
the only implementation is the PythonMatcher which just wraps a python match
object.

Test Plan: Covered by existing matcher tests in cstore-treemanifest.py

Reviewers: #mercurial, simonfar

Reviewed By: simonfar

Subscribers: simonfar, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4663354

Signature: t1:4663354:1488914384:2c33c7e0e7f2eade0786b6ff41503317989fd1e5
2017-03-07 11:39:46 -08:00
Durham Goode
a79aa29030 cstore: move uniondatapackstore holder to be a shared_ptr
Summary:
In a future patch we will want to pass the uniondatapackstore around to other
objects who will contribute to the lifetime. Let's change it to a shared_ptr so
that becomes easy.

Let's also make the destructor virtual, so we can pass different types of stores
around and have them be destructed correctly.

Test Plan: Ran the tests

Reviewers: #mercurial, simonfar

Reviewed By: simonfar

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4603893

Signature: t1:4603893:1487847173:2fc3505032ea8c30cf9e0f76ac4e75d64513d87d
2017-02-23 14:03:04 -08:00
Durham Goode
195cb62bde ctreemanifest: remove PythonObj from manifest data structures
Summary:
The old code kept a PythonObj around inside the ManifestFetcher for fetching
manifest contents from the store. As part of moving the treemanifest code to use
the new native cstore API let's make the manifest code depend on a Store
abstraction and have one implementation be a PythonStore.

This removes almost all of the python dependencies from the core treemanifest
code, except some logic around running the python matcher during iteration and
writing directly to the python result dict during diff. We'll abstract those
away later.

Test Plan: Built and ran the tests

Reviewers: #mercurial, simonfar

Reviewed By: simonfar

Subscribers: simonfar, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4569944

Signature: t1:4569944:1487847102:d005b6484fd7de9335961b0bc4530505b25f961d
2017-02-23 14:03:03 -08:00
Durham Goode
3b5fb8770e cstore: implement UnionDatapackStore.getdeltachain()
Summary:
Implements the getdeltachain function on the new UnionDatapackStore class.

This required some modifications to the DeltaChainIterator. Since the results of
the iterator may cross multiple different chains, we need to keep each chain
alive until the iterator is destructed, so we need to keep a reference to each
chain. We also had to remove the size() property from the iterator since the
fact that the chain spans chains means we don't know the size up front.

Test Plan: Adds a test

Reviewers: #mercurial, simonfar

Reviewed By: simonfar

Subscribers: simonfar, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4556458

Signature: t1:4556458:1487199872:07dffa3121acfbeb6d6993b518e6f4887122d4d5
2017-02-23 14:03:03 -08:00
Durham Goode
2cd1eeb08e ctreemanifest: move treemanifest into cstore
Summary:
As part of unifying our native store data structures into a single library,
let's move the treemanifest (including the python extension) into py-cstore.

Test Plan:
Built and ran the tests. Verified there was no ctreemanifest.so
dependency in the built cstore.so by using 'ldd cstore.so' on Linux and 'otools
-L cstore.so' on OSX.

Reviewers: #mercurial, simonfar

Reviewed By: simonfar

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4602484

Signature: t1:4602484:1487842683:964cbb43b7cb20d0db699ef691fe7fcf6bccf2e8
2017-02-23 14:03:03 -08:00
Durham Goode
3796c0a311 ctreemanifest: make find() throw KeyError
Summary:
Previously the find function would return None if the given file was not present
in the tree. The other manifest implementations throw a KeyError here instead.
This affects things like sparse looking for files in the null revision, where it
was receiving a None and crashing because it expected a KeyError in this case.

Test Plan: Added a test. Also, clones were working again.

Reviewers: #mercurial, rmcelroy

Reviewed By: rmcelroy

Subscribers: rmcelroy, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4412592

Signature: t1:4412592:1484321389:690e6cd7b3d1599714102ebf361e59fcb0ed59ef
2017-01-13 09:42:39 -08:00
Durham Goode
dd42602a25 ctreemanifest: fix new tree iteration when nodes are late-detected as same
We need to keep the path in sync with the stack, and since we've already popped
the stack here, we also need to pop the path in this short-circuit case.
2016-12-31 18:22:38 -08:00
Durham Goode
51ae957ffc treemanifest: add simple test for tree repack
Summary:
This adds a simple test that verifies hg repack will pack two tree manifest
packs into one.

It caught a bug where creating a treemanifest for a commit with a null parent
produced incorrect output because it constructed an empty tree and tried to use
it's node as the parent of the delta, when there should not have been any delta
in the first place. This is fixed by this diff as well.

Test Plan: Ran the new test

Reviewers: #mercurial, dsyang, rmcelroy

Reviewed By: rmcelroy

Subscribers: rmcelroy, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4261591

Signature: t1:4261591:1480705822:ef21fb8cebd8b89f92f58f11bb1dab59bf97664d
2016-12-02 14:37:55 -08:00
Durham Goode
7a71d2e604 ctreemanifest: implement fast path for matches
Summary:
The original python code for manifest matches has a fast path for when the
matcher contains a specific list of files, this let it check those specific
files instead of iterating over the entire manifest. Let's add this same
optimization to our ctreemanifest implementation.

This greatly speeds up copies._computeforwardmissing() during rebases.

Test Plan: Ran rebase and verified it was much faster

Reviewers: #mercurial, ikostia

Reviewed By: ikostia

Subscribers: rmcelroy, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4248075

Signature: t1:4248075:1480505248:964cbdd701fe29c38a7a0d9bf0752fb5d4bb0ed0
2016-12-02 14:37:31 -08:00
Durham Goode
61a86bdc94 treemanifest: catch pyexception for __nonzero__
Summary:
The recently added __nonzero__ function was not handling the possibility of
pyexceptions. This adds the appropriate catch.

A pyexception is where somewhere up the stack something has set the python error
string, and we just need to return the appropriate error value from the top
level c api to indicate an error happened.

Test Plan: none. I just caught this while debugging

Reviewers: #mercurial

Differential Revision: https://phabricator.intern.facebook.com/D4202090
2016-11-29 15:37:58 -08:00
Durham Goode
a0dc16174d treemanifest: write deltas for trees
Summary:
Previously, when we wrote each tree entry into a pack file, it wasn't delta'd in
any way. This patch makes it store the delta against p1 in the pack file.

Testing in a large repo shows this reduces tree pack size by about 22x.

Test Plan:
Ran the tests. Did a pull in a large repo and saw the pack file was
22x smaller than before (and still usable).

Reviewers: #mercurial

Differential Revision: https://phabricator.intern.facebook.com/D4202088
2016-11-29 15:37:58 -08:00
Durham Goode
ceb783595f treemanifest: add ManifestNode type and use it in NewTreeIter
Summary:
As part of enabling deltas in our tree pack files, we need NewTreeIter to return
the p1 and p2 Manifests as well. To do this we need to return a Manifest/Node
tuple. Since this is becoming a common structure, let's define a type for it and
change NewTreeIter to use it for it's return values.

Test Plan:
Ran the tests. With a future diff I built a pack file and verified it
was small because of deltas.

Reviewers: #mercurial

Differential Revision: https://phabricator.intern.facebook.com/D4202080
2016-11-29 15:37:58 -08:00
Durham Goode
b179a3593c treemanifest: sort parent nodes for hash computation
Summary:
Core mercurial sorts p1 and p2 before computing the hash, so it's deterministic.
We need to do the same.

Test Plan: Ran the tests, saw a hash changed

Reviewers: #mercurial

Differential Revision: https://phabricator.intern.facebook.com/D4202063
2016-11-29 15:37:58 -08:00
Durham Goode
f758ad8c4c treemanifest: fix bug in NewTreeIterator walk
Summary:
The new tree walk did not check if the compare entry was actually a directory
before traversing it. This caused problems for commits that deleted a file and
replaced it with a directory, since it attempted to recurse down the file, which
had no treemanifest.

Test Plan: Added a test

Reviewers: simpkins, #mercurial, zamsden, rmcelroy

Reviewed By: rmcelroy

Subscribers: net-systems-diffs@, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4055678

Signature: t1:4055678:1477430721:ec679afcb4ff6ea2bbf44927e04f2bfcdee8cc03
2016-10-25 14:39:52 -07:00
Durham Goode
a866710af5 treemanifest: improve performance of treemanifest.text()
Summary:
This improves the performance of treemanifest.text() from 5s to 3s. This
function is used when converting a treemanifest to a full text.

Test Plan:
Ran the unit tests.

This was visible when running hg commit with the extension enabled. I verified
the time went down from 5 to 3. A future diff will add an integration test suite
for the treemanifest extension.

Reviewers: simpkins, #mercurial, quark

Reviewed By: quark

Subscribers: quark, net-systems-diffs@, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4055693

Signature: t1:4055693:1477067172:47a61fe98e57b4b10e329b7f53f41b80bf15d205
2016-10-21 11:02:15 -07:00
Durham Goode
c0aee83142 treemanifest: fix build breaks on OSX 2016-10-17 11:47:47 -07:00
Simon Farnsworth
195bbe3fcb Fix compile-breaking format string issue
Summary: "%d" is the wrong format specifier for size_t, and thus this does not build.

Test Plan: make local and see build succeed

Reviewers: #mercurial, quark

Reviewed By: quark

Subscribers: quark, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4030507

Signature: t1:4030507:1476726943:44c480a1e231fe98354d386116b112d508436093
2016-10-17 11:31:18 -07:00
Durham Goode
8557048507 treemanifest: implement iteritems()
This implements the iteritems api. As part of this change we needed to expand
our existing iterator options to allow separating nodes from flags.
2016-10-14 16:01:12 -07:00
Durham Goode
8eea28cd2e treemanifest: implement treemanifest.keys() 2016-10-14 16:01:12 -07:00
Durham Goode
c3bc594c82 treemanifest: fix and test treemanifest.matches()
It had a bug where it thought the node was hex but it was already binary.
2016-10-14 16:01:12 -07:00
Durham Goode
2b41307ebe treemanifest: fix treemanifest.flags() when the file doesn't exist
manifest.flags() actually returns the default value if the filename doesn't
exist. So we need to replicate that behavior.

As part of this fix, I changed treemanifest.get() to return a boolean indicating
whether the file was found or not.
2016-10-14 16:01:12 -07:00
Durham Goode
f7249fd2e2 treemanifest: implement treemanifest.__nonzero__
This implements the __nonzero__ function, which is necessary for things like
`if mymanifest:`
2016-10-14 16:01:12 -07:00
Durham Goode
ae6de9bbdb treemanifest: implement treemanifest.dirs()
This implements the dirs function, which returns a collection set that can
answer the question of if a directory is in the manifest. Currently we do a
naive solution of using util.dirs(), which iterates over all the files. Given
that we have a tree already, we should be able to return something smarter in
the future.
2016-10-14 16:01:12 -07:00
Durham Goode
c171077ebd treemanifest: add None check for treemanifest.contains()
Mercurial sometimes checks if "None in mf", so we need to make sure we return
False in that situation.
2016-10-14 16:01:12 -07:00
Durham Goode
a68bd8bc73 treemanifest: add error checking for all argument strings
We weren't checking if the passed in string arg was successfuly parsed. This
patch adds checks for all of those instances.
2016-10-14 16:01:12 -07:00
Durham Goode
75c7cb42a1 treemanifest: implement hasdir()
Implements the py-treemanifest.hasdir() function
2016-10-14 16:01:12 -07:00
Durham Goode
89ad596f78 treemanifest: allow find() to get files and directories
Previously find would only return files. For treemanifest.hasdir() we needed it
to find directories as well. This patch adds a new enum for indicating if the
find should return files, directories, or both.
2016-10-14 16:01:12 -07:00
Durham Goode
445c6b0d4e treemanifest: fix diff on transient trees
The diff algorithm assumed every tree already had a node. If we are iterating
over an uncommitted tree, it may have tree entries with NULL as their node. We
need to always recurse in these cases.
2016-10-14 16:01:12 -07:00
Durham Goode
cd423fd42a treemanifest: implement 'clean' on diff
hg has an optional 'clean' arg on diff, which causes it to also return files
that aren't different between the two diffs. This implements it on our
treemanifest diff algorithm.
2016-10-14 16:01:12 -07:00
Durham Goode
7f2988ce25 treemanifest: implements text()
Implements the py-treemanifest.text() function
2016-10-14 16:01:12 -07:00
Durham Goode
f108eebde4 treemanifest: implements __setitem__
Implements the py-treemanifest.__setitem__ function. This also handles the
__delitem__ case.
2016-10-14 16:01:12 -07:00
Durham Goode
d96c188d00 treemanifest: implement setflag()
Implements the py-treemanifest.setflag() function
2016-10-14 16:01:12 -07:00
Durham Goode
ea9c99a376 treemanifest: implement get()
Implements the py-treemanifest.get() function
2016-10-14 16:01:12 -07:00
Durham Goode
9819c37220 treemanifest: update entry nodes even if they haven't changed
During serialization, if we encountered a tree entry that had a NULL node, but
the contents matched the prior version, we considered it unchanged and did not
replace the null pointer with a pointer to the actual hash. This meant when it's
parent tried to serialize it, it would encounter a null pointer exception and
crash. Now we always fill in the node during popResult, since by definition it
will be null there (since the only way for something to be pushed is for it to
be null).
2016-10-14 16:01:12 -07:00
Durham Goode
24071807b7 treemanifest: change find() to do copy-on-write
The find() function is used to perform set and delete operations on a tree. Now
that we track Manifests via mutability and ref counting, we can change find() to
do copy-on-write.
2016-10-14 16:01:12 -07:00