Summary:
The name being passed to the store was wrong, because it had a trailing slash.
This wasn't caught before because the code was reading and writing the paths
with a slash at the end, so it matched as long as we only interacted with packs
produced by the code. The issue became more obvious when I tried to have packs
generated from revlogs interact with this code.
All the tests are affected since the entry keys changed.
Also use 'const ManifestFetcher &x' to pass the ref to avoid the copy.
Test Plan: Tests updated
Reviewers: #mercurial, quark
Reviewed By: quark
Subscribers: quark, mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D4901274
Signature: t1:4901274:1492638476:ff28f8976657baec99effbd82ecd436f6282ea5b
Summary:
Previously, walksubtree would segfault if you ran it on a tree that had
non-finalized entries, since they contain node pointers that point to NULL.
Let's fix this by having serialization refer to the nullid if the pointer is
null.
Test Plan: Updated the tests to cover this
Reviewers: #mercurial, quark
Reviewed By: quark
Subscribers: mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D4901257
Signature: t1:4901257:1492637151:803b3183ad766e34f0ad24f14a8ec1b784600879
Summary:
In one place we tried to set the node to null ('\0' * 20) but instead set it to
40 null bytes. If we're using a hex representation we should be using '0' * 40.
Let's make a global constant for this as well.
Test Plan: Ran the tests
Reviewers: #mercurial, quark
Reviewed By: quark
Subscribers: mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D4901253
Signature: t1:4901253:1492637015:15bca25bff67167c3c1f51179b7c6f4db2589586
Summary:
Previously, callers would need to be aware of how ManifestEntry stored its
no-node state, by checking if entry.node == NULL. Moving to a hasNode() api
allows us to be more flexible in how we represent an empty node.
Test Plan: Ran the tests
Reviewers: #mercurial, quark
Reviewed By: quark
Subscribers: mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D4901246
Signature: t1:4901246:1492637096:bc193dd998d9f9653dd17b3324a2e71ac8732235
Summary:
A future patch will want to use SubtreeIterator for comparing two arbitrary
trees. To do this, we need to refactor out all the logic that actually computes
permanent hashes based on the other tree (since if I'm comparing against an
arbitrary tree, I don't want to use its hashes as my p1/p2 like I would if I'm
serializing the new parts of a tree).
This drastically simplifies the code as well, making it much easier to maintain.
Test Plan: Ran the tests
Reviewers: #mercurial, quark
Reviewed By: quark
Subscribers: mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D4895795
Signature: t1:4895795:1492629678:18617385720b0e1ef5d1513085910a0d3968e29b
Summary:
A future patch will make SubtreeIterator more generic for diffing trees. To do
that we'll need to move the finalization logic out of SubtreeIterator. This
patch adds a new FinalizerIterator class that simply wraps SubtreeIterator. In
the next patch logic will move between the two classes.
Test Plan: Ran the tests
Reviewers: #mercurial, quark
Reviewed By: quark
Subscribers: mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D4895794
Signature: t1:4895794:1492629514:9ee6e60c32e99041ce16f92c2923de6b4ad0b104
Summary:
A future patch will call ManifestEntry.update in more places, and it is
convienent to have the bin-to-hex conversion unified into one place.
Test Plan: Ran the tests
Reviewers: #mercurial, quark
Reviewed By: quark
Subscribers: mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D4895788
Signature: t1:4895788:1492629387:7d64db0169bc47e5996666c4fb8fde639e83ba8b
Summary:
Previously the manifest node was only contained by the pointer to the manifest
(usually the container manifest). This made accessing the node awkward in a
bunch of places. Let's refactor it so the node is kept on the Manifest itself as
well, and it handles invalidation and computation of that node as it is editted.
Test Plan: Ran the tests
Reviewers: #mercurial, stash
Reviewed By: stash
Subscribers: stash, mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D4880076
Signature: t1:4880076:1492591856:f8cd065cac8d3f7cac3069ee9cb61881a24ee426
Summary:
The NewTreeIterator class was used for iterating over a new tree, when given
it's p1. Mainly for performing the write of a new tree. We want to make this a
more generic mechanism so we can do the diff of one tree with an arbitrary
nother (like when deciding what trees to send a client if they have a given base
node). Let's start by renaming the class.
Test Plan: Ran the tests
Reviewers: #mercurial, quark
Reviewed By: quark
Subscribers: mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D4880048
Signature: t1:4880048:1492110564:bd51d35e4cfcca2d6b1d4cf81d78fdacf2b52f3d
Summary:
There was a bug where if a commit had an empty manifest (i.e. same contents as
it's parents, but different p1/p2) and it resulted in a
non-empty-but-still-a-noop delta in the revlog (i.e. a delta that deletes a line
and replaces it with the same content), this resulted in a no-op set to the tree
manifest. When the tree was serialized, it noticed that the set was a no-op, so
it didn't serialize that particular tree, but the parent didn't get notified it
was a no-op, so we serialized parent directories with pointers to sub trees that
did not exist.
The fix is to not store new sub-tree nodes on parents when the sub-tree contents
are the same. Now we just store the original sub-tree node. So we no longer
accidentally reference non-existent trees.
Unfortunately I'm not sure how Mercurial can get into this situation (how do you
produce a delta that has content, but the content is a no-op?), so I'm not sure
how to test it. The tree verification command in another patch can catch this
exception though.
Test Plan:
Ran 'hg debuggentrees' on a repo that has a manifest entry that
exhibits this problem. Verified via the debugger that only one tree (the root
node) was generated from adding that manifest.
Reviewers: #mercurial, simonfar
Reviewed By: simonfar
Subscribers: simonfar, mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D4724539
Signature: t1:4724539:1489789531:02a9a75a85aa2a0a6e4c16e163867bd5a6f55670
Summary:
Upstream has added a matcher argument to the diff API which allows diff to avoid
traversing certain parts of the tree. This adds support for that to our native
treemanifest implementation.
Test Plan: Added tests for diff with matches
Reviewers: #mercurial, stash
Reviewed By: stash
Subscribers: stash, mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D4677023
Signature: t1:4677023:1489076627:dbcea209d300a68fa050f68c52b4fd9949b85302
Summary:
A bunch of if statements were doing the same thing every time. This moves that
logic out of the individual if statements, which makes it cleaner and will make
a subsequent patch that runs a matcher against the paths easier.
Test Plan: Ran the tests
Reviewers: #mercurial, stash
Reviewed By: stash
Subscribers: mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D4677011
Signature: t1:4677011:1489075541:9379597b8866358ad5dad3f1f9ae00a0a0b523f9
Summary:
Now that ctreemanifest no longer depends on python.h, let's move pythonutil over
to cstore where all the python code is.
Test Plan: Ran the build and the tests
Reviewers: #mercurial, simonfar
Reviewed By: simonfar
Subscribers: mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D4663988
Signature: t1:4663988:1488895919:652b3fc35a2dd12c51a9f70e32997c7b4d037c95
Summary:
This is the last piece of removing the Python dependency from the core
treemanifest code. This replaces the old PythonObj diff dictionary with a new
DiffResult class that has a PythonDiffResult implementation.
Test Plan:
Ran the test suite, including the unit tests that explicitly cover
treemanifest diff from python.
Reviewers: #mercurial, simonfar
Reviewed By: simonfar
Subscribers: mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D4663984
Signature: t1:4663984:1488889615:62f064924b0d6b45dfa7e490d19418060c374f40
Summary:
As part of breaking the native cstore implementation away from Python, let's
create a Matcher class that can be used to perform path match testing. Initially
the only implementation is the PythonMatcher which just wraps a python match
object.
Test Plan: Covered by existing matcher tests in cstore-treemanifest.py
Reviewers: #mercurial, simonfar
Reviewed By: simonfar
Subscribers: simonfar, mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D4663354
Signature: t1:4663354:1488914384:2c33c7e0e7f2eade0786b6ff41503317989fd1e5
Summary:
In a future patch we will want to pass the uniondatapackstore around to other
objects who will contribute to the lifetime. Let's change it to a shared_ptr so
that becomes easy.
Let's also make the destructor virtual, so we can pass different types of stores
around and have them be destructed correctly.
Test Plan: Ran the tests
Reviewers: #mercurial, simonfar
Reviewed By: simonfar
Subscribers: mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D4603893
Signature: t1:4603893:1487847173:2fc3505032ea8c30cf9e0f76ac4e75d64513d87d
Summary:
The old code kept a PythonObj around inside the ManifestFetcher for fetching
manifest contents from the store. As part of moving the treemanifest code to use
the new native cstore API let's make the manifest code depend on a Store
abstraction and have one implementation be a PythonStore.
This removes almost all of the python dependencies from the core treemanifest
code, except some logic around running the python matcher during iteration and
writing directly to the python result dict during diff. We'll abstract those
away later.
Test Plan: Built and ran the tests
Reviewers: #mercurial, simonfar
Reviewed By: simonfar
Subscribers: simonfar, mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D4569944
Signature: t1:4569944:1487847102:d005b6484fd7de9335961b0bc4530505b25f961d
Summary:
Implements the getdeltachain function on the new UnionDatapackStore class.
This required some modifications to the DeltaChainIterator. Since the results of
the iterator may cross multiple different chains, we need to keep each chain
alive until the iterator is destructed, so we need to keep a reference to each
chain. We also had to remove the size() property from the iterator since the
fact that the chain spans chains means we don't know the size up front.
Test Plan: Adds a test
Reviewers: #mercurial, simonfar
Reviewed By: simonfar
Subscribers: simonfar, mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D4556458
Signature: t1:4556458:1487199872:07dffa3121acfbeb6d6993b518e6f4887122d4d5
Summary:
As part of unifying our native store data structures into a single library,
let's move the treemanifest (including the python extension) into py-cstore.
Test Plan:
Built and ran the tests. Verified there was no ctreemanifest.so
dependency in the built cstore.so by using 'ldd cstore.so' on Linux and 'otools
-L cstore.so' on OSX.
Reviewers: #mercurial, simonfar
Reviewed By: simonfar
Subscribers: mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D4602484
Signature: t1:4602484:1487842683:964cbb43b7cb20d0db699ef691fe7fcf6bccf2e8
Summary:
Previously the find function would return None if the given file was not present
in the tree. The other manifest implementations throw a KeyError here instead.
This affects things like sparse looking for files in the null revision, where it
was receiving a None and crashing because it expected a KeyError in this case.
Test Plan: Added a test. Also, clones were working again.
Reviewers: #mercurial, rmcelroy
Reviewed By: rmcelroy
Subscribers: rmcelroy, mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D4412592
Signature: t1:4412592:1484321389:690e6cd7b3d1599714102ebf361e59fcb0ed59ef
We need to keep the path in sync with the stack, and since we've already popped
the stack here, we also need to pop the path in this short-circuit case.
Summary:
This adds a simple test that verifies hg repack will pack two tree manifest
packs into one.
It caught a bug where creating a treemanifest for a commit with a null parent
produced incorrect output because it constructed an empty tree and tried to use
it's node as the parent of the delta, when there should not have been any delta
in the first place. This is fixed by this diff as well.
Test Plan: Ran the new test
Reviewers: #mercurial, dsyang, rmcelroy
Reviewed By: rmcelroy
Subscribers: rmcelroy, mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D4261591
Signature: t1:4261591:1480705822:ef21fb8cebd8b89f92f58f11bb1dab59bf97664d
Summary:
The original python code for manifest matches has a fast path for when the
matcher contains a specific list of files, this let it check those specific
files instead of iterating over the entire manifest. Let's add this same
optimization to our ctreemanifest implementation.
This greatly speeds up copies._computeforwardmissing() during rebases.
Test Plan: Ran rebase and verified it was much faster
Reviewers: #mercurial, ikostia
Reviewed By: ikostia
Subscribers: rmcelroy, mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D4248075
Signature: t1:4248075:1480505248:964cbdd701fe29c38a7a0d9bf0752fb5d4bb0ed0
Summary:
The recently added __nonzero__ function was not handling the possibility of
pyexceptions. This adds the appropriate catch.
A pyexception is where somewhere up the stack something has set the python error
string, and we just need to return the appropriate error value from the top
level c api to indicate an error happened.
Test Plan: none. I just caught this while debugging
Reviewers: #mercurial
Differential Revision: https://phabricator.intern.facebook.com/D4202090
Summary:
Previously, when we wrote each tree entry into a pack file, it wasn't delta'd in
any way. This patch makes it store the delta against p1 in the pack file.
Testing in a large repo shows this reduces tree pack size by about 22x.
Test Plan:
Ran the tests. Did a pull in a large repo and saw the pack file was
22x smaller than before (and still usable).
Reviewers: #mercurial
Differential Revision: https://phabricator.intern.facebook.com/D4202088
Summary:
As part of enabling deltas in our tree pack files, we need NewTreeIter to return
the p1 and p2 Manifests as well. To do this we need to return a Manifest/Node
tuple. Since this is becoming a common structure, let's define a type for it and
change NewTreeIter to use it for it's return values.
Test Plan:
Ran the tests. With a future diff I built a pack file and verified it
was small because of deltas.
Reviewers: #mercurial
Differential Revision: https://phabricator.intern.facebook.com/D4202080
Summary:
Core mercurial sorts p1 and p2 before computing the hash, so it's deterministic.
We need to do the same.
Test Plan: Ran the tests, saw a hash changed
Reviewers: #mercurial
Differential Revision: https://phabricator.intern.facebook.com/D4202063
Summary:
The new tree walk did not check if the compare entry was actually a directory
before traversing it. This caused problems for commits that deleted a file and
replaced it with a directory, since it attempted to recurse down the file, which
had no treemanifest.
Test Plan: Added a test
Reviewers: simpkins, #mercurial, zamsden, rmcelroy
Reviewed By: rmcelroy
Subscribers: net-systems-diffs@, mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D4055678
Signature: t1:4055678:1477430721:ec679afcb4ff6ea2bbf44927e04f2bfcdee8cc03
Summary:
This improves the performance of treemanifest.text() from 5s to 3s. This
function is used when converting a treemanifest to a full text.
Test Plan:
Ran the unit tests.
This was visible when running hg commit with the extension enabled. I verified
the time went down from 5 to 3. A future diff will add an integration test suite
for the treemanifest extension.
Reviewers: simpkins, #mercurial, quark
Reviewed By: quark
Subscribers: quark, net-systems-diffs@, mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D4055693
Signature: t1:4055693:1477067172:47a61fe98e57b4b10e329b7f53f41b80bf15d205
Summary: "%d" is the wrong format specifier for size_t, and thus this does not build.
Test Plan: make local and see build succeed
Reviewers: #mercurial, quark
Reviewed By: quark
Subscribers: quark, mjpieters
Differential Revision: https://phabricator.intern.facebook.com/D4030507
Signature: t1:4030507:1476726943:44c480a1e231fe98354d386116b112d508436093
manifest.flags() actually returns the default value if the filename doesn't
exist. So we need to replicate that behavior.
As part of this fix, I changed treemanifest.get() to return a boolean indicating
whether the file was found or not.
This implements the dirs function, which returns a collection set that can
answer the question of if a directory is in the manifest. Currently we do a
naive solution of using util.dirs(), which iterates over all the files. Given
that we have a tree already, we should be able to return something smarter in
the future.
Previously find would only return files. For treemanifest.hasdir() we needed it
to find directories as well. This patch adds a new enum for indicating if the
find should return files, directories, or both.
The diff algorithm assumed every tree already had a node. If we are iterating
over an uncommitted tree, it may have tree entries with NULL as their node. We
need to always recurse in these cases.
hg has an optional 'clean' arg on diff, which causes it to also return files
that aren't different between the two diffs. This implements it on our
treemanifest diff algorithm.
During serialization, if we encountered a tree entry that had a NULL node, but
the contents matched the prior version, we considered it unchanged and did not
replace the null pointer with a pointer to the actual hash. This meant when it's
parent tried to serialize it, it would encounter a null pointer exception and
crash. Now we always fill in the node during popResult, since by definition it
will be null there (since the only way for something to be pushed is for it to
be null).
The find() function is used to perform set and delete operations on a tree. Now
that we track Manifests via mutability and ref counting, we can change find() to
do copy-on-write.