Commit Graph

254 Commits

Author SHA1 Message Date
Augie Fackler
47e0320f8b cleanup: rename all iteritems methods to items and add iteritems alias
Due to a quirk of our module importer setup on Python 3, all calls and
definitions of methods named iteritems() get rewritten at import
time. Unfortunately, this means there's not a good portable way to
access these methods from non-module-loader'ed code like our unit
tests. This change fixes that, which also unblocks test-manifest.py
from passing under Python 3.

We don't presently define any itervalues methods, or we'd need to give
those similar treatment.
2017-05-29 00:00:02 -04:00
Augie Fackler
cdf974c00c manifest: use itertools.chain() instead of + for Python 3 compat
This is all pure-Python code, so I'm not too worried about perf here,
but we can come back and fix it should it be a problem.

With this change, the manifest code passes most unit tests on Python 3
(once the tests are corrected with many b prefixes. I've got a little
more to sort out there and then I'll mail that change too.
2017-05-28 21:29:58 -04:00
Augie Fackler
83192b065a manifest: fix some pure-Python parser bits to work on Python 3 2017-05-28 21:29:15 -04:00
Yuya Nishihara
4563e16232 parsers: switch to policy importer
# no-check-commit
2016-08-13 12:23:56 +09:00
Durham Goode
941307e87c treemanifest: allow manifestrevlog to take an explicit treemanifest arg
Previously we relied on the opener options to tell the revlog to be a tree
manifest. This makes it complicated for extensions to create treemanifests and
normal manifests at the same time. Let's add a construtor argument to create a
treemanifest revlog as well.

I considered removing the options['treemanifest'] logic from manifestrevlog
entirely, but doing so shifts the responsibility to the caller which ends up
requiring changes in localrepo, bundlerepo, and unionrepo. I figured having the
dual mechanism was better than polluting other parts of the code base with
treemanifest knowledge.
2017-05-09 13:56:46 -07:00
Martin von Zweigbergk
53fada2d33 manifest: remove unused property _oldmanifest
The last use seems to have gone away in 9df18405feb6 (manifest: make
manifestlog use it's own cache, 2016-11-10).
2017-05-08 09:39:21 -07:00
Martin von Zweigbergk
62b3db1726 manifest: remove check for non-contexts in _dirmancache
It looks like the _dirmancache has contained only manifest contexts
since 4df3a0172646 (manifest: remove usages of manifest.read,
2016-11-10).
2017-05-05 14:10:58 -07:00
Durham Goode
d08640dadd treemanifest: add walksubtrees api
Adds a new function to treemanifest that allows walking over the directories in
the tree. Currently it only accepts a matcher to prune the walk, but in the
future it will also accept a list of trees and will only walk over subtrees that
differ from the versions in the list. This will be useful for identifying what
parts of the tree are new to this revision, which is useful when deciding the
minimal set of trees to send to a client given that they have a certain tree
already.

Since this is intended for an extension to use, the only current consumer is a
test. In the future this function may be useful for implementing other
algorithms like diff and changegroup generation.
2017-04-10 13:07:47 -07:00
Martin von Zweigbergk
5c6cd2e435 manifest: update comment to be about bytearray
Looks like a leftover from 54d8e724da64 (py3: use bytearray() instead
of array('c', ...) constructions, 2017-03-12).
2017-04-03 08:45:24 -07:00
Yuya Nishihara
6286308d03 py3: fix manifestdict.fastdelta() to be compatible with memoryview
This doesn't look nice, but a straightforward way to support Python 3.
bytes(m[start:end]) is needed because a memoryview doesn't support ordering
operations. On Python 2, m[start:end] returns a bytes object even if m is
a buffer, so calling bytes() should involve no additional copy.

I'm tired of trying cleaner alternatives, including:

 a. extend memoryview to be compatible with buffer type
    => memoryview is not an acceptable base type
 b. wrap memoryview by buffer-like class
    => zlib complains it isn't bytes-like
2017-03-26 19:06:48 +09:00
Augie Fackler
7b7bdd3bf0 manifest: refer to bytestrings as bytes, not str
Required on Python 3.
2017-03-19 01:12:03 -04:00
Augie Fackler
4840b79ea6 manifest: use node.hex instead of .encode('hex')
The latter doesn't work on Python 3.
2017-03-19 01:11:37 -04:00
Gregory Szorc
5ca0f908bf py3: add __bool__ to every class defining __nonzero__
__nonzero__ was renamed to __bool__ in Python 3. This patch simply
aliases __bool__ to __nonzero__ for every class implementing
__nonzero__.
2017-03-13 12:40:14 -07:00
Augie Fackler
408bc8a668 manifest: ensure paths are bytes (not str) in pure parser 2017-03-12 03:31:54 -04:00
Augie Fackler
518fdf5357 manifest: now that node.bin is available, use it directly
Previously we were getting it through revlog, which is a little unusual.
2017-03-12 03:30:15 -04:00
Augie Fackler
c6a7c91d01 manifest: use node.bin instead of .decode('hex')
The latter doesn't work in Python 3.
2017-03-12 03:29:48 -04:00
Augie Fackler
93c6a91a94 manifest: add __next__ methods for Python 3
Python 3 renamed .next() in the iterator protocol to __next__().
2017-03-12 00:43:20 -05:00
Augie Fackler
949dee72f1 manifest: unbreak pure-python manifest parsing on Python 3 2017-03-12 00:44:21 -05:00
Augie Fackler
9a15a28705 py3: use bytearray() instead of array('c', ...) constructions
Portable from 2.6-3.6.
2017-03-12 03:32:21 -04:00
Durham Goode
2971a9e1cb treemanifest: make node reuse match flat manifest behavior
In a flat manifest, a node with the same content but different parents is still
considered a new node. In the current tree manifests however, if the content is
the same, we ignore the parents entirely and just reuse the existing node.

In our external treemanifest extension, we want to allow having one treemanifest
for every flat manifests, as a way of easeing the migration to treemanifests. To
make this possible, let's change the root node treemanifest behavior to match
the behavior for flat manifests, so we can have a 1:1 relationship.

While this sounds like a BC breakage, it's not actually a state users can
normally get in because: A) you can't make empty commits, and B) even if you try
to make an empty commit (by making a commit then amending it's changes away),
the higher level commit logic in localrepo.commitctx() forces the commit to use
the original p1 manifest node if no files were changed. So this would only
affect extensions and automation that reached passed the normal
localrepo.commit() logic straight into the manifest logic.
2017-03-01 16:19:41 -08:00
Durham Goode
4dc06e6401 manifest: add match argument to diff and filesnotin
As part of removing manifest.matches (since it is O(manifest)), let's start by
adding match arguments to diff and filesnotin. As we'll see in later patches,
these are the only flows that actually use matchers, so by moving the matching
into the actual functions, other manifest implementations can make more efficient
algorithsm.

For instance, this will allow treemanifest diff's to only iterate over the files
that are different AND meet the match criteria.

No consumers are changed in this patches, but the code is fairly easy to verify
visually. Future patches will convert consumers to use it.

One test was affected because it did not use the kwargs version of the clean
parameter.
2017-03-07 09:56:11 -08:00
Durham Goode
36db5cbc5c manifest: remove _repo from manifestctx objects
We were storing the repo on the manifestctx objects so that they could access
the manifestlog via repo.manifestlog, which would refresh the structure if it
became out of date. This caused probems however when we want to have multiple
manifest logs in memory at once, like when transitioning to tree manifest from
flat manifests, since a tree manifest would try to access sub-trees via
repo.manifestlog[node], which was the flat manifest.

The solution is to just not store the repo, and instead store the manifestlog
that created this context. This removes the invalidation when the in memory
manifestlog becomes out of date, but people should probably not be keeping ctx's
around that long anyway.
2017-03-01 16:39:48 -08:00
Durham Goode
062cc68bf8 manifest: allow specifying the revlog filename
Previously we had hardcoded the manifest filename to be 00manifest.i. In our
external treemanifest extension, we want to allow writing a treemanifest side by
side with a flat manifest, so we need to be able to store the root revisions at
a different location (in our extension we use 00manifesttree.i).

This patches moves the revlog name to a parameter so we can adjust it.
2017-03-01 16:35:57 -08:00
Durham Goode
2ec73fed86 manifest: check 'if x is None' instead of 'if not x'
The old code here would end up executing __len__ on a tree manifest to determine
if 'not _data' was true or not. This was very expensive on large repos. Since
this function just cares about memoization, we can just check 'if _data is None'
instead and save a bunch of time.
2017-02-26 10:16:47 -08:00
Mateusz Kwapich
55fff531e4 manifest: expose the parents() method 2016-11-17 10:59:15 -08:00
Durham Goode
a25d8b7cd9 manifest: move manifestctx creation into manifestlog.get()
Most manifestctx creation already happened in manifestlog.get(), but there was
one spot in the manifestctx class itself that created an instance manually. This
patch makes that one instance go through the manifestlog. This means extensions
can just wrap manifestlog.get() and it will cover all manifestctx creations. It
also means this code path now hits the manifestlog cache.
2016-11-17 15:31:19 -08:00
Durham Goode
1b97f7fdb4 manifest: change treemanifestctx to construct subtrees from the manifestlog
Previously, treemanifestctx would directly construct its subtrees. By making it
get the subtrees through manifestlog.get() we consolidate all treemanifestctx
creation into manifestlog.get() and therefore extensions that need to wrap
manifestctx creation (like narrow-hg) can intercept manifestctxs at that single
place.

This also means fetching subtrees will take advantage of the manifestlog ctx
cache now, which it did not before.
2016-11-14 15:24:07 -08:00
Durham Goode
148f23952f manifest: make revlog verification optional
This patches adds an parameter to manifestlog.get() to disable hash checking.
This will be used in an upcoming patch to support treemanifestctx reading
sub-trees without loading them from the revlog. (This is already supported but
does not go through the manifestlog.get() code path)
2016-11-14 15:17:27 -08:00
Durham Goode
d2df1b3944 manifest: delete manifest.manifest class
Now that nothing uses the primary manifest class, we can delete it.
2016-11-10 02:13:19 -08:00
Durham Goode
588bfbe9c6 manifest: make manifestlog use it's own cache
As we start to make manifestlog the primary manifest source, the dependency on
manifest.manifest will cause circular dependency problems. Let's break this
dependency by making manifestlog use it's own cache. In a near future patch we
will remove the previous manifest cache so we're not duplicating it.
2016-11-10 02:13:19 -08:00
Durham Goode
f980c11277 manifest: delete unused dirlog and _newmanifest functions
As part of migrating all manifest functionality out of manifest.manifest, let's
migrate a couple spots off of manifest.dirlog() to use the revlog specific
accessor. Then we can delete manifest.dirlog() and other unused functions.
2016-11-10 02:13:19 -08:00
Durham Goode
240c640350 manifest: move clearcaches to manifestlog
This is part of removing all functionality from manifest.manifest so we can
delete the class entirely.
2016-11-10 02:13:19 -08:00
Durham Goode
64058b3c19 manifest: remove usages of manifest.read
Now that the two manifestctx implementations have working read() functions,
let's remove the existing uses of manifest.read and drop the function.
2016-11-10 02:13:19 -08:00
Durham Goode
a9f6e04934 manifest: remove dependency on manifestrevlog being able to create trees
A future patch will be removing the read() function from the manifest class.
Since manifestrevlog currently depends on the read function that manifest
implements (as a derived class), we need to break the dependency from
manifestrevlog to read(). We do this by adding an argument to
manifestrevlog.write() which provides it with the ability to read a manifest.

This is good in general because it further separates revlog as the storage
format from the actual inmemory data structure implementation.
2016-11-10 02:13:19 -08:00
Durham Goode
6fb7c00e4d manifest: remove manifest.add and add memmfctx.write
This removes one more dependency on the manifest class by moving the write
functionality onto the memmanifestctx classes and changing the one consumer to
use the new API.

By moving the write path to a manifestctx, we now give the individual manifests
control over how they're read and serialized. This will be useful in developing
new manifest formats and storage systems.
2016-11-08 08:03:43 -08:00
Durham Goode
d111e231b7 manifest: add copy to mfctx classes
This adds copy functionality to the manifestctx classes. This will be used in an
upcoming diff to copy a manifestctx during commit so we can modify the manifest
before committing.
2016-11-08 08:03:43 -08:00
Durham Goode
ec7aab01e8 manifest: introduce memmanifestctx and memtreemanifestctx
This introduces two new classes to represent in-memory manifest instances.
Similar to memchangectx, this lets us prepare a manifest in memory, then in a
future patch we will add the apis that can commit this in memory structure.
2016-11-08 08:03:43 -08:00
Durham Goode
66a55cd598 manifestctx: add _revlog() function
The `self._repo.manifestlog._revlog` code is getting copy and pasted a lot in
manifestctx. Let's make it a function so it can be reused. This will make future
patches cleaner too.
2016-11-08 08:03:43 -08:00
Durham Goode
59f421f2ce manifest: remove manifest.find
As part of removing dependencies on manifest, this drops the find function and
fixes up the two existing callers to use the equivalent apis on manifestctx.
2016-11-08 08:03:43 -08:00
Martin von Zweigbergk
e6e8c26ad2 treemanifest: fix a "treeinmem" case
7089745181c5 (manifest: make treemanifestctx store the repo,
2016-10-18) broke most tests when run with treeinmem=True. The
treeinmem mode can not be enabled by the user, so this did not break
anything in practice, but it's useful to have it working for testing
the treemanifest code.
2016-11-04 13:49:15 -07:00
Durham Goode
847f748e59 manifest: add __nonzero__ method
This adds a __nonzero__ method to manifestdict. This isn't strictly necessary in
the vanilla Mercurial implementation, since Python will handle nonzero checks by
using __len__, but having it implemented here makes it easier for alternative
implementations to implement __nonzero__ and have them be plug-n-play with the
normal implementation.
2016-11-03 17:31:14 -07:00
Durham Goode
d793e01462 manifest: remove manifest.readshallowdelta
This removes manifest.readshallowdelta and converts its one consumer to use
manifestlog instead.
2016-11-02 17:10:47 -07:00
Durham Goode
f952eca1af manifest: get rid of manifest.readshallowfast
This removes manifest.readshallowfast and converts it's one user to use
manifestlog instead.
2016-11-02 17:10:47 -07:00
Durham Goode
b393b6387d manifest: add shallow option to treemanifestctx.readdelta and readfast
The old manifest had different functions for performing shallow reads, shallow
readdeltas, and shallow readfasts. Since a lot of the code is duplicate (and
since those functions don't make sense on a normal manifestctx), let's unify
them into flags on the existing readdelta and readfast functions.

A future diff will change consumers of these functions to use the manifestctx
versions and will delete the old apis.
2016-11-02 17:10:47 -07:00
Durham Goode
974eda820c manifest: change manifestlog mancache to be directory based
In the last patch we added a get() function that allows fetching directory level
treemanifestctxs. It didn't handle caching at directory level though, so we need to
change our mancache to support multiple directories.
2016-11-02 17:10:47 -07:00
Durham Goode
cd705bb046 manifest: add manifestlog.get to obtain subdirectory instances
Previously manifestlog only allowed obtaining root level manifests. Future
patches will need direct access to subdirectory manifests as part of changegroup
creation, so let's add a get() function that knows how to deal with
subdirectories.
2016-11-02 17:24:06 -07:00
Durham Goode
c70bd2fb82 manifest: throw LookupError if node not in revlog
When accessing a manifest via manifestlog[node], let's verify that the node
actually exists and throw a LookupError if it doesn't. This matches the old read
behavior, so we don't accidentally return invalid manifestctxs.

We do this in manifestlog instead of in the manifestctx/treemanifestctx
constructors because the treemanifest code currently relies on the fact that
certain code paths can produce treemanifests without touching the revlogs (and
it has tests that verify things work if certain revlogs are missing entirely, so
they break if we add validation that tries to read them).
2016-11-02 17:33:31 -07:00
Durham Goode
9fcac302ea manifest: make treemanifestctx store the repo
Same as in the last commit, the old treemanifestctx stored a reference to the
revlog.  If the inmemory revlog became invalid, the ctx now held an old copy and
would be incorrect. To fix this, we need the ctx to go through the manifestlog
for each access.

This is the same pattern that changectx already uses (it stores the repo, and
accesses commit data through self._repo.changelog).
2016-10-18 17:44:42 -07:00
Durham Goode
46fbc1bfc1 manifest: make manifestctx store the repo
The old manifestctx stored a reference to the revlog. If the inmemory revlog
became invalid, the ctx now held an old copy and would be incorrect. To fix
this, we need the ctx to go through the manifestlog for each access.

This is the same pattern that changectx already uses (it stores the repo, and
accesses commit data through self._repo.changelog).
2016-10-18 17:44:26 -07:00
Durham Goode
871d515e3d manifest: make manifestlog a storecache
The old @property on manifestlog was broken. It meant that we would always
recreate the manifestlog instance, which meant the cache was never hit. Since
we'll eventually remove repo.manifest and make manifestlog the only property,
let's go ahead and make manifestlog the @storecache property, have manifestlog
own the manifest instance, and have repo.manifest refer to it via manifestlog.

This means all accesses go through repo.manifestlog, which is now invalidated
correctly.
2016-10-18 17:33:39 -07:00