Commit Graph

180 Commits

Author SHA1 Message Date
Pierre-Yves David
a0f6321906 configitems: register the 'bundle.mainreporoot' config 2017-06-30 03:31:26 +02:00
Jun Wu
f5ab365fb4 bundlerepo: use raw revision in revdiff()
This is similar to "revlog: use raw revisions in revdiff". revdiff()
generates raw text used in revlog directly.

This makes test-flagprocessor.t happy.
2017-04-03 09:31:39 -07:00
Jun Wu
52198a3918 bundlerepo: fix raw handling in revision()
Similar to fixes in revlog.py, this patch uses "rawtext" to explicitly label
contents expected to be raw, and makes sure content stored in _cache is raw
text.

Now test-flagprocessor.t points us to another issue.
2017-04-06 17:45:47 -07:00
Jun Wu
062c44135c bundlerepo: build revlog index with flags
This fixes bundlerevlog.flags(rev) for any revisions provided by the bundle.

Now test-flagprocessor.t points us to another issue.
2017-04-06 18:06:42 -07:00
Jun Wu
336e7d4e7c bundlerepo: make baserevision return raw text
"baserevision" returns the text that will be used to apply deltas. Since
deltas are against raw texts, "baserevision" should return raw text.

Now test-flagprocessor.t points us to a new error.
2017-04-06 17:43:29 -07:00
Jun Wu
9c45c07c8b bundlerepo: avoid unnecessary node -> rev conversion 2017-03-29 16:28:00 -07:00
Augie Fackler
9a15a28705 py3: use bytearray() instead of array('c', ...) constructions
Portable from 2.6-3.6.
2017-03-12 03:32:21 -04:00
Pierre-Yves David
c8445658f5 vfs: use 'vfs' module directly in 'mercurial.bundlerepo'
Now that the 'vfs' classes moved in their own module, lets use the new module
directly. We update code iteratively to help with possible bisect needs in the
future.
2017-03-02 14:47:03 +01:00
Pulkit Goyal
07314d0686 py3: convert the mode argument of os.fdopen to unicodes (1 of 2)
os.fdopen() does not accepts bytes as its second argument which represent the
mode in which the file is to be opened. This patch makes sure unicodes are
passed in py3 by using pycompat.sysstr().
2017-02-13 20:06:38 +05:30
Remi Chaintron
dfc79cbfc3 revlog: flag processor
Add the ability for revlog objects to process revision flags and apply
registered transforms on read/write operations.

This patch introduces:
- the 'revlog._processflags()' method that looks at revision flags and applies
  flag processors registered on them. Due to the need to handle non-commutative
  operations, flag transforms are applied in stable order but the order in which
  the transforms are applied is reversed between read and write operations.
- the 'addflagprocessor()' method allowing to register processors on flags.
  Flag processors are defined as a 3-tuple of (read, write, raw) functions to be
  applied depending on the operation being performed.
- an update on 'revlog.addrevision()' behavior. The current flagprocessor design
  relies on extensions to wrap around 'addrevision()' to set flags on revision
  data, and on the flagprocessor to perform the actual transformation of its
  contents. In the lfs case, this means we need to process flags before we meet
  the 2GB size check, leading to performing some operations before it happens:
  - if flags are set on the revision data, we assume some extensions might be
    modifying the contents using the flag processor next, and we compute the
    node for the original revision data (still allowing extension to override
    the node by wrapping around 'addrevision()').
  - we then invoke the flag processor to apply registered transforms (in lfs's
    case, drastically reducing the size of large blobs).
  - finally, we proceed with the 2GB size check.

Note: In the case a cachedelta is passed to 'addrevision()' and we detect the
flag processor modified the revision data, we chose to trust the flag processor
and drop the cachedelta.
2017-01-10 16:15:21 +00:00
Remi Chaintron
6d11b9177b revlog: add 'raw' argument to revision and _addrevision
This patch introduces a new 'raw' argument (defaults to False) to revlog's
revision() and _addrevision() methods.
When the 'raw' argument is set to True, it indicates the revision data should be
handled as raw data by the flagprocessor.

Note: Given revlog.addgroup() calls are restricted to changegroup generation, we
can always set raw to True when calling revlog._addrevision() from
revlog.addgroup().
2017-01-05 17:16:07 +00:00
Remi Chaintron
cc88d4a3c4 revlog: merge hash checking subfunctions
This patch factors the behavior of both methods into 'checkhash'.
2016-12-13 14:21:36 +00:00
Pulkit Goyal
97f340e354 py3: use pycompat.getcwd() instead of os.getcwd()
We have pycompat.getcwd() which returns bytes path on Python 3. This patch
changes most of the occurences of the os.getcwd() with pycompat one.
2016-11-23 00:03:11 +05:30
Durham Goode
52b8095f37 manifest: remove last uses of repo.manifest
Now that all the functionality has been moved to manifestlog/manifestrevlog/etc,
we can finally change all the uses of repo.manifest to use the new versions. A
future diff will then delete repo.manifest.

One additional change in this commit is to change repo.manifestlog to be a
@storecache property instead of @property. This is required by some uses of
repo.manifest require that it be settable (contrib/perf.py and the static http
server). We can't do this in a prior change because we can't use @storecache on
this until repo.manifest is no longer used anywhere.
2016-11-10 02:13:19 -08:00
Durham Goode
57cfc4515a manifest: add bundlemanifestlog support
As part of deprecating manifest.manifest we need to make bundlerepo support
manifestlog.
2016-11-11 01:15:59 -08:00
Durham Goode
757b6fb5aa manifest: move manifest creation to a helper function
A future patch will be moving manifest creation to be inside manifestlog as part
of improving our cache guarantees. bundlerepo and unionrepo currently rely on
being able to hook into manifest creation, so let's temporarily move the actual
manifest creation to a helper function for them to intercept.

In the future manifest.manifest() will disappear entirely and this can
disappear.
2016-10-18 17:32:51 -07:00
Durham Goode
34f86b9344 manifest: make one use of _mancache avoid manifestctxs
In a future patch we will change manifestctx and treemanifestctx to no longer
derive from manifestdict and treemanifest, respectively. This means that
consumers of the _mancache will now need to be aware of the different between
the two, until we get rid of the manifest entirely and the _mancache becomes
only filled with ctxs.

This fixes one case of it that can be fixed by using the other cache. Future
patches will address the others uses using the upcoming manifestctx.read()
function.
2016-09-12 14:29:09 -07:00
Pierre-Yves David
cb4c54634b manifest: backed out changeset 3e5e08efafc9
There is some suspicious failure in evolution tests. This changeset was supposed
to be dropped until we investigate.
2016-09-10 01:42:05 +02:00
Durham Goode
e8a39ee6a7 manifest: make uses of _mancache aware of contexts
In a future patch we will change manifestctx and treemanifestctx to no longer
derive from manifestdict and treemanifest, respectively. This means that
consumers of the _mancache will now need to be aware of the different between
the two, until we get rid of the manifest entirely and the _mancache becomes
only filled with ctxs.
2016-08-29 18:02:09 -07:00
Durham Goode
9dfdbc1f92 manifest: break mancache into two caches
The old manifest cache cached both the inmemory representation and the raw text.
As part of the manifest refactor we want to separate the storage format from the
in memory representation, so let's split this cache into two caches.

This will let other manifest implementations participate in the in memory cache,
while allowing the revlog based implementations to still depend on the full text
caching where necessary.
2016-08-17 13:25:13 -07:00
Augie Fackler
ba4d11b62e bundlerepo: add support for treemanifests in cg3 bundles
This is a little messier than I'd like, and I'll probably come back
and do some more refactoring later, but as it is this unblocks
narrowhg. An alternative approach (which I may do as part of the
mentioned refactoring) would be to construct *all* dirlog instances up
front, so that we don't have to keep track of the linkmapper
method. This would avoid a reference cycle between the bundlemanifest
and the bundlerepository, but I was hesitant to do all the work up
front like that.

With this change, it's possible to do 'hg incoming' and 'hg pull' from
bundles in .hg/strip-backup in a treemanifest repository. Sadly, this
doesn't make it possible to 'hg clone' one of those (if you do 'hg
strip 0'), because the cg3 in the bundle gets written without a
treemanifest flag. Since that's going to be an involved refactor in a
different part of the code (which I *suspect* won't touch any of the
code I've just written here), let's leave it as an idea for Later.
2016-08-05 13:08:11 -04:00
Augie Fackler
7b4dc2c6d6 bundlerepo: use supportedincomingversions instead of allsupportedversions
Since bundlerepo is really a pull-like operation, this is the correct
method to use here.
2016-08-04 14:13:35 -04:00
Augie Fackler
7c233d9381 bundlerepo: introduce method to find file starts and use it
This moves us to the modern iter() technique instead of the `while
True` pattern since it's easy. Factored out as a function because I'm
about to need this in a second place.
2016-08-05 13:07:58 -04:00
Augie Fackler
ce135bfec2 bundlerevlog: use for loop over iterator instead of while True
The iter() builtin has a neat pattern where you give it a callable of
no arguments and a sentinel value, and you can then loop over the
function calls like a normal iterator. This cleans up the code a
little.
2016-08-05 13:09:50 -04:00
Augie Fackler
3bb87a6688 bundlerepo: use for loop over iterator instead of while True
The iter() builtin has a neat pattern where you give it a callable of
no arguments and a sentinel value, and you can then loop over the
function calls like a normal iterator. This cleans up the code a
little.
2016-08-05 13:09:24 -04:00
Pierre-Yves David
000dd50a40 bundle2: remove 'experimental.bundle2-exp' boolean config (BC)
All users are migrated to 'devel.legacy.exchange', we can clean up the
experimental namespace.

Marking as (BC) because I know some large installation have bundle2 off and I
want to make sure they notice the change.
2016-08-03 16:23:26 +02:00
Pierre-Yves David
6236bcaa4b bundlerepo: also read the 'devel.legacy.exchange' config
Bundlerepo does its own bundle2 related logic.
2016-08-03 16:42:10 +02:00
liscju
c7ec9d159e i18n: translate abort messages
I found a few places where message given to abort is
not translated, I don't find any reason to not translate
them.
2016-06-14 11:53:55 +02:00
liscju
f82ff5ff29 bundle: warn when update to revision existing only in a bundle (issue5004)
Now its done silently, so unless user really knows what he is doing
will be suprised to find that after update 'hg status' doesn't work.
This commit makes also merge operation warns about missing parent when
revision to merge exists only in the bundle.
2016-03-23 08:55:22 +01:00
Martin von Zweigbergk
4cc86f7b27 bundle: move writebundle() from changegroup.py to bundle2.py (API)
writebundle() writes a bundle2 bundle or a plain changegroup1. Imagine
away the "2" in "bundle2.py" for a moment and this change should makes
sense. The bundle wraps the changegroup, so it makes sense that it
knows about it. Another sign that this is correct is that the delayed
import of bundle2 in changegroup goes away.

I'll leave it for another time to remove the "2" in "bundle2.py"
(alternatively, extract a new bundle.py from it).
2016-03-28 14:41:29 -07:00
Pierre-Yves David
55efb3cd6d bundlerepo: properly handle hidden linkrev in manifestlog (issue4945)
The bundlerepository have to do some special magic to handle linkrev of the
bundled manifest. That logic was done from a repoview and obsolescence marker
affecting bundled changeset could lead to a crash. We now ensure we operate on
unfiltered repository.
2016-02-22 23:34:54 +01:00
Pierre-Yves David
7443f70cc2 bundlerepo: properly handle hidden linkrev in filelog (issue4945)
The bundlerepository have to do some special magic to handle linkrev of the
bundlerepo filerev. That logic was done from a repoview and obsolescence marker
affecting bundled changeset could lead to a crash. We now ensure we operate on
unfiltered repository.
2016-02-22 18:35:40 +01:00
Martin von Zweigbergk
86ca76bafe changegroup: fix pulling to treemanifest repo from flat repo (issue5066)
In b89de5ee5b31 (changegroup: don't support versions 01 and 02 with
treemanifests, 2016-01-19), I stopped supporting use of cg1 and cg2
with treemanifest repos. What I had not considered was that it's
perfectly safe to pull *to* a treemanifest repo using any changegroup
version. As reported in issue5066, I therefore broke pull from old
repos into a treemanifest repo. It was not covered by the test case,
because that pulled from a local repo while enabling treemanifests,
which enabled treemanifests on the source repo as well. After
switching to pulling via HTTP, it breaks.

Fix by splitting up changegroup.supportedversions() into
supportedincomingversions() and supportedoutgoingversions().
2016-01-27 09:07:28 -08:00
Bryan O'Sullivan
a2111cb180 bundlerepo: use context manager for file I/O in _writetempbundle 2016-01-12 14:48:27 -08:00
Martin von Zweigbergk
87d65b1188 changegroup3: add empty chunk separating directories and files
Remotefilelog overrides changegroup._addchangegroupfiles(), assuming
it is about files, which seems like a natural assumption. However, in
changegroup3, directory manifests are sent in the files section of the
changegroup. These naturally make remotefilelog unhappy.

The fact that the directories are not separated from the files
(although they do come before the files) also makes server.validate
harder to implement. Since we read one chunk at a time from the steam,
once we have found a file (non-directory) entry in the stream, we
would have to push the read data back into the stream, or otherwise
refactor the code. It will be easier if we add an empty chunk after
all directory manifests.

This change adds that empty chunk, although we don't yet take
advantage of it on the reading side. We will soon move the tree
manifest stuff out of _addchangegroupfiles() and into
_unpackmanifests().
2016-01-11 15:10:31 -08:00
Martin von Zweigbergk
e5bd6473b3 changegroup: hide packermap behind methods
This is to prepare for hiding changegroup3 behind a config option.
2016-01-12 21:01:06 -08:00
Pierre-Yves David
c5425dd70d bundlerepo: properly extract compressed changegroup from bundle2
Before this bundle repository were unable to work with compressed
bundle2. We use the same approach as with bundle1, we extract the
changegroup in uncompressed form into a temporary file.
2015-10-19 16:01:55 +02:00
Pierre-Yves David
a23eeef4af bundlerepo: uncompress changegroup in bundle1 case only
Uncompressing bundle2 needs to be handled differently.
2015-10-19 18:04:08 +02:00
Pierre-Yves David
d47e68c4cf bundlerepo: move temp-bundle writing logic into a closure
We will reuse this logic for bundle2
2015-10-19 17:58:04 +02:00
Pierre-Yves David
30913031d4 error: get Abort from 'error' instead of 'util'
The home of 'Abort' is 'error' not 'util' however, a lot of code seems to be
confused about that and gives all the credit to 'util' instead of the
hardworking 'error'. In a spirit of equity, we break the cycle of injustice and
give back to 'error' the respect it deserves. And screw that 'util' poser.

For great justice.
2015-10-08 12:55:45 -07:00
Pierre-Yves David
3ff87b1cf4 incoming: request a bundle2 when possible (BC)
Incoming was using bundle1 in all cases, as bundle1 is restricted to
changegroup1 and does not support general delta, this can lead to significant
CPU overhead if the server is using general delta storage. We now properly
request and store a bundle2 to disk.

If the server include any output or error in the bundle, they will be stored on
disk and replayed when the bundle is read. As 'hg incoming' is going to read the
bundle right away, we call that 'good' enough and go back to the bigger plan of
having general delta on by default.

This was tracked as 4864
2015-10-05 00:23:20 -07:00
Pierre-Yves David
1bff95dea7 bundlerepo: indent some code to prepare next patch
We are about to add a new condition. Code is indented in a separated patch for
readability.
2015-10-05 00:18:11 -07:00
Durham Goode
291911d02c bundlerepo: let bundle repo look in the _mancache
When looking up a base revision, we were ignoring the contents that were already
available in the manifest's _mancache. This patch allows us to use that data
instead of reading from the revlog.

This is useful in our pushrebase extension (which allows rebasing on the server
side during a push) because it allows us to prefetch the bundle base manifest
before aquiring the repo lock (1 second saving), which means doing less work inside the lock,
which means a 20% higher commit rate.
2015-09-28 10:27:36 -07:00
Gregory Szorc
e1217593c5 bundlerepo: use absolute_import 2015-08-08 00:36:35 -07:00
Matt Mackall
7c02d2b4b4 bundlerepo: mark internal-only config variable 2015-06-25 17:43:24 -05:00
Martin von Zweigbergk
6b05ffe8de bundlerepo: remove unused 'repo' parameter
Revision d6a8b4e28635 (filelog: add file function to open other
filelogs, 2011-05-10) added a _file() method to revlog, which also
required a 'repo' parameter to be added to bundlefilelog's
constructor. The _file() method was then removed in 55fa3c487b0f
(filelog: remove unused _file method, 2015-01-22), which made the
constructor parameter unused, so let's remove that too.
2015-05-03 14:18:32 -07:00
Yuya Nishihara
dcadb4da71 bundlerepo: disable filtering of changelog while constructing revision text
This avoids the following error that happened if base revision of bundle file
was hidden. bundlerevlog needs it to construct revision texts from bundle
content as revlog.revision() does.

  File "mercurial/context.py", line 485, in _changeset
    return self._repo.changelog.read(self.rev())
  File "mercurial/changelog.py", line 319, in read
    text = self.revision(node)
  File "mercurial/bundlerepo.py", line 124, in revision
    text = self.baserevision(iterrev)
  File "mercurial/bundlerepo.py", line 160, in baserevision
    return changelog.changelog.revision(self, nodeorrev)
  File "mercurial/revlog.py", line 1041, in revision
    node = self.node(rev)
  File "mercurial/changelog.py", line 211, in node
    raise error.FilteredIndexError(rev)
  mercurial.error.FilteredIndexError: 1
2015-04-29 19:47:37 +09:00
FUJIWARA Katsunori
8019b19ea2 bundlerepo: use pathutil.normasprefix to ensure os.sep at the end of cwd
Since Python 2.7.9, "os.path.join(path, '')" doesn't add "os.sep" at
the end of UNC path (see issue4557 for detail).

This makes bundlerepo incorrectly work, if:

  1. cwd is the root of UNC share (e.g. "\host\share"), and
  2. mainreporoot is near cwd (e.g. "\host\sharefoo\repo")
     - host of UNC path is same as one of cwd
     - share of UNC path starts with one of cwd
  3. "repopath" isn't specified in bundle URI
     (e.g. "bundle:bundlefile" or just "bundlefile")

For example:

  $ hg --cwd \host\share -R \host\sharefoo\repo incoming bundle

In this case:

  - os.path.join(r"\host\share", "") returns r"\host\share",
  - r"\host\sharefoo\repo".startswith(r"\host\share") returns True, then
  - r"foo\repo" is treated as repopath of bundlerepo instead of
    r"\host\sharefoo\repo"

This causes failure of combining "\host\sharefoo\repo" and bundle
file: in addition to it, "\host\share\foo\repo" may be combined with
bundle file, if it accidentally exists.

This patch uses "pathutil.normasprefix()" to ensure "os.sep" at the
end of cwd safely, even with some problematic encodings, which use
0x5c (= "os.sep" on Windows) as the tail byte of some multi-byte
characters.

BTW, normalization before "pathutil.normasprefix()" isn't needed in
this case, because "os.getcwd()" always returns normalized one.
2015-04-22 23:38:55 +09:00
Pierre-Yves David
af7d20b000 bundle2: rename format, parts and config to final names
It is finally time to freeze the bundle2 format! To do so we:
- rename HG2Y to HG20,
- drop "b2x:" prefix from all part names,
- rename capability to "bundle2-exp" to "bundle2"
- rename the hook flag from 'bundle2-exp' to 'bundle2'
2015-04-09 16:25:48 -04:00
Jordi Gutiérrez Hermoso
8eb132f5ea style: kill ersatz if-else ternary operators
Although Python supports `X = Y if COND else Z`, this was only
introduced in Python 2.5. Since we have to support Python 2.4, it was
a very common thing to write instead `X = COND and Y or Z`, which is a
bit obscure at a glance. It requires some intricate knowledge of
Python to understand how to parse these one-liners.

We change instead all of these one-liners to 4-liners. This was
executed with the following perlism:

    find -name "*.py" -exec perl -pi -e 's,(\s*)([\.\w]+) = \(?(\S+)\s+and\s+(\S*)\)?\s+or\s+(\S*)$,$1if $3:\n$1    $2 = $4\n$1else:\n$1    $2 = $5,' {} \;

I tweaked the following cases from the automatic Perl output:

    prev = (parents and parents[0]) or nullid
    port = (use_ssl and 443 or 80)
    cwd = (pats and repo.getcwd()) or ''
    rename = fctx and webutil.renamelink(fctx) or []
    ctx = fctx and fctx or ctx
    self.base = (mapfile and os.path.dirname(mapfile)) or ''

I also added some newlines wherever they seemd appropriate for readability

There are probably a few ersatz ternary operators still in the code
somewhere, lurking away from the power of a simple regex.
2015-03-13 17:00:06 -04:00