Commit Graph

410 Commits

Author SHA1 Message Date
Pulkit Goyal
591b6e9fb9 py3: use pycompat.strkwargs() to convert kwargs keys to str before passing 2017-06-17 15:05:11 +05:30
Martin von Zweigbergk
ad7cafdd66 exchange: switch to usual way of testing for bundle2-ness
We used safehasattr() in one place, but we use isinstance() for this
everywhere else, so switch to the latter.
2017-06-16 22:57:31 -07:00
Martin von Zweigbergk
76831e82ba exchange: use context manager for bundle1 unbundling
The lazy locking is not used for bundle1, so using a regular context
manager is clearer.
2017-06-15 22:57:20 -07:00
Martin von Zweigbergk
c8f8b266e4 clonebundle: use context managers for lock and transaction 2017-06-15 17:00:32 -07:00
Pierre-Yves David
a74c131fb4 push: add a way to allow concurrent pushes on unrelated heads
Client has a mechanism for the server to check that nothing changed server side
since the client prepared a push. That check is wide and any head changed on
the server will lead to an aborted push. We introduce a way for the client to
send a less strict checking. That logic will check that no heads impacted by
the push have been affected. If other unrelated heads (including named branches
heads) have been affected, the push will proceed.

This is very helpful for repositories with high developers traffic on different
heads, a common setup.

That behavior is currently controlled by an experimental option. The config
should live in the "server" section but bike-shedding of the name will happen
in the next changesets. Servers advertise this capability through a new bundle2
capability 'checkeads', using the value 'related'.

The 'test-push-race.t' is updated to check that new capabilities on the
documented cases.
2017-05-29 05:53:58 +02:00
Augie Fackler
9a722d4382 merge with stable 2017-06-03 16:33:28 -04:00
Gregory Szorc
3156c627f0 exchange: print full reason variable
This commit essentially reverts 62ad9c1dbce9.

urllib2.URLError receives a "reason" argument. It isn't always a
tuple. Mozilla has experienced at least IndexError failures due
to the reason[1] access.
https://bugzilla.mozilla.org/show_bug.cgi?id=1364687
2017-05-24 15:25:24 -07:00
Augie Fackler
b35d966e25 merge with stable 2017-03-18 12:27:52 -04:00
Gregory Szorc
aff9286eb0 exchange: use v2 bundles for modern compression engines (issue5506)
Previously, `hg bundle zstd` on a non-generaldelta repo would
attempt to use a v1 bundle. This would fail because zstd is not
supported on v1 bundles.

This patch changes the behavior to automatically use a v2 bundle
when the user explicitly requests a bundlespec that is a compression
engine not supported on v1. If the bundlespec is <engine>-v1, it is
still explicitly rejected because that request cannot be fulfilled.
2017-03-16 12:33:15 -07:00
Gregory Szorc
3bbb6e2c52 exchange: reject new compression engines for v1 bundles (issue5506)
Version 1 bundles only support a fixed set of compression engines.
Before this change, we would accept any compression engine for v1
bundles, even those that may not work on v1. This could lead to
an error.

We define a fixed set of compression engines known to work with v1
bundles and we add checking to ensure a newer engine (like zstd)
won't work with v1 bundles.

I also took the liberty of adding test coverage for unknown compression
names because I noticed we didn't have coverage of it before.
2017-03-16 12:23:56 -07:00
Pierre-Yves David
695fa85daa getbundle: cleanly handle remote abort during getbundle
bundle2 allow the server to report error explicitly. This was initially
implemented for push but there is not reason to not use it for pull too. This
changeset add logic similar to the one in 'unbundle' to the
client side of 'getbundle'. That logic make sure the error is properly reported
as "remote". This will allow the server side of getbundle to send clean "Abort"
message in the next changeset.
2017-02-10 18:17:20 +01:00
Pierre-Yves David
e8a7ecc281 bundle2: keep hint close to the primary message when remote abort
The remote hint message was ignored when reporting the remote error and
passed to the local generic abort error. I think I might initially have
tried to avoid reimplementing logic controlling the hint display depending of
the verbosity level. However, first, there does not seems to have such verbosity
related logic and second the resulting was wrong as the primary error and the
hint were split apart. We now properly print the hint as remote output.
2017-02-10 17:56:47 +01:00
Gregory Szorc
22633256ca exchange: use rich class for sorting clone bundle entries
Python 3 removed the "cmp" argument from sorted(). Custom sorting in
Python 3 must be implemented with the dunder comparison methods on
types and/or with a "key" function.

This patch converts our custom "cmp" function to a custom type.

The implementation is very similar to functools.cmp_to_key(). However,
cmp_to_key() doesn't exist in Python 2, so we can't use it.

This was the only use of the "cmp" argument to sorted() in the code
base.
2016-12-26 12:11:29 -07:00
Stanislau Hlebik
420d75485a bookmarks: make bookmarks.comparebookmarks accept binary nodes (API)
Binary bookmark format should be used internally. It doesn't make sense to have
optional parameters `srchex` and `dsthex`. This patch removes them. It will
also be useful for `bookmarks` bundle2 part because unnecessary conversions
between hex and bin nodes will be avoided.
2016-12-09 03:22:26 -08:00
Stanislau Hlebik
420e1ab2a8 bookmarks: rename compare() to comparebookmarks() (API)
Next commit will remove optional parameters from `compare()` function.
Let's rename `compare()` to `comparebookmarks()` to avoid ambiguity from
callers from external extensions.
2016-11-22 01:33:31 -08:00
Stanislau Hlebik
7dd985ce79 exchange: add _getbookmarks() function
This function will be used to generate bookmarks bundle2 part.
It is a separate function in order to make it easy to overwrite it
in extensions. Passing `kwargs` to the function makes it easy to
add new parameters in extensions.
2016-11-17 00:59:41 -08:00
Pierre-Yves David
ffd8983573 bundle2: move function building obsmarker-part in the bundle2 module
We move it next to similar part building functions. We will need it for the
"writenewbundle" logic. This will allow us to easily include obsmarkers in
on-disk bundle, a necessary step before having `hg strip` also operate on
markers.

(Yes, the bundle2 module was already too large, but there any many
interdependencies between its components so it is non-trivial to split, this is
a quest for another adventure.)
2017-05-28 11:48:18 -07:00
Martin von Zweigbergk
c3406ac3db cleanup: use set literals
We no longer support Python 2.6, so we can now use set literals.
2017-02-10 16:56:29 -08:00
Durham Goode
a73dbb6c8d changegroup: add bundlecaps back
Commit 9233182ea547d0aa removed the unused bundlecaps argument from the
changegroup code. While it is unused in core Mercurial, it was an important
feature for the remotefilelog extension because it allowed the exchange layer to
communicate to the changegroup packer that this was a shallow repo and that
filelogs should not be included. Without bundlecaps, there is currently no other
way to pass that information along without a more extensive refactor of
exchange, bundle, and changegroup code.

This patch backs out the original removal, and merges it with some recent
changes to changegroup apis.
2017-05-15 09:35:27 -07:00
Siddharth Agarwal
ade0695cf9 bundle2: don't check for whether we can do stream clones
At the moment this isn't used and all stream clones use the legacy protocol.

In an upcoming diff, canperformstreamclone will print out a message if a stream
clone was requested but couldn't happen for some reason. Removing this call
ensures the message isn't printed twice.
2017-05-08 17:30:51 -07:00
Pierre-Yves David
a8c6292f9c bundle2: move tagsfnodecache generation in a generic function
This will help us reusing the logic for `hg bundle`.
2017-05-05 17:28:52 +02:00
Yuya Nishihara
ab046506ef base85: proxy through util module
I'm going to replace hgimporter with a simpler import function, so we can
access to pure/cext modules by name:

  # util.py
  base85 = policy.importmod('base85')  # select pure.base85 or cext.base85

  # cffi/base85.py
  from ..pure.base85 import *  # may re-export pure.base85 functions

This means we'll have to use policy.importmod() function in place of the
standard import statement, but we wouldn't want to write it every place where
C extension modules are used. So this patch makes util host base85 functions.
2017-04-26 21:56:47 +09:00
Pierre-Yves David
01f1227adf exchange: directly 'getchangegroup'
It is identical to 'getlocalchangegroup' with a shorter name.
2017-05-04 12:41:36 +02:00
Martin von Zweigbergk
feebdd58e5 changegroup: delete unused 'bundlecaps' argument (API) 2017-05-02 23:47:10 -07:00
Martin von Zweigbergk
2b641873b5 merge with stable 2017-02-13 09:44:16 -08:00
Pierre-Yves David
d10a1ec092 unbundle: swap conditional branches for clarity
This is a small style update for clarity. The previous situation was:

  if foo:
    50 lines
  else:
    2 lines

In such case I tend to invert these to get the simpler branch out of the way
earlier:

  if not foo:
    2 lines
  else:
    50 lines

This makes the conditional and various alternatives fit on the same screen,
simpler to read overall.
2017-02-02 10:53:55 +01:00
Pierre-Yves David
354f9b4dae unbundle: add a small comment to tag the bundle1 case as such
This makes the code clearer to understand for someone new to it (or rusted)
2017-02-02 10:55:38 +01:00
Pierre-Yves David
433e331144 unbundle: add a small comment to clarify the 'check_heads' call
Bundle2 has its own mechanisms to check for heads (and other) changes, so push
using bundle2 is relying on the "check:heads" bundle part of unbundle and the
'check_heads' call is not checking anything. We add a small comment to make
this clearer.
2017-02-02 10:51:04 +01:00
Gregory Szorc
2b67e3476a exchange: obtain compression engines from the registrar
util.compengines has knowledge of all registered compression engines
and the metadata that associates them with various bundle types.

This patch removes the now redundant declaration of this metadata from
exchange.py and obtains it from the new source.

The effect of this patch is that once a new compression engine is
registered with util.compengines, `hg bundle -t <engine>` will just
work.
2016-11-10 23:34:15 -08:00
Mads Kiilerich
38cb771268 spelling: fixes of non-dictionary words 2016-10-17 23:16:55 +02:00
Gregory Szorc
26f6f03d4c exchange: refactor APIs to obtain bundle data (API)
Currently, exchange.getbundle() returns either a cg1unpacker or a
util.chunkbuffer (in the case of bundle2). This is kinda OK, as
both expose a .read() to consumers. However, localpeer.getbundle()
has code inferring what the response type is based on arguments and
converts the util.chunkbuffer returned in the bundle2 case to a
bundle2.unbundle20 instance. This is a sign that the API for
exchange.getbundle() is not ideal because it doesn't consistently
return an "unbundler" instance.

In addition, unbundlers mask the fact that there is an underlying
generator of changegroup data. In both cg1 and bundle2, this generator
is being fed into a util.chunkbuffer so it can be re-exposed as a
file object.

util.chunkbuffer is a nice abstraction. However, it should only be
used "at the edges." This is because keeping data as a generator is
more efficient than converting it to a chunkbuffer, especially if we
convert that chunkbuffer back to a generator (as is the case in some
code paths currently).

This patch refactors exchange.getbundle() into
exchange.getbundlechunks(). The new API returns an iterator of chunks
instead of a file-like object.

Callers of exchange.getbundle() have been updated to use the new API.

There is a minor change of behavior in test-getbundle.t. This is
because `hg debuggetbundle` isn't defining bundlecaps. As a result,
a cg1 data stream and unpacker is being produced. This is getting fed
into a new bundle20 instance via bundle2.writebundle(), which uses
a backchannel mechanism between changegroup generation to add the
"nbchanges" part parameter. I never liked this backchannel mechanism
and I plan to remove it someday. `hg bundle` still produces the
"nbchanges" part parameter, so there should be no user-visible
change of behavior. I consider this "regression" a bug in
`hg debuggetbundle`. And that bug is captured by an existing
"TODO" in the code to use bundle2 capabilities.
2016-10-16 10:38:52 -07:00
Pierre-Yves David
8b00f799de pull: grab wlock during pull
because pull might move bookmarks and bookmark are protected by wlock, we have
to grab wlock for pull :-(

This required a small upgrade of the 'lockdelay' extension used by
'test-clone.t' because the delay must apply to a single lock only.
2016-08-23 23:47:59 +02:00
Pierre-Yves David
7775b9bfe7 computeoutgoing: move the function from 'changegroup' to 'exchange'
Now that all users are in exchange, we can safely move the code in the
'exchange' module. This function is really about processing the argument of a
'getbundle' call, so it even makes senses to do so.
2016-08-09 17:06:35 +02:00
Pierre-Yves David
1b40b7e1c5 getchangegroup: take an 'outgoing' object as argument (API)
There is various version of this function that differ mostly by the way they
define the bundled set. The flexibility is now available in the outgoing object
itself so we move the complexity into the caller themself. This will allow use
to remove a good share of the similar function to obtains a changegroup in the
'changegroup.py' module.

An important side effect is that we stop calling 'computeoutgoing' in
'getchangegroup'. This is fine as code that needs such argument processing
is actually going through the 'exchange' module which already all this function
itself.
2016-08-09 17:00:38 +02:00
Augie Fackler
cb268cbd2f merge with stable 2016-08-15 12:26:02 -04:00
Augie Fackler
97b8f423b9 exchange: correctly specify url to unbundle (issue5145)
This parameter is slightly confusingly named in wireproto, so it got
mis-specified from the start as 'push' instead of the URL to which we
are pushing. Sigh. I've got a patch for that which I'll mail
separately since it's not really appropriate for stable.

Fixes a regression in bundle2 from bundle1.
2016-08-05 16:25:15 -04:00
liscju
c7ec9d159e i18n: translate abort messages
I found a few places where message given to abort is
not translated, I don't find any reason to not translate
them.
2016-06-14 11:53:55 +02:00
liscju
767e27633e bookmarks: add 'hg pull -B .' for pulling the active bookmark (issue5258) 2016-06-01 22:58:57 +02:00
Augie Fackler
ad67b99d20 cleanup: replace uses of util.(md5|sha1|sha256|sha512) with hashlib.\1
All versions of Python we support or hope to support make the hash
functions available in the same way under the same name, so we may as
well drop the util forwards.
2016-06-10 00:12:33 -04:00
Mike Hommey
50e9c3bb84 bundle2: properly request phases during getbundle
getbundle was requesting the "phase" namespace instead of the "phases"
namespace, which led to the client still requesting the phases
separately after getbundle finished.
2016-05-05 20:57:38 +09:00
Pierre-Yves David
000dd50a40 bundle2: remove 'experimental.bundle2-exp' boolean config (BC)
All users are migrated to 'devel.legacy.exchange', we can clean up the
experimental namespace.

Marking as (BC) because I know some large installation have bundle2 off and I
want to make sure they notice the change.
2016-08-03 16:23:26 +02:00
Pierre-Yves David
d98d072a06 bundle2: add a devel option controling bundle version used for exchange
We need an official way to force bundle1 to be used in test. We introduce a new
option 'devel.legacy.exchange' to control this. When specified, this option
will control the list of bundle version Mercurial consider when exchanging with
a peer. Current valid value are 'bundle1' and 'bundle2'.

Using this option in all tests will allow us to remove the
'experimental.bundle2-exp' option. We will simplify the code once the
experimental option is dropped.
2016-08-02 14:48:21 +02:00
Pierre-Yves David
732a3dd97c bundle2: rename the _canusebundle2 method to _forcebundle1
We rename and invert the logic of the _canusebundle2 utility. The idea here is
that we need to have a way to enforce the use of bundle1 in the tests. The
Mercurial philosophy is to try to use the best method available. Currently that
best method is bundle2, but this might change in the future. Therefore expressing
"do not use bundle2" is a loosy way to say "use bundle 1" and will likely create
issue in the future. As the config option will be explicitly about bundle1, we
rename the function beforehand to align with this. This will make the life of a
future developer working on bundle3 easier.
2016-08-03 15:01:23 +02:00
timeless
109fcbc79e pycompat: switch to util.urlreq/util.urlerr for py3 compat 2016-04-06 23:22:12 +00:00
Mads Kiilerich
940d175900 localrepo: refactor prepushoutgoinghook to take a pushop
prepushoutgoinghook was introduced in 8dfcd476a7f7 and largefiles is the only
in-tree use of it. Refactor it to be more useful for other use cases in
largefiles.
2016-04-13 01:09:11 +02:00
Martin von Zweigbergk
41fe4530a2 exchange: make _pushb2ctx() look more like _getbundlechangegrouppart()
The functions already have a lot in common, but were structured a
little differently.
2016-03-25 16:13:28 -07:00
Martin von Zweigbergk
5bf2894e26 exchange: get rid of "getcgkwargs" variable
This also makes the "version" argument explicit (never relies on
getlocalchangegroupraw()'s default), which I think is a good thing.
2016-03-25 16:01:40 -07:00
Nathan Goldbaum
c0c761b27a pushoperation: fix language issues in docstring 2016-03-10 17:31:38 -06:00
liscju
533e0cc9bc bookmarks: add 'hg push -B .' for pushing the active bookmark (issue4917) 2016-02-19 22:28:09 +01:00
Martin von Zweigbergk
86ca76bafe changegroup: fix pulling to treemanifest repo from flat repo (issue5066)
In b89de5ee5b31 (changegroup: don't support versions 01 and 02 with
treemanifests, 2016-01-19), I stopped supporting use of cg1 and cg2
with treemanifest repos. What I had not considered was that it's
perfectly safe to pull *to* a treemanifest repo using any changegroup
version. As reported in issue5066, I therefore broke pull from old
repos into a treemanifest repo. It was not covered by the test case,
because that pulled from a local repo while enabling treemanifests,
which enabled treemanifests on the source repo as well. After
switching to pulling via HTTP, it breaks.

Fix by splitting up changegroup.supportedversions() into
supportedincomingversions() and supportedoutgoingversions().
2016-01-27 09:07:28 -08:00
Martin von Zweigbergk
d1531da666 exchange: set 'treemanifest' param on pushed changegroups too
In 7a1ccfe03f74 (treemanifests: set bundle2 part parameter indicating
treemanifest, 2016-01-08), I didn't realize I had to set the parameter
separately for getbundle and unbundle. Having the parameter there on
push allows us to push to an empty repo and have the requirements
updated correctly.
2016-01-22 16:31:50 -08:00
Gregory Szorc
6a6f7ee7dc exchange: implement function for inferring bundle specification
We don't currently have a mechanism for inferring bundle spec strings
from bundle files. This patch adds one.

This will eventually be used to make the producing of clone bundles
manifests easier.
2016-01-14 22:49:03 -08:00
Martin von Zweigbergk
e5bd6473b3 changegroup: hide packermap behind methods
This is to prepare for hiding changegroup3 behind a config option.
2016-01-12 21:01:06 -08:00
Gregory Szorc
5732d02b64 exchange: make clone bundles non-experimental and enabled by default
The clone bundles feature was introduced in Mercurial 3.6 behind an
experimental and disabled by default flag. The feature has been enabled
on hg.mozilla.org for a few months and has served many terabytes of
clones. Users have been encouraged to use the feature and reception
has been very positive (mainly due to faster clones as a result of
connecting to a CDN). I have heard no feedback about changing the
feature other than inquiries about when it will be enabled by default.
So, I think the feature is ready to be enabled by default.

This patch renames experimental.clonebundles to ui.clonebundles,
documents the option, and enables it by default. References to the
experimental state of clone bundles have been removed. The remaining
config option docs in clonebundles.py have been removed because they
are redudant with `hg help config`.

There are some oddities with behavior of clone bundles. Because clones
with clone bundles are effectively 2 `hg pull` operations, there may be
2 transactions. This could result in hooks running twice. If the
subsequent pull is aborted, it could result in partial rollback and an
incomplete clone. This behavior is a bit wonky and should probably
be documented. If this patch is accepted, I'll send a follow-up to
document it. I don't think this behavior should prevent the feature
being enabled by default. Reworking the clone mechanism to support
interrupted or multi-part clones feels like a major new feature and
something that when implemented can change the hook and rollback
semantics of clone bundles. Besides, partial clone is better than
full rollback and hooks running on initial clone are likely rare, so I
think the impact is minimal.
2016-01-08 10:58:04 -08:00
Gregory Szorc
0436f8dd18 exchange: make clonebundleprefers non-experimental
In preparation for making the feature enabled by default.
2016-01-08 10:57:01 -08:00
Martin von Zweigbergk
417363259e treemanifests: set bundle2 part parameter indicating treemanifest
By adding a mandatory 'treemanifest' parameter in the bundle2 part, we
make it possible for the recipient to set repo requirements before the
manifest revlog is accessed.
2016-01-08 21:13:06 -08:00
Matt Mackall
a323d9f732 pull: make a single call to obsstore.add (issue5006)
Prior to this, a pull of 90k markers (already known locally!) was
making about 2000 calls to obsstore.add, which was repeatedly building
a full set of known markers (in addition to other transaction
overhead). This quadratic behavior accounted for about 50 seconds of a
70 second no-op pull. After this change, we're down to 20 seconds.

While it would seem simplest to just cache the known set for
obsstore.add, this would also introduce issues of correct cache invalidation.
The extra pointless transaction overhead would also remain.
2015-12-18 13:53:50 -06:00
Gregory Szorc
9859c8fd09 exchange: use absolute_import 2015-12-23 12:32:08 -08:00
Gregory Szorc
e4e7479111 exchange: standalone function to determine if bundle2 is requested
This will be used in a subsequent patch.
2015-12-04 13:31:01 -08:00
Ryan McElroy
f7b3bc72e8 exchange: pass pushop to discovery.checkheads
Previously, we passed a bunch of parameters to discovery.checkheads, but all
of the arguments can be fetched out of pushop, which may contain a lot more
useful information for extensions now that pushop is extensible.
2015-11-10 11:13:21 -08:00
Gregory Szorc
9a4b9852b5 exchange: do not attempt clone bundle if local repo is non-empty (issue4932) 2015-11-03 12:16:54 -08:00
Gregory Szorc
059667a51c bundle2: attribute remote failures to remote (issue4788)
Before bundle2, hook output from hook failures was prefixed with
"remote: ". Up to this point with bundle2, the output was converted to
the message to print in an Abort exception. This had 2 implications:

1) It was unclear whether an error message came from the local repo
   or the remote
2) The exit code changed from 1 to 255

This patch changes the handling of error:abort bundle2 parts during push
to prefix the error message with "remote: ". This restores the old
behavior.

We still preserve the behavior of raising an Abort during bundle2
application failure. This is a regression from pre-bundle2 because the
exit code changed.

Because we no longer raise an Abort with the remote's message, we needed
to insert a message for the new Abort. So, I invented a new error
message for that. This is another change from pre-bundle2. However, I
like the new error message because it states unambiguously who aborted
the push failed, which I think is important for users so they can decide
what's next.
2015-10-24 00:39:22 +01:00
Mads Kiilerich
09567db49a spelling: trivial spell checking 2015-10-17 00:58:46 +02:00
timeless@mozdev.org
1b8a3b4ea1 grammar: use does instead of do where appropriate 2015-10-14 02:06:54 -04:00
Gregory Szorc
53a1c4895e exchange: support streaming clone bundles in clone bundles
Now that we have support for detecting compatible stream clone bundles
in bundle specifications, we can safely add support for applying stream
clone bundles to the clone bundles feature.
2015-10-17 11:37:08 -07:00
Gregory Szorc
dcf323b32d exchange: parse requirements from stream clone specification string
Stream clone bundles can only be consumed if the consumer supports the
exact format requirements that were present on the producer.

This patch adds support for encoding and verifying the format
requirements on the bundle specification string for a stream clone
bundle are supported by the local repository. If they aren't, we raise
an UnsupportedBundleSpecification, just like we do when an unknown
compression or bundle type is encountered.

The impetus for this patch is so the clone bundles manifest can
advertise stream clone bundles and so clients can filter out stream
clones with unsupported format requirements. e.g. a stream clone
produced with the not-yet-invented "revlogv2" format will be ignored by
clients that only support "revlogv1."
2015-10-17 10:26:34 -07:00
Gregory Szorc
ddb0256917 exchange: support parameters in bundle specification strings
Sometimes a basic type string is not sufficient for representing the
contents of a bundle. Take bundle2 for example: future bundle2 files may
contain parts that today's bundle2 parser can't read. Another example is
stream clone data. These require clients to support specific
repository formats or they won't be able to read the written files. In
both scenarios, we need to describe additional metadata beyond the outer
container type. Furthermore, this metadata behaves more like an
unordered set, so an order-based declaration format (such as static
strings) is not sufficient.

We introduce support for "parameters" into the bundle specification
string. These are essentially key-value pairs that can be used to encode
additional metadata about the bundle.

Semicolons are used as the delimiter partially to increase similarity to
MIME parameter values (see RFC 2231) and because they are relatively
safe from the command line (although values will need quotes to avoid
interpretation as multiple shell commands). Alternatives considered were
spaces (a bit annoying to encode) and '&' (similar to URL query strings)
(which will do bad things in a shell if unquoted).

The parsing function now returns a dict of parsed parameters and
consumers have been updated accordingly.
2015-10-14 17:00:34 -07:00
Gregory Szorc
ea298c9464 exchange: support for streaming clone bundles
Now that we have a mechanism to produce and consume streaming clone
bundles, we need to teach the human-facing bundle specification parser
and the internal bundle file header reading code to be aware of this new
format. This patch does so.

For the human-facing bundle specification, we choose the name "packed"
to describe "streaming clone bundles" because the bundle is essentially
a "pack" of raw revlog files that are "packed" together. There should
probably be a bikeshed over the name, especially since it is human
facing.
2015-10-15 13:00:45 -07:00
Gregory Szorc
8bb79e1017 exchange: don't print error codes after clone bundle failure
We don't appear to print error codes elsewhere. The error codes are
inconsistent between at least Linux and OS X and are more trouble than
they are worth. Humans care about the error string more than the code
anyway.

A glob was also added to pave over differences in error strings between
Linux and OS X.
2015-10-15 14:53:32 -07:00
Sean Farley
ca3c3b3a56 exchange: add oparg to push so that extensions can wrap pushop 2015-10-13 23:04:53 -07:00
Augie Fackler
30c5897436 exchange: use cg?unpacker.apply() instead of changegroup.addchangegroup() 2015-10-13 17:12:29 -04:00
Gregory Szorc
9f94bd29c6 exchange: advertise if a clone bundle was attempted
The client now sends a "cbattempted" boolean flag to the "getbundle"
wire protocol command to tell the server whether a clone bundle was
attempted.

The presence of this flag will enable the server to conditionally emit a
bundle2 "output" part advertising the availability of clone bundles to
compatible clients that don't have it enabled.
2015-10-14 10:36:20 -07:00
Gregory Szorc
cd1d42460a exchange: record that we attempted to fetch a clone bundle
This is needed so a subsequent patch can conditionally add a bundle2
part to the "getbundle" wire protocol command depending on whether a
clone bundle was attempted.
2015-10-13 14:55:02 -07:00
Gregory Szorc
95a4a00349 exchange: provide hint on how to disable clone bundles
If a clone bundle persistently fails to apply, users need a way to
disable it so they have a hope of the clone working. Change the hint for
the abort scenario to advertise the config option to disable clone
bundles.
2015-10-13 12:41:32 -07:00
Gregory Szorc
143c4bca55 exchange: document filterclonebundleentries 2015-10-14 10:03:26 -07:00
Sean Farley
6ffbe0061e exchange: use pushop.repo instead of repo 2015-10-13 22:53:08 -07:00
Gregory Szorc
c0efcd3e47 exchange: support sorting URLs by client-side preferences
Not all bundles are appropriate for all clients. For example, someone
with a slow Internet connection may want to prefer bz2 bundles over gzip
bundles because they are smaller and don't take as long to transfer.
This is information that a server cannot know on its own. So, we invent
a mechanism for "preferring" server-advertised URLs based on their
attributes.

We could invent a negotiation between client and server where the client
sends its preferences and the sorting/filtering is done server-side.
However, this feels complex. We can avoid complicating the wire protocol
and exposing ourselves to backwards compatible concerns by performing
the sorting locally.

This patch defines a new config option for expressing preferred
attributes in server-advertised bundles.

At Mozilla, we leverage this feature so clients in fast data centers
prefer uncompressed bundles. (We advertise gzip bundles first because
that is a reasonable default.)

I consider this an advanced feature. I'm on the fence as to whether it
should be documented in `hg help config`.
2015-10-13 12:30:39 -07:00
Gregory Szorc
9516d6763b exchange: extract bundle specification components into own attributes
An upcoming patch will enable clients to prefer certain bundles over
others. The idea is that we define values of attributes from manifests
that are desirable.

The BUNDLESPEC attribute is a complex value consisting of multiple
parts. Clients may wish to only prefer one of these parts. Having to
specify every combination of BUNDLESPEC would be annoying. So, we
extract the components of BUNDLESPEC into their own attributes so
clients can easily filter on a sub-component.
2015-10-13 12:31:19 -07:00
Gregory Szorc
3b0f2a4363 exchange: support preserving external names when parsing bundle specs
This will be needed to make client-side preferences work easier.
2015-10-13 12:29:50 -07:00
Gregory Szorc
7f6305218b clonebundles: filter on SNI requirement
Server Name Indication (SNI) is commonly used in CDNs and other hosted
environments. Unfortunately, Python <2.7.9 does not support SNI and when
these older Python versions attempt to negotiate TLS to an SNI server,
they raise an opaque error like
"_ssl.c:507: error:14094410:SSL routines:SSL3_READ_BYTES:sslv3 alert
handshake failure."

We introduce a manifest attribute to denote the URL requires SNI and
have clients without SNI support filter these entries.
2015-10-13 10:59:41 -07:00
Gregory Szorc
11b70bd7bb clonebundles: filter on bundle specification
Not all clients are capable of reading every bundle. Currently, content
negotiation to ensure a server sends a client a compatible bundle
format is performed at request time. The response bundle is dynamically
generated at request time, so this works fine.

Clone bundles are statically generated *before* the request. This means
that a modern server could produce bundles that a legacy client isn't
capable of reading. Without some kind of "type hint" in the clone
bundles manifest, a client may attempt to download an incompatible
bundle. Furthermore, a client may not realize a bundle is incompatible
until it has processed part of the bundle (imagine consuming a 1 GB
changegroup bundle2 part only to discover the bundle2 part afterwards is
incompatibl). This would waste time and resources. And it isn't very
user friendly.

Clone bundle manifests thus need to advertise the *exact* format of the
hosted bundles so clients may filter out entries that they don't know
how to read. This patch introduces that mechanism.

We introduce the BUNDLESPEC attribute to declare the "bundle
specification" of the entry. Bundle specifications are parsed using
exchange.parsebundlespecification, which uses the same strings as the
"--type" argument to `hg bundle`. The supported bundle specifications
are well defined and backwards compatible.

When a client encounters a BUNDLESPEC that is invalid or unsupported, it
silently ignores the entry.
2015-10-13 11:45:30 -07:00
Gregory Szorc
cf1dfbfb60 clonebundle: support bundle2
exchange.readbundle() can return 2 different types. We weren't handling
the bundle2 case. Handle it.

At some point we'll likely want a generic API for applying a bundle from
a file handle. For now, create another one-off until we figure out what
the unified bundle API should look like (addressing this is a can of
worms I don't want to open right now).
2015-10-13 10:41:54 -07:00
Gregory Szorc
c55df1f741 exchange: refactor bundle specification parsing
The old code was tailored to `hg bundle` usage and not appropriate for
use as a general API, which clone bundles will require. The code has
been rewritten to make it more generally suitable.

We introduce dedicated error types to represent invalid and unsupported
bundle specifications. The reason we need dedicated error types (rather
than error.Abort) is because clone bundles will want to catch these
exception as part of filtering entries. We don't want to swallow
error.Abort on principle.
2015-10-13 10:57:54 -07:00
Gregory Szorc
5e2a52d9ca exchange: move bundle specification parsing from cmdutil
Clone bundles require a well-defined string to specify the type of
bundle that is listed so clients can filter compatible file types. The
`hg bundle` command and cmdutil.parsebundletype() already establish the
beginnings of a bundle specification format.

As part of formalizing this format specification so it can be used by
clone bundles, we move the specification parsing bits verbatim to
exchange.py, which is a more suitable place than cmdutil.py. A
subsequent patch will refactor this code to make it more appropriate as
a general API.
2015-10-13 11:43:21 -07:00
Gregory Szorc
5d1b4c49ee clonebundles: support for seeding clones from pre-generated bundles
Cloning can be an expensive operation for servers because the server
generates a bundle from existing repository data at request time. For
a large repository like mozilla-central, this consumes 4+ minutes
of CPU time on the server. It also results in significant network
utilization. Multiplied by hundreds or even thousands of clients and
the ensuing load can result in difficulties scaling the Mercurial server.

Despite generation of bundles being deterministic until the next
changeset is added, the generation of bundles to service a clone request
is not cached. Each clone thus performs redundant work. This is
wasteful.

This patch introduces the "clonebundles" extension and related
client-side functionality to help alleviate this deficiency. The
client-side feature is behind an experimental flag and is not enabled by
default.

It works as follows:

1) Server operator generates a bundle and makes it available on a
   server (likely HTTP).
2) Server operator defines the URL of a bundle file in a
   .hg/clonebundles.manifest file.
3) Client `hg clone`ing sees the server is advertising bundle URLs.
4) Client fetches and applies the advertised bundle.
5) Client performs equivalent of `hg pull` to fetch changes made since
   the bundle was created.

Essentially, the server performs the expensive work of generating a
bundle once and all subsequent clones fetch a static file from
somewhere. Scaling static file serving is a much more manageable
problem than scaling a Python application like Mercurial. Assuming your
repository grows less than 1% per day, the end result is 99+% of CPU
and network load from clones is eliminated, allowing Mercurial servers
to scale more easily. Serving static files also means data can be
transferred to clients as fast as they can consume it, rather than as
fast as servers can generate it. This makes clones faster.

Mozilla has implemented similar functionality of this patch on
hg.mozilla.org using a custom extension. We are hosting bundle files in
Amazon S3 and CloudFront (a CDN) and have successfully offloaded
>1 TB/day in data transfer from hg.mozilla.org, freeing up significant
bandwidth and CPU resources. The positive impact has been stellar and
I believe it has proved its value to be included in Mercurial core. I
feel it is important for the client-side support to be enabled in core
by default because it means that clients will get faster, more reliable
clones and will enable server operators to reduce load without
requiring any client-side configuration changes (assuming clients are
up to date, of course).

The scope of this feature is narrowly and specifically tailored to
cloning, despite "serve pulls from pre-generated bundles" being a valid
and useful feature. I would eventually like for Mercurial servers to
support transferring *all* repository data via statically hosted files.
You could imagine a server that siphons all pushed data to bundle files
and instructs clients to apply a stream of bundles to reconstruct all
repository data. This feature, while useful and powerful, is
significantly more work to implement because it requires the server
component have awareness of discovery and a mapping of which changesets
are in which files. Full, clone bundles, by contrast, are much simpler.

The wire protocol command is named "clonebundles" instead of something
more generic like "staticbundles" to leave the door open for a new, more
powerful and more generic server-side component with minimal backwards
compatibility implications. The name "bundleclone" is used by Mozilla's
extension and would cause problems since there are subtle differences
in Mozilla's extension.

Mozilla's experience with this idea has taught us that some form of
"content negotiation" is required. Not all clients will support all
bundle formats or even URLs (advanced TLS requirements, etc). To ensure
the highest uptake possible, a server needs to advertise multiple
versions of bundles and clients need to be able to choose the most
appropriate from that list one. The "attributes" in each
server-advertised entry facilitate this filtering and sorting. Their
use will become apparent in subsequent patches.

Initial inspiration and credit for the idea of cloning from static files
belongs to Augie Fackler and his "lookaside clone" extension proof of
concept.
2015-10-09 11:22:01 -07:00
Pierre-Yves David
30913031d4 error: get Abort from 'error' instead of 'util'
The home of 'Abort' is 'error' not 'util' however, a lot of code seems to be
confused about that and gives all the credit to 'util' instead of the
hardworking 'error'. In a spirit of equity, we break the cycle of injustice and
give back to 'error' the respect it deserves. And screw that 'util' poser.

For great justice.
2015-10-08 12:55:45 -07:00
Durham Goode
ceec7b0056 bundle2: allow lazily acquiring the lock
In the external pushrebase extension, it is valuable to be able to do some work
without taking the lock (like running expensive hooks). This enables
significantly higher commit throughput.

This patch adds an option to lazily acquire the lock. It means that all bundle2
part handlers that require writing to the repo must first call
op.gettransction(), when in this mode.
2015-10-05 16:19:54 -07:00
Gregory Szorc
ba84e5a41f exchange: add "streaming all changes" to bundle2 pulling
This is the beginning of client-side support for performing a stream
clone using bundle2. The main bundle2 pull function checks whether to
perform a streaming clone and outputs a message if so.

While we have a duplicate message, it seems easier to have all the
bundle2 console writing in one location and in an easy-to-read
conditional block.
2015-10-04 12:11:44 -07:00
Gregory Szorc
dcbb92c7a1 exchange: expose bundle2 availability on pulloperation
Like the previous patch, the value is cached and will prevent a function
level import in streamclone.py.
2015-10-04 12:03:30 -07:00
Gregory Szorc
0500a32918 exchange: expose bundle2 capabilities on pulloperation
This adds a cache and makes accessing the capabilities slightly simpler,
as you don't need to directly go through the bundle2 module. This will
also help prevent a function-level import in streamclone.py.

This patch arguably isn't necessary. But I think it makes things
slightly nicer.
2015-10-04 18:31:53 -07:00
Gregory Szorc
f283944ccb streamclone: rename and document maybeperformstreamclone()
Upcoming patches will introduce bundle2 based streaming clones. Add
"legacy" to the function name and add a docstring clarifying the intent of
the function.
2015-10-04 11:34:28 -07:00
Gregory Szorc
8ac7d32ad1 streamclone: refactor maybeperformstreamclone to take a pullop
Just like all the other pull steps. Consistency is good.

This seems a little excessive right now since maybeperformstreamclone is
such a short function. This will be addressed in a subsequent patch.
2015-10-04 11:20:52 -07:00
Gregory Szorc
13e503977f exchange: move stream clone logic into pull code path
Stream clones are a special case of clones. Clones are a special case of
pull. Most of the logic for deciding what to do at pull time is in
exchange.py. It makes sense for the stream clone determination to live
there as well.

This patch moves the calling of the stream clone code into pull(). The
checks in streamclone.canperformstreamclone() ensure that we don't
perform a stream clone unless it is possible.

A future patch will convert maybeperformstreamclone() to accept a
pullop to make it consistent with everything else in pull(). It will
also grow some functionality (in case you doubted the necessity of a 4
line function).
2015-10-02 23:04:52 -07:00
Gregory Szorc
7d658837f3 exchange: teach pull about requested stream clones
An upcoming patch will move the invocation of stream cloning logic to
the normal pull code path (from localrepository.clone). In preparation
for this, we teach pull() and pulloperation about whether a streaming
clone is requested.

The return logic in localrepository.clone() has been reformatted
slightly because of line length issues.
2015-10-02 22:16:34 -07:00
Gregory Szorc
d8e74180f0 streamclone: move code out of exchange.py
We bulk move functions from exchange.py related to streaming clones.

Function names were renamed slightly to drop a component redundant with
the module name. Docstrings and comments referencing old names and
locations were updated accordingly.
2015-10-02 16:05:52 -07:00
Gregory Szorc
b9c7577e2a exchange: add docstring to pull()
This seems like the kind of important function that should be documented
better.
2015-10-02 15:36:00 -07:00
Ryan McElroy
83dd86ff3e bundle2: generate check:heads in a independent function 2015-10-01 10:48:14 -07:00
Durham Goode
7638a913ab exchange: allow fallbackheads to use lazy set behavior
The common ancestor set implementation was made lazy a couple years ago, but
this piece of code still required processing the entire repo by putting set()
around the lazy set. The code was introduced in 984b6b21bf13, a year before the
lazy ancestor set was added.

Dropping the set() shaves 3.5 seconds off of 'push -r' in repos with hundreds of
thousands of commits.
2015-09-07 17:08:35 -07:00
Martin von Zweigbergk
6ea2f0a8d1 exchange: fix dead assignment
The assignment of the value from bundle2.processbundle() to 'r' is
unused. It is currently the same as its third argument (if given), and
since that argument may eventually go away (according to the method's
docstring), let's reassign the return value to 'op' instead to better
prepare for that.
2015-07-20 13:39:25 -07:00
Martin von Zweigbergk
2b14caf707 exchange: s/phase/bookmark/ in _pushb2bookmarks() 2015-07-20 13:35:19 -07:00