The discovery phases for bookmarks now use the list of explicitly pushed bookmarks
to do addition, removal and overwriting.
Tests are impacted because this reduces the amount of listkeys calls issued, removes
some duplicated messages and improves the accuracy of some messages.
To gather all the bookmark pushing actions together, we need code performing
those actions to be ready for them. We need to be able to produce different
messages for different actions.
There is no reason for bookmarks to get a special treatment. As a first step we
move the code as is in the `exchange.pull` function. Integration with the rest
of the flow will come later.
Adding bookmarks to pull means that most clone paths are now pulling bookmarks
through pull. We ensure that bookmark-update messages are properly suppressed in
that case.
In test-pull-http.t the 'requesting all changes' message disappear because we
now get the authentication error on the `listkeys`command before such message
is printed.
This part is responsible for adding new bookmarks on the remote. Before that,
it was done on its own in `commands.push`. The export is still not integrated
with the rest of the push process, but at least it now dwells in the right
function.
Returning the pushop object gives access to more information (upcoming bookmark
push result for example). `localrepo.push` currently extracts the `cgresult` for
callers.
We are about to introduce more results-related attributes on pushop (for
bookmarks) so we need a more distinctive name. We now use `cgresult` as
`pulloperation` does.
The primary goal is to make it easier for extensions to alter how bundle2
parts are laid out. They now can use the getbundle2partsgenerator decorator
to add new parts, or directly act on getbundle2partsmapping to wrap existing
part functions.
Note the 'request for bundle10 must include changegroup' error was kept
under the same conditions as before, although the logic changes don't make
it obvious.
Functions like getbundle and classes like unbundle10 really manipulate
changegroups and not bundles. A HG10 bundle is the same as a changegroup
plus a small header, but this is no longer the case for a HG2X bundle,
so it's better to separate the names a bit.
We use the `cg` argument to disable the generation of a changegroup part. This
is useful to pull information when changesets are already in sync (phases,
obsmarkers).
We should only exchange obsolete markers related to the changesets
that are being exchanged. For example, if `A'` is a successor of `A`,
we do not want to push the marker if we are not exchanging
`A'`. Otherwise `A` would disappear without a successor, leading to confusion
for both users and the evolution mechanism.
Therefore we now exchange only the markers relevant to the subset of nodes
involved in the push (the nodes themselves may be already common but were
selected by --rev (or the lack of --rev)).
Note that all selected markers are still exchanged on each push. We do
not have a discovery protocol for markers in core yet. Such discovery
would save us the exchange of markers known on both side.
We can now directly prevent any evolution-related operation from happening by using
an empty set for outgoing markers. So we detect if no transfers should occur and
use an empty set in this case.
We use the `pushkey` part to exchange bookmark updates within the unified
bundle2 push. Note that this only applies on update (moving a bookmark known on both
sides) since bookmark export (creation of a new bookmark on remote) is apparently
done outside of the _push function.
The discovery of necessary bookmark updates is now done within the "discovery
phase". This opens the door to the inclusion of bookmarks in a unified bundle2
push.
We make a temporary variable for the remote bookmark data and we do not expand
all elements from `bookmark.compare` since we are going to use only one.
Before this patch, there was always an attempt to update bookmark even if prior
steps of the push failed. I cannot see a good semantic reason to do so. We
disable this possibility to simplify the push flow with bundle2. Bookmarks will
be included in the bundle and fail with other steps.
Now that ancestors has the same boolean property as a list, we can stop unrolling
the set of ancestors. This should provide a significant speedup to this step as
ancestor objects are smart and lazy.
We now pass a transaction option to this phase movement function. The object
is currently not used by the function, but it will be in the future.
All call sites have been updated. Most call sites were already enclosed in a
transaction for a long time. The handful of others have been recently
updated in previous commit.
The retractboundary function remains to be upgraded.
We were relying on the phase internals to filter out redundant phase
information from remove. However as we plan to integrate phase
movement inside the transaction, we want to avoid useless transaction
creation on no-op pulls.
Therefore we filter out all the information that already matches the current
repository state. This will let us create a transaction only when there is
actual phase movement needed.
We do not have infrastructure to include obsolescence markers in the bundle2
push from core. But extensions may so we make sure it would not be sent twice.
Sending obsmarkers through pushkey requires extra encoding (since pushkey can't
take binary content) and slicing (since we can hit http header limit). As we
send all obsolescences markers that exists in the repo for each push, we used to
just look at the content of the "obsolete" pushkey namespace (already encoded
and sliced) and send its
content.
However, future changeset will make it possible to push only parts of the
obsmarkers. To prepare this we now explicitly encode a list of markers. The list
of markers is still "all of them" but future changeset will takes care
of that.
The new code uses a "_protected" method but that seems reasonable to keep it
private as this is the is the only external user of it and this whole pushing
obsmarker through pushkey things in fairly hacky already)
Phase push is now included in the same bundle2 push as changesets. We use
multiple pushkey parts to transmit the information. Note that phase moves are
still not part of the repository "transaction".
Instead of a single list of functions, we now have a list of names and
a mapping of names to functions. This simplifies wrapping of steps
from extensions. In the same move, declaration becomes decorator-based
(syntax sugar, nom nom nom!).
Now that both options (push succeed or fall back) live in pushop, we
can move the common heads computation there too. It is a very commonly
accessed attribute so it makes a lot of sense to have it in pushop.
Bundle2 will allow pushing all different parts of the push in a single bundle.
This mean that the discovery for each part needs to be done before trying to
push. Currently we may have different behaviors for phases and obsolescence markers
when the push of changesets fails. For example, information may still be
exchanged for a part of the history where changesets are common but where
phases mismatch. So the preparation of the push need to determine what
information need to be pushed in both situations. And it needs a different set of
heads for this. Therefore we are moving heads computation within pushop for easy
access by all parties. We start with the simplest set of heads.
The ``getbundle`` function was initially design to return a changegroup bundle.
However, bundle2 allows transmitting a wide range of data. Some bundle2
requests may not include a changegroup at all.
Before this changeset, the client would request a changegroup for
``heads=[nullid]`` and receive an empty changegroup.
We introduce an official boolean parameter, ``cg``, that can be set
to false to disable changegroup generation on getbundle. A new bundle2
capability is introduced to let the client know.
When a bundle2 parts generator returns a non callable value, it should not be
used as a reply handler. The changegroup part generator is already having this
case of behavior when there is no changegroup to push. This changeset prevent a
crash for user of the experimentable bundle2 feature.
Instead of explicitly calling a few function to generate part in the bundle, we
now have a list of all part generators. This should make it easier for
extensions to adds new part in the bundle.
This new way to extend the push deprecates the old `_pushbundle2extraparts` way.
When bundle2 push includes more than just changesets, we may have no
changegroup to push yet still have other data to push.
So we now try to performs a bundle2 push in all cases. The check for changegroup
inclusion is moved into the ``_pushb2ctx`` function in charge of creating the
changegroup part.
The bundle2 part is aborted if no actual payload part have been added to the
bundle2.
This attribute will record what steps were performed during the bundle2 push.
This will control whenever the old way push must be performed or skipped. This
will ultimately be used by changegroup, phases, obsmarkers, bookmarks and any
other kind of data ones may want to exchange even when bundle2 support is
missing.
We extract the creation of changegroup related parts into its own function.
This precludes the inclusion of more diverse data during the bundle2 push.
We use a closure to carry the logic that need to be perform when processing the
server reply.
This is the first step of a refactoring that will ease the inclusion of new part
in the bundle2 push and includes more information (like phases) in this push
We need to move the function a bit sooner to be able to group the generation of
`b2x:check:heads` and `b2x:changegroup` part in an external function. We move it
sooner to preserve parts creation order bundle2 tests rely on. At the ends of this
refactoring the `_pushbundle2extraparts` will be replaced by another mechanism
anyway.
When bundle2 was enabled, if hg pull had no commits to pull, it would print
'no changes found' and then download the entire repository from the server. This
was caused by heads and common being set to None, which gets treated as
heads=cl.heads() and common=[nullid], which means download the entire repo.
Pulling bundles without a changegroup is a valid use case (like if we're just
updating bookmarks), so this modifes the bundle code to allow not adding
changegroups.
This is backport of 26ad3517a3a2.
During pulls bundle2 was checking server.bundle2, but during pushes it was
checking experimental.bundle2. This makes them both experimental.bundle2.
This is a backport of a9334b37b19a
We make sure any exceptions raised during the whole span of handling bundle2
processing are decorated. This let us catch exceptions raised by hooks prior to
transaction commit.
If the heads on the server differ from the ones reported seen by the client at
bundle time, we raise a PushRaced exception. However, the part raising the
exception was broken.
To fix it, we move the PushRaced class in the error module so it can be
accessible everywhere without an import cycle.
A test is also added to prevent regression.
We call a dedicated hook right after closing the transaction. This will let
people react to the transaction with all the information in hand. This hook is
experimental and will not survive in future versions.
We call a dedicated hook right before closing the transaction. This will let
people abort unbundling with all the information in hand. This hook is
experimental and will not survive in future versions.
All currently core parts are moved to a `bx2` namespace (for "bundle 2
experimental"). This should avoid conflicts between the final stable
format and the one about to be released.
For the same reason, we advertise this bundle2 implementation and format as
experimental. This will leave room for field testing in 3.0 but won't conflict
with a stable implementation in 3.1.
The current implementation of bundle2 is still very experimental and the 3.0
freeze is yesterday. The current bundle2 format has never been field-tested, so
we rename the header to HG2X. This leaves the HG20 header available for real
usage as a stable format in Mercurial 3.1.
We won't guarantee that future mercurial versions will keep supporting this
`HG2X` format.
We can now retrieve them from the server during push. The capabilities are
encoded the same way as in `replycaps` part (with an extra layer of urlquoting
to escape separators).
The bundle2 processing does not create a bundle2 reply by default anymore. It
is only done if the client requests it with a `replycaps` part. This part is
called `replycaps` as it will eventually contain data about which bundle2
capabilities are supported by the client.
We have to add a flag to the test command to control whether a reply is
generated or not.
We now use a bundle2 container to push all phase updates at the same time. This
is a significant step forward, even if further refactoring is needed to unify
phase push with the changeset push.
We use bundle2 to retrieve the remote phase data at the same time as
changesets. This reduces the amount of requestis and should improve consistency
as the server can ensure nothing changed between the retrieval of those parts.
A new ``listkeys`` is supported by getbundle. It is a list of namespaces whose
content should be included in the bundle.
An appropriate entry has been added to the wireproto map of getbundle arguments
and a new bundle2 capability is advertised.
There are still no codes that request such parts in core mercurial.
This function factors the creation of appropriate entries to use in
``bundlecaps`` argument of ``getbundle``. This cleans up code calling
``getbundle`` and helps its usage in more part of the code.
The process of decoding remote bundle2caps blob into a dictionary is cumbersome.
We move it into a small helper function. This will clarify code that reads
bundle2 capabilities of peers and helps using it in new places.
We are going to raise exceptions for a wider range of cases: unsupported
mandatory stream and part parameters. We rename the exception with a wider
name.
This code have been factorised and moved in its own function by 45df6ae67061. We
actually read the result of this other computation in the very line before the
deleted block. I somehow forgot to remove the original code, but it is now
dead. Good bye duplicated computation.
When a bundle2 is pushed we return a bundle instead of an integer. We use to
return a binary stream. We now return a `bundle20` bundler to make the life of
wireprotocol implementation simpler.
This input will have to travel over the wire anyway, so we feed the peer method
with a simple binary stream and rely on the server side to use `readbundle`
to create the python object.
The test output changes because the bundle is created marginally sooner and the
debug output interleaves in a different way.
For friendliness with the wire protocol implementation, the `exchange.getbundle` now
returns a binary stream when asked for a bundle2. We detect a bundle2 request and
upgrade the binary stream to an unbundler object.
In the future the unbundler may gain feature to look like a binary stream, but
we are not quite there yet.
The `readbundle` function can now recognize a bundle2 stream and return the
appropriate unbundler. This is required for proper bundle2 support over the
wire.
We first read 4 bytes to get the `HG10` bytes then we read the compression scheme
if this is `HG10`. This prepares the code for the arrival of `HG20` handling.
Using `readbundle` in the part handlers creates a circular import hell. We are
now using a simple `HG10UN` stream with no header. Some parameters may
later be introduced on the part to change parameter.
Producers are updated as well.
We now support bundle2 for local push. The unbundle function has to detect
which version of the bundle to use since the return type is different.
Note that push error handling is currently nonexistent. This is one of the
reasons why bundle2 is still disabled by default.
This patch introduces "prepushoutgoinghooks" to extend outgoing check
before pushing changesets to remote easily.
This chooses the function returning "util.hooks" instead of the one to
be overridden.
The latter may cause problems silently, if one of overriders forgets
(or fails) to execute a kind of "super(xxx, self).overridden(...)". In
the other hand, the former can ensure that all registered functions
are invoked, unless one of them raises an exception.
We are going to introduce an `unbundlepart` dedicated to reading bundle. So we
need to rename the one used to create bundle. Even if dedicated to creation, this
is still used for unbundling until we get the new class.
Localrepo now supports the unbundle method of pushing changegroups. We
plan to use the unbundle call for bundle2 so it is important that all
peers supports it. The `peer.unbundle` and `peer.addchangegroup` code
path have small difference so cause some test output changes. None of those
changes seems problematic.
The `exchange` module now contains an `unbundle` function that holds the core
unbundle logic. The wire protocol keeps its own unbundle function. It enforces
wireprotocol-specific logic and then calls the extracted function.
This aims at implementing unbundle for localrepo.
We are going to refactor the unbundle function to have it working on
a local repository too. Having this function extracted will ease the process.
In the case of non-matching heads, the function now directly raises an
exception. The top level of the function is catching it.
The bundlecaps passed to exchange.getbundle were being dropped completely. We
should pass them on through to the changegroup.
This affected the remotefilelog extension, since it relies on those bundlecaps.
This changeset refactors the pull code to use a bundle2 when available. We keep
bundle2 disabled by default. The current code is not ready for prime time.
Ultimately we'll want to unify the API of `bunde10` and `bundle20` to have less
different code. But for now, testing the bundle2 exchange flow is an higher
priority.
This function can return a `HG10` or `HG20` bundle. It uses the `bundlecaps`
parameters to decides which one to return.
This is a distinct function from `changegroup.getbundle` for two reasons. First
the APIs of `bundle10` and `bundle20` are not compatible yet. The two functions
may be reunited in the future. Second `exchange.getbundle` will grow parameters
for all kinds of data (phases, obsmarkers, ...) so it's better to keep the
changegroup generation in its own function for now.
This function will be used it in the next changesets.
The `pushoperation` object contains strictly more data the arguments currently
passed to `localrepo.checkpush` we pass the new object instead. This function is
used by MQ to abort push that includes MQ changesets.
Note: extension that may use this function will have to align.
With bundle2 we'll slowly move current steps into a single bundle2 building and
call (changegroup, phases, obsmarkers, bookmarks). But we need to keep the
old methods around for servers that do not support bundle2. So we introduce
this set of steps that remains to be done.
With bundle2 we'll have multiple code path susceptible to be responsible for
adding changeset during the pull. We move it to the pull operation to simplify
this the coming logic. The one doing the adding the changegroup will set the
value.
In the bare pull case we could add the same node multiple time to the
`pulloperation.pulledsubset`. Beside being a bit wrong this confused the new
revset implementation of `revset._revancestor` into giving bad result.
This changeset fix the pull operation part. The fix for the revset itself will
come in another changeset.
pull: move changeset pulling in its own function
Now that every necessary information is held in the `pulloperation` object, we
can finally extract the changeset pulling to it's own function.
This changeset is pure code movement only.
Tree discovey use a `fetch` variable to know what is being pulled. We move this
information in the `pulloperation` object. This make it possible to extract the
changeset pulling logic into its own function.
The computation of the subset is simple operation using two useful pull
information (1) the set of common changeset before the pull (2) the set of
pulled changeset. We move this data into the `pulloperation` object since some
phase will need them. And we turn the pulled subset computation behind a
property case as multiple pull phase will need it.
Now that every necessary information is held in the `pulloperation` object, we
can finally extract the phase synchronisation phase to it's own function.
This changeset is pure code movement only.
We compute the set of local changeset that were target of the pull. This is then
used by phases logic to decide which part of the history should have it phase
updated.
We move this information into the object to allow extraction of phase
synchronisation in its own function.
I expect obsolete marker exchange to use it too in the future.
Most local change that occurs during a pull are doing within a `transaction`.
Currently this mean (1) adding new changeset (2) adding obsolescence markers. We
want the two operations to be done in the same transaction. However we do not
want to create a transaction if nothing is added to the repo. Creating an empty
transaction would drop the previous transaction data and confuse tool and people
who are still using rollback.
So the current pull code has some logic to create and handle this transaction on
demand. We are moving this logic in to the `pulloperation` object itself to
simplify this lazy creation logic through all different par of the push.
Note that, in the future, other part of pull (phases, bookmark) will probably
want to be part of the transaction too.
The obsolescence marker exchange code was already extracted during a previous
cycle. We are moving the extracted functio in this module. This function will
read and write data in the `pulloperation` object and I prefer to have all core
function collaborating through this object in the same place.
This changeset is pure code movement only. Code change for direct consumption of
the `pulloperation` object will come later.
This object will hold all data and state gathered through the pull. This will
allow us to split the long function into multiple small one. Smaller function
will be easier to maintains and wrap. The idea is to blindly store all
information related to the pull in this object so that each step and extension
can use them if necessary.
We start by putting the `repo` variable in the object. More migration in other
function.
The localrepo class if far too big. Push and pull logic are extracted and
reworked to better fit with the fact we exchange more than bundle now.
This changeset extract the pulh code. later changeset will slowly slice it into
smaller brick.
The localrepo.pull method is kept for now to limit impact on user code. But it
will be ultimately removed, now that the public API is hold by peer.
Now that every necessary information is held in the `pushoperation` object, we
can extract the new common set computation to it's own function.
This changeset is pure code movement only.
The phase synchronisation start by computing the new set of common head between
local and remote and then do the phase synchronisation on this set. This new
common set logic will eventually be used by the obsolescence markers exchange.
So we are going to split the long phase synchronisation in two.
Now that every necessary information is held in the `pushoperation` object, we
can extract the discovery logic to it's own function.
This changeset is pure code movement only.
Now that every necessary information is held in the `pushoperation` object, we
can extract the part responsible of aborting the push to it's own function.
This changeset is mostly pure code movement. the exception is the fact this
function returns a value to decide if changeset bundle should be pushed.