We call a dedicated hook right after closing the transaction. This will let
people react to the transaction with all the information in hand. This hook is
experimental and will not survive in future versions.
We call a dedicated hook right before closing the transaction. This will let
people abort unbundling with all the information in hand. This hook is
experimental and will not survive in future versions.
All currently core parts are moved to a `bx2` namespace (for "bundle 2
experimental"). This should avoid conflicts between the final stable
format and the one about to be released.
For the same reason, we advertise this bundle2 implementation and format as
experimental. This will leave room for field testing in 3.0 but won't conflict
with a stable implementation in 3.1.
The current implementation of bundle2 is still very experimental and the 3.0
freeze is yesterday. The current bundle2 format has never been field-tested, so
we rename the header to HG2X. This leaves the HG20 header available for real
usage as a stable format in Mercurial 3.1.
We won't guarantee that future mercurial versions will keep supporting this
`HG2X` format.
We can now retrieve them from the server during push. The capabilities are
encoded the same way as in `replycaps` part (with an extra layer of urlquoting
to escape separators).
The bundle2 processing does not create a bundle2 reply by default anymore. It
is only done if the client requests it with a `replycaps` part. This part is
called `replycaps` as it will eventually contain data about which bundle2
capabilities are supported by the client.
We have to add a flag to the test command to control whether a reply is
generated or not.
We now use a bundle2 container to push all phase updates at the same time. This
is a significant step forward, even if further refactoring is needed to unify
phase push with the changeset push.
We use bundle2 to retrieve the remote phase data at the same time as
changesets. This reduces the amount of requestis and should improve consistency
as the server can ensure nothing changed between the retrieval of those parts.
A new ``listkeys`` is supported by getbundle. It is a list of namespaces whose
content should be included in the bundle.
An appropriate entry has been added to the wireproto map of getbundle arguments
and a new bundle2 capability is advertised.
There are still no codes that request such parts in core mercurial.
This function factors the creation of appropriate entries to use in
``bundlecaps`` argument of ``getbundle``. This cleans up code calling
``getbundle`` and helps its usage in more part of the code.
The process of decoding remote bundle2caps blob into a dictionary is cumbersome.
We move it into a small helper function. This will clarify code that reads
bundle2 capabilities of peers and helps using it in new places.
We are going to raise exceptions for a wider range of cases: unsupported
mandatory stream and part parameters. We rename the exception with a wider
name.
This code have been factorised and moved in its own function by 45df6ae67061. We
actually read the result of this other computation in the very line before the
deleted block. I somehow forgot to remove the original code, but it is now
dead. Good bye duplicated computation.
When a bundle2 is pushed we return a bundle instead of an integer. We use to
return a binary stream. We now return a `bundle20` bundler to make the life of
wireprotocol implementation simpler.
This input will have to travel over the wire anyway, so we feed the peer method
with a simple binary stream and rely on the server side to use `readbundle`
to create the python object.
The test output changes because the bundle is created marginally sooner and the
debug output interleaves in a different way.
For friendliness with the wire protocol implementation, the `exchange.getbundle` now
returns a binary stream when asked for a bundle2. We detect a bundle2 request and
upgrade the binary stream to an unbundler object.
In the future the unbundler may gain feature to look like a binary stream, but
we are not quite there yet.
The `readbundle` function can now recognize a bundle2 stream and return the
appropriate unbundler. This is required for proper bundle2 support over the
wire.
We first read 4 bytes to get the `HG10` bytes then we read the compression scheme
if this is `HG10`. This prepares the code for the arrival of `HG20` handling.
Using `readbundle` in the part handlers creates a circular import hell. We are
now using a simple `HG10UN` stream with no header. Some parameters may
later be introduced on the part to change parameter.
Producers are updated as well.
We now support bundle2 for local push. The unbundle function has to detect
which version of the bundle to use since the return type is different.
Note that push error handling is currently nonexistent. This is one of the
reasons why bundle2 is still disabled by default.
This patch introduces "prepushoutgoinghooks" to extend outgoing check
before pushing changesets to remote easily.
This chooses the function returning "util.hooks" instead of the one to
be overridden.
The latter may cause problems silently, if one of overriders forgets
(or fails) to execute a kind of "super(xxx, self).overridden(...)". In
the other hand, the former can ensure that all registered functions
are invoked, unless one of them raises an exception.
We are going to introduce an `unbundlepart` dedicated to reading bundle. So we
need to rename the one used to create bundle. Even if dedicated to creation, this
is still used for unbundling until we get the new class.
Localrepo now supports the unbundle method of pushing changegroups. We
plan to use the unbundle call for bundle2 so it is important that all
peers supports it. The `peer.unbundle` and `peer.addchangegroup` code
path have small difference so cause some test output changes. None of those
changes seems problematic.
The `exchange` module now contains an `unbundle` function that holds the core
unbundle logic. The wire protocol keeps its own unbundle function. It enforces
wireprotocol-specific logic and then calls the extracted function.
This aims at implementing unbundle for localrepo.
We are going to refactor the unbundle function to have it working on
a local repository too. Having this function extracted will ease the process.
In the case of non-matching heads, the function now directly raises an
exception. The top level of the function is catching it.
The bundlecaps passed to exchange.getbundle were being dropped completely. We
should pass them on through to the changegroup.
This affected the remotefilelog extension, since it relies on those bundlecaps.
This changeset refactors the pull code to use a bundle2 when available. We keep
bundle2 disabled by default. The current code is not ready for prime time.
Ultimately we'll want to unify the API of `bunde10` and `bundle20` to have less
different code. But for now, testing the bundle2 exchange flow is an higher
priority.
This function can return a `HG10` or `HG20` bundle. It uses the `bundlecaps`
parameters to decides which one to return.
This is a distinct function from `changegroup.getbundle` for two reasons. First
the APIs of `bundle10` and `bundle20` are not compatible yet. The two functions
may be reunited in the future. Second `exchange.getbundle` will grow parameters
for all kinds of data (phases, obsmarkers, ...) so it's better to keep the
changegroup generation in its own function for now.
This function will be used it in the next changesets.
The `pushoperation` object contains strictly more data the arguments currently
passed to `localrepo.checkpush` we pass the new object instead. This function is
used by MQ to abort push that includes MQ changesets.
Note: extension that may use this function will have to align.
With bundle2 we'll slowly move current steps into a single bundle2 building and
call (changegroup, phases, obsmarkers, bookmarks). But we need to keep the
old methods around for servers that do not support bundle2. So we introduce
this set of steps that remains to be done.
With bundle2 we'll have multiple code path susceptible to be responsible for
adding changeset during the pull. We move it to the pull operation to simplify
this the coming logic. The one doing the adding the changegroup will set the
value.
In the bare pull case we could add the same node multiple time to the
`pulloperation.pulledsubset`. Beside being a bit wrong this confused the new
revset implementation of `revset._revancestor` into giving bad result.
This changeset fix the pull operation part. The fix for the revset itself will
come in another changeset.
pull: move changeset pulling in its own function
Now that every necessary information is held in the `pulloperation` object, we
can finally extract the changeset pulling to it's own function.
This changeset is pure code movement only.
Tree discovey use a `fetch` variable to know what is being pulled. We move this
information in the `pulloperation` object. This make it possible to extract the
changeset pulling logic into its own function.
The computation of the subset is simple operation using two useful pull
information (1) the set of common changeset before the pull (2) the set of
pulled changeset. We move this data into the `pulloperation` object since some
phase will need them. And we turn the pulled subset computation behind a
property case as multiple pull phase will need it.
Now that every necessary information is held in the `pulloperation` object, we
can finally extract the phase synchronisation phase to it's own function.
This changeset is pure code movement only.
We compute the set of local changeset that were target of the pull. This is then
used by phases logic to decide which part of the history should have it phase
updated.
We move this information into the object to allow extraction of phase
synchronisation in its own function.
I expect obsolete marker exchange to use it too in the future.
Most local change that occurs during a pull are doing within a `transaction`.
Currently this mean (1) adding new changeset (2) adding obsolescence markers. We
want the two operations to be done in the same transaction. However we do not
want to create a transaction if nothing is added to the repo. Creating an empty
transaction would drop the previous transaction data and confuse tool and people
who are still using rollback.
So the current pull code has some logic to create and handle this transaction on
demand. We are moving this logic in to the `pulloperation` object itself to
simplify this lazy creation logic through all different par of the push.
Note that, in the future, other part of pull (phases, bookmark) will probably
want to be part of the transaction too.
The obsolescence marker exchange code was already extracted during a previous
cycle. We are moving the extracted functio in this module. This function will
read and write data in the `pulloperation` object and I prefer to have all core
function collaborating through this object in the same place.
This changeset is pure code movement only. Code change for direct consumption of
the `pulloperation` object will come later.
This object will hold all data and state gathered through the pull. This will
allow us to split the long function into multiple small one. Smaller function
will be easier to maintains and wrap. The idea is to blindly store all
information related to the pull in this object so that each step and extension
can use them if necessary.
We start by putting the `repo` variable in the object. More migration in other
function.
The localrepo class if far too big. Push and pull logic are extracted and
reworked to better fit with the fact we exchange more than bundle now.
This changeset extract the pulh code. later changeset will slowly slice it into
smaller brick.
The localrepo.pull method is kept for now to limit impact on user code. But it
will be ultimately removed, now that the public API is hold by peer.
Now that every necessary information is held in the `pushoperation` object, we
can extract the new common set computation to it's own function.
This changeset is pure code movement only.
The phase synchronisation start by computing the new set of common head between
local and remote and then do the phase synchronisation on this set. This new
common set logic will eventually be used by the obsolescence markers exchange.
So we are going to split the long phase synchronisation in two.
Now that every necessary information is held in the `pushoperation` object, we
can extract the discovery logic to it's own function.
This changeset is pure code movement only.
Now that every necessary information is held in the `pushoperation` object, we
can extract the part responsible of aborting the push to it's own function.
This changeset is mostly pure code movement. the exception is the fact this
function returns a value to decide if changeset bundle should be pushed.
The fact that there is some unknown changes on remote one of the result of
discovery. It is then used by some push validation logic.
We move it in the object to be able to extract the said logic.
Now that every necessary information is held in the `pushoperation` object, we
can extract the logic pushing changeset to it's own function.
This changeset is pure code movement only.
The heads of the remote repository are used to detect race when pushing
changeset. We now store this information in `pushoperation` object to allow
extraction of the changeset pushing part.
Now that every necessary information is held in the `pushoperation` object, we
can finally extract the phase synchronisation phase to it's own function. This
is the first concrete block of code we extract from the huge push function.
Hooray!
This changeset is pure code movement only.
The set of outgoing and common changeset are used by phases to compute the new
common set between local and remote. So we need to move it into the object to
extract the phase sync from the god function.
Note that this information will be used by obsolescence markers too.
The return code convey information about the success of changeset push. This is
used by phases to compute the new common set between local and remote. So we
need to move it into the object to extract the phase sync from the god
function.
Note that this information will be used by obsolescence markers too.
Now that pushop is holding all the push related data, we do not really need a
closure anymore. So we start feeding the object to `localphasemove` and will
make it a normal function in the next commit.
During push, we try to lock the local repo to move local phase according to what
we saw/pushed on the remote repo. Locking the repo may fail, in that case we let
the push proceed without local phase movement (printing warning).
This mean we have code in phase synchronisation that will check if the local repo is
locked or not. we need to move this information in the push object to be able to
extract the phase synchronisation in its own function. This is done as a boolean
because putting reference to the actual lock outside of the main function
sounded a bad idea.
The obsolescence marker exchange code was already extracted during a previous
cycle. We are moving the extracted functio in this module. This function will
read and write data in the `pushoperation` object and I prefer to have all core
function collaborating through this object in the same place.
This changeset is pure code movement only. Code change for direct consumption of
the `pushoperation` object will come later.
The bookmark exchange code was already extracted during a previous cycle. This
changesets moves the extracted function in this module. This function will read
and write data in the `pushoperation` object and It is preferable to have all
core function collaborating through this object in the same place.
This changeset is pure code movement only. Code change for direct consumption of
the `pushoperation` object will come later.
This object will hold all data and state gathered through the push. This will
allow us to split the long function into multiple small one. Smaller function
will be easier to maintains and wrap. The idea is to blindly store all
information related to the push in this object so that each step and extension
can use them if necessary.
We start by putting the `repo` variable in the object. More migration in other
changeset.
The localrepo class if far too big. Push and pull logic will be extracted and
reworked to better fit with the fact they now exchange more than plain changeset
bundle.
This changeset extract the push code. later changeset will slowly slice this
over 200 hundred lines and 8 indentation level function into smaller saner
brick.
The localrepo.push method is kept for now to limit impact on user code. But it
will be ultimately removed, now that the public supposed API is hold by peer.
When bundle2 was enabled, if hg pull had no commits to pull, it would print
'no changes found' and then download the entire repository from the server. This
was caused by heads and common being set to None, which gets treated as
heads=cl.heads() and common=[nullid], which means download the entire repo.
Pulling bundles without a changegroup is a valid use case (like if we're just
updating bookmarks), so this modifes the bundle code to allow not adding
changegroups.