Because bundle2 allows a more precise exchange of obsmarkers during pull, it
sends them in a different order (previously unstable because of sets.) As
a result, they are added to the repository in a different order. To stabilize
the order and ensure tests are unchanged when moving from bundle1 to bundle2 we
sort markers when exchanging them.
In the long run, the obsstore will probably not use a linear storage.
The current bundle2 processing was capturing all output. This is nice as it
provide better meta data about what output what, but this was changing two
things:
1) adding a prefix "remote: " to "other" output during local push (issue4613)
2) local and ssh push does not provide real time output anymore (issue4615)
As we are unsure about what form should be used in (1) and how to solve (2) we
disable output capture in this two cases. Output capture can be forced using an
experimental option.
Until this changeset, we were only able to save output if an error happened
during the 'transaction.close()' phase. If the 'processbundle' call raised an
exception, the 'bundleoperation' object was never returned, so the reply bundle
was never accessible and no output could be salvaged. We introduce a quick (but
not very elegant) fix to gain access to any reply created during the processing.
This conclude this output related series. We should hopefully be able client-side to see the
whole server output, in a proper order.
The code is now complex enough that a refactoring of it would make sense on
default.
We were capturing all output issue during bundle2 processing, and all output
issue during transaction rollback in case of failure. However, the output issue
during transaction commit was still roaming the land freely. It is now put back
in line.
This let the user see output from 'pretxnclose' and 'txnclose' (and related) in
the right order.
External hook used to directly write on stdout and stderr. As a result their
output was not captured by the bundle2 processing. This resulted in confusing
out of order output on the client side. We are now capturing hooks output in
this context.
The output from the transaction rollback was not included into the reply bundle.
It was eventually caught by the usual 'unbundle' output capture and sent to the
client but the result was out of order on the client side. We now capture the
output for the transaction release and transmit it the same way as all other
output.
We should probably rethink the whole output capture things but this would not be
appropriate for stable.
The is still multiple cases were output failed to be properly capture, they will
be fixed in later changesets.
The re-handling of output is happening in some 'unbundle' callers. We have to
transmit the output information to this place so we stick it on the exception.
This is the third step in our quest for preserving the server output on error
(issue4594). We want to be able to copy the output part from the aborted reply
into the exception bundle.
If the client allows "pushback", the bundle2 served back by the server may
contains parts that will write to the repository. Such parts may require the
'wlock' (eg: bookmark) so we acquire it in advance to make sure it got acquired
before the 'lock'.
A bundle2 may contain bookmark updates (or other extension content) that
requires the 'wlock' to be written. As 'wlock' must be acquired before 'lock',
we must stay on the side of caution and use both in all case to ensure their
ordering.
This argument let extensions control in what order bundle2 part are generated
server side during a pull. This is useful to ensure the transaction is in a
proper state before some actions or hooks happens.
This argument let extensions control in what order bundle2 part are generated
client side during a push. This is useful to ensure the transaction is in a
proper state before some actions or hooks happens.
The series at a17556fc1521::77b112363d48 introduced generic transaction level
hooking. This makes the experimental bundle2 specific hooks redundant, we drop
them.
It is finally time to freeze the bundle2 format! To do so we:
- rename HG2Y to HG20,
- drop "b2x:" prefix from all part names,
- rename capability to "bundle2-exp" to "bundle2"
- rename the hook flag from 'bundle2-exp' to 'bundle2'
This function refactors the logic that decides to use 'bundle2' during an
exchange (pull/push). This will help being consistent while transitioning from
the experimental protocol to the final frozen version.
I do not expect this function to survive on the long run when using 'bundle2'
will become a simple capability check.
This is also necessary to allow HG2Y support in an extension to ease transition
of companies using the experimental protocol in production (yeah...). Such
extension will be able to wrap this function to use the experimental protocol in
some case.
To support more bundle2 formats, we need a wider detection of bundle2-family
streams. The various places what were explicitly detecting the full magic string
are now matching on the first three characters of it.
To support multiple bundle2 formats, we will need a function returning
the proper unbundler according to the header. We introduce such aa
function and change the usage in the code base. The function will get
smarter in later changesets.
This is somewhat similar to the dispatching we do for 'HG10' and 'HG11'.
The main target is to allow HG2Y support in an extension to ease transition of
companies using the experimental protocol in production (yeah...) But I've no
doubt this will be useful when playing with a future HG21.
When the 'kwargs' variable was added in 038ae1f12393 (bundle2: allow
pulling changegroups using bundle2, 2014-04-01), it could contain only
'bundlecaps', 'common' and 'heads', so the check for 'format' would
always be false. Since then, _pullbundle2extraprepare() has been added
for hooks, but it seems unlikely that they would a 'format' key.
The conditional was a bit too narrow and produced buggy result when a node was
present in both common and heads (because it pleased the discovery) and it was
locally known but filtered.
This resulted in buggy getbundle request and server side crash.
We have been running discovery on unfiltered repository for quite some time.
This was aimed at two things:
- save some bandwith by prevent the repushing of common but hidden changesets
- allow phases changes on secret/hidden changeset on bare push.
The cost of this unfiltered discovery combined with evolution is actually really
high. Evolution likely create thousand of hidden heads, and the discovery is
going to try to discovery if each of them are common or not. For example,
pushing from my development mercurial repository implies 17 discovery
round-trip.
The benefit are rare corner cases while the drawback are massive. So we run the
discovery on a filtered repository again.
We add some hack to detect remote heads that are known locally and adds them to
the common set anyway, so the good behavior of most of the corner case should
remains. But this will not work in all cases.
This bring my discovery phase back from 17 round-trips to 1 or 2.
This patch series is intended to allow bundle2 push reply part handlers to
make changes to the local repository; it has been developed in parallel with
an extension that allows the server to rebase incoming changesets while applying
them.
This diff adds an experimental config option "bundle2.pushback" which provides
a transaction to the reply unbundler during a push operation. This behavior is
opt-in because of potential security issues: the response can contain any part
type that has a handler defined, allowing the server to make arbitrary changes
to the local repository.
This patch series is intended to allow bundle2 push reply part handlers to
make changes to the local repository; it has been developed in parallel with
an extension that allows the server to rebase incoming changesets while applying
them.
Most pushes already open a transaction in order to sync phase information.
This diff replaces that transaction with one that spans the entire push
operation.
This transaction will be used in a later patch to guard repository changes
made during the reply handler.
This patch series is intended to allow bundle2 push reply part handlers to
make changes to the local repository; it has been developed in parallel with
an extension that allows the server to rebase incoming changesets while applying
them.
Aside from the transaction logic, the pulloperation class is used primarily as
a logic-free data structure for storing state information. This diff extracts
the transaction logic into its own class that can be shared with push
operations.
The phase-syncing code was using bundle2 if the remote supported it. It was
doing so without regard to bundle2 activation on the client. Moreover, the
phase push is now properly included in the unified bundle2 push, so having extra
code in syncphase should be useless. If the remote is bundle2-enabled, the
phases should already be synced.
The buggy verification code was leading to a crash when a 3.2 client was pushing
to a 3.1 server. The real bundle2 path detected that their versions were
incompatible, but the syncphase code failed to, sending an incompatible bundle2
to the server.
We drop the useless and buggy code as a result. The "else" clause is
de-indented in the process.
This mirrors the API for 'pending' and 'finalize' callbacks. I do not have
immediate usage planned for it, but I'm sure some callback will be happy to
access transaction related data.
Changeset d79feb65f3ee added advertising of supported changegroup version
through the new 'b2x:changegroup' capability. However, this capability is not
new and has been around since 3.1 with an empty value. This makes new clients
unable to push to 3.2 servers through bundle2 as they cannot find a common
changegroup version to use from and empty list.
Treating empty 'b2x:changegroup' value as old client fixes it.
The 'delayupdate' method now takes a transaction object and registers its
'_writepending' method for execution in 'transaction.writepending()'. The hook can then
use 'transaction.writepending()' directly.
At some point this will allow the addition of other file creation
during writepending.
When using bundle2, we find the common subset of supported changegroup-packers
and we pick the max of them. This allow to use generaldelta aware changegroups through
bundle2.
When using bundle2, we find the common subset of supported changegroup-packers
and we pick the max of them. This allow to use generaldelta aware changegroup
through bundle2.
The commands getchangegroup, getlocalchangegroup and getsubset now each
have a version ending in -raw. The raw versions return the chunk generator
from the changegroup packer directly, without wrapping it in a chunkbuffer
and unpacker. This avoids extra chunkbuffers in the bundle2 code path.
Also, the raw versions can be extended to support alternative packers
in the future, to be used from bundle2.
48062b2d0f30 regressed the behavior of pushing an unchanged bookmark to
a remote. Before that commit, pushing a unchanged bookmark would result
in "exporting bookmark @" being printed. After that commit, we now see
an incorrect message "bookmark %s does not exist on the local or remote
repository!"
This patch fixes the regression introduced by 48062b2d0f30 by having
the bookmark error reporting code filter identical bookmarks and adds
a test for the behavior.
bookmarks.compare() previously lumped identical bookmarks in the
"invalid" bucket. This patch adds a "same" bucket.
An 8-tuple for holding this state is pretty gnarly. The return value
should probably be converted into a class to increase readability. But
that is beyond the scope of a patch intended to be a late arrival to
stable.
Hooks that run after the transaction need to be able to touch the
repository. So we need to run them after the lock release. This is
similar to what the "changegroup" hook is doing in the
`addchangegroup` function.
We are changing all integers that denote the size of a chunk to read to int32.
There are two main motivations for that.
First, we change everything to the same width (32 bits) to make it possible for
a reasonably agnostic actor to forward a bundle2 without any extra processing.
With this change, this could be achieved by just reading int32s and forwarding
chunks of the size read. A bit a smartness would be logic to detect the end of
stream but nothing too complicated.
Second, we need some capacity to transmit special information during the bundle
processing. For example we would like to be able to raise an exception while a
part is being read if this exception happend while this part was generated.
Having signed integer let us use negative numbers to trigger special events
during the parsing of the bundle.
The format is renamed for B2X to B2Y because this breaks binary
compatibility. The B2X format support is dropped. It was experimental to
allow this kind of things. All elements not directly related to the binary
format remain flagged "b2x" because they are still compatible.
We need a wider set of hooks to process all the changes that happened during the
pull transaction. We reuse the experimental `b2x-transactionclose` hook set
from server's unbundle for consistency. This hook is experimental and will not
remains as-is forever, but this will open the door for experimentation in 3.2.
The source information can, should be applied once when opening the transaction
for the pull. This will lets element processed within a bundle2 be aware of them
and open the door to running a set of hooks when closing this pull transaction.
This is similar to what is done in server's unbundle call.
We store the source and url of the current data into `transaction.hookargs` this
let us inherit it from upper layers that may have created a much wider
transaction. We have to modify bundle2 at the same time to register the source
and url in the transaction. We have to do it in the same patch otherwise, the
`addchangegroup` call would fill these values and the hook calling will crash
because of the duplicated 'source' and 'url' arguments passed to the hook call.
A bundle2 may contain multiple parts adding changegroups, in which case there
are multiple operation records for changegroups, each with its own return
value. Those multiple return values are aggregated in a single cgresult value
for the whole operation.
As can be seen in the associated test case, the situation with hooks is not
really the best, but without deeper thoughts and changes, we can't do much
better. Hopefully, things will be improved before bundle2 is enabled by default.
In the meanwhile, multiple changegroups is not expected to be in widespread
use, and even less expected to be used for pushes. Also, not many clients
cloning or pulling bundle2 with multiple changesets are not expected to have
changegroup hooks anyways.