In all the remaining cases the comprehension variable is used for the same
thing as a previous loop variable.
This will mute some pyflakes "list comprehension redefines" warnings.
The ``getbundle`` function was initially design to return a changegroup bundle.
However, bundle2 allows transmitting a wide range of data. Some bundle2
requests may not include a changegroup at all.
Before this changeset, the client would request a changegroup for
``heads=[nullid]`` and receive an empty changegroup.
We introduce an official boolean parameter, ``cg``, that can be set
to false to disable changegroup generation on getbundle. A new bundle2
capability is introduced to let the client know.
In case of an unknown parameter passed to ``getbundle``, the server prints a
message and ignores the parameter. This message was misleadingly prefixed with
"abort: ". The use of "abort: " here is clearly wrong as nothing is actually
aborted and the ``getbundle`` call proceeds without the parameter.
The message is now prefixed with "warning: " and rephrased to make it clearer
that the parameter was simply ignored.
A new ``listkeys`` is supported by getbundle. It is a list of namespaces whose
content should be included in the bundle.
An appropriate entry has been added to the wireproto map of getbundle arguments
and a new bundle2 capability is advertised.
There are still no codes that request such parts in core mercurial.
In addition to listing the expected options for ``getbundle``, we also list their
types and handle the encoding/decoding automatically. This should make it easier
for extensions to transmit additional information to getbundle.
For now, getbundle accepts a fixed number of arguments: ``heads``, ``common``
and ``bundlecaps``. We make this list exposed at the module level to let
extensions add content there. This is important for extensions that wish to use
bundle2 for other contents than changegroup.
We picked a null character to split each parameter during the transfer. This is
fragile if the same character is used in parameter name. However other
codes will already behave in a strange way in that case, so we are not
introducing any regression. A better format may be picked for the final
version of the protocol.
This is a backward compatibility breakage per se. But bundle2 was explicitly
flagged as experimental, and this is one an error path anyway. So the worse
possible outcome from this change is to still have a crash but with a different
message.
We are going to raise exceptions for a wider range of cases: unsupported
mandatory stream and part parameters. We rename the exception with a wider
name.
Same drill again. We catch the PushRaced error, check if it cames from
a bundle2 processing, if so we turn it into a bundle2 with a part
transporting error information to be reraised client side.
If the heads on the server differ from the ones reported seen by the client at
bundle time, we raise a PushRaced exception. However, the part raising the
exception was broken.
To fix it, we move the PushRaced class in the error module so it can be
accessible everywhere without an import cycle.
A test is also added to prevent regression.
Same as for Abort error, we catch the error, encode it into a bundle2 reply
(expected by the client) and stream this reply. The client processing of the
error will raise the exception again.
Clients expect a bundle2 reply to their bundle2 submission. So we
catch the Abort error and turn it into a bundle2 containing a part
transporting the exception data. The unbundling of this reply will
raise the error again.
We highlight the fact that this is experimental by moving it to an "experimental"
section, and we match the config name with the server capability name
`bundle2-exp`.
For the same reason, we advertise this bundle2 implementation and format as
experimental. This will leave room for field testing in 3.0 but won't conflict
with a stable implementation in 3.1.
The current implementation of bundle2 is still very experimental and the 3.0
freeze is yesterday. The current bundle2 format has never been field-tested, so
we rename the header to HG2X. This leaves the HG20 header available for real
usage as a stable format in Mercurial 3.1.
We won't guarantee that future mercurial versions will keep supporting this
`HG2X` format.
We can now retrieve them from the server during push. The capabilities are
encoded the same way as in `replycaps` part (with an extra layer of urlquoting
to escape separators).
We use the new method defined in the past changeset to send a bundle2 stream and
receive one in reply. The http version is missing remote output support. This
will be done later using a bundle part.
This method will be used by bundle2 pushes. It calls a command, feeds it with a
stream and receives another stream in reply.
Actual implementation for ssh and http will be done in later changesets.
This changeset makes `wireprotocol` peers advertise bundle2 capability and
comply with bundle2 `getbundle` requests.
Note that advertising bundle2 could make a client try to use it for push. Such
pushes would fail. However, I do not expect any human being to have enabled
bundle2 on their server yet.
Localrepo now supports the unbundle method of pushing changegroups. We
plan to use the unbundle call for bundle2 so it is important that all
peers supports it. The `peer.unbundle` and `peer.addchangegroup` code
path have small difference so cause some test output changes. None of those
changes seems problematic.
The `exchange` module now contains an `unbundle` function that holds the core
unbundle logic. The wire protocol keeps its own unbundle function. It enforces
wireprotocol-specific logic and then calls the extracted function.
This aims at implementing unbundle for localrepo.
We are going to refactor the unbundle function to have it working on
a local repository too. Having this function extracted will ease the process.
In the case of non-matching heads, the function now directly raises an
exception. The top level of the function is catching it.
This is a gratuitous code move aimed at reducing the localrepo bloatness.
The method had few callers, not enough to be kept in local repo.
The peer API stay unchanged.
This is a gratuitous code move aimed at reducing the localrepo bloatness.
The method had few callers, not enough to be kept in local repo.
The peer API remains unchanged.
This is a gratuitous code move aimed at reducing the localrepo bloatness.
The method had few callers, not enough to be kept in local repo.
The peer API remains unchanged.
Move move in the same direction we took for command line commands. each wire
protocol function will be decorated with its name and arguments.
Beside beside easier to read, this open the road to easily adding more metadata
(like security level or return type)
We already have multiple call function for multiple return type. The
`_decompress` function is only used for http and seems like a layer violation.
We drop it in favor of a new call type dedicated to "stream that may be useful to
compress".
The wireprotocole command use a small set of class as return value. We document
the meaning of each of them.
AS my knowledge of wire protocol is fairly shallow, the documentation can
probably use improvement. But this is a better than nothing.
It will help people that need to add capabilities (in a more subtle was that
just adding some to the list) in multiple way:
1. This function returns a list, not a string. Making it easier to look at,
extend or alter the content.
2. The original capabilities function will be store in the dictionary of wire
protocol command. So extension that wrap this function also need to update
the dictionary entry.
Both wrapping and update of the dictionary entry are needed because the
`hello` wire protocol use the function itself. This is specifically sneaky for
extension writer as ssh use the `hello` command while http use the
`capabilities` command.
With this new `_capabilities` function there is one and only one obvious
place to wrap when needed.
Moves the file walk out of the stream method so that extensions can override it.
This allows an extension to decide what files should be streamed, and in
particular allows a stream without filelogs.
The message was not very much to the point and did not in any way help an
ordinary user.
'repository changed while preparing/uploading bundle - please try again'
is more correct, gives the user some understanding of what is going on, and
tells how to 'recover' from the situation.
The 'bundle' aspect could be seen as an implementation detail that shouldn't be
mentioned, but I think it helps giving an exact error message.
The message could still leave the user wondering why Mercurial doesn't lock the
repo and how unsafe it thus is. Explaining that is however too much detail.
Now that changelog filtering is in place, it's become evident that
naming the filters according to the set of revs _not_ included in the
filtered changelog is confusing. This is especially evident in the
collaborative branch cache scheme.
This changes the names of the filters to reflect the revs that _are_
included:
hidden -> visible
unserved -> served
mutable -> immutable
impactable -> base
repoview.filteredrevs is renamed to filterrevs, so that callers read a
bit more sensibly, e.g.:
filterrevs('visible') # filter revs according to what's visible
Merely creating and using a generator has a measurable impact,
particularly since the common case for stream_out is generators that
yield just once. Avoiding generators improves stream_out performance
by about 7%.
Auditing at this stage is both pointless (paths are already trusted by
the local repo) and expensive. Skipping the audits improves stream_out
performance by about 15%.
They were previously inside the mercurial.phases module, but obsolete
logic will need them to exclude `extinct` changesets from pull and
push.
The proper and planned way to implement such filtering is still to apply a
changelog level filtering. But we are far to late in the cycle to implement and
push such a critical piece of code (changelog filtering). With Matt Mackall
approval I'm extending this quick and dirty mechanism for obsolete purpose.
Changelog level filtering should come during the next release cycle.
This change separates peer implementations from the repository implementation.
localpeer currently is a simple pass-through to localrepository, except for
legacy calls, which have already been removed from localpeer. This ensures that
the local client code only uses the most modern peer API when talking to local
repos.
Peers have a .local() method which returns either None or the underlying
localrepository (or descendant thereof). Repos have a .peer() method to return
a freshly constructed localpeer. The latter is used by hg.peer(), and also to
allow folks to pass either a peer or a repo to some generic helper methods.
We might want to get rid of .peer() eventually.
The only user of locallegacypeer is debugdiscovery, which uses it to pose as a
pre-setdiscovery client. But we decided to leave the old API defined in
locallegacypeer for clarity and maybe for other uses in the future.
It might be nice to actually define the peer API directly in peer.py as stub
methods. One problem there is, however, that localpeer implements
lock/addchangegroup, whereas the true remote peers implement unbundle.
It might be desireable to get rid of this distinction eventually.
Discovery now use an overlay above branchmap to prune invisible "secret"
changeset from branchmap.
To minimise impact on the code during the code freeze, this is achieve by
recomputing non-secret heads on the fly when any secret changeset exists. This
is a computation heavy approach similar to the one used for visible heads. But
few sever should contains secret changeset anyway. See comment in code for more
robust approach.
On local repo the wrapper is applied explicitly while the wire-protocol take
care of wrapping branchmap call in a transparent way. This could be unified by
the Peter Arrenbrecht and Sune Foldager proposal of a `peer` object.
An inappropriate `(+i heads)` may still appear when pushing new changes on a
repository with secret changeset. (see Issue3394 for details)
The `repo` object here is *always* local. Using `repo.heads()` ensure we will
reject push if any secret changeset exists.
During discovery, `visibleheads` were sent to the peer. So we can only expect it
to send us `visibleheads` back. If any secret changeset exists::
visibleheads != repo.heads()
This fix server side part of issue 3303 when pushing over the wire.
Any secret changesets will be excluded from pull and push. Phase data are
properly synchronized on pull and push if a changeset is seen as secret locally
but is non-secret remote side.
This patch does not handle the case of a changeset secret on remote but known
locally.
Remote side may add useful information alongside failure return code. For
example "ssl is required". This patch mirror what is done for the unbundle
command.
Older clients will still print the provided error message and not much else:
over ssh, this will be each line prefixed with 'remote: ' in addition to an
"abort: unexpected response: '\n'"; over http, this will be the '---%<---'
banners in addition to the 'does not appear to be a repository' message.
Currently, clients with this patch will display 'abort: remote error:\n' and
the provided error text, but it is trivial to style the error text however is
deemed appropriate.
Makes lookup, heads, known, branchmap, pushkey, and listkeys batchable.
It could, for instance, be interesting to use this to batch calls to
lookup when a pull or clone has multiple --rev arguments. The next patch
is going to batch heads and known to slightly tune discovery.
Firstly, I think we should do this for all new wire commands, just
to be on the safe side. So I want to get this into the 1.9 release.
Secondly, there actually is potential here that sometimes the server
can know that the number of its nodes which can possibly still be
undecided on the client is small. It might then just send them along
directly (cutting short the end game). This, however, requires
walking the graph on the server, which can be expensive, so for the
moment we're not actually doing it.
Changeset 20b319765bcf introduced the unbundlehash capability and
unconditionally hashed the heads on the client side. By mistake, the
heads were also cased in the heads == ['force'] case.
Send the command arguments in the HTTP headers. The command is still part
of the URL. If the server does not have the 'httpheader' capability, the
client will send the command arguments in the URL as it did previously.
Web servers typically allow more data to be placed within the headers than
in the URL, so this approach will:
- Avoid HTTP errors due to using a URL that is too large.
- Allow Mercurial to implement a more efficient wire protocol.
An alternate approach is to send the arguments as part of the request body.
This approach has been rejected because it requires the use of POST
requests, so it would break any existing configuration that relies on the
request type for authentication or caching.
Extensibility:
- The header size is provided by the server, which makes it possible to
introduce an hgrc setting for it.
- The client ignores the capability value after the first comma, which
allows more information to be included in the future.
New argument is silently ignored by both HTTP and SSH servers.
This means we can, for instance, add new flags to getbundle()
to request advanced features (like lightweight-copy-aware bundles),
and older servers will silently ignore this request and send back
a plain bundle.
Current wire protocol of unbundle sends the list of all heads in the remote
repository to avoid race condition. This causes "URL too long" error on HTTP
server when the repository has many heads.
This change allows clients to send SHA1 hash of sorted head hashes instead.
Also, this introduces "unbundlehash" capability to inform them that the server
accepts hashed heads parameter.
getbundle(common, heads) -> bundle
Returns the changegroup for all ancestors of heads which are not ancestors of common. For both
sets, the heads are included in the set.
Intended to eventually supercede changegroupsubset and changegroup. Uses heads of common region
to exclude unwanted changesets instead of bases of desired region, which is more useful and
easier to implement.
Designed to be extensible with new optional arguments (which will have to be guarded by
corresponding capabilities).
known([Node]) -> [1/0]
Returns 1/0 for each node, indicating whether it's known by the server.
Needed for new discovery protocols introduced in later patches.
Previously, branch names were ideally manipulated as UTF-8 strings,
because they were stored as UTF-8 in the dirstate and the changelog
and could not be safely converted to the local encoding and back.
However, only about 80% of branch name code was actually using the
right encoding conventions. This patch uses the localstr addition to
allow working on branch names as local strings, which simplifies
handling so that the previously incorrect code becomes correct.
The behaviour between http and ssh still differ:
- the "unsynced changes" is seen as a remote output in the http cases
- but it is correctly seen as a push error for ssh
This patch fixes issues with stream cloning in the presense of parentdelta,
lwcopy and similar additions that change the interpretation of the revlog
format, or the format itself.
Currently, the stream capability is sent like this:
stream=<version of changelog>
But the client doesn't actually check the version number; also, it only checks
the changelog and it doesn't capture the interpretation-changes and
flag-changes in parentdelta and lwcopy.
This patch removes the 'stream' capability whenever we use a non-basic revlog
format, to prevent old clients from receiving incorrect data. In those cases,
a new capability called 'streamreqs' is added instead. Instead of a revlog
version, it comes with a list of revlog-format relevant requirements, which
are a subset of the repository requirements, excluding things that are not
relevant for stream.
New clients use this to determine whether or not they can stream. Old clients
only look for the 'stream' capability, as always. New servers will still send
this when serving old repositories.