Move move in the same direction we took for command line commands. each wire
protocol function will be decorated with its name and arguments.
Beside beside easier to read, this open the road to easily adding more metadata
(like security level or return type)
We already have multiple call function for multiple return type. The
`_decompress` function is only used for http and seems like a layer violation.
We drop it in favor of a new call type dedicated to "stream that may be useful to
compress".
The wireprotocole command use a small set of class as return value. We document
the meaning of each of them.
AS my knowledge of wire protocol is fairly shallow, the documentation can
probably use improvement. But this is a better than nothing.
It will help people that need to add capabilities (in a more subtle was that
just adding some to the list) in multiple way:
1. This function returns a list, not a string. Making it easier to look at,
extend or alter the content.
2. The original capabilities function will be store in the dictionary of wire
protocol command. So extension that wrap this function also need to update
the dictionary entry.
Both wrapping and update of the dictionary entry are needed because the
`hello` wire protocol use the function itself. This is specifically sneaky for
extension writer as ssh use the `hello` command while http use the
`capabilities` command.
With this new `_capabilities` function there is one and only one obvious
place to wrap when needed.
Moves the file walk out of the stream method so that extensions can override it.
This allows an extension to decide what files should be streamed, and in
particular allows a stream without filelogs.
The message was not very much to the point and did not in any way help an
ordinary user.
'repository changed while preparing/uploading bundle - please try again'
is more correct, gives the user some understanding of what is going on, and
tells how to 'recover' from the situation.
The 'bundle' aspect could be seen as an implementation detail that shouldn't be
mentioned, but I think it helps giving an exact error message.
The message could still leave the user wondering why Mercurial doesn't lock the
repo and how unsafe it thus is. Explaining that is however too much detail.
Now that changelog filtering is in place, it's become evident that
naming the filters according to the set of revs _not_ included in the
filtered changelog is confusing. This is especially evident in the
collaborative branch cache scheme.
This changes the names of the filters to reflect the revs that _are_
included:
hidden -> visible
unserved -> served
mutable -> immutable
impactable -> base
repoview.filteredrevs is renamed to filterrevs, so that callers read a
bit more sensibly, e.g.:
filterrevs('visible') # filter revs according to what's visible
Merely creating and using a generator has a measurable impact,
particularly since the common case for stream_out is generators that
yield just once. Avoiding generators improves stream_out performance
by about 7%.
Auditing at this stage is both pointless (paths are already trusted by
the local repo) and expensive. Skipping the audits improves stream_out
performance by about 15%.
They were previously inside the mercurial.phases module, but obsolete
logic will need them to exclude `extinct` changesets from pull and
push.
The proper and planned way to implement such filtering is still to apply a
changelog level filtering. But we are far to late in the cycle to implement and
push such a critical piece of code (changelog filtering). With Matt Mackall
approval I'm extending this quick and dirty mechanism for obsolete purpose.
Changelog level filtering should come during the next release cycle.
This change separates peer implementations from the repository implementation.
localpeer currently is a simple pass-through to localrepository, except for
legacy calls, which have already been removed from localpeer. This ensures that
the local client code only uses the most modern peer API when talking to local
repos.
Peers have a .local() method which returns either None or the underlying
localrepository (or descendant thereof). Repos have a .peer() method to return
a freshly constructed localpeer. The latter is used by hg.peer(), and also to
allow folks to pass either a peer or a repo to some generic helper methods.
We might want to get rid of .peer() eventually.
The only user of locallegacypeer is debugdiscovery, which uses it to pose as a
pre-setdiscovery client. But we decided to leave the old API defined in
locallegacypeer for clarity and maybe for other uses in the future.
It might be nice to actually define the peer API directly in peer.py as stub
methods. One problem there is, however, that localpeer implements
lock/addchangegroup, whereas the true remote peers implement unbundle.
It might be desireable to get rid of this distinction eventually.
Discovery now use an overlay above branchmap to prune invisible "secret"
changeset from branchmap.
To minimise impact on the code during the code freeze, this is achieve by
recomputing non-secret heads on the fly when any secret changeset exists. This
is a computation heavy approach similar to the one used for visible heads. But
few sever should contains secret changeset anyway. See comment in code for more
robust approach.
On local repo the wrapper is applied explicitly while the wire-protocol take
care of wrapping branchmap call in a transparent way. This could be unified by
the Peter Arrenbrecht and Sune Foldager proposal of a `peer` object.
An inappropriate `(+i heads)` may still appear when pushing new changes on a
repository with secret changeset. (see Issue3394 for details)
The `repo` object here is *always* local. Using `repo.heads()` ensure we will
reject push if any secret changeset exists.
During discovery, `visibleheads` were sent to the peer. So we can only expect it
to send us `visibleheads` back. If any secret changeset exists::
visibleheads != repo.heads()
This fix server side part of issue 3303 when pushing over the wire.
Any secret changesets will be excluded from pull and push. Phase data are
properly synchronized on pull and push if a changeset is seen as secret locally
but is non-secret remote side.
This patch does not handle the case of a changeset secret on remote but known
locally.
Remote side may add useful information alongside failure return code. For
example "ssl is required". This patch mirror what is done for the unbundle
command.
Older clients will still print the provided error message and not much else:
over ssh, this will be each line prefixed with 'remote: ' in addition to an
"abort: unexpected response: '\n'"; over http, this will be the '---%<---'
banners in addition to the 'does not appear to be a repository' message.
Currently, clients with this patch will display 'abort: remote error:\n' and
the provided error text, but it is trivial to style the error text however is
deemed appropriate.
Makes lookup, heads, known, branchmap, pushkey, and listkeys batchable.
It could, for instance, be interesting to use this to batch calls to
lookup when a pull or clone has multiple --rev arguments. The next patch
is going to batch heads and known to slightly tune discovery.
Firstly, I think we should do this for all new wire commands, just
to be on the safe side. So I want to get this into the 1.9 release.
Secondly, there actually is potential here that sometimes the server
can know that the number of its nodes which can possibly still be
undecided on the client is small. It might then just send them along
directly (cutting short the end game). This, however, requires
walking the graph on the server, which can be expensive, so for the
moment we're not actually doing it.
Changeset 20b319765bcf introduced the unbundlehash capability and
unconditionally hashed the heads on the client side. By mistake, the
heads were also cased in the heads == ['force'] case.
Send the command arguments in the HTTP headers. The command is still part
of the URL. If the server does not have the 'httpheader' capability, the
client will send the command arguments in the URL as it did previously.
Web servers typically allow more data to be placed within the headers than
in the URL, so this approach will:
- Avoid HTTP errors due to using a URL that is too large.
- Allow Mercurial to implement a more efficient wire protocol.
An alternate approach is to send the arguments as part of the request body.
This approach has been rejected because it requires the use of POST
requests, so it would break any existing configuration that relies on the
request type for authentication or caching.
Extensibility:
- The header size is provided by the server, which makes it possible to
introduce an hgrc setting for it.
- The client ignores the capability value after the first comma, which
allows more information to be included in the future.
New argument is silently ignored by both HTTP and SSH servers.
This means we can, for instance, add new flags to getbundle()
to request advanced features (like lightweight-copy-aware bundles),
and older servers will silently ignore this request and send back
a plain bundle.
Current wire protocol of unbundle sends the list of all heads in the remote
repository to avoid race condition. This causes "URL too long" error on HTTP
server when the repository has many heads.
This change allows clients to send SHA1 hash of sorted head hashes instead.
Also, this introduces "unbundlehash" capability to inform them that the server
accepts hashed heads parameter.
getbundle(common, heads) -> bundle
Returns the changegroup for all ancestors of heads which are not ancestors of common. For both
sets, the heads are included in the set.
Intended to eventually supercede changegroupsubset and changegroup. Uses heads of common region
to exclude unwanted changesets instead of bases of desired region, which is more useful and
easier to implement.
Designed to be extensible with new optional arguments (which will have to be guarded by
corresponding capabilities).
known([Node]) -> [1/0]
Returns 1/0 for each node, indicating whether it's known by the server.
Needed for new discovery protocols introduced in later patches.
Previously, branch names were ideally manipulated as UTF-8 strings,
because they were stored as UTF-8 in the dirstate and the changelog
and could not be safely converted to the local encoding and back.
However, only about 80% of branch name code was actually using the
right encoding conventions. This patch uses the localstr addition to
allow working on branch names as local strings, which simplifies
handling so that the previously incorrect code becomes correct.
The behaviour between http and ssh still differ:
- the "unsynced changes" is seen as a remote output in the http cases
- but it is correctly seen as a push error for ssh
This patch fixes issues with stream cloning in the presense of parentdelta,
lwcopy and similar additions that change the interpretation of the revlog
format, or the format itself.
Currently, the stream capability is sent like this:
stream=<version of changelog>
But the client doesn't actually check the version number; also, it only checks
the changelog and it doesn't capture the interpretation-changes and
flag-changes in parentdelta and lwcopy.
This patch removes the 'stream' capability whenever we use a non-basic revlog
format, to prevent old clients from receiving incorrect data. In those cases,
a new capability called 'streamreqs' is added instead. Instead of a revlog
version, it comes with a list of revlog-format relevant requirements, which
are a subset of the repository requirements, excluding things that are not
relevant for stream.
New clients use this to determine whether or not they can stream. Old clients
only look for the 'stream' capability, as always. New servers will still send
this when serving old repositories.