We already support multiple primitive for listing files, which were
affected by the current changeset.
This patch adds files() which returns files of the current changeset
matching a given pattern or fileset query via the "set:" prefix.
There isn't a formal handshake protocol in the wire protocol. But
clients almost certainly need to perform particular actions before they
can communicate with a server optimally. So document what that is
so people understand what's going on at connection establishment time.
The Mercurial wire protocol is under-documented. This includes a lack
of source docstrings and comments as well as pages on the official
wiki.
This patch adds the beginnings of "internals" documentation on the
wire protocol.
The documentation should have nearly complete coverage on the
lower-level parts of the protocol, such as the different transport
mechanims, how commands and arguments are sent, capabilities, and,
of course, the commands themselves.
As part of writing this documentation, I discovered a number of
deficiencies in the protocol and bugs in the implementation. I've
started sending patches for some of the issues. I hope to send a lot
more.
This patch starts with the scaffolding for a new internals page.
And refactor dispatch.py to use it. As you can see, the resulting code
is much simpler.
I was tempted to inline _runcommand as part of writing this series.
However, a number of extensions wrap _runcommand. So keeping it around
is necessary (extensions can't easily wrap runcommand because it calls
hooks before and after command execution).
V2:
- move from shortest() with minlength 8 to minlength 4
- mention [templates] in config.txt
- better describe the difference between [templatealias] and [templates]
V3:
- choose a better example template
Python 2.7 supports specifying a custom cipher list to TLS sockets.
Advanced users may wish to specify a custom cipher list to increase
security. Or in some cases they may wish to prefer weaker ciphers
in order to increase performance (e.g. when doing stream clones
of very large repositories).
This patch introduces a [hostsecurity] config option for defining
the cipher list. The help documentation states that it is for
advanced users only.
Honestly, I'm a bit on the fence about providing this because
it is a footgun and can be used to decrease security. However,
there are legitimate use cases for it, so I think support should
be provided.
Currently, Mercurial will use TLS 1.0 or newer when connecting to
remote servers, selecting the highest TLS version supported by both
peers. On older Pythons, only TLS 1.0 is available. On newer Pythons,
TLS 1.1 and 1.2 should be available.
Security professionals recommend avoiding TLS 1.0 if possible.
PCI DSS 3.1 "strongly encourages" the use of TLS 1.2.
Known attacks like BEAST and POODLE exist against TLS 1.0 (although
mitigations are available and properly configured servers aren't
vulnerable).
I asked Eric Rescorla - Mozilla's resident crypto expert - whether
Mercurial should drop support for TLS 1.0. His response was
"if you can get away with it." Essentially, a number of servers on
the Internet don't support TLS 1.1+. This is why web browsers
continue to support TLS 1.0 despite desires from security experts.
This patch changes Mercurial's default behavior on modern Python
versions to require TLS 1.1+, thus avoiding known security issues
with TLS 1.0 and making Mercurial more secure by default. Rather
than drop TLS 1.0 support wholesale, we still allow TLS 1.0 to be
used if configured. This is a compromise solution - ideally we'd
disallow TLS 1.0. However, since we're not sure how many Mercurial
servers don't support TLS 1.1+ and we're not sure how much user
inconvenience this change will bring, I think it is prudent to ship
an escape hatch that still allows usage of TLS 1.0. In the default
case our users get better security. In the worst case, they are no
worse off than before this patch.
This patch has no effect when running on Python versions that don't
support TLS 1.1+.
As the added test shows, connecting to a server that doesn't
support TLS 1.1+ will display a warning message with a link to
our wiki, where we can guide people to configure their client to
allow less secure connections.
Currently, Mercurial will use TLS 1.0 or newer when connecting to
remote servers, selecting the highest TLS version supported by both
peers. On older Pythons, only TLS 1.0 is available. On newer Pythons,
TLS 1.1 and 1.2 should be available.
Security-minded people may want to not take any risks running
TLS 1.0 (or even TLS 1.1). This patch gives those people a config
option to explicitly control which TLS versions Mercurial should use.
By providing this option, one can require newer TLS versions
before they are formally deprecated by Mercurial/Python/OpenSSL/etc
and lower their security exposure. This option also provides an
easy mechanism to change protocol policies in Mercurial. If there
is a 0-day and TLS 1.0 is completely broken, we can act quickly
without changing much code.
Because setting the minimum TLS protocol is something you'll likely
want to do globally, this patch introduces a global config option under
[hostsecurity] for that purpose.
wrapserversocket() has been taught a hidden config option to define
the explicit protocol to use. This is queried in this function and
not passed as an argument because I don't want to expose this dangerous
option as part of the Python API. There is a risk someone could footgun
themselves. But the config option is a devel option, has a warning
comment, and I doubt most people are using `hg serve` to run a
production HTTPS server (I would have something not Mercurial/Python
handle TLS). If this is problematic, we can go back to using a
custom extension in tests to coerce the server into bad behavior.
hgweb currently offers limited functionality for "classifying"
repositories. This patch aims to change that.
The web.labels config option list is introduced. Its values
are exposed to the "index" and "summary" templates. Custom
templates can use template features like ifcontains() to e.g.
look for the presence of a specific label and engage specific
behavior. For example, a site operator may wish to assign a
"defunct" label to a repository so the repository is prominently
marked as dead in repository indexes.
This corrects a warning from lintian that we're shipping an executable without
a man page. Since there is a doc string in the text, let's use that for the man
page.
Now that we have a mechanism for declaring path sub-options, we can
start to pile on features!
Many power users have expressed frustration that bare `hg push`
attempts to push all local revisions to the remote. This patch
introduces the "pushrev" path sub-option to control which revisions
are pushed when no "-r" argument is specified.
The value of this sub-option is a revset, naturally.
A future feature addition could potentially introduce a "pushnames"
sub-options that declares the list of names (branches, bookmarks,
topics, etc) to push by default. The entire "what to push by default"
feature should probably be considered before this patch lands.
Recent work has introduced the [hostsecurity] config section for
defining per-host security settings. This patch builds on top
of this foundation and implements the ability to define a per-host
path to a file containing certificates used for verifying the server
certificate. It is logically a per-host web.cacerts setting.
This patch also introduces a warning when both per-host
certificates and fingerprints are defined. These are mutually
exclusive for host verification and I think the user should be
alerted when security settings are ambiguous because, well,
security is important.
Tests validating the new behavior have been added.
I decided against putting "ca" in the option name because a
non-CA certificate can be specified and used to validate the server
certificate (commonly this will be the exact public certificate
used by the server). It's worth noting that the underlying
Python API used is load_verify_locations(cafile=X) and it calls
into OpenSSL's SSL_CTX_load_verify_locations(). Even OpenSSL's
documentation seems to omit that the file can contain a non-CA
certificate if it matches the server's certificate exactly. I
thought a CA certificate was a special kind of x509 certificate.
Perhaps I'm wrong and any x509 certificate can be used as a
CA certificate [as far as OpenSSL is concerned]. In any case,
I thought it best to drop "ca" from the name because this reflects
reality.
smtp.verifycert was accidentally broken by 799db3fe9866. And,
I believe the "loose" value has been broken for longer than that.
The current code refuses to talk to a remote server unless the
CA is trusted or the fingerprint is validated. In other words,
we lost the ability for smtp.verifycert to lower/disable security.
There are special considerations for smtp.verifycert in
sslutil.validatesocket() (the "strict" argument). This violates
the direction sslutil is evolving towards, which has all security
options determined at wrapsocket() time and a unified code path and
configs for determining security options.
Since smtp.verifycert is broken and since we'll soon have new
security defaults and new mechanisms for controlling host security,
this patch formally deprecates smtp.verifycert. With this patch,
the socket security code in mail.py now effectively mirrors code
in url.py and other places we're doing socket security.
For the record, removing smtp.verifycert because it was accidentally
broken is a poor excuse to remove it. However, I would have done this
anyway because smtp.verifycert is a one-off likely used by few people
(users of the patchbomb extension) and I don't think the existence
of this seldom-used one-off in security code can be justified,
especially when you consider that better mechanisms are right around
the corner.
We introduce the [hostsecurity] config section. It holds per-host
security settings.
Currently, the section only contains a "fingerprints" option,
which behaves like [hostfingerprints] but supports specifying the
hashing algorithm.
There is still some follow-up work, such as changing some error
messages.
The post-* family of hooks will not run in case a command fails (i.e.
raises an exception). This makes it inconvenient to hook into events
such as doing something in case of a failed push.
We catch all exceptions to run the failure hook. I am not sure if this
is too aggressive, but tests apparently pass.
A pretty common pattern in templates is adding conditional separators
like so:
{node}{if(bookmarks, " {bookmarks}")}{if(tags, " {tags}")}
With this patch, the above can be simplified to:
{separate(" ", node, bookmarks, tags)}
The function is similar to the already existing join(), but with a few
differences:
* separate() skips empty arguments
* join() expects a single list argument, while separate() expects
each item as a separate argument
* separate() takes the separator first in order to allow a variable
number of arguments after it
Before this patch, when printing help text using `hg help`, or `hg log -h`,
the output will wrap at 78 chars even if the user has a bigger terminal width
and there is no config option to change it, making the experience different
from the commonly used `man` tool.
This patch introduces a new config option `ui.textwidth`, which replaces the
hardcoded number. It's set to 78 by default to maintain compatibility. When
set to 0, `hg help` will behave more like `man`.
Before this patch, the HTTP transport protocol would always zlib
compress certain responses (notably "getbundle" wire protocol commands)
at zlib compression level 6.
zlib can be a massive CPU resource sink for servers. Some server
operators may wish to reduce server-side CPU requirements while
requiring more bandwidth. This is common on corporate intranets, for
example. Others may wish to use more CPU but reduce bandwidth.
This patch introduces a config option to allow server operators
to control the zlib compression level.
On the "mozilla-unified" generaldelta repository, setting this
value to "0" (disable compression) results in server-side CPU
utilization for a `hg clone` going from ~180s to ~124s CPU time on
my i7-6700K. A level of "1" (which increases the transfer size from
~1,074 MB at level 6 to ~1,222 MB) utilizes ~132s CPU time.
This patch subtly changes the behavior of the parsing of "X.Y" values
to not set the "section" variable when rendering a known sub-topic.
Previously, "section" would be the same as the sub-topic name. This
required the sub-topic RST to have a section named the same as the
sub-topic name.
When I made this change, the descriptions from help.internalstable
started being rendered in command line output. This didn't look correct
to me, as it didn't match the formatting of main help pages. I
corrected this by moving the top section to help.internalstable and
changing the section levels of all the "internals" topics.
The end result is that "internals" topics now match the rendering of
main topics on both the CLI and HTML. And, "internals" topics no longer
require a main section matching the name of the topic.
Because parsing "$n" requires a crafted tokenizer, it exists only for backward
compatibility (as documented in revset._tokenizealias.) This patch updates the
examples so that users are encouraged to use symbolic names instead of "$n"s.
I'm going to implement alias expansion in templater, which won't support "$n"
parameters to make my life easier. Templater is more complicated than revset
because tokenizer and parser call each other.
Now template aliases are fully supported in log and formatter templates.
As I said before, aliases are not expanded in map files. This avoids possible
corruption of our stock styles and web templates. This behavior is undocumented
since no map file nor [templates] section are documented at all. Later on,
we might want to add [aliases] section to map files if it appears to be useful.
This patch introduces a new config flag ui.interface to select the interface
for interactive commands. It currently only applies to chunks selection.
The config can be overridden on a per feature basis with the flag
ui.interface.<feature>.
features for the moment can only be 'chunkselector', moving forward we expect
to have 'histedit' and other commands there.
If an incorrect value is given to ui.interface we print a warning and use the
default interface: text. If HGPLAIN is specified we also use the default
interface: text.
Note that we fail quickly if a feature does not handle all the interfaces
that we permit in ui.interface; in future, we could design a fallback path
(e.g. blackpearl to curses, curses to text), but let's leave that until we
need it.
Certificate pinning via [hostfingerprints] is a useful security
feature. Currently, we only support one fingerprint per hostname.
This is simple but it fails in the real world:
* Switching certificates breaks clients until they change the
pinned certificate fingerprint. This incurs client downtime
and can require massive amounts of coordination to perform
certificate changes.
* Some servers operate with multiple certificates on the same
hostname.
This patch adds support for defining multiple certificate
fingerprints per host. This overcomes the deficiencies listed
above. I anticipate the primary use case of this feature will
be to define both the old and new certificate so a certificate
transition can occur with minimal interruption, so this scenario
has been called out in the help documentation.
Adjust the examples (prefix and hostfingerprints) in help/config.txt
https://hg.intevation.de/ is now served with a new certificate that is signed
by a commercial CA, so all nearly all browsers will accept it automatically.
Listing both names, hg.intevation.de and hg.intevation.org, in the section
[hostfingerprints] allows using both without configuring web.cacerts and
without changing existing https URLs in the [paths] section.
The "hg purge" alias as currently described in "hgrc.5" only works, if
the caller's current working directory is identical to the repository's
root directory.
This patch slightly modifies the example by adding an empty pattern as a
file argument to the "hg status" command, thus forcing this command to
list the affected files relative to the current directory.
Before this patch, text blocks changed in this patch are shown as just
continuous text blocks like below in HTML format.
Global configuration like the username setting is typically put into:
%USERPROFILE%\mercurial.ini
$HOME/.hgrc
This patch itemizes these text blocks to increase readability in HTML
format.
Global configuration like the username setting is typically put into:
- %USERPROFILE%\mercurial.ini (on Windows)
- $HOME/.hgrc (on Unix, Plan9)
Like as other platform sensitive container-ed text blocks, this patch
also adds explicit "on PLATFORM" information to each items for
readability in HTML format, even though output of "hg help config" on
command line seems a little redundant. For example, on Unix:
Global configuration like the username setting is typically put into:
- "$HOME/.hgrc" (on Unix, Plan9)
We've discovered an issue with this flag during certain kinds of rebases. When:
(1) we're rebasing while currently on the destination commit, and
(2) an untracked or ignored file F is currently in the working copy, and
(3) the same file F is in a source commit, and
(4) F has different contents in the source commit,
then we'll try to merge the file rather than overwrite it.
An earlier patch I sent honored the options for these situations as well.
Unfortunately, rebases go through the same flow as the old, deprecated 'hg
merge --force'. We'd rather not make any changes to 'hg merge --force'
behavior, and there's no way from this point in the code to figure out whether
we're in 'hg rebase' or 'hg merge --force'.
Pierre-Yves David and I came up with the idea to split the 'force' flag up into
'force' for rebases, and 'forcemerge' for merge. Since this is a very
disruptive change and we're in freeze mode, simply undocument the options for
this release so that our hands aren't tied by BC concerns. We'll redocument
them in the next release.
This patch changes "progress.shouldprint()" so a feature name is provided to
"ui.plain()" to determine if there is an exception specificed in HGPLAINEXCEPT
for the progress extension.
This will allow user-facing scripts to provide progress output while HGPLAIN
is enabled.
For example, ":hg:`help config.default-push`" creates an invalid link
to "hgrc.5.html#default-push" in HTML, but ":hg:`help
config.paths.default-push`" creates a valid link to
"hgrc.5.html#paths".
These options were undocumented for 3.7 because of an issue found during the
freeze (see rev e5c26e4e6ec7). This issue has now been fixed, so we can
document these options again.
Closing files that have been appended to is relatively slow on
Windows/NTFS. This makes several Mercurial operations slower on
Windows.
The workaround to this issue is conceptually simple: use multiple
threads for I/O. Unfortunately, Python doesn't scale well to multiple
threads because of the GIL. And, refactoring our code to use threads
everywhere would be a huge undertaking. So, we decide to tackle this
problem by starting small: establishing a thread pool for closing
files.
This patch establishes a mechanism for closing file handles on separate
threads. The coordinator object is basically a queue of file handles to
operate on and a thread pool consuming from the queue.
When files are opened through the VFS layer, the caller can specify
that delay closing is allowed.
A proxy class for file handles has been added. We must use a proxy
because it isn't possible to modify __class__ on built-in types. This
adds some overhead. But as future patches will show, this overhead
is cancelled out by the benefit of closing file handles on background
threads.
Before this patch rebase would create divergence when you were rebasing obsolete
changesets on a destination not containing one of its successors.
This patch introduces rebase.allowdivergence to explicitly allow
divergence creation with rebase.
In some real-world cases it is preferable to allow overwriting ignored files
while continuing to abort on unknown files. This primarily happens when we're
replacing build artifacts (which are ignored) with checked in files, but
continuing to abort on differing files that aren't ignored.
We're redefining merge.checkunknown to only control the behavior for files
that aren't ignored. That's fine because this config was only very recently
introduced and has not made its way into any Mercurial releases yet.
Sometimes a txnclose or changegroup hook wants to iterate through all
the changesets in transaction: in that situation usually the revset
`$HG_NODE:` is used to select the revisions. Unfortunately this revset
sometimes may contain too many changesets because we don't have the
write lock while the hook runs newer changes may be added to
repository in the meantime.
That's why there is a need for extra variable carrying the information about
the last change in the transaction.
The clone bundles feature was introduced in Mercurial 3.6 behind an
experimental and disabled by default flag. The feature has been enabled
on hg.mozilla.org for a few months and has served many terabytes of
clones. Users have been encouraged to use the feature and reception
has been very positive (mainly due to faster clones as a result of
connecting to a CDN). I have heard no feedback about changing the
feature other than inquiries about when it will be enabled by default.
So, I think the feature is ready to be enabled by default.
This patch renames experimental.clonebundles to ui.clonebundles,
documents the option, and enables it by default. References to the
experimental state of clone bundles have been removed. The remaining
config option docs in clonebundles.py have been removed because they
are redudant with `hg help config`.
There are some oddities with behavior of clone bundles. Because clones
with clone bundles are effectively 2 `hg pull` operations, there may be
2 transactions. This could result in hooks running twice. If the
subsequent pull is aborted, it could result in partial rollback and an
incomplete clone. This behavior is a bit wonky and should probably
be documented. If this patch is accepted, I'll send a follow-up to
document it. I don't think this behavior should prevent the feature
being enabled by default. Reworking the clone mechanism to support
interrupted or multi-part clones feels like a major new feature and
something that when implemented can change the hook and rollback
semantics of clone bundles. Besides, partial clone is better than
full rollback and hooks running on initial clone are likely rare, so I
think the impact is minimal.
A 'colliding unknown file' is a file that meets all of the following
conditions:
- is untracked or ignored on disk
- is present in the changeset being merged or updated to
- has different contents
Previously, we would always abort whenever we saw such files. With this config
option we can choose to warn and back the unknown files up instead, or even
forgo the warning entirely and silently back the unknown files up.
Common use cases for this configuration include a large scale transition of
formerly ignored unknown files to tracked files. In some cases the files can be
given new names, but in other cases, external "convention over configuration"
constraints have determined that the file must retain the same name as before.
I recently implemented the server.bundle1* options to control whether
bundle1 exchange is allowed.
After thinking about Mozilla's strategy for handling generaldelta
rollout a bit more, I think server operators need an additional
lever: disable bundle1 if and only if the repo is generaldelta.
bundle1 exchange for non-generaldelta repos will not have the potential
for CPU explosion that generaldelta repos do. Therefore, it makes sense
for server operators to continue to allow bundle1 exchange for
non-generaldelta repos without having to set a per-repo hgrc option
to change the policy depending on whether the repo is generaldelta.
This patch introduces a new set of options to control bundle1 behavior
for generaldelta repos. These options enable server operators to limit
bundle1 restrictions to the class of repos that can be performance
issues. It also allows server operators to tie bundle1 access to store
format. In many server environments (including Mozilla's), legacy repos
will not be generaldelta and new repos will or might be. New repos often
aren't bound by legacy access requirements, so setting a global policy
that disallows access to new/generaldelta repos via bundle1 could be a
reasonable policy in many server environments. This patch makes this
policy very easy to implement (modify global hgrc, add options to
existing generaldelta repos to grandfather them in).
It seems like a good idea to document the revlog format.
There is a lot more that could be added to this documentation.
But you have to start somewhere.
The old messages implied that disabling a single setting would ensure
compatibility, but if you disabled one of the older flags, and
left a newer flag, that would not actually do what the docs say.
Bundle types and the high-level data format of each bundle isn't
documented anywhere. Let's document this as well.
Obviously there are many more details about bundles that could be
written about. But you have to start somewhere.
There is no formal location for spec-like technical/internal docs. The
repository makes sense as such a location because spec-like
documentation should be reviewed (ruling out a wiki). mpm has also
stated that he would like this documentation to be part of the
built-in help system. So, we establish an "internals" sub-directory
to hold this class of documentation.
The format of changegroups does not appear to be documented anywhere,
even in source code. It therefore seemed like an appropriate first thing
to document.
This patch adds low-level documentation of versions 1 and 2 of the
changegroup foromat. It currently only describes the raw data format.
There is probably room to write higher-level documentation on strategies
for producing and consuming the data. We'll leave that for another day.
The added file is not yet accessible via `hg help` nor via hgweb.
Support for this will follow in subsequent patches.
Power users often want to apply per-path configuration options. For
example, they may want to declare an alternate URL for push operations
or declare a revset of revisions to push when `hg push` is used
(as opposed to attempting to push all revisions by default).
This patch establishes the use of sub-options (config options with
":" in the name) to declare additional behavior for paths.
New sub-options are declared by using the new ``@ui.pathsuboption``
decorator. This decorator serves multiple purposes:
* Declaring which sub-options are registered
* Declaring how a sub-option maps to an attribute on ``path``
instances (this is needed to `hg paths` can render sub-options
and values properly)
* Validation and normalization of config options to attribute
values
* Allows extensions to declare new sub-options without monkeypatching
* Allows extensions to overwrite built-in behavior for sub-option
handling
As convenient as the new option registration decorator is, extensions
(and even core functionality) may still need an additional hook point
to perform finalization of path instances. For example, they may wish
to validate that multiple options/attributes aren't conflicting with
each other. This hook point could be added later, if needed.
To prove this new functionality works, we implement the "pushurl"
path sub-option. This option declares the URL that `hg push` should
use by default.
We require that "pushurl" is an actual URL. This requirement might be
controversial and could be dropped if there is opposition. However,
objectors should read the complicated code in ui.path.__init__ and
commands.push for resolving non-URL values before making a judgement.
We also don't allow #fragment in the URLs. I intend to introduce a
":pushrev" (or similar) option to define a revset to control which
revisions are pushed when "-r <rev>" isn't passed into `hg push`.
This is much more powerful than #fragment and I don't think #fragment
is useful enough to continue supporting.
The [paths] section of the "config" help page has been updated
significantly. `hg paths` has been taught to display path sub-options.
The docs mention that "default-push" is now deprecated. However, there
are several references to it that need to be cleaned up. A large part
of this is converting more consumers to the new paths API. This will
happen naturally as more path sub-options are added and more and more
components need to access them.
bundle2 is the new and preferred wire protocol format. For various
reasons, server operators may wish to force clients to use it.
One reason is performance. If a repository is stored in generaldelta,
the server must recompute deltas in order to produce the bundle1
changegroup. This can be extremely expensive. For mozilla-central,
bundle generation typically takes a few minutes. However, generating
a non-gd bundle from a generaldelta encoded mozilla-central requires
over 30 minutes of CPU! If a large repository like mozilla-central
were encoded in generaldelta and non-gd clients connected, they could
easily flood a server by cloning.
This patch gives server operators config knobs to control whether
bundle1 is allowed for push and pull operations. The default is to
support legacy bundle1 clients, making this patch backwards compatible.
New ui.graphnodetemplate option allows us to colorize a node symbol by phase
or branch,
[ui]
graphnodetemplate = {label('graphnode.{phase}', graphnode)}
[color]
graphnode.draft = yellow bold
or use a variety of unicode emoji characters, and so on. (You'll need less-481
to display non-BMP unicode character.)
[ui]
graphnodetemplate = {ifeq(obsolete, 'stable', graphnode, '\xf0\x9f\x92\xa9')}
Since we have pushed back the performance issue related to general delta behind
another configuration (Still off by default), we can safely create new
repository with general delta support. As client are compatible with it since
Mercurial 1.9 (4.5 years ago) I do no expect any significant compatibility
issues.
This option will make repositories created as general delta by default but will
not make Mercurial aggressively recompute deltas for all incoming bundle.
Instead, the delta contained in the bundle will be used. This will allow us to
start having general delta repositories created everywhere without triggering
massive recomputation costs for all new clients cloning from old servers.
Very often in my life I'm finding that the only configured merge tool
present on the system is vimdiff[0], and it's currently impossible (as
far as I can tell) short of specifying `ui.merge = `[1] to actually
*disable* a merge tool. This allows vimdiff-haters to put:
[merge-tools]
vimdiff.disable = yes
in their ~/.hgrc and never see vimdiff again. I'm stopping short of
putting this as a commented out entry in the sample new user hgrc
(seen when a user runs `hg config --edit` with no ~/.hgrc) for now,
but I might come back and do that later.
0: vimdiff is at an awkward intersection: it's usually installed by
the vim package which is often installed as a vi substitute, so it's
mere presence doesn't imply me wanting it, unlike (say) kdiff3.
1: There's a related problem I ran into today where specifying
`ui.merge = :merge` failed because :merge isn't a command, which I
think is a regression. I'll try and figure that out and at least file
a bug.
On windows, hgrc.d/*.rc would not be read if mercurial.ini was found. That was
far from obvious from the documentation and different from the behavior on
posix systems.
As a consequence of this, TortoiseHg cacert configuration placed in hgrc.d
would not be read if an old global mercurial.ini still existed.
"hg config -g" could also crash when no global configuration files could be
found.
Instead, make windows behave like posix and read all global configuration
files.
The documentation was in a way right that individual config settings in the
global Mercurial.ini would override settings from for example .hgrc.d\*.rc, but
only because the .d files not would be read at all if a Mercurial.ini was
found. The ordering in the documentation is thus changed to match the code.
Cloning can be an expensive operation for servers because the server
generates a bundle from existing repository data at request time. For
a large repository like mozilla-central, this consumes 4+ minutes
of CPU time on the server. It also results in significant network
utilization. Multiplied by hundreds or even thousands of clients and
the ensuing load can result in difficulties scaling the Mercurial server.
Despite generation of bundles being deterministic until the next
changeset is added, the generation of bundles to service a clone request
is not cached. Each clone thus performs redundant work. This is
wasteful.
This patch introduces the "clonebundles" extension and related
client-side functionality to help alleviate this deficiency. The
client-side feature is behind an experimental flag and is not enabled by
default.
It works as follows:
1) Server operator generates a bundle and makes it available on a
server (likely HTTP).
2) Server operator defines the URL of a bundle file in a
.hg/clonebundles.manifest file.
3) Client `hg clone`ing sees the server is advertising bundle URLs.
4) Client fetches and applies the advertised bundle.
5) Client performs equivalent of `hg pull` to fetch changes made since
the bundle was created.
Essentially, the server performs the expensive work of generating a
bundle once and all subsequent clones fetch a static file from
somewhere. Scaling static file serving is a much more manageable
problem than scaling a Python application like Mercurial. Assuming your
repository grows less than 1% per day, the end result is 99+% of CPU
and network load from clones is eliminated, allowing Mercurial servers
to scale more easily. Serving static files also means data can be
transferred to clients as fast as they can consume it, rather than as
fast as servers can generate it. This makes clones faster.
Mozilla has implemented similar functionality of this patch on
hg.mozilla.org using a custom extension. We are hosting bundle files in
Amazon S3 and CloudFront (a CDN) and have successfully offloaded
>1 TB/day in data transfer from hg.mozilla.org, freeing up significant
bandwidth and CPU resources. The positive impact has been stellar and
I believe it has proved its value to be included in Mercurial core. I
feel it is important for the client-side support to be enabled in core
by default because it means that clients will get faster, more reliable
clones and will enable server operators to reduce load without
requiring any client-side configuration changes (assuming clients are
up to date, of course).
The scope of this feature is narrowly and specifically tailored to
cloning, despite "serve pulls from pre-generated bundles" being a valid
and useful feature. I would eventually like for Mercurial servers to
support transferring *all* repository data via statically hosted files.
You could imagine a server that siphons all pushed data to bundle files
and instructs clients to apply a stream of bundles to reconstruct all
repository data. This feature, while useful and powerful, is
significantly more work to implement because it requires the server
component have awareness of discovery and a mapping of which changesets
are in which files. Full, clone bundles, by contrast, are much simpler.
The wire protocol command is named "clonebundles" instead of something
more generic like "staticbundles" to leave the door open for a new, more
powerful and more generic server-side component with minimal backwards
compatibility implications. The name "bundleclone" is used by Mozilla's
extension and would cause problems since there are subtle differences
in Mozilla's extension.
Mozilla's experience with this idea has taught us that some form of
"content negotiation" is required. Not all clients will support all
bundle formats or even URLs (advanced TLS requirements, etc). To ensure
the highest uptake possible, a server needs to advertise multiple
versions of bundles and clients need to be able to choose the most
appropriate from that list one. The "attributes" in each
server-advertised entry facilitate this filtering and sorting. Their
use will become apparent in subsequent patches.
Initial inspiration and credit for the idea of cloning from static files
belongs to Augie Fackler and his "lookaside clone" extension proof of
concept.
This allows the latest class of tag to be found, such as a release candidate or
final build, instead of just the absolute latest.
It doesn't appear that the existing keyword can be given an optional argument.
There is a keyword, function and filter for 'date', so it doesn't seem harmful
to introduce a new function with the same name as an existing keyword. Most
functions are pretty Mercurial agnostic, but there is {revset()} as precedent.
Even though templatekw.getlatesttags() returns a single tuple, one entry of
which is a list, it is simplest to present this as a list of tags instead of a
single item, with each tag having a distance and change count attribute. It is
also closer to how {latesttag} returns a list of tags, and how this function
works when not given a '%' operator.