We already support multiple primitive for listing files, which were
affected by the current changeset.
This patch adds files() which returns files of the current changeset
matching a given pattern or fileset query via the "set:" prefix.
There isn't a formal handshake protocol in the wire protocol. But
clients almost certainly need to perform particular actions before they
can communicate with a server optimally. So document what that is
so people understand what's going on at connection establishment time.
The Mercurial wire protocol is under-documented. This includes a lack
of source docstrings and comments as well as pages on the official
wiki.
This patch adds the beginnings of "internals" documentation on the
wire protocol.
The documentation should have nearly complete coverage on the
lower-level parts of the protocol, such as the different transport
mechanims, how commands and arguments are sent, capabilities, and,
of course, the commands themselves.
As part of writing this documentation, I discovered a number of
deficiencies in the protocol and bugs in the implementation. I've
started sending patches for some of the issues. I hope to send a lot
more.
This patch starts with the scaffolding for a new internals page.
And refactor dispatch.py to use it. As you can see, the resulting code
is much simpler.
I was tempted to inline _runcommand as part of writing this series.
However, a number of extensions wrap _runcommand. So keeping it around
is necessary (extensions can't easily wrap runcommand because it calls
hooks before and after command execution).
V2:
- move from shortest() with minlength 8 to minlength 4
- mention [templates] in config.txt
- better describe the difference between [templatealias] and [templates]
V3:
- choose a better example template
Python 2.7 supports specifying a custom cipher list to TLS sockets.
Advanced users may wish to specify a custom cipher list to increase
security. Or in some cases they may wish to prefer weaker ciphers
in order to increase performance (e.g. when doing stream clones
of very large repositories).
This patch introduces a [hostsecurity] config option for defining
the cipher list. The help documentation states that it is for
advanced users only.
Honestly, I'm a bit on the fence about providing this because
it is a footgun and can be used to decrease security. However,
there are legitimate use cases for it, so I think support should
be provided.
Currently, Mercurial will use TLS 1.0 or newer when connecting to
remote servers, selecting the highest TLS version supported by both
peers. On older Pythons, only TLS 1.0 is available. On newer Pythons,
TLS 1.1 and 1.2 should be available.
Security professionals recommend avoiding TLS 1.0 if possible.
PCI DSS 3.1 "strongly encourages" the use of TLS 1.2.
Known attacks like BEAST and POODLE exist against TLS 1.0 (although
mitigations are available and properly configured servers aren't
vulnerable).
I asked Eric Rescorla - Mozilla's resident crypto expert - whether
Mercurial should drop support for TLS 1.0. His response was
"if you can get away with it." Essentially, a number of servers on
the Internet don't support TLS 1.1+. This is why web browsers
continue to support TLS 1.0 despite desires from security experts.
This patch changes Mercurial's default behavior on modern Python
versions to require TLS 1.1+, thus avoiding known security issues
with TLS 1.0 and making Mercurial more secure by default. Rather
than drop TLS 1.0 support wholesale, we still allow TLS 1.0 to be
used if configured. This is a compromise solution - ideally we'd
disallow TLS 1.0. However, since we're not sure how many Mercurial
servers don't support TLS 1.1+ and we're not sure how much user
inconvenience this change will bring, I think it is prudent to ship
an escape hatch that still allows usage of TLS 1.0. In the default
case our users get better security. In the worst case, they are no
worse off than before this patch.
This patch has no effect when running on Python versions that don't
support TLS 1.1+.
As the added test shows, connecting to a server that doesn't
support TLS 1.1+ will display a warning message with a link to
our wiki, where we can guide people to configure their client to
allow less secure connections.
Currently, Mercurial will use TLS 1.0 or newer when connecting to
remote servers, selecting the highest TLS version supported by both
peers. On older Pythons, only TLS 1.0 is available. On newer Pythons,
TLS 1.1 and 1.2 should be available.
Security-minded people may want to not take any risks running
TLS 1.0 (or even TLS 1.1). This patch gives those people a config
option to explicitly control which TLS versions Mercurial should use.
By providing this option, one can require newer TLS versions
before they are formally deprecated by Mercurial/Python/OpenSSL/etc
and lower their security exposure. This option also provides an
easy mechanism to change protocol policies in Mercurial. If there
is a 0-day and TLS 1.0 is completely broken, we can act quickly
without changing much code.
Because setting the minimum TLS protocol is something you'll likely
want to do globally, this patch introduces a global config option under
[hostsecurity] for that purpose.
wrapserversocket() has been taught a hidden config option to define
the explicit protocol to use. This is queried in this function and
not passed as an argument because I don't want to expose this dangerous
option as part of the Python API. There is a risk someone could footgun
themselves. But the config option is a devel option, has a warning
comment, and I doubt most people are using `hg serve` to run a
production HTTPS server (I would have something not Mercurial/Python
handle TLS). If this is problematic, we can go back to using a
custom extension in tests to coerce the server into bad behavior.
hgweb currently offers limited functionality for "classifying"
repositories. This patch aims to change that.
The web.labels config option list is introduced. Its values
are exposed to the "index" and "summary" templates. Custom
templates can use template features like ifcontains() to e.g.
look for the presence of a specific label and engage specific
behavior. For example, a site operator may wish to assign a
"defunct" label to a repository so the repository is prominently
marked as dead in repository indexes.
This corrects a warning from lintian that we're shipping an executable without
a man page. Since there is a doc string in the text, let's use that for the man
page.
Now that we have a mechanism for declaring path sub-options, we can
start to pile on features!
Many power users have expressed frustration that bare `hg push`
attempts to push all local revisions to the remote. This patch
introduces the "pushrev" path sub-option to control which revisions
are pushed when no "-r" argument is specified.
The value of this sub-option is a revset, naturally.
A future feature addition could potentially introduce a "pushnames"
sub-options that declares the list of names (branches, bookmarks,
topics, etc) to push by default. The entire "what to push by default"
feature should probably be considered before this patch lands.
Recent work has introduced the [hostsecurity] config section for
defining per-host security settings. This patch builds on top
of this foundation and implements the ability to define a per-host
path to a file containing certificates used for verifying the server
certificate. It is logically a per-host web.cacerts setting.
This patch also introduces a warning when both per-host
certificates and fingerprints are defined. These are mutually
exclusive for host verification and I think the user should be
alerted when security settings are ambiguous because, well,
security is important.
Tests validating the new behavior have been added.
I decided against putting "ca" in the option name because a
non-CA certificate can be specified and used to validate the server
certificate (commonly this will be the exact public certificate
used by the server). It's worth noting that the underlying
Python API used is load_verify_locations(cafile=X) and it calls
into OpenSSL's SSL_CTX_load_verify_locations(). Even OpenSSL's
documentation seems to omit that the file can contain a non-CA
certificate if it matches the server's certificate exactly. I
thought a CA certificate was a special kind of x509 certificate.
Perhaps I'm wrong and any x509 certificate can be used as a
CA certificate [as far as OpenSSL is concerned]. In any case,
I thought it best to drop "ca" from the name because this reflects
reality.
smtp.verifycert was accidentally broken by 799db3fe9866. And,
I believe the "loose" value has been broken for longer than that.
The current code refuses to talk to a remote server unless the
CA is trusted or the fingerprint is validated. In other words,
we lost the ability for smtp.verifycert to lower/disable security.
There are special considerations for smtp.verifycert in
sslutil.validatesocket() (the "strict" argument). This violates
the direction sslutil is evolving towards, which has all security
options determined at wrapsocket() time and a unified code path and
configs for determining security options.
Since smtp.verifycert is broken and since we'll soon have new
security defaults and new mechanisms for controlling host security,
this patch formally deprecates smtp.verifycert. With this patch,
the socket security code in mail.py now effectively mirrors code
in url.py and other places we're doing socket security.
For the record, removing smtp.verifycert because it was accidentally
broken is a poor excuse to remove it. However, I would have done this
anyway because smtp.verifycert is a one-off likely used by few people
(users of the patchbomb extension) and I don't think the existence
of this seldom-used one-off in security code can be justified,
especially when you consider that better mechanisms are right around
the corner.
We introduce the [hostsecurity] config section. It holds per-host
security settings.
Currently, the section only contains a "fingerprints" option,
which behaves like [hostfingerprints] but supports specifying the
hashing algorithm.
There is still some follow-up work, such as changing some error
messages.
The post-* family of hooks will not run in case a command fails (i.e.
raises an exception). This makes it inconvenient to hook into events
such as doing something in case of a failed push.
We catch all exceptions to run the failure hook. I am not sure if this
is too aggressive, but tests apparently pass.
A pretty common pattern in templates is adding conditional separators
like so:
{node}{if(bookmarks, " {bookmarks}")}{if(tags, " {tags}")}
With this patch, the above can be simplified to:
{separate(" ", node, bookmarks, tags)}
The function is similar to the already existing join(), but with a few
differences:
* separate() skips empty arguments
* join() expects a single list argument, while separate() expects
each item as a separate argument
* separate() takes the separator first in order to allow a variable
number of arguments after it
Before this patch, when printing help text using `hg help`, or `hg log -h`,
the output will wrap at 78 chars even if the user has a bigger terminal width
and there is no config option to change it, making the experience different
from the commonly used `man` tool.
This patch introduces a new config option `ui.textwidth`, which replaces the
hardcoded number. It's set to 78 by default to maintain compatibility. When
set to 0, `hg help` will behave more like `man`.
Before this patch, the HTTP transport protocol would always zlib
compress certain responses (notably "getbundle" wire protocol commands)
at zlib compression level 6.
zlib can be a massive CPU resource sink for servers. Some server
operators may wish to reduce server-side CPU requirements while
requiring more bandwidth. This is common on corporate intranets, for
example. Others may wish to use more CPU but reduce bandwidth.
This patch introduces a config option to allow server operators
to control the zlib compression level.
On the "mozilla-unified" generaldelta repository, setting this
value to "0" (disable compression) results in server-side CPU
utilization for a `hg clone` going from ~180s to ~124s CPU time on
my i7-6700K. A level of "1" (which increases the transfer size from
~1,074 MB at level 6 to ~1,222 MB) utilizes ~132s CPU time.
This patch subtly changes the behavior of the parsing of "X.Y" values
to not set the "section" variable when rendering a known sub-topic.
Previously, "section" would be the same as the sub-topic name. This
required the sub-topic RST to have a section named the same as the
sub-topic name.
When I made this change, the descriptions from help.internalstable
started being rendered in command line output. This didn't look correct
to me, as it didn't match the formatting of main help pages. I
corrected this by moving the top section to help.internalstable and
changing the section levels of all the "internals" topics.
The end result is that "internals" topics now match the rendering of
main topics on both the CLI and HTML. And, "internals" topics no longer
require a main section matching the name of the topic.
Because parsing "$n" requires a crafted tokenizer, it exists only for backward
compatibility (as documented in revset._tokenizealias.) This patch updates the
examples so that users are encouraged to use symbolic names instead of "$n"s.
I'm going to implement alias expansion in templater, which won't support "$n"
parameters to make my life easier. Templater is more complicated than revset
because tokenizer and parser call each other.
Now template aliases are fully supported in log and formatter templates.
As I said before, aliases are not expanded in map files. This avoids possible
corruption of our stock styles and web templates. This behavior is undocumented
since no map file nor [templates] section are documented at all. Later on,
we might want to add [aliases] section to map files if it appears to be useful.
This patch introduces a new config flag ui.interface to select the interface
for interactive commands. It currently only applies to chunks selection.
The config can be overridden on a per feature basis with the flag
ui.interface.<feature>.
features for the moment can only be 'chunkselector', moving forward we expect
to have 'histedit' and other commands there.
If an incorrect value is given to ui.interface we print a warning and use the
default interface: text. If HGPLAIN is specified we also use the default
interface: text.
Note that we fail quickly if a feature does not handle all the interfaces
that we permit in ui.interface; in future, we could design a fallback path
(e.g. blackpearl to curses, curses to text), but let's leave that until we
need it.
Certificate pinning via [hostfingerprints] is a useful security
feature. Currently, we only support one fingerprint per hostname.
This is simple but it fails in the real world:
* Switching certificates breaks clients until they change the
pinned certificate fingerprint. This incurs client downtime
and can require massive amounts of coordination to perform
certificate changes.
* Some servers operate with multiple certificates on the same
hostname.
This patch adds support for defining multiple certificate
fingerprints per host. This overcomes the deficiencies listed
above. I anticipate the primary use case of this feature will
be to define both the old and new certificate so a certificate
transition can occur with minimal interruption, so this scenario
has been called out in the help documentation.
Adjust the examples (prefix and hostfingerprints) in help/config.txt
https://hg.intevation.de/ is now served with a new certificate that is signed
by a commercial CA, so all nearly all browsers will accept it automatically.
Listing both names, hg.intevation.de and hg.intevation.org, in the section
[hostfingerprints] allows using both without configuring web.cacerts and
without changing existing https URLs in the [paths] section.
The "hg purge" alias as currently described in "hgrc.5" only works, if
the caller's current working directory is identical to the repository's
root directory.
This patch slightly modifies the example by adding an empty pattern as a
file argument to the "hg status" command, thus forcing this command to
list the affected files relative to the current directory.
Before this patch, text blocks changed in this patch are shown as just
continuous text blocks like below in HTML format.
Global configuration like the username setting is typically put into:
%USERPROFILE%\mercurial.ini
$HOME/.hgrc
This patch itemizes these text blocks to increase readability in HTML
format.
Global configuration like the username setting is typically put into:
- %USERPROFILE%\mercurial.ini (on Windows)
- $HOME/.hgrc (on Unix, Plan9)
Like as other platform sensitive container-ed text blocks, this patch
also adds explicit "on PLATFORM" information to each items for
readability in HTML format, even though output of "hg help config" on
command line seems a little redundant. For example, on Unix:
Global configuration like the username setting is typically put into:
- "$HOME/.hgrc" (on Unix, Plan9)
We've discovered an issue with this flag during certain kinds of rebases. When:
(1) we're rebasing while currently on the destination commit, and
(2) an untracked or ignored file F is currently in the working copy, and
(3) the same file F is in a source commit, and
(4) F has different contents in the source commit,
then we'll try to merge the file rather than overwrite it.
An earlier patch I sent honored the options for these situations as well.
Unfortunately, rebases go through the same flow as the old, deprecated 'hg
merge --force'. We'd rather not make any changes to 'hg merge --force'
behavior, and there's no way from this point in the code to figure out whether
we're in 'hg rebase' or 'hg merge --force'.
Pierre-Yves David and I came up with the idea to split the 'force' flag up into
'force' for rebases, and 'forcemerge' for merge. Since this is a very
disruptive change and we're in freeze mode, simply undocument the options for
this release so that our hands aren't tied by BC concerns. We'll redocument
them in the next release.