Commit Graph

521 Commits

Author SHA1 Message Date
Gregory Szorc
a71e4d026c config: document profiling.show{min,max} 2017-06-15 11:04:46 -07:00
Gregory Szorc
2f6dff3311 help: document bundle specifications
I softly formalized the concept of a "bundle specification" a while
ago when I was working on clone bundles and stream clone bundles and
wanted a more robust way to define what exactly is in a bundle file.

The concept has existed for a while. Since it is part of the clone
bundles feature and exposed to the user via the "-t" argument to
`hg bundle`, it is something we need to support for the long haul.

After the 4.1 release, I heard a few people comment that they didn't
realize you could generate zstd bundles with `hg bundle`. I'm
partially to blame for not documenting it in bundle's docstring.

Additionally, I added a hacky, experimental feature for controlling
the compression level of bundles in 054e64c4d837. As the commit
message says, I went with a quick and dirty solution out of time
constraints. Furthermore, I wanted to eventually store this
configuration in the "bundlespec" so it could be made more flexible.

Given:

a) bundlespecs are here to stay
b) we don't have great documentation over what they are, despite being
   a user-facing feature
c) the list of available compression engines and their behavior isn't
   exposed
d) we need an extensible place to modify behavior of compression
   engines

I want to move forward with formalizing bundlespecs as a user-facing
feature. This commit does that by introducing a "bundlespec" help
page. Leaning on the just-added compression engine documentation
and API, the topic also conveniently lists available compression
engines and details about them. This makes features like zstd
bundle compression more discoverable. e.g. you can now
`hg help -k zstd` and it lists the "bundlespec" topic.
2017-04-01 13:42:06 -07:00
Pierre-Yves David
e3a8075346 hook: add hook name information to external hook
While we are here, we can also add the hook name information to external hook.
2017-03-31 11:53:56 +02:00
Pierre-Yves David
833b1335ba hook: provide hook type information to external hook
The python hooks have access to the hook type information. There is not reason
for external hook to not be aware of it too.

For the record my use case is to make sure a hook script is configured for the
right type.
2017-03-31 11:08:11 +02:00
Martin von Zweigbergk
653c126971 help: format `commands` heading correctly
The number of dashes under it needs to match exactly for it to be
rendered as a heading. Without this change, the dashes end up on the
same line as "commands", and "hg help config.commands" does not work.
2017-03-22 16:36:53 -07:00
Martin von Zweigbergk
68e5878038 status: support commands.status.relative config
When the config is set to true, status output becomes relative to the
working directory. This has bugged me since I started using hg and it
turns it is sillily simple to support it (unless I missed something,
of course).

We could also add a --relative flag, but I would personally always
want that on, and I haven't heard any use for having it sometimes on,
so this patch only lets you enable it via config.
2017-03-21 17:50:44 -07:00
Augie Fackler
f42196a5a4 merge with stable 2017-06-13 10:02:34 -04:00
Gregory Szorc
bc8582fc01 streamclone: consider secret changesets (BC) (issue5589)
Previously, a repo containing secret changesets would be served via
stream clone, transferring those secret changesets. While secret
changesets aren't meant to imply strong security (if you really
want to keep them secret, others shouldn't have read access to the
repo), we should at least make an effort to protect secret changesets
when possible.

After this commit, we no longer serve stream clones for repos
containing secret changesets by default. This is backwards
incompatible behavior. In case anyone is relying on the behavior,
we provide a config option to opt into the old behavior.

Note that this defense is only beneficial for remote repos
accessed via the wire protocol: if a client has access to the
files backing a repo, they can get to the raw data and see secret
revisions.
2017-06-09 10:41:13 -07:00
David Soria Parra
0e29dd10bc revset: lookup descendents for negative arguments to ancestor operator
Negative offsets to the `~` operator now search for descendents. The search is
aborted when a node has more than one child as we do not have a definition for
'nth child'. Optionally we can introduce such a notion and take the nth child
ordered by rev number.

The current revset language does provides a short operator for ancestor lookup
but not for descendents. This gives user a simple revset to move to the previous
changeset, e.g. `hg up '.~1'` but not to the 'next' changeset. With this change
userse can now use `.~-1` as a shortcut to move to the next changeset.
This fits better into allowing users to specify revisions via revsets and
avoiding the need for special `hg next` and `hg prev` operations.

The alternative to negative offsets is adding a new operator. We do not have
many operators in ascii left that do not require bash escaping (',', '_', and
'/' come to mind). If we decide that we should add a more convenient short
operator such as ('/', e.g. './1') we can later add it and allow ascendents
lookup via negative numbers.
2017-05-27 10:25:09 -07:00
Gregory Szorc
efbb740737 revlog: skeleton support for version 2 revlogs
There are a number of improvements we want to make to revlogs
that will require a new version - version 2. It is unclear what the
full set of improvements will be or when we'll be done with them.
What I do know is that the process will likely take longer than a
single release, will require input from various stakeholders to
evaluate changes, and will have many contentious debates and
bikeshedding.

It is unrealistic to develop revlog version 2 up front: there
are just too many uncertainties that we won't know until things
are implemented and experiments are run. Some changes will also
be invasive and prone to bit rot, so sitting on dozens of patches
is not practical.

This commit introduces skeleton support for version 2 revlogs in
a way that is flexible and not bound by backwards compatibility
concerns.

An experimental repo requirement for denoting revlog v2 has been
added. The requirement string has a sub-version component to it.
This will allow us to declare multiple requirements in the course
of developing revlog v2. Whenever we change the in-development
revlog v2 format, we can tweak the string, creating a new
requirement and locking out old clients. This will allow us to
make as many backwards incompatible changes and experiments to
revlog v2 as we want. In other words, we can land code and make
meaningful progress towards revlog v2 while still maintaining
extreme format flexibility up until the point we freeze the
format and remove the experimental labels.

To enable the new repo requirement, you must supply an experimental
and undocumented config option. But not just any boolean flag
will do: you need to explicitly use a value that no sane person
should ever type. This is an additional guard against enabling
revlog v2 on an installation it shouldn't be enabled on. The
specific scenario I'm trying to prevent is say a user with a
4.4 client with a frozen format enabling the option but then
downgrading to 4.3 and accidentally creating repos with an
outdated and unsupported repo format. Requiring a "challenge"
string should prevent this.

Because the format is not yet finalized and I don't want to take
any chances, revlog v2's version is currently 0xDEAD. I figure
squatting on a value we're likely never to use as an actual revlog
version to mean "internal testing only" is acceptable. And
"dead" is easily recognized as something meaningful.

There is a bunch of cleanup that is needed before work on revlog
v2 begins in earnest. I plan on doing that work once this patch
is accepted and we're comfortable with the idea of starting down
this path.
2017-05-19 20:29:11 -07:00
Matt Harbison
fd4c90025c help: update the color documentation for Windows 10 ANSI support
It looks like only the initial release of Windows 10 lacked support for this
functionality. [1][2]  Since that build is no longer supported, I didn't bother
getting very specific, to keep the help text less cluttered.

[1] https://github.com/symfony/symfony/issues/17499#issuecomment-243481052
[2] https://en.wikipedia.org/wiki/Windows_10_version_history
2017-05-22 22:32:59 -04:00
Augie Fackler
9a722d4382 merge with stable 2017-06-03 16:33:28 -04:00
Ryan McElroy
158bdd12b9 update: add flag to require update destination
In some mercurial workflows, the default destination for update does not
always work well and can lead to confusing behavior. With this flag enabled,
every update command will require passing an explicit destination, eliminating
this confusion.
2017-03-14 17:43:18 -07:00
Gregory Szorc
7d51a8278d revlog: remove some revlogNG terminology
RevlogNG is not such a good name when it is no longer the
newest revlog version. Since we'll soon have revlog version 2,
let's remove some references to it.
2017-05-19 20:14:31 -07:00
Gregory Szorc
304ebe38d7 help: clarify that colons are allowed in fingerprints values
This was suggested by Lars Rohwedder in issue5559.
2017-05-11 00:02:32 -07:00
Siddharth Agarwal
0c7305054f clone: add a server-side option to disable full getbundles (pull-based clones)
For large enough repositories, pull-based clones take too long, and an attempt
to use them indicates some sort of configuration or other issue or maybe an
outdated Mercurial. Add a config option to disable them.
2017-05-11 10:50:05 -07:00
Martin von Zweigbergk
09d53c160b merge with stable 2017-05-12 11:20:25 -07:00
Augie Fackler
ec67fe37fb merge with stable 2017-05-04 00:26:55 -04:00
Siddharth Agarwal
d28bc68b23 internals: document that "branches" is a legacy wire command
Modern clients use a different discovery mechanism.
2017-05-03 14:07:14 -07:00
Yuya Nishihara
cb8fe16f12 help: fix layout of pre-formatted text 2017-03-09 12:55:48 +09:00
Yuya Nishihara
99a5ad18ca help: fix example of revs() fileset 2017-03-09 11:01:03 +09:00
Augie Fackler
b3e554d062 internals: add some brief documentation about censor 2017-01-23 20:17:24 -05:00
Kim Alvefur
977cd43980 help: align description of 'base rev' with reality [issue5488]
The text about revlogs seems to be wrong about -1 being used to indicate
the start of a delta chain. Attempt to correct this.
2017-02-28 15:19:08 +01:00
Kyle Lippincott
2dc3df2798 help: fix internals.changegroups
Add information about tree manifests, copy edit the text and fix up a few
ambiguities.

The document also contains a few additional fixes from Siddharth Agarwal
<sid0@fb.com>, who used it to build a parser for changegroups in Rust.
2017-03-01 18:37:34 -08:00
Pierre-Yves David
5be5f101f6 fileset: add revs(revs, fileset) to evaluate set in working directory
Unlike other functions, "revs()" does not select files but switches the
evaluation context. This allow to match file with property in another revision
that the one currently evaluated.

This changeset is based on work from Yuya Nishihara.
2017-03-03 12:44:56 +01:00
Dan Villiom Podlaski Christiansen
66c3976319 share: add --relative flag to store a relative path to the source
Storing a relative path the source repository is useful when exporting
repositories over the network or when they're located on external
drives where the mountpoint isn't always fixed.

Currently, Mercurial interprets paths in `.hg/shared` relative to
$PWD. I suspect this is very much unintentional, and you have to
manually edit `.hg/shared` in order to trigger this behaviour.

However, on the off chance that someone might rely on it, I added a
new capability called 'relshared'. In addition, this makes earlier
versions of Mercurial fail with a graceful error.

I should note that I haven't tested this patch on Windows.
2017-02-13 14:05:24 +01:00
Pierre-Yves David
88774bdc76 help: use 'churn' instead of 'color' as an example extension
The 'color' extensions is now deprecated.
2017-02-21 22:53:38 +01:00
Pierre-Yves David
a29094b506 color: update main documentation
Now that the feature no longer lives in the extension, we document it in the
help of the core config. This include the new 'ui.color' option introduced in
the previous changesets.

As a result the color extensions can now be deprecated.

This is a documentation patch only; color is still disabled by default.
2017-02-21 20:04:55 +01:00
Augie Fackler
127b928398 pager: add a config knob to just globally turn off the pager 2017-02-07 17:13:25 -05:00
Augie Fackler
2e832efca3 pager: move most help to a new help topic and deprecate extension 2017-02-07 00:07:53 -05:00
Rodrigo Damazio Bovendorp
8c5ff96dc1 match: adding support for matching files inside a directory
This adds a new "rootfilesin" matcher type which matches files inside a
directory, but not any subdirectories (so it matches non-recursively).
This has the "root" prefix per foozy's plan for other matchers (rootglob,
rootpath, cwdre, etc.).
2017-02-13 15:39:29 -08:00
Rainer Woitok
ee47006291 doc: correct example concerning "hg purge" alias in man page "hgrc.5"
The "hg purge" alias as currently described in "hgrc.5" issues a possibly
confusing error message like

   rm: missing operand
   Try 'rm --help' for more information.

if no files are to be purged at all.

This patch slightly modifies the example by adding a "-f" option to the
"rm" command.
2017-02-17 11:08:36 +01:00
David Demelier
e7d1102f9f hg: allow usage of XDG_CONFIG_HOME/hg/hgrc
Modern applications must use the following paths to store configuration files:

  - $XDG_CONFIG_HOME/hg/hgrc,
  - $HOME/.config/hg/hgrc (if XDG_CONFIG_HOME is unset or not absolute).

For backward compatibility, ~/.hgrc is still created if no hgrc exist using hg
config --edit.

See https://standards.freedesktop.org/basedir-spec/basedir-spec-latest.html
2017-02-07 17:33:35 +01:00
FUJIWARA Katsunori
2afd920706 misc: update year in copyright lines
This patch also makes some expected output lines in tests glob-ed for
persistence of them.

BTW, files below aren't yet changed in 2017, but this patch also
updates copyright of them, because:

    - mercurial/help/hg.1.txt

      almost all of "man hg" output comes from online help of hg
      command, and is already changed in 2017

    - mercurial/help/hgignore.5.txt
    - mercurial/help/hgrc.5

      "copyright 2005-201X Matt Mackall" in them mentions about
      copyright of Mercurial itself
2017-02-12 02:23:33 +09:00
Martin von Zweigbergk
ad5f4ef8a6 revlog: give EXTSTORED flag value to narrowhg
Narrowhg has been using "1 << 14" as its revlog flag value for a long
time. We (Google) have many repos with that value in production
already. When the same value was reserved for EXTSTORED, it made those
repos invalid. Upgrading them will be a little painful. We should
clearly have reserved the value for narrowhg a long time ago. Since
the EXTSTORED flag is not yet in any release and Facebook also says
they have not started using it in production, so it should be okay to
change it. This patch gives the current value (1 << 14) back to
narrowhg and gives a new value (1 << 13) to EXTSTORED.
2017-01-17 11:25:02 -08:00
Martin von Zweigbergk
a445384510 help: don't let tools reflow revlog flags list
Before this change, the text about revlog flags was reflowed into a
single paragraph, which made it a bit hard to read. I don't even know
the rules around this, but adding a blank line before each flag seems
to prevent the reflowing.
2017-01-17 11:45:10 -08:00
Martin von Zweigbergk
0ecfe18db3 help: format revlog.txt more closely to result
The rendered text has spaces before each item in the list
2017-01-17 11:29:06 -08:00
Matt Harbison
d3bfb5a06a help: eliminate duplicate text for revset string patterns
There's no reason to duplicate this so many times, and it's likely an instance
will be missed if support for a new pattern is added and documented.  The
stringmatcher is mostly used by revsets, though it is also used for the 'tag'
related templates, and namespace filtering in the journal extension.  So maybe
there's a better place to document it.  `hg help patterns` seems inappropriate,
because that is all file pattern matching.

While here, indicate how to perform case insensitive regex searches.
2017-01-07 23:35:35 -05:00
Martin von Zweigbergk
43fb09f0d0 help: explain that revsets can be used where 1 or 2 revs are wanted
We did not seem to document that one can do things like "hg up :@"
where the last revision of the revset ":@".
2017-01-11 23:13:00 -08:00
Martin von Zweigbergk
1333ea0016 help: explain what the term "revset" means
We refer to revsets in a few places (e.g. in "hg help config"), but we
never explained what they are. Until now.
2017-01-11 22:46:07 -08:00
Martin von Zweigbergk
1840263a8c help: merge revsets.txt into revisions.txt
Selecting single and multiple revisions is closely related, so let's
put it in one place, so users can easily find it. We actually did not
even point to "hg help revsets" from "hg help revisions", but now that
they're on a single page, that won't be necessary.
2017-01-11 11:37:38 -08:00
Martin von Zweigbergk
01bab7fc55 help: use a single paragraph to describe full and abbreviated nodeids
The texts describing 40-digit strings and the abbreviated form are
closely related, so make it a single paragraph.
2017-01-11 11:28:54 -08:00
Gregory Szorc
9849c580fb hgweb: support Content Security Policy
Content-Security-Policy (CSP) is a web security feature that allows
servers to declare what loaded content is allowed to do. For example,
a policy can prevent loading of images, JavaScript, CSS, etc unless
the source of that content is whitelisted (by hostname, URI scheme,
hashes of content, etc). It's a nifty security feature that provides
extra mitigation against some attacks, notably XSS.

Mitigation against these attacks is important for Mercurial because
hgweb renders repository data, which is commonly untrusted. While we
make attempts to escape things, etc, there's the possibility that
malicious data could be injected into the site content. If this happens
today, the full power of the web browser is available to that
malicious content. A restrictive CSP policy (defined by the server
operator and sent in an HTTP header which is outside the control of
malicious content), could restrict browser capabilities and mitigate
security problems posed by malicious data.

CSP works by emitting an HTTP header declaring the policy that browsers
should apply. Ideally, this header would be emitted by a layer above
Mercurial (likely the HTTP server doing the WSGI "proxying"). This
works for some CSP policies, but not all.

For example, policies to allow inline JavaScript may require setting
a "nonce" attribute on <script>. This attribute value must be unique
and non-guessable. And, the value must be present in the HTTP header
and the HTML body. This means that coordinating the value between
Mercurial and another HTTP server could be difficult: it is much
easier to generate and emit the nonce in a central location.

This commit introduces support for emitting a
Content-Security-Policy header from hgweb. A config option defines
the header value. If present, the header is emitted. A special
"%nonce%" syntax in the value triggers generation of a nonce and
inclusion in <script> elements in templates. The inclusion of a
nonce does not occur unless "%nonce%" is present. This makes this
commit completely backwards compatible and the feature opt-in.

The nonce is a type 4 UUID, which is the flavor that is randomly
generated. It has 122 random bits, which should be plenty to satisfy
the guarantees of a nonce.
2017-01-10 23:37:08 -08:00
Gregory Szorc
f71c86b7e9 protocol: send application/mercurial-0.2 responses to capable clients
With this commit, the HTTP transport now parses the X-HgProto-<N>
header to determine what media type and compression engine to use for
responses. So far, we only compress responses that are already being
compressed with zlib today (stream response types to specific
commands). We can expand things to cover additional response types
later.

The practical side-effect of this commit is that non-zlib compression
engines will be used if both ends support them. This means if both
ends have zstd support, zstd - not zlib - will be used to compress
data!

When cloning the mozilla-unified repository between a local HTTP
server and client, the benefits of non-zlib compression are quite
noticeable:

  engine     server CPU (s)   client CPU (s)    bundle size
zlib (l=6)      174.1            283.2         1,148,547,026
zstd (l=1)       99.2            267.3         1,127,513,841
zstd (l=3)      103.1            266.9         1,018,861,363
zstd (l=7)      128.3            269.7           919,190,278
zstd (l=10)     162.0               -            894,547,179
none             95.3            277.2         4,097,566,064

The default zstd compression level is 3. So if you deploy zstd
capable Mercurial to your clients and servers and CPU time on
your server is dominated by "getbundle" requests (clients cloning
and pulling) - and my experience at Mozilla tells me this is often
the case - this commit could drastically reduce your server-side
CPU usage *and* save on bandwidth costs!

Another benefit of this change is that server operators can install
*any* compression engine. While it isn't enabled by default, the
"none" compression engine can now be used to disable wire protocol
compression completely. Previously, commands like "getbundle" always
zlib compressed output, adding considerable overhead to generating
responses. If you are on a high speed network and your server is under
high load, it might be advantageous to trade bandwidth for CPU.
Although, zstd at level 1 doesn't use that much CPU, so I'm not
convinced that disabling compression wholesale is worthwhile. And, my
data seems to indicate a slow down on the client without compression.
I suspect this is due to a lack of buffering resulting in an increase
in socket read() calls and/or the fact we're transferring an extra 3 GB
of data (parsing HTTP chunked transfer and processing extra TCP packets
can add up). This is definitely worth investigating and optimizing. But
since the "none" compressor isn't enabled by default, I'm inclined to
punt on this issue.

This commit introduces tons of tests. Some of these should arguably
have been implemented on previous commits. But it was difficult to
test without the server functionality in place.
2016-12-24 15:29:32 -07:00
Gregory Szorc
a95fb0b61b wireproto: advertise supported media types and compression formats
This commit introduces support for advertising a server's support for
media types and compression formats in accordance with the spec defined
in internals.wireproto.

The bulk of the new code is a helper function in wireproto.py to
obtain a prioritized list of compression engines available to the
wire protocol. While not utilized yet, we implement support
for obtaining the list of compression engines advertised by the
client.

The upcoming HTTP protocol enhancements are a bit lower-level than
existing tests (most existing tests are command centric). So,
this commit establishes a new test file that will be appropriate
for holding tests around the functionality of the HTTP protocol
itself.

Rounding out this change, `hg debuginstall` now prints compression
engines available to the server.
2016-12-24 15:21:46 -07:00
Gregory Szorc
73aa240fe1 util: declare wire protocol support of compression engines
This patch implements a new compression engine API allowing
compression engines to declare support for the wire protocol.

Support is declared by returning a compression format string
identifier that will be added to payloads to signal the compression
type of data that follows and default integer priorities of the
engine.

Accessor methods have been added to the compression engine manager
class to facilitate use.

Note that the "none" and "bz2" engines declare wire protocol support
but aren't enabled by default due to their priorities being 0. It
is essentially free from a coding perspective to support these
compression formats, so we do it in case anyone may derive use from
it.
2016-12-24 13:51:12 -07:00
Gregory Szorc
0d089ecdf9 internals: document compression negotiation
As part of adding zstd support to all of the things, we'll need
to teach the wire protocol to support non-zlib compression formats.

This commit documents how we'll implement that.

To understand how we arrived at this proposal, let's look at how
things are done today.

The wire protocol today doesn't have a unified format. Instead,
there is a limited facility for differentiating replies as successful
or not. And, each command essentially defines its own response format.

A significant deficiency in the current protocol is the lack of
payload framing over the SSH transport. In the HTTP transport,
chunked transfer is used and the end of an HTTP response body (and
the end of a Mercurial command response) can be identified by a 0
length chunk. This is how HTTP chunked transfer works. But in the
SSH transport, there is no such framing, at least for certain
responses (notably the response to "getbundle" requests). Clients
can't simply read until end of stream because the socket is
persistent and reused for multiple requests. Clients need to know
when they've encountered the end of a request but there is nothing
simple for them to key off of to detect this. So what happens is
the client must decode the payload (as opposed to being dumb and
forwarding frames/packets). This means the payload itself needs
to support identifying end of stream. In some cases (bundle2), it
also means the payload can encode "error" or "interrupt" events
telling the client to e.g. abort processing. The lack of framing
on the SSH transport and the transfer of its responsibilities to
e.g. bundle2 is a massive layering violation and a wart on the
protocol architecture. It needs to be fixed someday by inventing a
proper framing protocol.

So about compression.

The client transport abstractions have a "_callcompressable()"
API. This API is called to invoke a remote command that will
send a compressible response. The response is essentially a
"streaming" response (no framing data at the Mercurial layer)
that is fed into a decompressor.

On the HTTP transport, the decompressor is zlib and only zlib.
There is currently no mechanism for the client to specify an
alternate compression format. And, clients don't advertise what
compression formats they support or ask the server to send a
specific compression format. Instead, it is assumed that non-error
responses to "compressible" commands are zlib compressed.

On the SSH transport, there is no compression at the Mercurial
protocol layer. Instead, compression must be handled by SSH
itself (e.g. `ssh -C`) or within the payload data (e.g. bundle
compression).

For the HTTP transport, adding new compression formats is pretty
straightforward. Once you know what decompressor to use, you can
stream data into the decompressor until you reach a 0 size HTTP
chunk, at which point you are at end of stream.

So our wire protocol changes for the HTTP transport are pretty
straightforward: the client and server advertise what compression
formats they support and an appropriate compression format is
chosen. We introduce a new HTTP media type to hold compressed
payloads. The header of the payload defines the compression format
being used. Whoever is on the receiving end can sniff the first few
bytes route to an appropriate decompressor.

Support for multiple compression formats is advertised on both
server and client. The server advertises a "compression" capability
saying which compression formats it supports and in what order they
are preferred. Clients advertise their support for multiple
compression formats and media types via the introduced "X-HgProto"
request header.

Strictly speaking, servers don't need to advertise which compression
formats they support. But doing so allows clients to fail fast if
they don't support any of the formats the server does. This is useful
in situations like sending bundles, where the client may have to
perform expensive computation before sending data to the server.

Rather than simply advertise a list of supported compression formats,
we introduce an additional "httpmediatype" server capability
advertising which media types the server supports. This means servers
are explicit about what formats they exchange. IMO, this is superior
to inferring support from other capabilities (like "compression").

By advertising compression support on each request in the "X-HgProto"
header and media type and direction at the server level, we are able
to gradually transition existing commands/responses to the new media
type and possibly compression. Contrast with the old world, where we
only supported a single media type and the use of compression was
built-in to the semantics of the command on both client and server.
In the new world, if "application/mercurial-0.2" is supported,
compression is supported. It's that simple.

It's worth noting that we explicitly don't use "Accept,"
"Accept-Encoding," "Content-Encoding," or "Transfer-Encoding" for
content negotiation and compression. People knowledgeable of the HTTP
specifications will say that we should use these because that's
what they are designed to be used for. They have a point and I
sympathize with the argument. Earlier versions of this commit even
defined supported media types in the "Accept" header. However, my
years of experience rolling out services leveraging HTTP has taught
me to not trust the HTTP layer, especially if you are going outside
the normal spec (such as using a custom "Content-Encoding" value to
represent zstd streams). I've seen load balancers, proxies, and other
network devices do very bad and unexpected things to HTTP messages
(like insisting zlib compressed content is decoded and then re-encoded
at a different compression level or even stripping compression
completely). I've found that the best way to avoid surprises when
writing protocols on top of HTTP is to use HTTP as a dumb transport as
much as possible to minimize the chances that an "intelligent" agent
between endpoints will muck with your data. While the widespread use of
TLS is mitigating many intermediate network agents interfering with
HTTP, there are still problems at the edges, with e.g. the origin HTTP
server needing to convert HTTP to and from WSGI and buggy or
feature-lacking HTTP client implementations. I've found the best way to
avoid these problems is to avoid using headers like "Content-Encoding"
and to bake as much logic as possible into media types and HTTP message
bodies. The protocol changes in this commit do rely on a custom HTTP
request header and the "Content-Type" headers. But we used them before,
so we shouldn't be increasing our exposure to "bad" HTTP agents.

For the SSH transport, we can't easily implement content negotiation
to determine compression formats because the SSH transport has no
content negotiation capabilities today. And without a framing protocol,
we don't know how much data to feed into a decompressor. So in order
to implement compression support on the SSH transport, we'd need to
invent a mechanism to represent content types and an outer framing
protocol to stream data robustly. While I'm fully capable of doing
that, it is a lot of work and not something that should be undertaken
lightly. My opinion is that if we're going to change the SSH transport
protocol, we should take a long hard look at implementing a grand
unified protocol that attempts to address all the deficiencies with
the existing protocol. While I want this to happen, that would be
massive scope bloat standing in the way of zstd support. So, I've
decided to take the easy solution: the SSH transport will not gain
support for multiple compression formats. Keep in mind it doesn't
support *any* compression today. So essentially nothing is changing
on the SSH front.
2016-12-24 13:56:36 -07:00
Remi Chaintron
66071d6de5 revlog: REVIDX_EXTSTORED flag
This flag will be used by the lfs extension to mark the revision data as stored
externally.
2017-01-05 17:16:51 +00:00
Matt Harbison
1e958d800e help: merge the various operator sections of revsets, filesets and templates
Having sections for specific operator types assumes the user already knows what
type of operators are supported.  By having a common heading, the user can
simply lookup help for "(revsets|filesets|templates).operators".
2017-01-08 12:05:10 -05:00
Matt Harbison
28a570f58c help: apply the section headings from revsets to templates
Unlike filesets, there are a few distinct headings that are not shared with
revsets.  But common names are used where possible.
2017-01-08 02:43:01 -05:00
Matt Harbison
0b96ef6f6b help: apply the section headings from revsets to filesets
This has the nice property of visually breaking up the wall of text.  It also
allows specific smaller sections to be called out.  For example,
`hg help filesets.predicates` now prints just the predicate section.  At the
moment, the revset headings are a superset of the fileset headings, so there is
consistency in how example, predicate and operator help is called out.

The reference to `hg help patterns` was moved to the overview section, so that
it isn't stuck in the examples section.
2017-01-08 02:40:36 -05:00
Sean Farley
7f456ac7c6 config: add docs for ignoring all text below in the editor
This is an example of how to use the new skip-from-there string for ignoring the
diff in a commit message.
2017-01-04 22:32:42 -06:00
Remi Chaintron
0c2f48e6cc documentation: better censor flag documentation 2016-12-22 19:08:38 -05:00
Martin von Zweigbergk
8f2ed099c1 help: make multirevs just an alias for revsets
The multirevs topis seems to be covered well by the revsets topic, so
just make it an alias and remove multirevs.txt.
2016-12-16 09:48:14 -08:00
Remi Chaintron
25e30cf1ed censor: flag internal documentation 2016-11-23 17:36:35 +00:00
Gregory Szorc
247c663431 help: clarify contents of revlog index
The previous wording indicated that field at index 3 was the
size of the decompressed chunk, not the size of the full
revision text.
2016-11-22 18:13:02 -08:00
Gregory Szorc
93a170991f help: fix double word usage
"most" was used twice.

(I fixed a grammar error before timeless spotted it!)
2016-11-09 16:04:44 -08:00
Gregory Szorc
d1e86f5ae3 profiling: make statprof the default profiler (BC)
The statprof sampling profiler runs with significantly less overhead.
Its data is therefore more useful. Furthermore, its default output
shows the hotpath by default, which I've found to be way more useful
than the default profiler's function time table.

There is one behavioral regression with this change worth noting:
the statprof profiler currently doesn't profile individual hgweb
requests like lsprof does. This is because the current implementation
of statprof only profiles the thread that started profiling.

The ability for lsprof to profile individual hgweb requests is
relatively new and likely not widely used. Furthermore, I have plans
to modify statprof to support profiling multiple threads. I expect
that change to go through several iterations. I'm submitting this
patch first so there is more time to test statprof. Perfect is the
enemy of good.
2016-11-04 21:44:25 -07:00
Gregory Szorc
c60af834ba profiling: use vendored statprof and upstream enhancements (BC)
Now that the statprof module is vendored and suitable for use, we
switch our statprof profiler to use it. This required some minor
changes because of drift between the official statprof profiler
and the vendored copy.

We also incorporate Facebook's improvements from the "statprofext"
extension at
https://bitbucket.org/facebook/hg-experimental, notably support for
different display formats.

Because statprof output is different, this is marked as BC. Although
most users likely won't notice since most users don't profile.
2016-11-04 20:50:38 -07:00
FUJIWARA Katsunori
38ad72f729 help: replace selenic.com by mercurial-scm.org in man pages
Source code repository and mailing list services have been already
migrated to mercurial-scm.org domain.
2016-11-01 20:39:36 +09:00
Simon Farnsworth
d5bf3ea399 templater: provide arithmetic operations on integers
The termwidth template keyword is of limited use without some way to ensure
that margins are respected.

Provide a full set of arithmetic operators (four basic operations plus the
mod function, defined to match Python's // for division), so that you can
create termwidth based layouts that match the user's terminal size
2016-10-09 05:51:04 -07:00
Hannes Oldenburg
d853961750 templates: add built-in files() function
We already support multiple primitive for listing files, which were
affected by the current changeset.
This patch adds files() which returns files of the current changeset
matching a given pattern or fileset query via the "set:" prefix.
2016-09-23 08:15:05 +00:00
timeless
73431dfebb help: add sections for revsets 2016-09-21 17:05:27 -04:00
timeless
de3d5e78b4 help: move revsets.## documentation into infix section 2016-09-21 17:23:05 +00:00
Gregory Szorc
57934ad4c1 help: document wire protocol commands 2016-08-22 19:50:21 -07:00
Gregory Szorc
a167c55bc7 help: document wire protocol "handshake" protocol
There isn't a formal handshake protocol in the wire protocol. But
clients almost certainly need to perform particular actions before they
can communicate with a server optimally. So document what that is
so people understand what's going on at connection establishment time.
2016-08-22 19:49:59 -07:00
Gregory Szorc
378142967e help: document wire protocol capabilities
All capabilities from the history of the project are now documented.
2016-08-22 19:48:31 -07:00
Gregory Szorc
b21abff69d help: document wire protocol transport protocols
The HTTP and SSH transport protocols are documented. This
includes how commands and arguments are serialized as well as
response types.
2016-08-22 19:47:34 -07:00
Gregory Szorc
5c136c2209 help: internals topic for wire protocol
The Mercurial wire protocol is under-documented. This includes a lack
of source docstrings and comments as well as pages on the official
wiki.

This patch adds the beginnings of "internals" documentation on the
wire protocol.

The documentation should have nearly complete coverage on the
lower-level parts of the protocol, such as the different transport
mechanims, how commands and arguments are sent, capabilities, and,
of course, the commands themselves.

As part of writing this documentation, I discovered a number of
deficiencies in the protocol and bugs in the implementation. I've
started sending patches for some of the issues. I hope to send a lot
more.

This patch starts with the scaffolding for a new internals page.
2016-08-22 19:46:39 -07:00
Gregory Szorc
9ac5776ef7 profiling: add a context manager that no-ops if profiling isn't enabled
And refactor dispatch.py to use it. As you can see, the resulting code
is much simpler.

I was tempted to inline _runcommand as part of writing this series.
However, a number of extensions wrap _runcommand. So keeping it around
is necessary (extensions can't easily wrap runcommand because it calls
hooks before and after command execution).
2016-08-14 17:51:12 -07:00
Augie Fackler
cb268cbd2f merge with stable 2016-08-15 12:26:02 -04:00
Mathias De Maré
e736d31e70 help: add example of '[templates]' usage
V2:
- move from shortest() with minlength 8 to minlength 4
- mention [templates] in config.txt
- better describe the difference between [templatealias] and [templates]

V3:
- choose a better example template
2016-08-08 16:47:42 +02:00
Anton Shestakov
9573c5450e help: update link to wiki/CommandServer 2016-08-04 10:42:03 +08:00
FUJIWARA Katsunori
7276a9d11f doc: make previous line of certificate example end with "::"
Before this patch, certificate example is formatted just as normal
text.
2016-08-01 06:08:27 +09:00
FUJIWARA Katsunori
9e475c7395 doc: fix incorrect use of rst hg role in help text 2016-08-01 06:08:27 +09:00
Gregory Szorc
a7364efe16 sslutil: support defining cipher list
Python 2.7 supports specifying a custom cipher list to TLS sockets.
Advanced users may wish to specify a custom cipher list to increase
security. Or in some cases they may wish to prefer weaker ciphers
in order to increase performance (e.g. when doing stream clones
of very large repositories).

This patch introduces a [hostsecurity] config option for defining
the cipher list. The help documentation states that it is for
advanced users only.

Honestly, I'm a bit on the fence about providing this because
it is a footgun and can be used to decrease security. However,
there are legitimate use cases for it, so I think support should
be provided.
2016-07-17 10:59:32 -07:00
Gregory Szorc
91047cde9d sslutil: require TLS 1.1+ when supported
Currently, Mercurial will use TLS 1.0 or newer when connecting to
remote servers, selecting the highest TLS version supported by both
peers. On older Pythons, only TLS 1.0 is available. On newer Pythons,
TLS 1.1 and 1.2 should be available.

Security professionals recommend avoiding TLS 1.0 if possible.
PCI DSS 3.1 "strongly encourages" the use of TLS 1.2.

Known attacks like BEAST and POODLE exist against TLS 1.0 (although
mitigations are available and properly configured servers aren't
vulnerable).

I asked Eric Rescorla - Mozilla's resident crypto expert - whether
Mercurial should drop support for TLS 1.0. His response was
"if you can get away with it." Essentially, a number of servers on
the Internet don't support TLS 1.1+. This is why web browsers
continue to support TLS 1.0 despite desires from security experts.

This patch changes Mercurial's default behavior on modern Python
versions to require TLS 1.1+, thus avoiding known security issues
with TLS 1.0 and making Mercurial more secure by default. Rather
than drop TLS 1.0 support wholesale, we still allow TLS 1.0 to be
used if configured. This is a compromise solution - ideally we'd
disallow TLS 1.0. However, since we're not sure how many Mercurial
servers don't support TLS 1.1+ and we're not sure how much user
inconvenience this change will bring, I think it is prudent to ship
an escape hatch that still allows usage of TLS 1.0. In the default
case our users get better security. In the worst case, they are no
worse off than before this patch.

This patch has no effect when running on Python versions that don't
support TLS 1.1+.

As the added test shows, connecting to a server that doesn't
support TLS 1.1+ will display a warning message with a link to
our wiki, where we can guide people to configure their client to
allow less secure connections.
2016-07-13 21:35:54 -07:00
Gregory Szorc
0ed5c4ca7f sslutil: config option to specify TLS protocol version
Currently, Mercurial will use TLS 1.0 or newer when connecting to
remote servers, selecting the highest TLS version supported by both
peers. On older Pythons, only TLS 1.0 is available. On newer Pythons,
TLS 1.1 and 1.2 should be available.

Security-minded people may want to not take any risks running
TLS 1.0 (or even TLS 1.1). This patch gives those people a config
option to explicitly control which TLS versions Mercurial should use.
By providing this option, one can require newer TLS versions
before they are formally deprecated by Mercurial/Python/OpenSSL/etc
and lower their security exposure. This option also provides an
easy mechanism to change protocol policies in Mercurial. If there
is a 0-day and TLS 1.0 is completely broken, we can act quickly
without changing much code.

Because setting the minimum TLS protocol is something you'll likely
want to do globally, this patch introduces a global config option under
[hostsecurity] for that purpose.

wrapserversocket() has been taught a hidden config option to define
the explicit protocol to use. This is queried in this function and
not passed as an argument because I don't want to expose this dangerous
option as part of the Python API. There is a risk someone could footgun
themselves. But the config option is a devel option, has a warning
comment, and I doubt most people are using `hg serve` to run a
production HTTPS server (I would have something not Mercurial/Python
handle TLS). If this is problematic, we can go back to using a
custom extension in tests to coerce the server into bad behavior.
2016-07-14 20:47:22 -07:00
Gregory Szorc
5255c3f24b hgweb: expose list of per-repo labels to templates
hgweb currently offers limited functionality for "classifying"
repositories. This patch aims to change that.

The web.labels config option list is introduced. Its values
are exposed to the "index" and "summary" templates. Custom
templates can use template features like ifcontains() to e.g.
look for the presence of a specific label and engage specific
behavior. For example, a site operator may wish to assign a
"defunct" label to a repository so the repository is prominently
marked as dead in repository indexes.
2016-06-30 18:59:53 -07:00
Matt Mackall
c0d551e8ec merge with stable 2016-07-01 16:02:56 -05:00
Mike Miller
90a873dfc0 help: document that [subpaths] may rewrite relative paths
The subpaths substitution logic first attempts to match the absolute
repository path, then the relative subrepository path if that failed.
2016-06-16 09:15:12 -07:00
Matt Harbison
8d72832828 help: fix the display for hg help internals.revlogs (issue5227)
It previously aborted saying the help section wasn't found.  Credit to Yuya for
figuring out the fix.
2016-05-08 22:28:09 -04:00
Sean Farley
dc865d7f73 hg-ssh: copy doc string to man page
This corrects a warning from lintian that we're shipping an executable without
a man page. Since there is a doc string in the text, let's use that for the man
page.
2016-05-06 23:03:41 -07:00
Sean Farley
bf3cdfa44b revsets: add docs for '%' operator 2016-04-27 14:02:18 -07:00
Gregory Szorc
9c6bc630a3 ui: path option to declare which revisions to push by default
Now that we have a mechanism for declaring path sub-options, we can
start to pile on features!

Many power users have expressed frustration that bare `hg push`
attempts to push all local revisions to the remote. This patch
introduces the "pushrev" path sub-option to control which revisions
are pushed when no "-r" argument is specified.

The value of this sub-option is a revset, naturally.

A future feature addition could potentially introduce a "pushnames"
sub-options that declares the list of names (branches, bookmarks,
topics, etc) to push by default. The entire "what to push by default"
feature should probably be considered before this patch lands.
2016-06-26 07:59:02 -07:00
Gregory Szorc
7a668208b8 sslutil: per-host config option to define certificates
Recent work has introduced the [hostsecurity] config section for
defining per-host security settings. This patch builds on top
of this foundation and implements the ability to define a per-host
path to a file containing certificates used for verifying the server
certificate. It is logically a per-host web.cacerts setting.

This patch also introduces a warning when both per-host
certificates and fingerprints are defined. These are mutually
exclusive for host verification and I think the user should be
alerted when security settings are ambiguous because, well,
security is important.

Tests validating the new behavior have been added.

I decided against putting "ca" in the option name because a
non-CA certificate can be specified and used to validate the server
certificate (commonly this will be the exact public certificate
used by the server). It's worth noting that the underlying
Python API used is load_verify_locations(cafile=X) and it calls
into OpenSSL's SSL_CTX_load_verify_locations(). Even OpenSSL's
documentation seems to omit that the file can contain a non-CA
certificate if it matches the server's certificate exactly. I
thought a CA certificate was a special kind of x509 certificate.
Perhaps I'm wrong and any x509 certificate can be used as a
CA certificate [as far as OpenSSL is concerned]. In any case,
I thought it best to drop "ca" from the name because this reflects
reality.
2016-06-07 20:29:54 -07:00
Gregory Szorc
35166670e2 mail: unsupport smtp.verifycert (BC)
smtp.verifycert was accidentally broken by 799db3fe9866. And,
I believe the "loose" value has been broken for longer than that.
The current code refuses to talk to a remote server unless the
CA is trusted or the fingerprint is validated. In other words,
we lost the ability for smtp.verifycert to lower/disable security.

There are special considerations for smtp.verifycert in
sslutil.validatesocket() (the "strict" argument). This violates
the direction sslutil is evolving towards, which has all security
options determined at wrapsocket() time and a unified code path and
configs for determining security options.

Since smtp.verifycert is broken and since we'll soon have new
security defaults and new mechanisms for controlling host security,
this patch formally deprecates smtp.verifycert. With this patch,
the socket security code in mail.py now effectively mirrors code
in url.py and other places we're doing socket security.

For the record, removing smtp.verifycert because it was accidentally
broken is a poor excuse to remove it. However, I would have done this
anyway because smtp.verifycert is a one-off likely used by few people
(users of the patchbomb extension) and I don't think the existence
of this seldom-used one-off in security code can be justified,
especially when you consider that better mechanisms are right around
the corner.
2016-06-04 11:13:28 -07:00
Gregory Szorc
95fda4d981 sslutil: allow fingerprints to be specified in [hostsecurity]
We introduce the [hostsecurity] config section. It holds per-host
security settings.

Currently, the section only contains a "fingerprints" option,
which behaves like [hostfingerprints] but supports specifying the
hashing algorithm.

There is still some follow-up work, such as changing some error
messages.
2016-05-28 12:37:36 -07:00
Matt Mackall
a24591e84c merge with stable 2016-05-17 11:28:46 -05:00
Jordi Gutiérrez Hermoso
dc2068ce42 dispatch: add fail-* family of hooks
The post-* family of hooks will not run in case a command fails (i.e.
raises an exception). This makes it inconvenient to hook into events
such as doing something in case of a failed push.

We catch all exceptions to run the failure hook. I am not sure if this
is too aggressive, but tests apparently pass.
2016-04-28 10:37:47 -04:00
Martin von Zweigbergk
f553986d8e templater: add separate() template function
A pretty common pattern in templates is adding conditional separators
like so:

  {node}{if(bookmarks, " {bookmarks}")}{if(tags, " {tags}")}

With this patch, the above can be simplified to:

  {separate(" ", node, bookmarks, tags)}

The function is similar to the already existing join(), but with a few
differences:

 * separate() skips empty arguments

 * join() expects a single list argument, while separate() expects
   each item as a separate argument

 * separate() takes the separator first in order to allow a variable
   number of arguments after it
2016-05-03 09:49:54 -07:00
Jun Wu
de167181c6 ui: add new config option for help text width
Before this patch, when printing help text using `hg help`, or `hg log -h`,
the output will wrap at 78 chars even if the user has a bigger terminal width
and there is no config option to change it, making the experience different
from the commonly used `man` tool.

This patch introduces a new config option `ui.textwidth`, which replaces the
hardcoded number. It's set to 78 by default to maintain compatibility. When
set to 0, `hg help` will behave more like `man`.
2016-05-04 18:18:24 +01:00
Sean Farley
dcd80c4774 help: wrap ".orig" in rst quotes
Apparently, .orig. is a macro for man pages so we need to wrap it in quotes to
silence lintian warnings.
2016-04-30 18:40:34 -07:00
Gregory Szorc
8720bcb69f hgweb: config option to control zlib compression level
Before this patch, the HTTP transport protocol would always zlib
compress certain responses (notably "getbundle" wire protocol commands)
at zlib compression level 6.

zlib can be a massive CPU resource sink for servers. Some server
operators may wish to reduce server-side CPU requirements while
requiring more bandwidth. This is common on corporate intranets, for
example. Others may wish to use more CPU but reduce bandwidth.

This patch introduces a config option to allow server operators
to control the zlib compression level.

On the "mozilla-unified" generaldelta repository, setting this
value to "0" (disable compression) results in server-side CPU
utilization for a `hg clone` going from ~180s to ~124s CPU time on
my i7-6700K.  A level of "1" (which increases the transfer size from
~1,074 MB at level 6 to ~1,222 MB) utilizes ~132s CPU time.
2016-08-07 18:09:58 -07:00
Gregory Szorc
efb3c4ff63 help: don't try to render a section on sub-topics
This patch subtly changes the behavior of the parsing of "X.Y" values
to not set the "section" variable when rendering a known sub-topic.
Previously, "section" would be the same as the sub-topic name. This
required the sub-topic RST to have a section named the same as the
sub-topic name.

When I made this change, the descriptions from help.internalstable
started being rendered in command line output. This didn't look correct
to me, as it didn't match the formatting of main help pages. I
corrected this by moving the top section to help.internalstable and
changing the section levels of all the "internals" topics.

The end result is that "internals" topics now match the rendering of
main topics on both the CLI and HTML. And, "internals" topics no longer
require a main section matching the name of the topic.
2016-08-06 17:04:22 -07:00
Yuya Nishihara
be362c5e3b help: avoid using "$n" parameter in revsetalias example
Because parsing "$n" requires a crafted tokenizer, it exists only for backward
compatibility (as documented in revset._tokenizealias.) This patch updates the
examples so that users are encouraged to use symbolic names instead of "$n"s.

I'm going to implement alias expansion in templater, which won't support "$n"
parameters to make my life easier. Templater is more complicated than revset
because tokenizer and parser call each other.
2016-03-26 18:50:56 +09:00
Yuya Nishihara
d21d4d0b82 ui: drop template aliases by HGPLAIN
Otherwise, scripting output could be suffered from user aliases.
2016-03-27 21:05:55 +09:00
Yuya Nishihara
ec53346d72 templater: load and expand aliases by template engine (API) (issue4842)
Now template aliases are fully supported in log and formatter templates.

As I said before, aliases are not expanded in map files. This avoids possible
corruption of our stock styles and web templates. This behavior is undocumented
since no map file nor [templates] section are documented at all. Later on,
we might want to add [aliases] section to map files if it appears to be useful.
2016-03-27 20:59:36 +09:00
Gregory Szorc
12984d10a8 help: remove references to "Python 2.6 or later"
We require Python 2.6. So there is no value to these docs.
2016-04-10 10:58:47 -07:00
Gregory Szorc
db9b5063c1 help: document sharing of revlog header with revision 0
The previous docs were incorrect about there being a discrete header
on revlogs.
2016-03-19 15:17:33 -07:00