Commit Graph

992 Commits

Author SHA1 Message Date
Boris Feld
ac4da5873b configitems: register the 'web.encoding' config 2017-06-30 03:45:44 +02:00
Boris Feld
01eaec495b configitems: register the 'web.description' config 2017-06-30 03:45:43 +02:00
Boris Feld
bb6b276196 configitems: register the 'web.descend' config 2017-06-30 03:45:42 +02:00
Boris Feld
208bc8bf91 configitems: register the 'web.collapse' config 2017-06-30 03:45:38 +02:00
Boris Feld
9ac692bc8c configitems: register the 'web.address' config 2017-06-30 03:45:32 +02:00
Boris Feld
b22253c4ad configitems: register the 'web.accesslog' config 2017-06-30 03:45:31 +02:00
Boris Feld
ac80b45fb8 web: use '_unset' default value for proxy config method
This special value is needed to make sure registered default value are taken in
account.
2017-09-15 19:21:08 +02:00
Yuya Nishihara
a71f259bd2 doctest: bulk-replace string literals with b'' for Python 3
Our code transformer can't rewrite string literals in docstrings, and I
don't want to make the transformer more complex.
2017-09-03 14:32:11 +09:00
Martin von Zweigbergk
cb8b36b8aa cleanup: rename "matchfn" to "match" where obviously a matcher
We usually call matchers either "match" or "m" and reserve "matchfn"
for functions.

Differential Revision: https://phab.mercurial-scm.org/D641
2017-09-05 15:06:45 -07:00
Augie Fackler
e2774d9258 python3: wrap all uses of <exception>.strerror with strtolocal
Our string literals are bytes, and we mostly want to %-format a
strerror into a one of those literals, so this fixes a ton of issues.
2017-08-22 20:03:07 -04:00
Augie Fackler
cc479af38a httppeer: add support for httppostargs when we're sending a file
This is probably only used in the 'unbundle' command, but the code
ended up being cleaner to make it generic and treat *all* httppostargs
with a non-args request body as though they were file-like in
nature. It also means we get test coverage more or less for free. A
previous version of this change didn't use io.BytesIO, and it was a
lot more complicated.

This also fixes a server-side bug, so anyone using httppostargs should
update all of their servers to this revision or later *before* this
gets to their clients, otherwise servers will hang trying to over-read
the POST body.

Differential Revision: https://phab.mercurial-scm.org/D231
2017-07-26 17:58:19 -04:00
Gregory Szorc
64adaa7b62 revset: pass repo when passing ui
The repo instance is currently only used to provide a changeset
lookup function as part of parsing revsets. I /think/ this allows
node fragments to resolve. I'm not sure why we wouldn't want this
to always "just work" if parsing a revset string.

Plus, an upcoming commit will introduce a new consumer that needs a
handle on the repo. So passing it more often will make that code
work more.

Passing a repo instance in all callers of revset.match* results in
a bunch of test changes. Notably, branch and tags caches get
populated as part of evaluating revsets. I'm not sure if this is
desirable. So this patch takes the conservative approach and only
passes the repo if we're passing a ui instance.

Differential Revision: https://phab.mercurial-scm.org/D97
2017-07-15 15:51:57 -07:00
David Demelier
7369cb3896 hgweb: use ui._unset to prevent a warning in configitems 2017-07-03 13:04:35 +02:00
Pierre-Yves David
014dc08cb1 configitems: register the 'server.zliblevel' config 2017-06-30 03:44:16 +02:00
Denis Laxalde
831d4dcf5b hgweb: plug followlines action in annotate view
Add the followlines.js script and corresponding parameters as data attribute
on <tbody class="sourcelines"> element.
Extend CSS rules so that they also match the DOM structure of annotate view.

As previously, only address paper and gitweb styles (other styles do not have
followlines at all).
2017-06-21 17:17:17 +02:00
Yuya Nishihara
e75a42ecc9 dagop: move blockancestors() and blockdescendants() from context
context.py seems not a good place to host these functions.

  % wc -l mercurial/context.py mercurial/dagop.py
    2306 mercurial/context.py
     424 mercurial/dagop.py
    2730 total
2017-02-19 19:37:14 +09:00
Pierre-Yves David
2fdfd87513 profile: drop maybeprofile
It seems sufficiently simple to use "profile(enabled=X)" to not justify having
a dedicated context manager just to read the config.

(I do not have a too strong opinion about this).
2017-06-09 12:29:29 +01:00
Yuya Nishihara
877b7b1d1a help: pass commands module by argument
This removes import cycle.
2017-05-21 16:57:32 +09:00
Jun Wu
58df20579d webcommands: use fctx.isbinary 2017-05-03 18:04:43 -07:00
Denis Laxalde
d141711562 hgweb: do not show "descending" link in followlines UI for filelog heads
When on a filelog head, we are certain that there will be no descendant so the
target of the "descending" link will lead to an empty log result. Do not
display the link in this case.
2017-04-24 10:32:15 +02:00
Matt Harbison
0181beb642 hgwebdir: allow a repository to be hosted at "/"
This can be useful in general, but will also be useful for hosting subrepos,
with the main repo at /.
2017-03-31 23:00:41 -04:00
Denis Laxalde
bd52f5d831 hgweb: handle a "descend" query parameter in filelog command
When this "descend" query parameter is present along with "linerange"
parameter, we get revisions following line range in descending order. The
parameter has no effect without "linerange".
2017-04-10 16:23:41 +02:00
Denis Laxalde
927c1336ab mdiff: add a hunkinrange helper function
This factors out hunk filtering logic by line range that is similar in
mdiff.blocksinrange() and hgweb.webutil.diffs().
2017-04-01 12:24:59 +02:00
Gregory Szorc
64e2de02bd hgweb: extract path traversal checking into standalone function
A common exploit in web applications that access paths is to insert
path separator strings like ".." to try to get the server to serve up
files it shouldn't.

We have code for detecting this in staticfile(). A subsequent commit
will need to perform this test as well. Since this is security code,
let's factor the check so we don't have to reinvent the wheel.
2017-03-31 21:47:26 -07:00
Gregory Szorc
bfa11ec1e0 hgweb: use context manager for file I/O 2017-03-31 22:30:38 -07:00
Denis Laxalde
495eefece3 hgweb: prefix line id by ctx shortnode in filelog when patches are shown
When "patch" query parameter is present in requests to filelog view, line ids
in patches diff are no longer unique in the page since several patches are
shown on the same page. We now prefix line id by changeset shortnode when
several patches are displayed in the same page to have unique line ids
overall.
2017-03-30 21:40:10 +02:00
Denis Laxalde
6af640ec03 hgweb: fix diff hunks filtering by line range in webutil.diffs()
The previous clause for filter out a diff hunk was too restrictive. We need to
consider the following cases (assuming linerange=(lb, ub) and the @s2,l2
hunkrange):

            <-(s2)--------(s2+l2)->
      <-(lb)---(ub)->
               <-(lb)---(ub)->
                           <-(lb)---(ub)->

previously on the first and last situations were considered.

In test-hgweb-filelog.t, add a couple of lines at the beginning of file "b" so
that the line range we will follow does not start at the beginning of file.
This covers the change in aforementioned diff hunk filter clause.
2017-03-29 12:07:07 +02:00
Denis Laxalde
2702418496 hgweb: filter diff hunks when 'linerange' and 'patch' are specified in filelog 2017-03-13 15:17:20 +01:00
Denis Laxalde
69dcb458cd hgweb: add a 'linerange' parameter to webutil.diffs()
This is used to filter out hunks based on their range (with respect to 'node2'
for patch.diffhunks() call, i.e. 'ctx' for webutil.diffs()).

This is the simplest way to filter diff hunks, here done on server side. Later
on, it might be interesting to perform this filtering on client side and
expose a "toggle" action to alternate between full and filtered diff.
2017-03-13 15:15:49 +01:00
Denis Laxalde
996bd4af95 hgweb: handle a "linerange" request parameter in filelog command
We now handle a "linerange" URL query parameter to filter filelog using
a logic similar to followlines() revset.
The URL syntax is: log/<rev>/<file>?linerange=<fromline>:<toline>
As a result, filelog entries only consists of revision changing specified
line range.

The linerange information is propagated to "more"/"less" navigation links but
not to numeric navigation links as this would apparently require a dedicated
"revnav" class.

Only update the "paper" template in this patch.
2017-01-19 17:41:00 +01:00
Denis Laxalde
6b9779860f hgweb: add a "patch" query parameter to filelog command
Add support for a "patch" query parameter in filelog web command similar to
--patch option of `hg log` to display the diff of each changeset in the table
of revisions. The diff text is displayed in a dedicated row of the table that
follows the existing one for each entry and spans over all columns. Only
update "paper" template in this patch.
2017-03-13 10:41:13 +01:00
Denis Laxalde
53e1237344 hgweb: handle "parity" internally in webutil.diffs()
There's apparently no reason to have the "parity" of diff blocks that
webutil.diffs() generates coming from outside the function. So have it
internally managed. We thus now pass a "web" object to webutil.diffs() to get
access to both "repo" and "stripecount" attribute.
2017-03-13 10:40:19 +01:00
Matt Harbison
27ca0a8a5b hgwebdir: add support for explicit index files
This is useful for when repositories are nested in --web-conf, and in the future
with hosted subrepositories.  The previous behavior was only to render an index
at each virtual directory.  There is now an explicit 'index' child for each
virtual directory.  The name was suggested by Yuya, for consistency with the
other method names.

Additionally, there is now an explicit 'index' child for every repository
directory with a nested repository somewhere below it.  This seems more
consistent with each virtual directory hosting an index, and more discoverable
than to only have an index for a directory that directly hosts a nested
repository.  I couldn't figure out how to close the loop and provide one in each
directory without a deeper nested repository, without blocking a committed
'index' file.  Keeping that seems better than rendering an empty index.
2017-03-05 22:22:32 -05:00
Gregory Szorc
5ca0f908bf py3: add __bool__ to every class defining __nonzero__
__nonzero__ was renamed to __bool__ in Python 3. This patch simply
aliases __bool__ to __nonzero__ for every class implementing
__nonzero__.
2017-03-13 12:40:14 -07:00
Pierre-Yves David
50e7f5d5fd hgweb: explicitly tests for None
Changeset 11e325d162fe removed the mutable default value, but did not explicitly
tested for None. Such implicit testing can introduce semantic and performance
issue. We move to an explicit testing for None as recommended by PEP8:

https://www.python.org/dev/peps/pep-0008/#programming-recommendations
2017-03-15 15:11:04 -07:00
Pierre-Yves David
c05c73d498 hgweb: explicitly tests for None in webutil
Changeset 45c7a22dbdc0 removed the mutable default value, but did not explicitly
tested for None. Such implicit testing can introduce semantic and performance
issue. We move to an explicit testing for None as recommended by PEP8:

https://www.python.org/dev/peps/pep-0008/#programming-recommendations
2017-03-15 15:10:09 -07:00
Gregory Szorc
d2e9e46760 hgweb: don't use mutable default argument value 2017-03-12 21:52:17 -07:00
Gregory Szorc
5cc9a634fe hgweb: don't use mutable default argument value 2016-12-26 16:55:47 -07:00
Denis Laxalde
0ed26e5739 hgweb: use patch.diffhunks in webutil.diffs to simplify the algorithm
Function patch.diffhunks yields items for a "block" (i.e. a file) as a whole
so take advantage of this to simplify the algorithm and avoid parsing diff
lines to determine whether we're starting a new "block" or not. Thus we drop
to external block counter and rely on diffhunks iterations instead.
We also take advantage of the fact that patch.diffhunks() yields *lines* of
hunks (instead of a string) to avoid building a list that is ''.join-ed into a
string that is then split.

As lines in 'header' returned by patch.diffhunks() have no trailing new line,
we need to insert it ourselves to match template expectations.
2017-03-06 09:28:33 +01:00
Denis Laxalde
5d10a8a4b5 hgweb: start enumerate at 1 in webutil.diffs's inner function prettyprintlines 2017-03-06 09:44:39 +01:00
Denis Laxalde
be3d757c61 hgweb: explictly pass basectx in webutil.diffs
There's only one case where `basectx` parameter is None (over two usages), so
it's probably not worth handling the special case as it makes code-reading
harder.

Along the way, use ctx.p1() instead of checking for ctx.parents() being empty
which should not occur.
2017-01-17 17:25:48 +01:00
Yuya Nishihara
b2229f5117 revset: split language services to revsetlang module (API)
New revsetlang module hosts parser, tokenizer, and miscellaneous functions
working on parsed tree. It does not include functions for evaluation such as
getset() and match().

  2288 mercurial/revset.py
   684 mercurial/revsetlang.py
  2972 total

get*() functions are aliased since they are common in revset.py.
2017-02-19 18:19:33 +09:00
Yuya Nishihara
d63d83be69 revset: import set classes directly from smartset module
Follows up 97d0be4019ac.
2017-02-19 18:16:09 +09:00
Denis Laxalde
86ca3ec602 hgweb: simplify calculation of first revision in filelog command 2017-01-17 09:19:24 +01:00
Denis Laxalde
8eecb0ced7 hgweb: restore ascending iteration on revs in filelog web command
Follow-up on e082a1597833. Adjust back the "parity" generator's offset to keep
rendering the same.
2017-01-17 09:17:29 +01:00
Denis Laxalde
e0d6f05072 hgweb: build the "entries" list directly in filelog command
There's no apparent reason to have this "entries" generator function that
builds a list and then yields its elements in reverse order and which is only
called to build the "entries" list. So just build the list directly, in
reverse order.

Adjust "parity" generator's offset to keep rendering the same.
2017-01-13 10:22:25 +01:00
Gregory Szorc
9849c580fb hgweb: support Content Security Policy
Content-Security-Policy (CSP) is a web security feature that allows
servers to declare what loaded content is allowed to do. For example,
a policy can prevent loading of images, JavaScript, CSS, etc unless
the source of that content is whitelisted (by hostname, URI scheme,
hashes of content, etc). It's a nifty security feature that provides
extra mitigation against some attacks, notably XSS.

Mitigation against these attacks is important for Mercurial because
hgweb renders repository data, which is commonly untrusted. While we
make attempts to escape things, etc, there's the possibility that
malicious data could be injected into the site content. If this happens
today, the full power of the web browser is available to that
malicious content. A restrictive CSP policy (defined by the server
operator and sent in an HTTP header which is outside the control of
malicious content), could restrict browser capabilities and mitigate
security problems posed by malicious data.

CSP works by emitting an HTTP header declaring the policy that browsers
should apply. Ideally, this header would be emitted by a layer above
Mercurial (likely the HTTP server doing the WSGI "proxying"). This
works for some CSP policies, but not all.

For example, policies to allow inline JavaScript may require setting
a "nonce" attribute on <script>. This attribute value must be unique
and non-guessable. And, the value must be present in the HTTP header
and the HTML body. This means that coordinating the value between
Mercurial and another HTTP server could be difficult: it is much
easier to generate and emit the nonce in a central location.

This commit introduces support for emitting a
Content-Security-Policy header from hgweb. A config option defines
the header value. If present, the header is emitted. A special
"%nonce%" syntax in the value triggers generation of a nonce and
inclusion in <script> elements in templates. The inclusion of a
nonce does not occur unless "%nonce%" is present. This makes this
commit completely backwards compatible and the feature opt-in.

The nonce is a type 4 UUID, which is the flavor that is randomly
generated. It has 122 random bits, which should be plenty to satisfy
the guarantees of a nonce.
2017-01-10 23:37:08 -08:00
Gregory Szorc
f71c86b7e9 protocol: send application/mercurial-0.2 responses to capable clients
With this commit, the HTTP transport now parses the X-HgProto-<N>
header to determine what media type and compression engine to use for
responses. So far, we only compress responses that are already being
compressed with zlib today (stream response types to specific
commands). We can expand things to cover additional response types
later.

The practical side-effect of this commit is that non-zlib compression
engines will be used if both ends support them. This means if both
ends have zstd support, zstd - not zlib - will be used to compress
data!

When cloning the mozilla-unified repository between a local HTTP
server and client, the benefits of non-zlib compression are quite
noticeable:

  engine     server CPU (s)   client CPU (s)    bundle size
zlib (l=6)      174.1            283.2         1,148,547,026
zstd (l=1)       99.2            267.3         1,127,513,841
zstd (l=3)      103.1            266.9         1,018,861,363
zstd (l=7)      128.3            269.7           919,190,278
zstd (l=10)     162.0               -            894,547,179
none             95.3            277.2         4,097,566,064

The default zstd compression level is 3. So if you deploy zstd
capable Mercurial to your clients and servers and CPU time on
your server is dominated by "getbundle" requests (clients cloning
and pulling) - and my experience at Mozilla tells me this is often
the case - this commit could drastically reduce your server-side
CPU usage *and* save on bandwidth costs!

Another benefit of this change is that server operators can install
*any* compression engine. While it isn't enabled by default, the
"none" compression engine can now be used to disable wire protocol
compression completely. Previously, commands like "getbundle" always
zlib compressed output, adding considerable overhead to generating
responses. If you are on a high speed network and your server is under
high load, it might be advantageous to trade bandwidth for CPU.
Although, zstd at level 1 doesn't use that much CPU, so I'm not
convinced that disabling compression wholesale is worthwhile. And, my
data seems to indicate a slow down on the client without compression.
I suspect this is due to a lack of buffering resulting in an increase
in socket read() calls and/or the fact we're transferring an extra 3 GB
of data (parsing HTTP chunked transfer and processing extra TCP packets
can add up). This is definitely worth investigating and optimizing. But
since the "none" compressor isn't enabled by default, I'm inclined to
punt on this issue.

This commit introduces tons of tests. Some of these should arguably
have been implemented on previous commits. But it was difficult to
test without the server functionality in place.
2016-12-24 15:29:32 -07:00
Gregory Szorc
52ab84abd8 httppeer: extract code for HTTP header spanning
A second consumer of HTTP header spanning will soon be introduced.
Factor out the code to do this so it can be reused.
2016-12-24 14:46:02 -07:00
Anton Shestakov
836493ef5e hgweb: use archivespecs for links on repo index page too
Moving archivespecs to the module level allows using it from other modules
(such as hgwebdir_mod), and keeping a reference to it in requestcontext allows
current code to just work.
2017-01-10 23:41:58 +08:00