It seems sufficiently simple to use "profile(enabled=X)" to not justify having
a dedicated context manager just to read the config.
(I do not have a too strong opinion about this).
This is useful for when repositories are nested in --web-conf, and in the future
with hosted subrepositories. The previous behavior was only to render an index
at each virtual directory. There is now an explicit 'index' child for each
virtual directory. The name was suggested by Yuya, for consistency with the
other method names.
Additionally, there is now an explicit 'index' child for every repository
directory with a nested repository somewhere below it. This seems more
consistent with each virtual directory hosting an index, and more discoverable
than to only have an index for a directory that directly hosts a nested
repository. I couldn't figure out how to close the loop and provide one in each
directory without a deeper nested repository, without blocking a committed
'index' file. Keeping that seems better than rendering an empty index.
Content-Security-Policy (CSP) is a web security feature that allows
servers to declare what loaded content is allowed to do. For example,
a policy can prevent loading of images, JavaScript, CSS, etc unless
the source of that content is whitelisted (by hostname, URI scheme,
hashes of content, etc). It's a nifty security feature that provides
extra mitigation against some attacks, notably XSS.
Mitigation against these attacks is important for Mercurial because
hgweb renders repository data, which is commonly untrusted. While we
make attempts to escape things, etc, there's the possibility that
malicious data could be injected into the site content. If this happens
today, the full power of the web browser is available to that
malicious content. A restrictive CSP policy (defined by the server
operator and sent in an HTTP header which is outside the control of
malicious content), could restrict browser capabilities and mitigate
security problems posed by malicious data.
CSP works by emitting an HTTP header declaring the policy that browsers
should apply. Ideally, this header would be emitted by a layer above
Mercurial (likely the HTTP server doing the WSGI "proxying"). This
works for some CSP policies, but not all.
For example, policies to allow inline JavaScript may require setting
a "nonce" attribute on <script>. This attribute value must be unique
and non-guessable. And, the value must be present in the HTTP header
and the HTML body. This means that coordinating the value between
Mercurial and another HTTP server could be difficult: it is much
easier to generate and emit the nonce in a central location.
This commit introduces support for emitting a
Content-Security-Policy header from hgweb. A config option defines
the header value. If present, the header is emitted. A special
"%nonce%" syntax in the value triggers generation of a nonce and
inclusion in <script> elements in templates. The inclusion of a
nonce does not occur unless "%nonce%" is present. This makes this
commit completely backwards compatible and the feature opt-in.
The nonce is a type 4 UUID, which is the flavor that is randomly
generated. It has 122 random bits, which should be plenty to satisfy
the guarantees of a nonce.
Moving archivespecs to the module level allows using it from other modules
(such as hgwebdir_mod), and keeping a reference to it in requestcontext allows
current code to just work.
This allows us to write doctests depending on a ui object, but not on global
configs.
ui.load() is a class method so we can do wsgiui.load(). All ui() calls but
for doctests are replaced with ui.load(). Some of them could be changed to
not load configs later.
Currently, running `hg serve --profile` doesn't yield anything useful:
when the process is terminated the profiling output displays results
from the main thread, which typically spends most of its time in
select.select(). Furthermore, it has no meaningful results from
mercurial.* modules because the threads serving HTTP requests don't
actually get profiled.
This patch teaches the hgweb wsgi applications to profile individual
requests. If profiling is enabled, the profiler kicks in after
HTTP/WSGI environment processing but before Mercurial's main request
processing.
The profile results are printed to the configured profiling output.
If running `hg serve` from a shell, they will be printed to stderr,
just before the HTTP request line is logged. If profiling to a file,
we only write a single profile to the file because the file is not
opened in append mode. We could add support for appending to files
in a future patch if someone wants it.
Per request profiling doesn't work with the statprof profiler because
internally that profiler collects samples from the thread that
*initially* requested profiling be enabled. I have plans to address
this by vendoring Facebook's customized statprof and then improving
it.
hgweb currently offers limited functionality for "classifying"
repositories. This patch aims to change that.
The web.labels config option list is introduced. Its values
are exposed to the "index" and "summary" templates. Custom
templates can use template features like ifcontains() to e.g.
look for the presence of a specific label and engage specific
behavior. For example, a site operator may wish to assign a
"defunct" label to a repository so the repository is prominently
marked as dead in repository indexes.
New frommapfile() function will make it clear when template aliases will be
loaded. They should be applied to command arguments and templates in hgrc,
but not to map files. Otherwise, our stock styles and web templates
(i.e map-file templates) could be modified unintentionally.
Future patches will add "aliases" argument to __init__(), but not to
frommapfile().
The home of 'Abort' is 'error' not 'util' however, a lot of code seems to be
confused about that and gives all the credit to 'util' instead of the
hardworking 'error'. In a spirit of equity, we break the cycle of injustice and
give back to 'error' the respect it deserves. And screw that 'util' poser.
For great justice.
It took longer than I wanted to grok how the various parts of hgweb
worked. So I added some class and method documentation to help whoever
hacks on this next.
hgwebdir refreshes the set of known repositories periodically. This
is necessary because refreshing on every request could add significant
request latency.
More than once I've found myself wanting to tweak this interval at
Mozilla. I've also wanted the ability to always refresh (often when
writing tests for our replication setup).
This patch makes the refresh interval configurable. Negative values
indicate to always refresh. The default is left unchanged.
Python 2.6 introduced the "except type as instance" syntax, replacing
the "except type, instance" syntax that came before. Python 3 dropped
support for the latter syntax. Since we no longer support Python 2.4 or
2.5, we have no need to continue supporting the "except type, instance".
This patch mass rewrites the exception syntax to be Python 2.6+ and
Python 3 compatible.
This patch was produced by running `2to3 -f except -w -n .`.
Previously, if a subrepo parent had 'web.hidden=True' set, neither the parent
nor child had a repository entry. However, the directory entry for the parent
would be listed (it wouldn't have the fancy 'web.name' if configured), and that
link went to the repo's summary page, effectively making it not hidden.
This simply disables the directory processing if a valid repository is present.
Whether or not the subrepo should be hidden is debatable, but this leaves that
behavior unchanged (i.e. it stays hidden).
Previously, when 'web.name' was set on a subrepo parent and 'web.collapse=True',
the parent repo would show in the list with the configured 'web.name', and a
directory with the parent repo's filesystem name (with a trailing slash) would
also appear. The subrepo(s) would unexpectedly be excluded from the list of
repositories. Clicking the directory entry would go right to the repo page.
Now both the parent and the subrepos show up, without the additional directory
entry.
The configured hgweb paths used '**' for finding the repos in this scenario.
A couple of notes about the tests:
- The area where the subrepo was added has a comment that it tests subrepos,
though none previously existed there. One now does.
- The 'web.descend' option is required for collapse to work. I'm not sure what
the previous expectations were for the test. Nothing changed with it set,
prior to adding the code in this patch. It is however required for this test.
- The only output changes are for the hyperlinks, obviously because of the
'web.name' parameter.
- Without this code change, there would be an additional diff:
--- /usr/local/mercurial/tests/test-hgwebdir.t
+++ /usr/local/mercurial/tests/test-hgwebdir.t.err
@@ -951,7 +951,7 @@
/rcoll/notrepo/e/
/rcoll/notrepo/e/e2/
/rcoll/notrepo/f/
- /rcoll/notrepo/f/f2/
+ /rcoll/notrepo/f/
Test repositories inside intermediate directories
I'm not sure why the fancy name doesn't come out, but it is enough to
demonstrate that the parent is not listed redundantly, and the subrepo isn't
skipped.
Some extensions set configuration settings that showed up in 'hg showconfig
--debug' with 'none' as source. That was confusing.
Instead, they will now tell which extension they come from.
This change tries to be consistent and specify a source everywhere - also where
it perhaps is less relevant.
Until now, repositories did not provide any value for isdirectory in rows
produced for the index output, and thus isdirectory was generally evaluated as
None for each index entry representing a repository.
However, directories (visible when viewed with the descend and collapse
settings enabled) did provide a value of True and this value appeared to
persist in subsequent rows processed by the templater, causing isdirectory
tests in templates to produce incorrect results for index entries appearing
after directories.
This patch asserts the None value for repositories, thus erasing any such
persistent True values.
The web.prefix setting was being ignored when creating the index URL
breadcrumbs.
We only need to fix hgwebdir and not hgweb because hgweb gets the complete URL
request, including the prefix, while hgwebdir gets a "subdir" which does not
include the prefix.
This fix is slightly different of what was suggested on the bug tracker. In
there it was suggested to hide the prefix itself from the breadcrumb. I think
that would be a better solution, but it would require changing all the index
templates and passing the prefix to the template engine, which may be too big
a change for stable during the freeze. For now this fixes the problem, and the
fix could be improved during the next cycle.
The purpose of this change is to make it much easier to navigate up the
repository tree when the hg web server is used to serve more than one
repository.
A "URL breadcrumb" is a path where each of the path items can be clicked to go
to the corresponding path page.
This lets you go up the folder hierarchy very quickly. For example, when showing
the list of repositories in http://myserver/myteams/myprojects, the following
"breadcrumb" will be shown:
Mercurial > myteams > myprojects
Clicking on "myprojects" reloads the page. Clicking on "myteams" goes up one
folder. Clicking on the leftmost "Mercurial" goes to the server root.
This "breadcrumb" also appears on all repository pages. For example on the
summary page of the repository at http://myserver/myteams/myprojects/myrepo the
following will be shown:
Mercurial > myteams > myprojects > myrepo / summary
This change has been applied to all templates that already had a link to the
main repository page (i.e. gitweb, monoblue, paper and coal) plus to the index
page of the spartan template.
In order to make the breadcumb links stand out the some of the template styles
have been customized.
Up until now the templates that show RSS and Atom feeds on the "repository
lists" (i.e. gitweb and monoblue) showed them for all entries, including regular
folders. Clicking on those "folder RSS" links would result in an error page
being shown.
This patch hides those links for regular folders.
The existing help only walked through an example.
Now we first explain the basic rules and then show an example.
The 'collections' example and description only cause confusion and is removed.
Bikeshedded by Patrick Mezard <patrick@mezard.eu>
The descend option in hgweb can be used to display all reachable repositories
within a directory hierarchy if set to True. However, all reachable
repositories, regardless of their depth below the root of the hierarchy, are
then listed at the same level - expanded - in the hgweb interface. This patch
adds support for showing only each level of a directory hierarchy, with
subrepositories being shown alongside their parent repositories only at the
appropriate level (because there is no way to navigate to subrepositories from
within repositories), and the contents of directories hidden - collapsed -
behind a link for each directory. To enable this multi-level navigation, a new
option called collapse must be set to True when the descend option is set to
True.
This change complements the existing web/logourl setting, and lets the user
customize the logo image that is shown on many of the hg server pages.
If this setting is not set, hglogo.png is used.
The introduction of the new URL parsing code has created a startup
time regression. This is mainly due to the use of url.hasscheme() in
the ui class. It ends up importing many libraries that the url module
requires.
This fix helps marginally, but if we can get rid of the urllib import
in the URL parser all together, startup time will go back to normal.
perfstartup time before the URL refactoring (707e4b1e8064):
! wall 0.050692 comb 0.000000 user 0.000000 sys 0.000000 (best of 100)
current startup time (9ad1dce9e7f4):
! wall 0.070685 comb 0.000000 user 0.000000 sys 0.000000 (best of 100)
after this change:
! wall 0.064667 comb 0.000000 user 0.000000 sys 0.000000 (best of 100)