During some experiments of mine, the uncompressed cloning could not
be enabled for hgweb.cgi nor hgwebdir.cgi though the server claimed
to be stream_out capable.
The only solution was to enable it using the user's .hgrc file.
This solution is not acceptable when publishing the repos through
an HTTP server because the CGI runs as a www dedicated user whose's
home hgrc file may not be accessible to users publishing their repos
through their userdir.
For such cases we could end up with this typical debug output:
hg --debug clone --uncompressed http://server/hg/project
destination directory: project
sending capabilities command
capabilities: lookup changegroupsubset stream=1
unbundle=HG10GZ,HG10BZ,HG10UN
sending stream_out command
abort: operation forbidden by server
The error lies in the fact the hgweb object defines new accessors
to the repo configuration that trust things by default (untrusted=True)
but the streamclone:stream_out function uses the usual accessors to the
repo.ui object, which do not trust by default (untrusted=False)
Fix this inconsistency, adding a new parameter to the stream_out function.
hgweb then forces a "trust by default" behavior.
The problem were some functions passed in the "defaults" argument
during the templater creation which use "self.t" directly. This
creates the cycle:
hgweb object
-> templater object
-> defaults dict
-> footer function
-> hgweb object
Instead of completely avoding the cycle, we break it after using
the templater.
This allows the client to display a reasonable message to the user
(e.g. "Permission denied: .hg/lock"), instead of the current
"<url> does not appear to be an hg repository".
Right now, if a pretxnchangegroup hook fails, we send some HTML
error message to the client and the transaction is not rolled back
(issue499).
Catching util.Abort allows us to send a decent message to the client
and for some reason makes the rollback complete.
This patch is not perfect since it doesn't fix the reason why the
transaction wasn't rolled back (maybe some circular references?).
Also, the transaction is aborted only after we've sent the response
back to the client and the "transaction aborted" message ends up in
the logs of the web server.
Arguments to capabilities were added before the 0.9.1 release, so there
are no compatibility issues. Mercurial 0.9 didn't support http push.
Using HG10GZ, HG10BZ and HG10UN has the advantage that new bundle types can
be added later and the client doesn't have to try sending them first and
reacting on errors sent by the server.
- path from old style are prefixed by '/', make cleanpath strip them
- make manifest() use relative paths, that was the only function using
'/' prefixed paths
The only exceptions are web.static and web.templates, since they can
be used to get any file that is readable by the user running the CGI
script.
Other options can be (ab)used to increase the use of the cpu
(allow_bz2) or of the bandwidth (server.uncompressed), but they're
trusted anyway.
This is another step to fix issue189, but currently the file revision numbers
are read as changeset revision numbers, so the link will point to the wrong
revision.
This happens e.g. when using the following apache config:
RewriteRule (.*) /hgwebdir/$1 [PT]
instead of the less readable (but more correct):
RewriteRule (.*) /hgwebdir$1 [PT]
rename commands.dodiff to patch.diff.
rename commands.doexport to patch.export.
move some functions from commands to new mercurial.cmdutil module.
turn list of diff options into mdiff.diffopts class.
patch.diff and patch.export now has clean api for call from 3rd party
python code.
With this change, you can set
[web]
stripes=3
to get stripes every three lines (a-la fanfold paper), instead of every
line on source and directory listings. The default behaviour is stripes=1
which generates output similar to current, and you can also turn stripes
off by setting it to 0.
existing clone code uses pull to get changes from remote repo. is very
slow, uses lots of memory and cpu.
new clone code has server write file data straight to client, client
writes file data straight to disk. memory and cpu used are very low,
clone is much faster over lan.
new client can still clone with pull, can still clone from older servers.
new server can still serve older clients.
str.find return -1 when the substring is not found, -1 evaluate
to True and is a valid index, which can lead to bugs.
Using alternatives when possible makes the code clearer and less
prone to bugs. (and __contains__ is faster in microbenchmarks)
After the WSGI changes, even if a push over HTTP succeeds, apache
complains about "Premature end of script headers: hgwebdir.cgi" and
returns a "HTTP Error 500: Internal Server Error", making the local hg
abort.
The change to either of the files touched by this patch is enough to fix
this, but I think changing both is a more robust solution.
First, it changes the server to be almost a generic WSGI server.
Second, it changes request.py to have wsgiapplication and
_wsgirequest. wsgiapplication is a class that creates _wsgirequests
when called by a WSGI compliant server. It needs to know whether
or not it should create hgwebdir or hgweb requests.
Lastly, wsgicgi.py is added, and the CGI scripts are altered to
use it to launch wsgiapplications in a WSGI compliant way.
As a side effect, all the keepalive code has been removed from
request.py. This code needs to be moved so that it is exclusively
in server.py
new hgrc entries allow_push, deny_push, push_ssl control push over http.
allow_push list controls push. if empty or not set, no user can push.
if "*", any user (incl. unauthenticated user) can push. if list of user
names, only authenticated users in list can push.
deny_push list examined before allow_push. if "*", no user can push.
if list of user names, no unauthenticated user can push, and no users
in list can push.
push_ssl requires https connection for push. default is true, so password
sniffing can not be done.
only "hg serve" affected yet. http server running cgi script will not
use persistent connections. support for fastcgi will help that.
clients that support keepalive can use one tcp connection for all
commands during clone and pull. this makes latency of binary search
during pull much lower over wan.
if server does not know content-length, it will force connection to
close at end. right fix is to use chunked transfer-encoding but this is
easier and does not hurt performance. only command that is affected is
"changegroup" which is always last command during a pull.