str.find return -1 when the substring is not found, -1 evaluate
to True and is a valid index, which can lead to bugs.
Using alternatives when possible makes the code clearer and less
prone to bugs. (and __contains__ is faster in microbenchmarks)
on 5.8MB (244.000 lines) text file with similar lines, hash before
this change made diff against empty file take 75 seconds. this change
improves performance to 0.6 seconds. result is that clone of smallish
repo (137MB) with some files like this takes 1 minute instead of 10
minutes.
common case of diff is 10% slower now, probably because of worse cache
locality. but diff does not affect overall performance in common case
(less than 1% of runtime is in diff when it is working ok), so this
tradeoff looks good.
First, it changes the server to be almost a generic WSGI server.
Second, it changes request.py to have wsgiapplication and
_wsgirequest. wsgiapplication is a class that creates _wsgirequests
when called by a WSGI compliant server. It needs to know whether
or not it should create hgwebdir or hgweb requests.
Lastly, wsgicgi.py is added, and the CGI scripts are altered to
use it to launch wsgiapplications in a WSGI compliant way.
As a side effect, all the keepalive code has been removed from
request.py. This code needs to be moved so that it is exclusively
in server.py
filterfiles was failing to find files for directory arguments if
another file existed that started with the directory name and
sorted earlier. For example, a manifest of ('foo.h', 'foo/foo')
would cause filterfiles('foo') to return nothing. This resolves
issue #294.
cold cache diff performance has regressed in two ways. localrepo.changes
has optimizations for diffing against the working dir parent that expect
node1 to be None. commands.revpair() usage means that commands.dodiff()
never sends node1 == None. This is fixed in localrepo.changes by checking
against the dirstate parents.
In the non-dirstate parents case, localrepo.changes does a loop comparing
files without first sorting the file names, leading to random access
across the disk.
new hgrc entries allow_push, deny_push, push_ssl control push over http.
allow_push list controls push. if empty or not set, no user can push.
if "*", any user (incl. unauthenticated user) can push. if list of user
names, only authenticated users in list can push.
deny_push list examined before allow_push. if "*", no user can push.
if list of user names, no unauthenticated user can push, and no users
in list can push.
push_ssl requires https connection for push. default is true, so password
sniffing can not be done.
On the kernel repo:
$ hg heads -q
before after
RevlogNG 1.11 0.52
Revlogv0 0.80 0.69
Since the current code for tags has to find all the heads of the repo,
this also helps there:
$ hg tags
before after
RevlogNG 2.35 1.76
Revlogv0 2.04 1.90
This allows one to walk the revision graph using only revision numbers,
which can be faster than using revision hashes, especially for
RevlogNG, where the parents of a revision are stored as revision
numbers.
now all repositories have capabilities slot, tuple with list of names.
if 'unbundle' capability present, repo supports push where client does
not need to lock server. repository classes that have unbundle capability
also have unbundle method.
implemented for ssh now, will be base for push over http.
unbundle protocol acts this way. server tells client what heads it
has during normal negotiate step. client starts unbundle by repeat
server's heads back to it. if server has new heads, abort immediately.
otherwise, transfer changes to server. once data transferred, server
locks and checks heads again. if heads same, changes can be added.
else someone else added heads, and server aborts.
if client wants to force server to add heads, sends special heads list of
'force'.
uses keepalive module from urlgrabber package. tested against "hg serve",
cgi server, and through http proxy. used ethereal to verify that only
one tcp connection used during entire "hg pull" sequence.
if server supports keepalive, this makes latency of "hg pull" much lower.
only "hg serve" affected yet. http server running cgi script will not
use persistent connections. support for fastcgi will help that.
clients that support keepalive can use one tcp connection for all
commands during clone and pull. this makes latency of binary search
during pull much lower over wan.
if server does not know content-length, it will force connection to
close at end. right fix is to use chunked transfer-encoding but this is
easier and does not hurt performance. only command that is affected is
"changegroup" which is always last command during a pull.
Use case: If a remote repo has two heads and I _want_ to merge them, I merge
and push. Meanwhile someone else pushed on top of one of the heads. He won't
get a warning, because he doesn't create a new head, I won't notice that I
don't close a head, because I don't get a message telling me.
Because older servers don't return any output for unknown commands,
it's tricky to add new commands. The approach is this: we add a
"hello" command that reports any interesting capabilities (and other
things that might be of interest in the future). To detect whether
this new command is supported, we issue both it and our startup
detection command ("between") at the beginning of a connection.
The number of csets and the hooks where wrong (negative number of csets) when
we unbundled a bundle which contains csets we already had.
Remove unused variables.
- add documentation about what the function does, notably
the fact that it updates 'base'
- transform the workflow to a more simple 'if elif elif else'
- do not call remote.branches if not necessary
- some nodes where missing in 'base' (from what I understand,
if the root of a branch is missing but one parent is present,
the parent should be in 'base')
- add a testcase for an incorrect outgoing that is fixed by
this cset
- add a testcase for an empty group bug, it needs fixing
problems fixed:
- https scheme handled properly for real and proxy urls.
- url of form "http://user:password@host:port/path" now ok.
- no-proxy check uses proper host names.
if uisetup functin exists in extension, is called before cmdtable examined.
called with ui object as parameter. lets module modify cmdtable before
commands.py sees it.
In the v4l-dvb repo, the manifest revno and the changelog revno are not
in sync. This happened because the same patch was applied to the same
revision in two different branches, resulting in the same manifest text,
with the same parents and so the first revision was reused.
Since hgweb.manifest was assuming the revnos of the manifest and of the
changelog were always the same, clicking on manifest -> bz2 in the
v4l-dvb site would download the wrong revision.
Use the linkrev to go from manifest revision to changelog revision.
This still won't be perfect since the page will still talk about
"manifest for changeset XYZ", where XYZ was the first changeset to have
this manifest, which is not necessarily the same changeset that the user
clicked to get to this page - but at least the contents will be the
same.
Further the installation of packagescan over demandload is moved to the
packagescan module.
I added as well few more comments in the packagescan module to avoid
the wrong use of package scan in the future.
Reason:
mercurial.packagescan acts as fake mercurial.demandload during a py2exe
run. Unfortunatly the import of mercurial.version in setup.py is done
before mercurial.packagescan is installed. This results in few imports
without mercurial.packagescan in charge and therefore not all dependend
modules are detected when running mercurial.packagescan.getmodules
later e.g. winerror is missed.
old code read every head of .hgtags. delete and recreate of .hgtags gave
new head, but if error in deleted rev, .hgtags had error messages every
time it was parsed. this was very hard to fix, because deleted revs hard
to get back and update, needed merges too.
new code reads .hgtags on every head. advantage is if parse error
happens with new code, is possible to fix them by editing .hgtags on a
head and committing.
NOTE: new code uses binary search of manifest of each head to be fast,
but still much slower than old code. best thing would be to have delete
record stored in filelog so we never touch manifest. could find live
heads directly from filelog. this is more work than i want now.
new tests check for parse of tags on different heads, and inaccessible
heads created by delete and recreate of .hgtags.
When the gpatch fix for solaris was introduced in b67447b909f3 the
patch command was "". For some strange reason windows 2000 is
not happy with those quotes when given in os.popen.