Previously, fnodes had a key and empty dict value for every element in
changedfiles. This is somewhat wasteful. Empty dicts in CPython consume
a lot more memory than you would expect - 280 bytes.
On mozilla-central, which has ~190,000 files/fnodes keys, the previous
loop populating fnodes allocated 91,924 KB of memory, most of that for
the empty dicts.
With this patch in place, our peak RSS during mozilla-central clone
drops:
before: 364,356 KB
after: 326,008 KB
delta: -38,348 KB
When combined with the previous patch, total peak RSS decrease is now
190,116 KB.
The contents of fnodes are only accessed once per key. It is wasteful to
cache the value since nobody will use it.
Before this patch, the caching of unused data in fnodes was effectively
causing a memory leak during the file streaming part of bundle creation.
On mozilla-central (which has ~190,000 entries in fnodes), this patch
has a significant impact on RSS at the end of generate():
before: 516,124 KB
after: 364,356 KB
delta: -151,768 KB
The origin of this code can be traced back to 1f567a607f1f and has been
with us since the 2.7 release.
lookupmf() is currently defined earlier than when it is needed. Future
patches further refactoring this code will be easier to read when
lookupmf() is in its new home.
The mail module only verifies the smtp ssl certificate if 'verifycert' is enabled
(the default). The 'verifycert' can take three possible values:
- 'strict'
- 'loose'
- any "False" value, eg: 'false' or '0'
We tested the validity of the third value, but never converted it to actual
falseness, making 'False' an equivalent for 'loose'.
This changeset fixes it.
2ec3e28dea6b changed 'sample' from a list to a set. The iteration order is thus
undefined and the yesno indices are not stable.
To solve this, repeat the listification and comment from elsewhere in the code.
Note: the randomness in the discovery protocol can make this problem hard to
reproduce.
2ec3e28dea6b made it possible that the initial head check didn't include all
heads. If that is the case, don't use the early exit just because this random
sample happened to be 'all known'.
Note: the randomness in the discovery protocol can make this problem hard to
reproduce.
This keyword remapping was introduced in 236440938a03 as part of converting
generator based iterators into list based iterators, mentioning "undesired
behavior in template" when a generator is exhausted, but doesn't say what and
introduces no tests.
The problem with the remapping was that it corrupted the output for keywords
like 'extras', 'file_copies' and 'file_copies_switch' in templates such as:
$ hg log -r 82a4f5557c6b --template "{file_copies % ' File: {file_copy}\n'}"
File: mercurial/changelog.py (mercurial/hg.py)
File: mercurial/changelog.py (mercurial/hg.py)
File: mercurial/changelog.py (mercurial/hg.py)
File: mercurial/changelog.py (mercurial/hg.py)
File: mercurial/changelog.py (mercurial/hg.py)
File: mercurial/changelog.py (mercurial/hg.py)
File: mercurial/changelog.py (mercurial/hg.py)
File: mercurial/changelog.py (mercurial/hg.py)
What was happening was that in the first call to runtemplate() inside runmap(),
'lm' mapped the keyword (e.g. file_copies) to the appropriate showxxx() method.
On each subsequent call to runtemplate() in that loop however, the keyword was
mapped to a list of the first item's pieces, e.g.:
'file_copy': ['mercurial/changelog.py', ' (', 'mercurial/hg.py', ')']
Therefore, the dict for the second and any subsequent items were not processed
through the corresponding showxxx() method, and the first item's data was
reused.
The 'extras' keyword regressed in 56b014c52204, and 'file_copies' regressed in
4e182fb53989 for other reasons. The common thread of things fixed by this seems
to be when a list of dicts are passed to the templatekw._hybrid class.
$ hg extdiff -p cmd -o "name <user@example.com>"
resulted in a shell redirection error (due to the less-than sign),
rather than passing the single option to cmd. This was due to options
not being quoted for passing to the shell, via util.system(). Apply
util.shellquote() to each of the user-specified options (-o) to the
comparison program before they are concatenated and passed to
util.system(). The requested external diff command (-p) and the
files/directories being compared are already quoted correctly.
The discussion at the time of changeset 6654fcb57d92 correctly noted
that this course of action breaks whitespace-separated options specified
for external diff commands in the configuration. The lower part of the
patch corrects this by lexing options read from the configuration file
into separate options rather than reading them all into the first
option.
Update test to cover these conditions.
Related changesets (reverse-chronological):
- 6654fcb57d92 (fix reverted to make configuration file options work)
- c64ec6e8ffa2 (issue fixed but without fix for configuration file)
Further digging on this issue show that the limit on the sample size used in
discovery never works for heads. Here is a quote from the code itself:
desiredlen = size - len(always)
if desiredlen <= 0:
# This could be bad if there are very many heads, all unknown to the
# server. We're counting on long request support here.
The long request support never landed and evolution make the "very many heads,
all unknown to the server" case quite common.
We implement a simple and stupid hard limit of sample size for all query. This
should prevent HTTP 414 error with the current state of the code.
History rewriting commands like histedit tend to use temporary
commits. They may schedule hook execution on these temporary commits
for after the lock has been released. But temporary commits are likely
to have been stripped before the lock is released (and the hook run).
Hook executed for missing revisions leads to various crashes.
We disable hooks execution for revision missing in the repo. This
provides a dirty but simple fix to user issues.
On streaming clone, we were priming the local branch cache with the
remote branchmap, without checking which heads were closed.
This fixes an issue introduced in:
changeset: 17740:f8d7aaf86507
user: Tomasz Kleczek <tomasz.kleczek@fb.com>
date: Wed Oct 03 13:19:53 2012 -0700
summary: branchcache: fetch source branchcache during clone (issue3378)
that was exposed in 2.9 by:
changeset: 20192:6c385e85aa05
user: Brodie Rao <brodie@sf.io>
date: Mon Sep 16 01:08:29 2013 -0700
summary: branches: simplify with repo.branchmap().iterbranches()
8a92e6790099 broke bookmarks getting copied during uncompressed clones. Since
most of the pull logic has been moved into exchange.py, lets just call
exchange.pull to fix up the repo with the latest bits after the streaming clone
has bootstrapped the repo. This keeps us from having to duplicate the bookmark
logic.
The matcher variable 'm' in checkstatus() is reset to None on each
call, so the caching of the matcher no longer happens as it was
intended. This seems to be a regression in 6b9fbae54476 (revset: added
lazyset implementation to checkstatus, 2014-01-03).
Fix by moving the cached matcher into the enclosing function so it's
actually cached across calls. This speeds up
hg log -r 'modifies(mercurial/context.py)' >/dev/null
from 7.5s to 4s.
Also see similar fix in 5ff5c5c9e69f (revset: avoid recalculating
filesets, 2014-10-22).
0cc5c10d5dc7 was not the final version of that patch. It was really slow
because `l not in repo.changelog` iterates revisions up to `l`. Instead,
rev() should utilize spanset.__contains__().
revset #0: rev(210000)
0) wall 0.000039 comb 0.000000 user 0.000000 sys 0.000000 (best of 67978)
1) wall 0.002721 comb 0.000000 user 0.000000 sys 0.000000 (best of 1055)
2) wall 0.000059 comb 0.000000 user 0.000000 sys 0.000000 (best of 45599)
(0: 3.2-rc, 1: 0cc5c10d5dc7, 2: this patch)
Note that the benchmark result described in 0cc5c10d5dc7 is wrong because
it is the one of the initial version.
'n' was introduced in Mercurial in 5d1adb6683fa and broke Python 2.4 support in
mysterious ways that only showed failure in test-glog.t. Py_BuildValue failed
because of the unknown format and a TypeError was thrown ... but it never
showed up on the Python side and it happily continued processing with wrong
data.
Quoting https://docs.python.org/2/c-api/arg.html :
n (integer) [Py_ssize_t]
Convert a Python integer or long integer to a C Py_ssize_t.
New in version 2.5.
k (integer) [unsigned long]
Convert a Python integer or long integer to a C unsigned long without
overflow checking.
This will use unsigned long instead of Py_ssize_t. That is not a good solution,
but good is not an option when we have to support Python 2.4.
The old revset had pretty terrible performance on large repositories (12+
seconds). This new revset achieves the same result in only 0.7s. As we improve
the underlying revset APIs we can probably get this revset down to 'only(base,
dest)::', but at the moment that version still takes 2s.
This addresses the bug described in issue4405: when obsolescence markers are
enabled, amending a commit with a file move can lead to the copy information
being lost.
However, the bug is more general and can be reproduced without obsmarkers as
well, as demonstracted by Pierre-Yves and put into the updated test.
Specifically, graph topology divergences between the filelogs and the changelog
can cause copy information to be lost during amends.