We are about to introduce a third option to create an outgoing object:
'missingroots'. This argument will be mutually exclusive with 'commonheads' so
we implement some default value handling in preparation.
This will also help use to make more use of outgoing creation around the code
base.
We are to introduce more code constructing such object in the code base. It will
be more convenient to pass a repository object, all current users already
operate at the repository level anyway. More changes to the contructor argument
are coming in later changeset.
Previously in the worst case we iterated the files in matcher twice and
had a method only for this, which reimplemented logic in subdirmatchers
constructor. So we replaced the method with a subdirmatcher.files() call.
Replacement _pycompatstub designed to be compatible with our demandimporter.
try-except is replaced by version comparison because ImportError will no longer
be raised immediately.
This should be less invasive than mucking builtins.
Since tokenize.untokenize() looks start/end positions of tokens, we calculates
them from the NEWLINE token of the future import.
These functions will be imported automagically by our code transformer.
getattr() and setattr() are widely used in our code. We wouldn't probably
want to rewrite every single call of getattr/setattr. delattr() and hasattr()
aren't that important, but they are functions of the same kind.
It appears that computing index isn't cheap if --rev is specified. That's
why "index" field is available only if --index is specified.
I've named marker.flags() as "flag" because "flags" implies a list or dict
in template world.
Thanks to Piotr Listkiewicz for the initial implementation of this patch.
This will be used to process key-value pairs by formatter. The default
field names and format are derived from the {extras} template keyword.
Tests will be added later.
When doing streaming compression with zlib, zlib appears to emit chunks
with data after ~20-30kb on average is available. In other words, most
calls to compress() return an empty string. On the mozilla-unified repo,
only 48,433 of 921,167 (5.26%) of calls to compress() returned data.
In other words, we were sending hundreds of thousands of empty chunks
via a generator where they touched who knows how many frames (my guess
is millions). Filtering out the empty chunks from the generator
cuts down on overhead.
In addition, we were previously feeding 8kb chunks into zlib
compression. Since this function tends to emit *compressed* data after
20-30kb is available, it would take several calls before data was
produced. We increase the amount of data fed in at a time to 32kb.
This reduces the number of calls to compress() from 921,167 to
115,146. It also reduces the number of output chunks from 48,433 to
31,377. This does increase the average output chunk size by a little.
But I don't think this will matter in most scenarios.
The combination of these 2 changes appears to shave ~6s CPU time
or ~3% from a server serving the mozilla-unified repo.
Currently, running `hg serve --profile` doesn't yield anything useful:
when the process is terminated the profiling output displays results
from the main thread, which typically spends most of its time in
select.select(). Furthermore, it has no meaningful results from
mercurial.* modules because the threads serving HTTP requests don't
actually get profiled.
This patch teaches the hgweb wsgi applications to profile individual
requests. If profiling is enabled, the profiler kicks in after
HTTP/WSGI environment processing but before Mercurial's main request
processing.
The profile results are printed to the configured profiling output.
If running `hg serve` from a shell, they will be printed to stderr,
just before the HTTP request line is logged. If profiling to a file,
we only write a single profile to the file because the file is not
opened in append mode. We could add support for appending to files
in a future patch if someone wants it.
Per request profiling doesn't work with the statprof profiler because
internally that profiler collects samples from the thread that
*initially* requested profiling be enabled. I have plans to address
this by vendoring Facebook's customized statprof and then improving
it.
And refactor dispatch.py to use it. As you can see, the resulting code
is much simpler.
I was tempted to inline _runcommand as part of writing this series.
However, a number of extensions wrap _runcommand. So keeping it around
is necessary (extensions can't easily wrap runcommand because it calls
hooks before and after command execution).
This makes profiling more flexible since we can now call multiple
functions when a profiler is active. But the real reason for this
is to enable a future consumer to profile a function that returns
a generator. We can't do this from the profiling function itself
because functions can either be generators or have return values:
they can't be both. So therefore it isn't possible to have a generic
profiling function that can both consume and re-emit a generator
and return a value.
We do this for other global command arguments. We don't for --profile
for reasons that are unknown to me. Probably because nobody has needed
it.
An upcoming patch will introduce a new consumer of the profiling
code. It doesn't have access to command line arguments. So let's
set the config option during argument processing.
We also remove a check for "options['profile']" because it is now
redundant.
Currently, profiling code lives in dispatch.py, which is a low-level
module centered around command dispatch. Furthermore, dispatch.py
imports a lot of other modules, meaning that importing dispatch.py
to get at profiling functionality would often result in a module import
cycle.
Profiling is a generic activity. It shouldn't be limited to command
dispatch. This patch moves profiling code from dispatch.py to the
new profiling.py. The low-level "run a profiler against a function"
functions have been moved verbatim. The code for determining how to
invoke the profiler has been extracted to its own function.
I decided to create a new module rather than stick this code
elsewhere (such as util.py) because util.py is already quite large.
And, I foresee this file growing larger once Facebook's profiling
enhancements get added to it.
We have a single line function which just lowercase the letters and replaces
"_" with "". Its better to avoid that function call. Moreover we calling this
function around 33 times.
Before, a keyvalue node was processed by the last catch-all condition of
_optimize(). Therefore, topo.firstbranch=expr would bypass tree rewriting
and would crash if an expr wasn't trivial.
V2:
- move from shortest() with minlength 8 to minlength 4
- mention [templates] in config.txt
- better describe the difference between [templatealias] and [templates]
V3:
- choose a better example template
This parameter is slightly confusingly named in wireproto, so it got
mis-specified from the start as 'push' instead of the URL to which we
are pushing. Sigh. I've got a patch for that which I'll mail
separately since it's not really appropriate for stable.
Fixes a regression in bundle2 from bundle1.
In this case, column positioning isn't needed for i18n, too.
Maybe, check-code warning "missing _() in ui message" caused this
useless _() invocation in 6477dd5eeedf.
Before this patch, importing C module on Windows environment causes
infinite recursion call, if py2exe is used with -b2 option.
At importing C module "a.b", extra hooking by zipextimporter of py2exe
causes:
0. assumption before accessing "b" of "a":
- built-in module object is created for "a",
(= "a" is actually imported)
- _demandmod is created for "a.b" as a proxy object, and
(= "a.b" is not yet imported)
- an attribute "b" of "a" is initialized by the latter
1. invocation of __import__ via _hgextimport() in _demandmod._load()
for "a.b" implies _demandimport() for "a.b"
This is unintentional, because _demandmod might be returned by
_hgextimport() instead of built-in module object.
2. _demandimport() at (1) is invoked with not context of "a", but
context of zipextimporter
Just after invocation of _hgextimport() in _demandimport(), an
attribute "b" of the built-in module object for "a" is still
bound to the proxy object for "a.b", because context of "a" isn't
updated by actual importing "a.b". even though the built-in
module object for "a.b" already appears in sys.modules.
Therefore, chainmodules() returns _demandmod for "a.b", which is
gotten from the attribute "b" of "a".
3. processfromitem() on "a.b" causes _demandmod._load() for "a.b"
again
_demandimport() takes context of "a" in this case.
Therefore, attributes below are bound to built-in module object
for "a.b", as expected:
- "b" of built-in module object for "a"
- _module of _demandmod for "a.b"
4. but _demandimport() invoked at (1) returns _demandmod object
because _demandimport() just returns the object returned by
chainmodules() at (3) above.
5. then, _demandmod._load() causes infinite recursion call
_demandimport() returns _demandmod for "a.b", and it is "self" at
_demandmod._load().
To avoid infinite recursion at actual module importing, this patch
uses self._module, if _hgextimport() returns _demandmod itself. If
_demandmod._module isn't yet bound at this point, execution should be
aborted, because actual importing failed.
In this patch, _demandmod._module is examined not on _demandimport()
side, but on _demandmod._load() side, because:
- the former has some exit points
- only the latter uses _hgextimport(), except for _demandimport()
BTW, this issue occurs only in the code path for non .py/.pyc files in
zipextimporter (strictly speaking, in _memimporter) of py2exe.
Even if zipextimporter is enabled, .py/.pyc files are handled by
zipimporter, and it doesn't imply unintentional _demandimport() at
invocation of __import__ via _hgextimport().
Some draconian IT setups lock accounts after a small number of incorrect
password attempts. Mercurial's implementation of the urllib2 authentication was
causing 5 retry attempts with the same credentials, without prompting the user.
The code was attempting to check whether the authorization token had changed,
but unfortunately was reading the misleading 'headers' member of the request
instead of using the 'get_header' accessor.
Modelled on fix for Python issue 8797:
https://bugs.python.org/issue8797https://hg.python.org/cpython/rev/30e8a8f22a2a
Now that we store and display merge labels in user prompts (not just
conflict markets), we should rely on labels to clarify the two sides of a
merge operation (hg merge, hg update, hg rebase etc).
"remote" is not a great name here, as it conflates "remote" as in "remote
server" with "remote" as in "the side of the merge that's further away". In
cases where you're merging the "wrong way" around, remote can even be the
"local" commit that you're merging with something pulled from the remote
server.
Now that we persist the labels, we can consistently use the labels in
prompts for the user without risk of confusion. This changes a huge amount
of command output:
This means that merge prompts like:
no tool found to merge a
keep (l)ocal, take (o)ther, or leave (u)nresolved? u
and
remote changed a which local deleted
use (c)hanged version, leave (d)eleted, or leave (u)nresolved? c
become:
no tool found to merge a
keep (l)ocal [working copy], take (o)ther [destination], or leave (u)nresolved? u
and
remote [source] changed a which local [dest] deleted
use (c)hanged version, leave (d)eleted, or leave (u)nresolved? c
where "working copy" and "destination" were supplied by the command that
requested the merge as labels for conflict markers, and thus should be
human-friendly.
The journal extension had to touch the dirstate internals to be notified about
wd parent change. To make that detection cleaner and reusable let's move it core.
Now the extension can register to be notified about parent changes.
At 9069882b46bf, we had to optimize a parsed tree to resolve x^:y ambiguity.
Since we've moved the resolution of x^:y to parse(), we no longer have to call
optimize(). Therefore, (x:y) can be taken as a single expression, not an odd
range expression x:y.
The "normal" ISO date/time includes a T between date and time. It also
allows dropping the colons and seconds from the timespec. Add new
patterns for these forms as well as tests.
We want to be able to accept ISO 8601 style timezones that don't
include a space separator, so we change the timezone parsing function
to accept a full date string and return both the offset and the
non-timezone portion.
Previously a subrepository "sub" would cause no warnings to
be issued for a file "subnot/a", if it's not present in the
corresponding changeset when calling:
hg cat subnot/a
SSLContext.get_ca_certs() can raise
"ssl.SSLError: unknown error (_ssl.c:636)" on Windows. See
https://bugs.python.org/issue20916 for more info.
We add a try..except that swallows the exception to work around
this bug. If we encounter the bug, we won't print a warning
message about attempting to load CA certificates. This is
unfortunate. But there appears to be little we can do :/
In cbefa73a359814e6784a63f90b78c7afd39bc7d5, I introduced a new bug:
when a symlink points to a folder in commit A and to another folder
in commit B, while updating from A to B, Mercurial will try to use
removedir on this symlink, which will fail. This is a very bad bug,
since it basically renders symlinks to folders unusable in repos.
Added test case fails without a fix and passes with it.