The httplib library is renamed to http.client in python 3. So the
import is conditionalized and a test is added in check-code to warn
to use util.httplib
If the user press 'q' to leave the 'less' pager, it is expected to end the
hg process immediately. We currently rely on SIGPIPE for this behavior. But
SIGPIPE won't arrive if we don't write anything (like doing heavy
computation, reading from network etc). If that happens, the user will feel
that the hg process just hangs.
The patch address the issue by adding a SIGCHLD signal handler and sends
SIGPIPE to the server as soon as the pager exits.
This is also an issue with hg's pager implementation.
Rebuilding translation table (256 size) at each repquote() invocations
is redundant.
For example, this patch decreases user time of command invocation
below from 18.297s to 13.445s (about -27%) on a Linux box. This
command is main part of test-check-code.t.
hg locate | xargs python contrib/check-code.py --warnings --per-file=0
This patch adds "_repquote" prefix to functions and variables factored
out from repquote() to avoid conflict of name in the future.
Before this patch, "missing _() in ui message" rule overlooks
translatable message, which starts with other than alphabet.
To detect "missing _() in ui message" more exactly, this patch
improves the regexp with assumptions below.
- sequence consisting of below might precede "translatable message"
in same string token
- formatting string, which starts with '%'
- escaped character, which starts with 'b' (as replacement of '\\'), or
- characters other than '%', 'b' and 'x' (as replacement of alphabet)
- any string tokens might precede a string token, which contains
"translatable message"
This patch builds an input file, which is used to examine "missing _()
in ui message" detection, before '"$check_code" stringjoin.py' in
test-contrib-check-code.t, because this reduces amount of change churn
in subsequent patch.
This patch also applies "()" instead of "_()" on messages below to
hide false-positives:
- messages for ui.debug() or debug commands/tools
- contrib/debugshell.py
- hgext/win32mbcs.py (ui.write() is used, though)
- mercurial/commands.py
- _debugchangegroup
- debugindex
- debuglocks
- debugrevlog
- debugrevspec
- debugtemplate
- untranslatable messages
- doc/gendoc.py (ReST specific text)
- hgext/hgk.py (permission string)
- hgext/keyword.py (text written into configuration file)
- mercurial/cmdutil.py (formatting strings for JSON)
When auto-completing hg commands, aliases are listed, but not the available
switches for an alias, because `HGPLAIN=1` filters these out. Add a
`HGPLAINEXCEPT=alias` exception to resolve this.
We make heavy use of aliases that drive hg log with custom revsets, sorting and
the -G switch, but want our users to be able to auto-complete any additional
command-line switches.
Before this patch, fromlocalfunc() assumes that "module" attribute of
ast.ImportFrom is None for "from . import a", and Python 2.7.x
satisfies this assumption.
On the other hand, with Python 2.6.x, "module" attribute of
ast.ImportFrom is an empty string for "from . import a", and this
causes failure of test-check-module-imports.t.
Our signal handlers forward signals to the server process, but it will
disappear soon after hgc_close(). So we should unregister handlers before
hgc_close(). Otherwise chg would abort due to kill(perrpid, sig) failure.
The problem is spotted by SIGWINCH while waiting pager termination.
Before this patch, chg will give up when it cannot connect to the new server
within 10 seconds. If the host has high load during that time, 10 seconds
is not enough.
This patch makes it adjustable using the CHGTIMEOUT environment variable.
Before this patch, chg uses the old pager behavior (pre 55f6f7fb60d2), which
executes pager in the main process. The user will see the exit code of the
pager, instead of the hg command.
Like 55f6f7fb60d2, this patch fixes the behavior by executing the pager in
the child process, and wait for it at the end of the main process.
This patch makes repquote() distinguish more characters below, as a
preparation for exact detection in subsequent patch.
- "%" as "%"
- "\\" as "b"(ackslash)
- "*" as "A"(sterisk)
- "+" as "P"(lus)
- "-" as "M"(inus)
Characters other than "%" don't use itself as replacement, because
they are treated as special ones in regexp.
d19c9c93ad10 tried to detect '.. note::' more exactly. But
implementation of it seems not correct, because:
- fromc.find(c) returns -1 for other than "." and ":"
- tochr[-1] returns "q" for such characters, but
- expected result for them is "o"
This patch uses dict to manage replacement instead of replacing
str.find() by str.index(), for improvement/refactoring in subsequent
patches. Examination by fixedmap is placed just after examination for
' ' and '\n', because subsequent patch will integrate the latter into
the former.
This patch also changes regexp for 'string join across lines with no
space' rule, and adds detailed test for it, because d19c9c93ad10 did:
- make repquote() distinguish "." (as "p") and ":" (as "q") from
others (as "o"), but
- not change this regexp without any reason (in commit log, at
least), even though this regexp depends on what "o" means
This patch doesn't focuses on deciding whether "." and/or ":" should
be followed by whitespace or not in translatable messages.
It doesn't make sense that (a) is allowed whereas (b) is disallowed.
a) from mercurial import hg
from mercurial.i18n import _
b) from . import hg
from .i18n import _
Since (b) is banned, we should do the same for (a) for consistency.
a) from mercurial import hg
from mercurial.i18n import _
b) from . import hg
from .i18n import _
This is needed so that launchpad and friends have a unique version number for
each distroseries (trusty, wily, xenial, etc). It was discovered when trying to
upload 3.8 to launchpad.
Debian maintainers already have this and lintian warns us about not
listing 'wish' as a dependency or suggestion so this patch does indeed
just that. The issue, by the way, is that we are shipping hgk (which is
written in tcl/tk) so we should be good citizens and list wish (a meta
package for tcl/tk) as a dependency.
Instead of using bdist_mpkg, we use the modern Apple-provided tools to
build an OS X Installer package directly. This has several advantages:
* Avoids bdist_mpkg which seems to be barely maintained and is hard to
use.
* Creates a single unified .pkg instead of a .mpkg.
* The package we produce is in the modern, single-file format instead of
a directory bundle that we have to zip up for download.
In addition, this way of building the package now correctly:
* Installs the manpages, bringing the `make osx`-generated package in
line with the official Mac packages we publish on the website.
* Installs files with the correct permissions instead of encoding the
UID of the user who happened to build the package.
Thanks to Augie for updating the test expectations.
As we don't use sockdirfd yet, this is the simplest workaround to compile chg
on old Unices where AT_FDCWD does not exist. Foozy pointed out Mac OS X 10.10
is required for AT_FDCWD as well as xxxat() functions.
This bug is hidden by the current bias towards matches at the
beginning of the file. When this bias is tweaked later to address
recursion balancing, the normalization code could cause the next block
to shrink to a negative length, thus creating invalid delta chunks. We
add checks here to disallow that.
This bug requires test cases that are an awkwardly large size for the test
suite, but is very rapidly picked up by the included torture tester.
So far fromlocal recognizes relative imports of the form:
from . import D
from .. import E
It wasn't prepared for recognizing relative imports like:
from ..F import G
The bug was not found so far because all relative imports starting
from the parent was in the list of allowsymbolicimports like:
from ..i18n import
from ..node import
Before this patch, check-py3-compat.py implies unintentional ".pure"
sub-package name at loading pure modules, because module name is
composed by just replacing "/" in the path to actual ".py" file by
".".
This makes pure modules belong to "mercurial.pure" package, and
prevents them from importing a module belonging to "mercurial" package
relatively by "from . import foo" or so.
This is reason why pure modules fail to import another module
relatively only at examination by check-py3-compat.py.
This had the unfortunate side effect of causing the environment to have a
newline due to the fact that some 'cd' outputs the result of the directory
change. So, let's just redirect the meaningless output.
Before this patch, if the user uses chg and ncurses interface, resizing the
terminal window will mess up its content.
This patch fixes the issue by forwarding SIGWINCH to the worker process.
We don't really need to report SyntaxErrors, since in theory
docchecker or a test will catch them, but they happen, and
we can't just have the code crash, so for now, we're reporting
them.
Before this patch, if the server started by chg has exited with code 0 without
creating a connectable unix domain socket at the specified address, chg will
exit with code 0, which is not the correct behavior. It can happen, for
example, CHGHG is set to /bin/true.
This patch addresses the issue by checking the exit code of the server and
printing a new error message if the server exited normally but cannot be
reached.
We check for sockdirfd at freecmdserveropts but not lockfd, which is a bit
strange to people new to the code. Add a comment and an assert to make it
clear that lockfd should be closed earlier.
As part of the series to support long socket paths, we need to add the fd of
the directory to the cmdserveropts structure so we can use basenames instead
of full paths for sockname, redirectsockname, and lockfile.
This is a style fix. I was using tabstop=4 for some early patches, although
I realized we use tabstop=8 later but these early style issues remains. Let's
fix them.
Before this patch, chg always uses color in its debugmsg and abortmsg and
there is no way to turn it off.
This patch adds a global flag to control whether chg should use color or
not and only enables it when stderr is a tty and HGPLAIN is not set.
Before this patch, "connect to" debug message is printed repeatedly because
a previous patch changed how the chg client decides the server is ready to be
connected.
This patch revises the places we print connect debug messages so they are less
repetitive without losing useful information.
On pypy datetime and cProfile are modules written in Python, not in C.
For the purpose of this test, just list them explicitely as builtins,
which silences warnings about them being imported before stdlib modules.
Before this patch, chg will fall back to "hg" if neither CHGHG nor HG are set.
This may have trouble if the "hg" in PATH is not compatible with chg, which
can happen, for example, an old hg is installed in a virtualenv.
Since it's very hard to do a quick hg version check from chg, after discussion
in IRC with smf and marmoute, the quickest solution is to build a package with
a hardcoded absolute hg path in chg. This patch makes it possible by adding a
C macro HGPATH.
Before this patch, if __GNUC__ is not defined, PRINTF_FORMAT_ will not be
defined and will cause compilation error.
This patch solves the issue by making sure PRINTF_FORMAT_ is defined. It
allows chg to be compiled with tcc (http://bellard.org/tcc/) by:
tcc -o chg *.c
All of mercurial.* is now using absolute_import. Most of
mercurial.* is able to ast parse with Python 3. The next big
hurdle is being able to import modules using Python 3.
This patch adds testing of hgext.* and mercurial.* module imports
in Python 3. As the new test output shows, most modules can't
import under Python 3. However, many of the failures are due
to a common problem in a highly imported module (e.g. the bytes vs
str issue in node.py).
Previously, test-check-py3-compat.t parsed Python files with Python 2
and looked for known patterns that are incompatible with Python 3.
Now that we have a mechanism for invoking Python 3 interpreters from
tests, we can expand check-py3-compat.py and its corresponding .t
test to perform an additional AST parse using Python 3.
As the test output shows, we identify a number of new parse failures
on Python 3. There are some redundant warnings for missing parentheses
for the print function. Given the recent influx of patches around
fixing these, the redundancy shouldn't last for too long.
Redirecting stdout to /dev/null has unwanted side effects, namely ui.write
will stop working. This patch removes the redirection code and helps chg to
pass test-bad-extension.t.
If the server has an uncaught exception, it will exit without being able to
write the channel information. In this case, the client is likely to complain
about "failed to read channel", which looks inconsistent with original hg.
This patch silences the error message and makes uncaught exception behavior
more like original hg. It will help chg to pass test-fileset.t.
In some rare cases (next patch), we may want validate to do "unlink" without
forcing the client reconnect. This patch addes a new "reconnect" instruction
and makes "unlink" not to reconnect by default.
Currently the validate command in chgserver expects config can be loaded
without issues but the config can be broken and chg will print a stacktrace
instead of the parsing error, if a chg server is already running.
This patch adds a handler for ParseError in validate and a new instruction
"exit" to make the client exit without abortmsg. A test is also added to make
sure it will behave as expected.
See the previous patch for details. Since the socket will be closed by the
server, handleresponse() will never return:
Traceback (most recent call last):
...
chg: abort: failed to read channel
If chgserver aborts during startup, for example, error.ParseError when parsing
a config file, chg client probably just wants to exit with a same exit code
without printing other unrelated text. This patch changes the text "cmdserver
exited with status" from abortmsg to debugmsg and exits with a same exit code.
It couldn't exclude an empty file containing comments. That's why
hgext/__init__.py had been listed in test-check-py3-compat.t before
"hgext: officially turn 'hgext' into a namespace package".
One problem reported by lintian is "bad-distribution-in-changes-file unstable"
in changelog, but the current changelog for the official package in Ubuntu also
uses that distribution name (unstable), because they import from Debian. This
certainly doesn't stop the build process.
Current pidfile logic will only keep the pid of the newest server, which is
not very useful if we want to kill all servers, and will become outdated if
the server auto exits after being idle for too long.
Besides, the server-side pidfile writing logic runs before chgserver gets
confighash so it's not trivial to append confighash to pidfile basename like
we did for socket file.
This patch removes --pidfile from the command starting chgserver and switches
to an alternative way (unlink socket file) to stop the server.
chgserver now validates and reloads configs automatically. Manually reloading
is no longer necessary. Besides, we are deprecating pid files since the server
will periodically check its ownership of the socket file and exit if it does
not own the socket file any longer, which works more reliable than a pid file.
This patch removes the SIGHUP reload logic from both chg server and client.
The chgserver is designed to load repo config from current directory. "--cwd /"
will prevent chgserver from loading repo config and generate a wrong
confighash, which will result in a redirect loop. This patch removes "--cwd /"
and uses "--daemon-postexec chdir:/" instead.
The custom porting fixers are removed. A comment related to 2to3
has been removed from the import checker.
After this patch, no references to 2to3 remain.
Some users may have hg as a wrapper script which sets sensitive environment
variables (like setting up virtualenv). This will make chg redirect forever
because the environment variables are never considered up to date.
This patch adds a limit (10) for reconnect attempts and warn the user with
a possible solution if the limit is exceeded.
This patch uses the newly added validate method to make sure the server has
loaded the up-to-date config and extensions. If the server cannot validate
itself, the client will receive instructions and follow them to try to reach
another server that is more likely to validate itself. The instructions can
be a redirect (connect to another server address) and/or an unlink (stops an
out-dated server).
It was necessary to go through progress.uisetup() to set up the progressui
wrapper. Since the progress extension has got into the core, progress.assume-tty
is no longer necessary.
Sometimes people may create a symbol link from hg to chg, or write a wrapper
script named hg calling chg. Without $HG and $CHGHG set, this will lead to
chg executes itself causing deadlock. The user will notice chg hangs for some
time and aborts with a timed out message, without knowing the root cause and
how to solve it.
This patch sets a dummy environment variable before executing hg to detect
this situation, and print a fatal message with some possible solutions.
CHGINTERNALMARK is set by chg client to detect the situation that chg is
started by chg. It is temporary and should be dropped to avoid possible
side effects.
There are some known unsupported commands or flags for chg, such as hg serve -d
and hg foo --time. This patch detects these situations and transparently fall
back to the original hg. So the users won't bother remembering what chg can and
cannot do by themselves.
The current detection is not 100% accurate since we do not have an equivalent
command line parser in C. But it tries not to cause false positives that
prevents people from using chg for legit cases. In the future we may want to
implement a more accurate "unsupported" check server-side.
This is a part of the one server per config series. In multiple-server setup,
multiple clients may try to start different servers (on demand) at the same
time. The old lock will not guarantee a client to connect to the server it
just started, and is not crash friendly.
This patch addressed above issues by using flock and does not release the lock
until the client actually connects to the server or times out.
Initially we use --daemon-pipefds to pass file descriptors for synchronization.
Later, in order to support Windows, --daemon-pipefds is changed to accept a
file path to unlink instead. The name is outdated since then.
chg client is designed to use flock, which will be held before starting a
server and until the client actually connects to the server it started. The
unlink synchronization approach is not so helpful in this case.
To address the issues, this patch renames pipefds to postexec and the following
patch will allow the value of --daemon-postexec to be things like
'unlink:/path/to/file' or 'none'.
I've updated the script to reflect changes in Mercurial and to include a much
more through installation guide with configuration examples and details on how
to configure IIS. I've used the script to set up a working server from scratch.
We are going to make chgserver load repo config, remember what it is, and
load repo config again to detect config change. This is the first step that
passes config, repo, cwd options to server. Traceback is passed as well to
cover errors before hitting chgserver.runcommand.
They are like malloc and realloc but will abort the program on error.
A lot of places use {m,re}alloc and check their results. This patch
can simplify them.
This is necessary to suspend/resume long pulls, interactive curses session,
etc.
The implementation is based on emacsclient, but our version doesn't test if
chg process is foreground or not before propagating SIGCONT. This is because
chg isn't always an interactive session. If we copy the SIGTTIN/SIGTTOU
emulation from emacsclient, non-interactive session can't be moved to a
background job.
$ chg pull
^Z
suspended
$ bg %1
[1] continued
[1] suspended (tty input) # wrong
https://github.com/emacs-mirror/emacs/blob/0e96320/lib-src/emacsclient.c#L1094
It seems calling memset() and sigemptyset() is common pattern to initialize
sigaction. And strictly speaking, sigset_t must be initialized by sigemptyset()
or sigfillset(). I saw git and uwsgi do that way, so let's follow them.
If the revset being benchmarked has an exception, the handling code was
encountering an error because the exception did not always have an "output"
attribute (I think it's a python 2.7 thing).
This rule detects "hg extdiff" invocation without -p/--program and
-o/--option.
This patch specifies "-p diff" explicitly in test-extdiff.t to avoid
false positive matching.
Before this patch, check-code examines "magic" pattern (e.g.
'^#!.*python') matching against not contents of a file, but name of
it.
This unintentionally omits code checking against Python source file,
of which filename doesn't end with "*.py" or "*.cgi", even though
contents of it starts with "#!/bin/python" or so.
In this change, 'pre' refers contents of file 'f'.
This is fixing for 'legacy exception syntax; use "as" instead of ","'
check-code rule.
check-code has overlooked these, because files aren't recognized as
one to be checked (this problem is fixed by subsequent patch).
This is fixing for 'missing _() in ui message (use () to hide
false-positives)' check-code rule.
check-code has overlooked this, because a file isn't recognized as one
to be checked (this problem is fixed by subsequent patch).
This is fixing for 'no whitespace around = for named parameters'
check-code rule.
check-code has overlooked this, because a file isn't recognized as one
to be checked (this problem is fixed by subsequent patch).
This is fixing for 'line too long' check-code rule.
check-code has overlooked this, because a file isn't recognized as one
to be checked (this problem is fixed by subsequent patch).
Before this patch, some tests using external "diff" command via
extdiff extension fail on Solaris, because system standard "diff" (=
/usr/bin/diff) on Solaris always formats chunk header in the style
below:
@@ -X.x +Y.y @@
even though "diff" on Linux sometimes omits ".x" and/or ".y" in it.
This patch makes chunk header of external diff glob-ed for portability
of tests, and adds check-code.py rules to detect such diff output in
tests.
This patch also changes "hg diff" output in test-subrepo-git to
simplify detection rules, even though it is certainly portable because
these lines are generated by "git" command.
This patch is a part of making tests using external "diff" portable,
and tests below aren't yet portable even after this patch.
test-largefiles-update.t
test-subrepo-deep-nested-change.t
Before this patch, some tests using external "diff" command via
extdiff extension fail on Solaris, because system standard "diff" (=
/usr/bin/diff) on Solaris doesn't display timezone for timestamp of
each files in diff output.
This patch makes timezone in external diff output glob-ed for
portability of tests, and adds check-code.py a rule to detect such
Before this patch, some tests using external "diff" command via
extdiff extension fail on Solaris, because "-p" (show which C function
each change is in) option isn't supported by system standard "diff" on
Solaris, even though extdiff passes it to external "diff" by default.
Fortunately, this non-portable option isn't important for (current, at
least) tests using external "diff" command via extdiff extension.
This patch omits "-p" for external "diff" command via extdiff
extension for portability of tests, and adds check-code.py a rule to
detect invocation of "diff" with "-p".
Newly added check-code.py rule examines only lines generated by
external "diff" with "-r", because strict examination might
misidentify "hg diff -p" or other complicated lines consisting of
"diff" string as wrong one.
This patch is a part of making tests using external "diff" portable,
and tests below aren't yet portable even after this patch.
test-graft.t
test-largefiles-update.t
test-subrepo-deep-nested-change.t
Before this patch, test-check-config.t fails on Solaris, because
"xargs" doesn't invoke check-config.py with all filenames at once.
"xargs" may invoke specified command multiple times with part of
arguments given from stdin: according to "xargs(1)" man page, this
dividing arguments is system-dependent.
For portability of test-check-config.t, this patch adds "xargs" like
mode to check-config.py and executes it in test-check-config.t without
"xargs".
We're trying to forbid "grep -a" and unintentionally complained even
if the "-a" was part of the filename. Requiring a space before "-a" to
match is probably good enough.
The old code did not understand the difference between the first line of the summary,
and a random line in the summary that happened to include a #, or a
random line in the changes that happened to include it.
7c3798ffdc0c is an example where it fails
For reasons I can't explain (but likely have something to do with a
combination of __import__ inferring default values for arguments and
the demand importer mechanism further assuming defaults), the demand
importer isn't playing well with IPython. Without this patch, we get
a failure "ValueError: Attempted relative import in non-package" when
attempting to import "IPython." The stack has numerous demandimport
calls on it and adding "IPython" to the exclude list in demandimport
isn't enough to make the problem go away, which means the issue is
likely somewhere in the bowells of IPython. It's easier to just disable
the demand importer when importing the debugger.
The goal of commit summary keywords is to help us sort, categorize,
and filter our voluminous commits for our release notes in a way
that's helpful and meaningful to end users. Lately, there have been a
huge number of "keywords" that are neither words nor particularly key.
This patch tries to discourage that by narrowing the allowed
characters to alphanumeric. In particular, it doesn't allow "."
(method, function names, and file extensions) and "/" (filenames). It
also gives a short reminder of what a keyword ought to be.
This addition to the inno installer script means that the windows uninstaller
registry key “DisplayVersion" is set to the application version number and
will show in Add/Remove Programs.
This makes the changes in 68b7b759ebff and 71a3703364df available on Windows.
I'm not setup to make the installer, so someone with experience in this area
should probably give it a look. In looking around to try to figure out how to
build the installer, it looks like the Makefile may need an update to $DOCFILES.
My version of docker (1.8.3) have a different formating for 'docker version'
that broke the build script. We make the version matching more generic in to
work with both version.
A subsequent patch will refactor _chunks() and the calculation of the
offset will no longer occur in that function. Prepare by returning the
offset from _chunkraw().
There are make targets for building mercurial packages for various
distributions using docker. One of the preparation steps before building is to
create inside the docker image a user with the same uid/gid as the current user
on the host system, so that the resulting files have appropriate
ownership/permissions.
It's possible to run `make docker-<distro>` as a user with uid or gid that is
already present in a vanilla docker container of that distibution. For example,
issue4657 is about failing to build fedora packages as a user with uid=999 and
gid=999 because these ids are already used in fedora, and groupadd fails.
useradd would fail too, if the flow ever got to it (and there was a user with
such uid already).
A straightforward (maybe too much) way to fix this is to allow non-unique uid
and gid for the new user and group that get created inside the image. I'm not
sure of the implications of this, but marmoute encouraged me to try and send
this patch for stable.
Previously the -rc in our rc tags got dropped, meaning that those
packages looked newer to the packaging system than the later release
build. This rectifies the issue, though some damage may already have
been done on 3.6-rc builds.
I'm mostly cargo-culting the RPM version format - there don't appear
to be rules for RPM about how to handle this. Hopefully an RPM
enthusiast can fix up what I've done as a followup.
We want to support editors with parameters, eg EDITOR="vim -O" or whatever.
So remove the quotes from around $ED and assume that the editor variable is
properly escaped already.
Now, 'dirstate.write(tr)' delays writing in-memory changes out, if a
transaction is running.
This may cause treating this revision as "the first bad one" at
bisecting in some cases using external hook process inside transaction
scope, because some external hooks and editor process are still
invoked without HG_PENDING and pending changes aren't visible to them.
'dirstate.write()' callers below in localrepo.py explicitly use 'None'
as 'tr', because they can assume that no transaction is running:
- just before starting transaction
- at closing transaction, or
- at unlocking wlock
This fix adds a caret to the start of the regex looking for merge markers. This
avoids the issue arises when you've real merge conflicts in a file that tests
for the existance of merge markers in test output. Editmerge will not open on
the fake/tested merge markers because they'll be indented in.
This adds a test extension to check that the non-normal set contains the
expected entries. It wraps several methods of the dirstate to check that
the non-normal set has the correct values before and after the call. The
extension lives in contrib so that paranoid developers can easily
enable it to make sure that the non-normal set is consistent across more
complex operations than the included tests.
Whenever check-code finds something wrong, the diffs it
generated were fairly hard to read.
The problem is that check-code before this change
would list files that were white listed using
no- check- code but without a glob marker.
Whereas, the test-check-code.t expected output has
no-che?k-code (glob) in order to avoid having itself
flagged as a file to skip.
Thus, in addition to any lines relating to things you
did wrong, all of the white-listed files are listed as
changed.
There is no reason for things to be this painful.
This change makes the output from check-code.py match
the expected output in test-check-code.t
This came up before, but the tests in check-code.py don't find -U (only -u)
and they don't work when the diff is inside a shell function. This fixes
the offending tests and beefs up check-code.py.
Not having this caused warnings on Windows:
mercurial/pure/osutil.py:12: stdlib import follows local import: os
mercurial/pure/osutil.py:13: stdlib import follows local import: socket
mercurial/pure/osutil.py:14: stdlib import follows local import: stat
mercurial/pure/osutil.py:15: stdlib import follows local import: sys
We have a convention of using -c|-m|FILE elsewhere for reading from
revlogs. Use it for `hg perfrevlog`.
While I was here, I also added a docstring to document what this
command does, as "perfrevlog" is ambiguous.
As part of investigating performance improvements to revlog reading,
I needed a mechanism to measure every part of revlog reading so I knew
where time was spent and how effective optimizations were.
This patch implements a perf command for benchmarking the various
stages of reading a single revlog revision.
When executed against a manifest revision at the end of a 30,000+
long delta chain in mozilla-central, the command demonstrates that
~80% of time is spent in zlib decompression.
The old code only partially cleared the caches. Now that we have a
comprehensive method for wiping all caches, let's call it.
This appears to introduce a marginal regression in `hg perfmanifest`
on mozilla-central. This is good because the new result is more
accurate since caches aren't being used.
The /dev/null redirect was causing the following error:
The system cannot find the path specified.
Adjusting HGRCPATH as part of the command line causes the system to try to
execute 'HGRCPATH'.
Once we get a matcher down into manifestmerge, we can make narrowhg
work more easily and potentially let manifest.match().diff() do less
work in manifestmerge.
Python 3 is inevitable. There have been incremental movements towards
converting the code base to be Python 3 compatible. Unfortunately, we
don't have any tests that look for Python 3 compatibility. This patch
changes that.
We introduce a check-py3-compat.py script whose role is to verify
Python 3 compatibility of the files passed in. We add a test that
calls this script with all .py files from the source checkout.
The script currently only verifies that absolute_import and
print_function are used. These are the low hanging fruits for Python
compatbility. Over time, we can include more checks, including
verifying we're able to load each Python file with Python 3. You
have to start somewhere.
Accepting this patch means that all new .py files must have
absolute_import and print_function (if "print" is used) to avoid
a new warning about Python 3 incompatibility. We've already
converted several files to use absolute_import and print_function
is in the same boat, so I don't think this is such a radical
proposition.
Before this patch, import-checker.py didn't know if a name in ImportFrom
statement are module or not. Therefore, it complained the following example
did "direct symbol import from mercurial".
# hgext/foo.py
from mercurial import hg
This patch reuses the dict of local modules to filter out sub-module names.
Because OpenSSL is compiled without SSLv3 support on Debian sid, Python 2.6.9
can't be built without this hack. Python 2.7 is patched appropriately, but
2.6 isn't.
http://bugs.python.org/issue22935
This allows builddeb to handle distributions that are not Debian.
Distributor ID is reported by lsb_release --id, and in case of builddeb it's
usually Debian or Ubuntu.
Debian and Ubuntu releases have both codenames and traditional version numbers.
An entire "branch" of releases is referred to by its codename, and version
numbers (e.g. 8.2, 14.04.3) are used to address individual releases.
Since we use codenames for building .deb packages, let's call the option and
the variable appropriately.
The pull url header can easily grow over 80 chars. The check-commit script was
confusing this with a too long summary line. We update the regular expression to
not match other header.
Many revset consumers construct changectx instances for each returned
result. Add support for benchmarking this to our revset benchmark
script.
In the future, we might want to have some kind of special syntax in
the parsed revset files to engage this mode automatically. This would
enable us to load changectxs for revsets that do that in the code and
would more accurately benchmark what's actually happening. For now,
running all revsets with or without changectxs is sufficient.
Previously, perfrevset called repo.revs(), which only returns integer
revisions. Many revset consumers call repo.set(), which returns
changectx instances. Or they obtain a context manually later.
Since obtaining changectx instances when evaluating revsets is common,
this patch adds support for benchmarking this use case.
While we added an if conditional for every benchmark loop, it
doesn't appear to matter since revset evaluation dwarfs the cost
of a single if.
The home of 'Abort' is 'error' not 'util' however, a lot of code seems to be
confused about that and gives all the credit to 'util' instead of the
hardworking 'error'. In a spirit of equity, we break the cycle of injustice and
give back to 'error' the respect it deserves. And screw that 'util' poser.
For great justice.
default value are common to all call. Using mutable value is a classical source
of bug in Python. We forbid it.
The regexp (Courtesy of Matt Mackall) is only catching such value on the first
line of a definition, but that will be good enough for now.
Leaving the hgk binary in /usr/bin causes some lintian warnings, and
downstream packages poke it in /usr/share/mercurial, so we'll just
stash it in there. Rather than patch hgk.py as part of the Mercurial
install, just drop a config file in /etc/mercurial/hgrc.d that points
to the installed hgk.
Python 2.6.1 (r261:67515, Jun 24 2010, 21:47:49)
[GCC 4.2.1 (Apple Inc. build 5646)] on darwin
$ ls '/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6'/BaseHTTPServer.py*
/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/BaseHTTPServer.pyc
/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/BaseHTTPServer.pyo
This is a much larger commit than I'd like, but I honestly don't see a
good way to break it up and leave things working. Summary:
We now use debian/rules with debhelper to build our debs. This is much
more standard, and means we use dh_python2 to do things like handle
leaving .pyc files out of the built debs.
The resulting package is split into mercurial and mercurial-common,
with the former being the hg stub and all the native .sos, and the
latter being basically everything else.
builddeb and dockerdeb are updated to use the new system. The old way
(using dpkg by hand) breaks with the above changes because
debian/control no longer contains a version string (that's now guessed
from the phony changelog.)
Tests are updated to assert that the right files end up in the right
debs.
Debian now rolls their own official Vagrant base boxes, so use that. At
the same time, we're updating from Debian 7.4 (wheezy) to 8.1 (jessie),
and switching from 32-bit to 64-bit (Debian does not provide 32-bit base
boxes).
Previously:
$ /c/Program\ Files/Mercurial/hg help -k merge-tools
abort: No such file or directory: c:\Program Files\Mercurial\help\scripting.txt
The Inno installer seems OK, but the TortoiseHg required the same fix. That got
queued with an additional change of 'helpFolder.guid' in guids.wxi (probably by
Steve). I'm not sure if that is necessary here too.
This allows getting a Python stack trace at any time on Unix by
hitting Ctrl-\ (or Ctrl-T on BSDs). Useful for debugging mysterious
hangs on the fly. Sample output:
$ hg log -k nosuchmessage
^\
File "/home/mpm/hg/mercurial/revset.py", line 3089, in _iterfilter
if cond(x):
File "/home/mpm/hg/mercurial/util.py", line 415, in f
cache[arg] = func(arg)
File "/home/mpm/hg/mercurial/revset.py", line 1215, in matches
for t in c.files() + [c.user(), c.description()])
File "/home/mpm/hg/mercurial/context.py", line 525, in files
return self._changeset[3]
File "/home/mpm/hg/mercurial/util.py", line 531, in __get__
result = self.func(obj)
File "/home/mpm/hg/mercurial/context.py", line 498, in _changeset
return self._repo.changelog.read(self.rev())
File "/home/mpm/hg/mercurial/changelog.py", line 338, in read
text = self.revision(node)
File "/home/mpm/hg/mercurial/revlog.py", line 1092, in revision
bins = self._chunks(chain)
File "/home/mpm/hg/mercurial/revlog.py", line 1013, in _chunks
ladd(decompress(buffer(data, chunkstart - offset, chunklength)))
File "/home/mpm/hg/mercurial/revlog.py", line 91, in decompress
return _decompress(bin)
----
As of this change, we no longer produce broken debs, but I've already
got followups written that will produce much more standard-looking
packages and test the resulting packages.
On my linux machines multiprocessing appears to defeat the logic in
import-checker to detect stdlib modules. Since we now only use
versions of Python which ship with multiprocessing, let's just
whitelist the module.
I got the following error by rewriting hgweb/webcommands.py to use
absolute_import. It is false-positive because the import line appears in
"help" function:
hgweb/webcommands.py:1297: higher-level import should come first: mercurial
This patch makes the import checker aware of the function scope and apply
rules recursively.
I got the following error by rewriting hgweb/__init__.py to use
absolute_import, which is obviously wrong:
Import cycle: mercurial.hgweb.__init__ -> mercurial.hgweb.__init__
"from foo import bar" should not make a cycle if "foo" is a package and
if "bar" is a module or a package. On the other hand, it should be detected
as a cycle if "bar" is a non-module name. Both cases are doc-tested already,
so this patch does not add new doctest.
This script scans files for lines that look like either ui.config
usage or config variable documentation. It then ensures:
- ui.config calls for each option agree on types and defaults
- every option appears to be mentioned in documentation
It doesn't complain about devel/experimental options and allows
marking options that are not intended to be public.
Since we haven't been able to come up with a good scheme for
documenting config options at point of use, this will help close the
loop of making sure all options that should be documented are.
If mercurial was installed into a directory other than the site-packages,
test-module-imports.t failed as 'mercurial.node' was listed in stdlib_modules:
testpackage/latesymbolimport.py relative import of stdlib module
Instead, we should exclude our packages explicitly.
We can't assume that the site-packages is the only directory that has Python
files but is not handled as a package. For example, we have dist-packages
directory on Debian.
Before this patch, `import-checker.py` exits with non-0 code, if no
error is detected. This is unusual as Unix command.
This change may be a one of preparations for issue4677, because this
can avoid extra explanation about unusual exit code of
`import-checker.py` for third party tool developers.
We introduce a new convention for declaring imports and enforce it via
the import checker script.
The new convention is only active when absolute imports are used, which is
currently nowhere. Keying off "from __future__ import absolute_import" to
engage the new import convention seems like the easiest solution. It is
also beneficial for Mercurial to use this mode because it means less work
and ambiguity for the importer and potentially better performance due to
fewer stat() system calls because the importer won't look for modules in
relative paths unless explicitly asked.
Once all files are converted to use absolute import, we can refactor
this code to again only have a single import convention and we can
require use of absolute import in the style checker.
The rules for the new convention are documented in the docstring of the
added function. Tests have been added to test-module-imports.t. Some
tests are sensitive to newlines and source column position, which makes
docstring testing difficult and/or impossible.
A future patch will formalize the modern import convention. In
preparation for that, introduce a new wrapper function that will invoke
the proper function.
"from . import X" will produce an ImportFrom ast node with .module =
None. This resulted in a run-time error from attempting to concatenate
None with a str.
Another problem with relative imports is that the prefix may be dynamic
based on the "level" attribute of the import. e.g. "from ." has level 1
and "from .." has level 2.
We teach the "fromlocal" function how to cope with relative imports.
Where appropriate, the consumer passes in the level so relative module
names may be resolved properly.
We just rewrote all files to use modern exception syntax. Ban the old
form.
This will detect the "except type, instance" and
"except (type1, type2), instance" forms.
Python 2.6 introduced the "except type as instance" syntax, replacing
the "except type, instance" syntax that came before. Python 3 dropped
support for the latter syntax. Since we no longer support Python 2.4 or
2.5, we have no need to continue supporting the "except type, instance".
This patch mass rewrites the exception syntax to be Python 2.6+ and
Python 3 compatible.
This patch was produced by running `2to3 -f except -w -n .`.
The canonical way of doing 'roots(X)' is 'X - children(X)'. This is what the
implementation used to be. However, computing children is expensive because it
is unbounded. Any changesets in the repository may be a children of '0' so you
have to look at all changesets in the repository to compute children(0).
Moreover the current revsets implementation for children is not lazy, leading to
bad performance when fetching the first result.
There is a more restricted algorithm to compute roots:
roots(X) = [r for r in X if not parents(r) & X]
This achieve the same result while only looking for parent/children relation in
the X set itself, making the algorithm 'O(len(X))' membership operation.
Another advantages is that it turns the check into a simple filter, preserving
all laziness property of the underlying revsets.
The speed is very significant and some laziness is restored.
-) revset without 'roots(...)' to compare to base line
0) before this change
1) after this change
revset #0: roots((tip~100::) - (tip~100::tip))
plain min last
-) 0.001082 0.000993 0.000790
0) 0.001366 0.001385 0.001339
1) 0.001257 92% 0.001028 74% 0.000821 61%
revset #1: roots((0::) - (0::tip))
plain min last
-) 0.134551 0.144682 0.068453
0) 0.161822 0.171786 0.157683
1) 0.137583 85% 0.146204 85% 0.070012 44%
revset #2: roots(tip~100:)
plain min first last
-) 0.000219 0.000225 0.000231 0.000229
0) 0.000513 0.000529 0.000507 0.000539
1) 0.000463 90% 0.000269 50% 0.000267 52% 0.000463 85%
revset #3: roots(:42)
plain min first last
-) 0.000119 0.000146 0.000146 0.000146
0) 0.000231 0.000254 0.000253 0.000260
1) 0.000216 93% 0.000186 73% 0.000184 72% 0.000244 93%
revset #4: roots(not public())
plain min first
-) 0.000478 0.000502 0.000504
0) 0.000611 0.000639 0.000634
1) 0.000604 0.000560 87% 0.000558
revset #5: roots((0:tip)::)
plain min max first last
-) 0.057795 0.004905 0.058260 0.004908 0.038812
0) 0.132845 0.118931 0.130306 0.114280 0.127742
1) 0.111659 84% 0.005023 4% 0.111658 85% 0.005022 4% 0.092490 72%
revset #6: roots(0::tip)
plain min max first last
-) 0.032971 0.033947 0.033460 0.032350 0.033125
0) 0.083671 0.081953 0.084074 0.080364 0.086069
1) 0.074720 89% 0.035547 43% 0.077025 91% 0.033729 41% 0.083197
revset #7: 42:68 and roots(42:tip)
plain min max first last
-) 0.006827 0.000251 0.006830 0.000254 0.006771
0) 0.000337 0.000353 0.000366 0.000350 0.000366
1) 0.000318 94% 0.000297 84% 0.000353 0.000293 83% 0.000351
revset #8: roots(0:tip)
plain min max first last
-) 0.002119 0.000145 0.000147 0.000147 0.000147
0) 0.047441 0.040660 0.045662 0.040284 0.043435
1) 0.038057 80% 0.000187 0% 0.034919 76% 0.000186 0% 0.035097 80%
revset #0: roots(:42 + tip~42:)
plain min max first last sort
-) 0.000321 0.000317 0.000319 0.000308 0.000369 0.000343
0) 0.000772 0.000751 0.000811 0.000750 0.000802 0.000783
1) 0.000632 81% 0.000369 49% 0.000617 76% 0.000358 47% 0.000601 74% 0.000642 81%
If the computation of a set for each phase (done in C) is available,
we use it directly instead of applying a simple filter. This give a
massive speed-up in the vast majority of cases.
On my mercurial repo with about 15000 out of 40000 draft changesets:
revset: draft()
plain min first last
0) 0.011201 0.019950 0.009844 0.000074
1) 0.000284 2% 0.000312 1% 0.000314 3% 0.000315 x4.3
Bad performance for "last" come from the handling of the 15000 elements set
(memory allocation, filtering hidden changesets (99% of it) etc. compared to
applying the filter only on a handfuld of revisions (the first draft changesets
being close of tip).
This is not seen as an issue since:
* Timing is still pretty good and in line with all the other one,
* Current user of Vanilla Mercurial will not have 1/3 of their repo draft,
This bad effect disappears when phase's set is smaller. (about 200 secrets):
revset: secret()
plain min first last
0) 0.011181 0.022228 0.010851 0.000452
1) 0.000058 0% 0.000084 0% 0.000087 0% 0.000087 19%
Using 'repo[X]' is much slower because it creates a 'changectx' object and goes
though multiple layers of code to do so. It is also error prone if there is
tags, bookmarks, branch or other names that could map to a node hash and take
precedence (user are wicked).
This provides a significant performance boost on repository with a lot of
heads. Benchmark result for a repo with 1181 heads.
revset: head()
plain min last reverse
0) 0.014853 0.014371 0.014350 0.015161
1) 0.001402 9% 0.000975 6% 0.000874 6% 0.001415 9%
revset: head() - public()
plain min last reverse
0) 0.015121 0.014420 0.014560 0.015028
1) 0.001674 11% 0.001109 7% 0.000980 6% 0.001693 11%
revset: draft() and head()
plain min last reverse
0) 0.015976 0.014490 0.014214 0.015892
1) 0.002335 14% 0.001018 7% 0.000887 6% 0.002340 14%
The speed up is visible even when other more costly revset are in use
revset: head() and author("mpm")
plain min last reverse
0) 0.105419 0.090046 0.017169 0.108180
1) 0.090721 86% 0.077602 86% 0.003556 20% 0.093324 86%
This file should gather all revsets ever thought interesting by
anyone. That way one can check the impact of a change when touching
something revset-ish. See inline comments for details.
This file have been refilled with all the entry I could automatically
find from changeset descriptions. I assume we missed some not using
'revsetbenchmarks.py' output.
We rename the file and document its purpose. We'll be introducing another file
gathering revsets useful for benchmark of the predicate themsleves in a coming
changesets.
We remove revset making use of min and max as this is covered by the variants.
We could use variant for roots too, but it is not in the default so keep it
here.
We need more advanced variants in some cases. For example, "The last
rev of the sorted version".
We introduce a syntax for this: `reverse+last` means `last(reverse(REVSET))`.
We now use an 8 char display for timing (from 10), we add some logic to drop
precision if the number grows too large (as we do not care about sub-0 digit
in this case). This allow to pack more variants in a single screen.
The current benchmarks were only testing the whole iteration. This is suboptimal
because some changes are meaningful for things like first result, minimum or
sorting.
We introduce a "variants" feature that let you systematically add some variants
to all revsets tested.
A typical variants value would be 'plain,min,last,sort'. When testing 'all()' it
will also provide testing for:
- all()
- min(all())
- last(all())
- sort(sort)
and output:
plain min last sort
0) 0.034568 0.037857 0.000074 0.034238
1) 0.011358 32% 0.020181 53% 0.000080 108% 0.011405 33%
Using revsets (who hit the API) instead of the internal API add some overhead,
but the overhead should be the same everywhere so it still allow comparison.
This is is more simple to implement and allows comparison with older versions
who do not have the same API.
If the time difference is more than 5% from the previous run, we'll display
relative information. This makes it much simpler to spot performance changes in
a sea of benchmarks.
We mostly only care about total time. Dropping this output give us some room to
display more useful information (like percentage different) in future
changesets.
The file doc was saying something, the code was doing something else, the
argument validation was doing a third thing.
Doc and behavior now comply with the argument defined in the code.
We cannot just ask perfrevset to provide debug output because we usually want
to compare output from old version of Mercurial that do not support it. So, we
are using a regular expression.
(/we now have \d problems/).
This makes the root install folder (on Windows) nice and tidy. The
only files left in the root folder are:
hg.exe
python27.dll
COPYING.rtf
ReadMe.html
the last of which was probably out-of-date 7 years ago
\s is equivalent to the character class [ \t\n\r\f\v]. Using \s+ in
a regular expression against input with multiple lines may match across
multiple lines.
For the regexp in question, "\+\s+" would match "+\n " and similar
sequences, leading to false positives for functions that were included
in diff context, after a modified hunk.
In their infinite wisdom, the Python maintainers stripped bytes of its
% and format() methods for 3.x. They've now added % back to 3.5, but
format() is still missing. Since we don't have any particular need for
it, we should keep avoiding it.
Extension authors (notably at companies using hg) have been
cargo-culting the `testedwith = 'internal'` bit from hg's own
extensions, which then defeats our "file bugs over here" logic in
dispatch. Let's be more aggressive about trying to give extension
authors a hint about what testedwith should say.
The previous patch ensures all module names are recorded in `imports`
as absolute names, so we no longer need to treat modules as ones
imported relatively from the target source if they appear to not be
from the stdlib.
This patch makes `imported_modules()` always yield absolute
`dotted_name_of_path()`-ed name by strict detection with
`fromlocal()`.
This change improves circular detection in some points:
- locally defined modules, of which name collides against one of
standard library, can be examined correctly
For example, circular import related to `commands` is overlooked
before this patch.
- names not useful for circular detection are ignored
Names below are also yielded before this patch:
- module names of standard library (= not locally defined one)
- non-module names (e.g. `node.nullid` of `from node import nullid`)
These redundant names decrease performance of circular detection.
For example, with files at 13dc86d189c9, average loops per file in
`checkmod()` is reduced from 165 to 109.
- `__init__` can be handled correctly in `checkmod()`
For example, current implementation has problems below:
- `from xxx import yyy` doesn't recognize `xxx.__init__` as imported
- `xxx.__init__` imported via `import xxx` is treated as `xxx`,
and circular detection is aborted, because `key` of such
module name is not `xxx` but `xxx.__init__`
- it is easy to enhance for `from . import xxx` style or so (in the
future)
Module name detection in `imported_modules()` can use information
in `ast.ImportFrom` fully.
It is assumed that all locally defined modules are correctly specified
to `import-checker.py` at once.
Strictly speaking, when `from foo.bar.baz import module1` imports
`foo.bar.baz.module1` module, current `imported_modules()` yields only
`foo.bar.baz.__init__`, even though also `foo.__init__` and
`foo.bar.__init__` should be yielded to detect circular import
exactly.
But this limitation is reasonable one for improvement in this patch,
because current `__init__` files in Mercurial seems to be implemented
carefully.
`fromlocalfunc()` uses:
- `modulename` (of the target source) to compose absolute module
name imported relatively from it
It is assumed that `modulename` is an `dotted_name_of_path()`-ed
source file, which may have `.__init__` at the end of it.
This assumption makes composing `prefix` of relative name easy.
- `localmods` to examine whether there is a locally defined (=
Mercurial specific) module matching against the specified name
It is assumed that module names not existing in `localmods` are
ones of Python standard library.
We now have a lock triggered for any transaction. We use it to ensure no-read
are made in read-only mode. We need more that just "no changegroup is added",
since bundle2 allows for more than just changegroup to be exchanged. We still
protect pushkey as it may write data without opening a transaction.
This is a preparation for subsequent patches, which expect that all
locally defined (= mercurial specific) modules are already known
before examinations.
Looping twice for specified modules is a little redundant, but
reasonable cost for improvement in subsequent patches.