The Linux CIFS kernel driver (even in 2.6.36) suffers from a hardlink
count blindness bug (lstat() returning 1 in st_nlink when it is expected
to return >1), which causes repository corruption if Mercurial running
on Linux pushes or commits to a hardlinked repository stored on a Windows
share, if that share is mounted using the CIFS driver.
This patch works around issue1866 and improves the workaround done in
65e082ae3076 to fix issue761, by teaching the opener to lazily execute a
runtime check (new function checknlink) to see if the hardlink count
reported by nlinks() can be trusted.
Since nlinks() is also known to return varying count values (1 or >1)
depending on whether the file is open or not and depending on what client
and server software combination is being used for accessing and serving
the Windows share, we deliberately open the file before calling nlinks() in
order to have a stable precondition. Trying to depend on the precondition
"file closed" would be fragile, as the file could have been opened very
easily somewhere else in the program.
We were not returning the correct result if nullrev was in revs, as we
are checking parent(currentrev) != nullrev before yielding currentrev
test-convert-hg-startrev was wrong: if we start converting from rev -1 and
onwards, all the descendants of -1 (full repo) should be converted.
- Don't call atomictempfile or nlinks() if the path is malformed
(no basename). Let posixfile() raise IOError directly.
- atomictempfile already breaks up hardlinks, no need to poke
at the file with nlinks() if atomictemp.
- No need to copy the file contents to break hardlinks for 'w'rite
modes (w, wb, w+, w+b). Unlinking and recreating the file is faster.
When parentdelta is enabled, we choose the delta that has the minimum
distance to its base. Otherwise, base may be sufficiently far away to
require a full version, resulting in greatly reduced compression.
The current implementation of colwidth was treating 'A'mbiguous
characters as wide, which was incorrect in a non-East Asian context.
As per http://unicode.org/reports/tr11/#Recommendations, we should
instead default to 'narrow' if we don't know better. As character
width is dependent on the particular font used and we have no idea
what fonts are in use, this recommendation applies.
This introduces HGENCODINGAMBIGUOUS to get the old behavior back.
Currently, cmdutil.make_file() will return a freshly made file handle,
except when given a pattern of '-'. If callers would want to close the
handle, they would have to make sure that it's neither sys.stdin or
sys.stdout. Instead, returning a duplicate of either of the two
ensures that make_file() lives up to its name and creates a new
file handle regardless of the input.
This uses the same strategy as progress for pulls, estimating manifests
based on changeset count and estimating file count by files list in
each changeset.
This allows us to provide deterministic progress information during
transfer of bundle data over HTTP. This is required because we
currently buffer the bundle data to local disk prior to transfer since
wsgiref lacks chunked transfer-coding support.
The opener already unlinks the filename before 'w'riting, for both
the symlink and the normal file case. It also now resets the flags
for normal files on 'w'rite, which makes this os.unlink call completely
redundant.
For Windows, removing this extra unlink call helps to avoid tripping
issue2524 (os.unlink followed by a write).
In many common cases where the subrepo is up to date with the remote repo,
hg push will only need to run one git-branch command to see that nothing
needs to be pushed.
As well as simplifying the code, this makes subprocess calls about 25% faster.
Tested on a couple linux boxes.
python -mtimeit -s'import subprocess' 'subprocess.Popen(
"git version", shell=True, stdout=subprocess.PIPE).wait()'
python -mtimeit -s'import subprocess' 'subprocess.Popen(
["git","version"], stdout=subprocess.PIPE).wait()'
226847bf9cab updated copyfile to also copy over atimes and
mtimes. That behavior is specifically to trick editors into thinking
files that hg record has modified haven't changed. We don't really
care about preserving times in the general case.
This continues the strategy of separation between hg pull and hg update in
git subrepos by only dealing with git's branches on an update. This behavior
tries to cover the bare essentials of the semantics of git pull in the subrepo
when the parent repo does hg pull and hg update.
The Python default for this function is -1, indicating both relative
and absolute imports should be used.[1] Previously, we relied on the
Python VM not passing level when such semantics were
requisted. This is not the case for PyPy, however, where a level of -1
is always passed to __import__.
[1] <http://docs.python.org/library/functions.html#__import__>
Previously, branch names were ideally manipulated as UTF-8 strings,
because they were stored as UTF-8 in the dirstate and the changelog
and could not be safely converted to the local encoding and back.
However, only about 80% of branch name code was actually using the
right encoding conventions. This patch uses the localstr addition to
allow working on branch names as local strings, which simplifies
handling so that the previously incorrect code becomes correct.
Avoids calls to git push when the revision is already known to be
in the remote repository. Now, when using a read-only git subrepo,
git will never need to talk to its upstream repository.
This enables minirst to parse and print option lists which have both
long and short options. Before, we could only parse option lists with
long options.
You can now split a list with a comment:
* foo
.. separator
* bar
and the two list items will no longer be run together, that is the
output is
* foo
* bar
instead of
* foo
* bar
With this patch applied, Mercurial will list the hashes of new remote heads
if push --debug aborts because of new remote heads (option -f/--force not set).
Example:
$ hg push --debug repo1
using http://example.org/repo1
http auth: user johndoe, password not set
sending between command
pushing to http://example.org/repo1
sending capabilities command
capabilities: changegroupsubset stream=1 lookup pushkey unbundle=HG10GZ,HG10BZ,HG10UN branchmap
sending heads command
searching for changes
common changesets up to 187dd3f0a37d
sending branchmap command
new remote heads on branch 'default' <- new output line
new remote head 5862c07f53a2 <- new output line
abort: push creates new remote heads on branch 'default'!
(did you forget to merge? use push -f to force)
Compare to without --debug (not changed by this patch, including it here
for reference purposes only):
$ hg push repo1
pushing to http://example.org/repo1
searching for changes
abort: push creates new remote heads on branch 'default'!
(did you forget to merge? use push -f to force)
Motivation for this change:
'hg outgoing' may list a whole lot of benign changesets plus an odd changeset
that will trigger the "new remote heads" abort. It can be hard to spot that
single unwanted changeset (it may be an old forgotten experiment, lingering
in the local repo).
"hg log -r 'heads(outgoing())'" might be useful, but that also lists a head
that may be benign on push.
Inside prepush(), we already know which heads are causing troubles on 'hg push'.
Why not make that info available (at least on --debug)?
This would also be helpful for doing remote support, as the supporter can ask
the user to paste the output of 'hg push --debug' on error and then ask further
questions about the heads listed.
- Handle 'subset' argument
- Stop returning the null rev from p1 and parents, as in the non-dirstate case
- Order parents as in the non-dirstate case (ascending revs)
This patch makes the 'set' argument to revset function parents() optional.
Like p1() and p2(), if no argument is given, returns the parent(s) of the
working directory.
Morally equivalent to 'p1()+p2()', as expected.
This patch makes the 'set' argument to revset functions p1() and p2()
optional. If no argument is given, p1() and p2() return the first or second
parent of the working directory.
If the working directory is not an in-progress merge (no 2nd parent), p2()
returns the empty set. For a checkout of the null changeset, both p1() and
p2() return the empty set.
- Don't call atomictempfile or nlinks() if the path is malformed
(no basename). Let posixfile() raise IOError directly.
- atomictempfile already breaks up hardlinks, no need to poke
at the file with nlinks() if atomictemp.
- No need to copy the file contents to break hardlinks for 'w'rite
modes (w, wb, w+, w+b). Unlinking and recreating the file is faster.
When issuing `hg pull -r REV` in a repo with no common ancestor with the
remote repo, the message 'requesting all changes' is printed, even though only
the changese that are ancestors of REV are actually requested. This can be
confusing for users (see
http://www.selenic.com/pipermail/mercurial/2010-October/035508.html).
This silences the message if (and only if) the '-r' option was passed.
PROTOCOL_SSLv3 on the server side doesn't work everywhere. Sometimes the client
reports "EOF occurred in violation of protocol" (for example on Mac and Solaris).
The more compatible PROTOCOL_SSLv23 is now used instead. It works but is less
"secure" for some OpenSSL versions as it can fall back to weak encryption.
ui.forcemerge is set before calling into merge or resolve commands, then unset
to prevent ui pollution for further operations.
ui.forcemerge takes precedence over HGMERGE, but mimics HGMERGE behavior if the
given --tool is not found by the merge-tools machinery. This makes it possible
to do: hg resolve --tool="python mymerge.py" FILE
With this approach, HGMERGE and ui.merge are not harmed by --tool
pyOpenSSL apparently doesn't work for Python 2.7 and isn't very actively
maintained.
The built-in ssl module seems like a long-term winner, so we now use that with
Python 2.6 and higher.
Makes extensions.load return the module that
it has loaded.
This is done so that callers can get information on this module, which
e.g. can be used for generating docs.
For the boolean operators, the subset optimization works by calculating
the cheaper argument first, and passing the subset to the second
argument to restrict the revision domain. This works well for filtering
predicates.
But parents() don't work like a filter: it may return revisions outside the
specified set. So, combining it with boolean operators may easily yield
incorrect results. For instance, for the following revision graph:
0 -- 1
the expression '0 and parents(1)' should evaluate as follows:
0 and parents(1) ->
0 and 0 ->
0
But since [0] is passed to parents() as a subset, we get instead:
0 and parents(1 and 0) ->
0 and parents([]) ->
0 and [] ->
[]
This also affects children(), p1() and p2(), for the same reasons.
Predicates that call these (like heads()) are also affected.
We work around this issue by ignoring the subset when propagating
the call inside those predicates.
I have made a help topic for merge tools. The text in the topic is
based on the http://mercurial.selenic.com/wiki/MergeProgram page from
the wiki, along with some extra information on the internal merge tools.
HGMERGE has different semantics than ui.merge. HGMERGE should hold the name
on an executable in your path, or an absolute tool path. As such, it's not
safe to simply copy the user's specified --tool value into HGMERGE. Instead,
we disable HGMERGE by setting it to an empty string.
In particular, when extensions add hooks, or add non-ui and non-paths
configuration items during their setups, we really have no reason
to re-"fix" the config dictionaries.
See the Hg Book on why we actually want to detect this case:
http://hgbook.red-bean.com/read/mercurial-in-daily-use.html#id364290
Before:
$ hg up deadbeef
warning: detected divergent renames of X to:
...
After:
$ hg up deadbeef
note: possible conflict - X was renamed multiple times to:
...
No functionality change.
This patch modifies the check for shell aliases to prevent crashing when an invalid
global option is given.
When an invalid global option is given the check will simply return and let the
normal error handling for this case happen.
The https mode failed in super because BaseRequestHandler is an old-style
class.
This introduces the first test of https client/server functionality - and
"hghave ssl". The test is currently only run on Python 2.6.
Without this fix, mod_wsgi and spawning get in a wedged state after
sending a 304 response. Not sending a body fixed that problem. The
header change was discovered by using wsgiref.validate.validator to
check for other errors.
This changes backouts changeset to retain linear history, .e. it is committed
as a child of the working directory parent, not the reverted changeset
parent.
The default behavior was previously to just commit a reverted change as a
child of the backed out changeset - thus creating a new head. Most of
the time, you would use the --merge option, as it does not make sense to
keep this dangling head as is.
The previous behavior could be obtained by using 'hg update --clean .' after a
'hg backout --merge'.
The --merge option itself is not affected by this change. There is also
still an autocommit of the backout if a merge is not needed, i.e. in case
the backout is the parent of the working directory.
Previously we had (pwd = parent of the working directory):
pwd older
backout auto merge
backout --merge auto commit
With the new linear approach:
pwd older
backout auto commit
backout --merge auto commit
auto: commit done by the backout command
merge: backout also already committed but explicit merge and commit needed
commit: user need to commit the update/merge
Comparing sizes is cheaper than comparing file contents, as it does not
involve reading the file on disk or from the filelog.
It is however not always possible: some extensions, or encode filters,
change data when extracting it to the working directory.
The behaviour between http and ssh still differ:
- the "unsynced changes" is seen as a remote output in the http cases
- but it is correctly seen as a push error for ssh
- Mac OS X has problems with filenames starting with '._'
(e.g. '.FOO' -> '._f_o_o' is now encoded as '~2e_f_o_o')
- Explorer of Windows Vista and Windows 7 strip leading spaces of
path elements of filenames when copying trees
Above problems are avoided by encoding the first space (as '~20') or
period (as '~2e') of all path elements.
This introduces a new entry 'dotencode' in .hg/requires, that is,
a new repository filename layout (inside .hg/store).
Newly created repositories require 'dotencode' by default. Specifying
[format]
dotencode = False
in a config file will use the old format instead.
Prior Mercurial versions will abort with the message
abort: requirement 'dotencode' not supported!
when trying to access a local repository that requires 'dotencode'.
New 'dotencode' repositories can be converted to the previous
repository format with
hg --config format.dotencode=0 clone --pull repoA repoB
Python's __import__() function has 'level' as the fourth argument, not the
third. The code path in question probably never worked.
(This was seen trying to run Mercurial in PyPy. Fixing this made it
die somewhere else...)
When using "hg update" to update to a revision on another branch, if
the user has uncommitted changes in the working directory, hg aborts
with the following message:
abort: crosses branches (use 'hg merge' to merge or use 'hg update
-C' to discard changes)
If the user isn't trying to update to tip and they follow the command
examples verbatim, they would end up updating to the wrong revision.
This patch removes the command examples in favor of just telling the
user to either merge or use --clean:
abort: crosses branches (merge branches or use --clean to discard
changes)
hg also aborts if the user tries to use "hg update" to get to tip
(without specifying a revision) and tip is on another branch:
abort: crosses branches (use 'hg merge' or use 'hg update -c')
This message is changed in the same fashion:
abort: crosses branches (merge branches or use --check to force
update)
Previously no '# ' lines came through the parser.
Now only the first '# ' lines are processed, from '# HG changeset patch' and to
the first line not starting with '# '.
hg log -r 'outgoing(..)' ignored #branch in some cases.
This patch fixes it.
The cases where it misbehaved are now covered by the added
test-revset-outgoing.t
Before this change, "hg help heads" said
hg heads [-ac] [-r REV] [REV]...
[...]
If STARTREV is specified, only those heads that are descendants
of STARTREV will be displayed.
[...]
-r --rev REV show only heads which are descendants of REV
[...]
which made little sense since there are two things called REV in the
synopsis and nothing called STARTREV.
A little digging reveals that the "[-r REV]" part of the synopsis was
introduced in 3acdabb0ef1d, changed to "[-r STARTREV]" in
68df96e9e749, and then changed back to "[-r REV]" in 5c78ba00bf0c.
The last change seems to be based on a patch[1] on our mailinglist
that actually *inserted* STARTREV again in the help for the command
line option itself. For some reason, the patch was changed to remove
STARTREV from the synopsis.
This change finally makes the help consistent by putting STARTREV back
into the help in all places where it is needed:
hg heads [-ac] [-r STARTREV] [REV]...
[...]
If STARTREV is specified, only those heads that are descendants
of STARTREV will be displayed.
[...]
-r --rev STARTREV show only heads which are descendants of STARTREV
[...]
This was not possible until 7ba070acb8f7, which introduced the
possibility of naming the meta variables for each option.
[1]: http://mercurial.markmail.org/message/qgc55gd4fam4ogvz
Pythons SSL module verifies that certificates received for HTTPS are valid
according to the specified cacerts, but it doesn't verify that the certificate
is for the host we connect to.
We now explicitly verify that the commonName in the received certificate
matches the requested hostname and is valid for the time being.
This is a minimal patch where we try to fail to the safe side, but we do still
rely on Python's SSL functionality and do not try to implement the standards
fully and correctly. CRLs and subjectAltName are not handled and proxies
haven't been considered.
This change might break connections to some sites if cacerts is specified and
the certificates (by our definition) isn't correct. The workaround is to
disable cacerts which in most cases isn't much worse than it was before with
cacerts.
Most commands expands configured paths when repositories are specified, just as
the urls help says. Clone also expands the destination path. Clone is morally
equivalent to init + push/pull, so init should also expand the destination path
- and that is what this patch makes it do.
There is no really good usecases for this and in most cases it doesn't matter,
but consistency is nice, and otherwise we would have to document the exception.
The content type for both .tar.gz and .tar.bz2 downloads was
application/x-tar, which is correct for .tar files when no
Content-Encoding is present, but is not correct for .tar.gz and .tar.bz2
files unless Content-Encoding is set to gzip or x-bzip2, respectively.
However, setting Content-Encoding causes browsers to undo that encoding
during download, when a .gz or .bz2 file is usually the desired
artifact. Omitting the Content-Encoding header is preferred to avoid
having browsers uncompress non-render-able files.
Additionally, the Content-Disposition line indicates a final desired
filename with .tar.gz or .tar.bz2 extension which makes providing a
Content-Encoding header inappropriate.
With the current configuration browsers (Chrome and Firefox thus far)
are registering the application/x-tar Content-Type and not .tar
extension and appending that extension, yielding filename.tar.gz.tar as
a final on-disk artifact. This was originally reported here:
http://stackoverflow.com/questions/3753659
I've changed the .tar.gz and .tar.bz2 Content-Type values to
application/x-gzip and application/x-bzip2, respectively. Which yields
correctly named download artifacts on Firefox, Chrome, and IE.
When using hg commands over an http repository in a long running process, a
httphandler instance is leaked for each command, because of a loop
handler.parent -> OpenerDirector and OpenerDirector.handlers -> handler which
is not handled by Python's gc. Discussion on #mercurial concluded that removing
the __del__ method solved the problem.