Doing this will allow access to the lines in arbitrary order (because the
result of enumerate() is an iterator), and that will help calculating rowspan
for annotate blocks.
Yuya suggested we add this check to ensure we don't accidentally try
to load user-writable paths on Windows if we change the control
flow of this function later.
I could not find any use of this parameter value. And it arguably makes
understanding of the function more difficult. Setting the parameter default
value to False.
The link is embedded into a div with class="annotate-info" that only shows up
upon hover of the annotate column. To avoid duplicate hover-overs (this new
one and the one coming from link's title), drop "title" attribute from a
element and put it in the annotate-info element.
This is common between chg and vanilla forking server, so move it to
commandserver and unify handle().
It would be debatable whether we really need gc.collect() or not, but that
is beyond the scope of this series. Maybe we can remove gc.collect() once
all resource deallocations are switched to context manager.
HintException is unrelated to the hierarchy of errors. It is an implementation
detail whether a class inherits from HintException or not, a sort of "private
inheritance" in C++.
New Hint isn't an exception class, which prevents catching error by its type:
try:
dosomething()
except error.Hint:
pass
Unfortunately, this passes on PyPy 5.3.1, but not on Python 2, and raises more
detailed TypeError on Python 3.
Many Linux distros and other Nixen have CA certificates in well-defined
locations. Rather than potentially fail to load any CA certificates at
all (which will always result in a certificate verification failure),
we scan for paths to known CA certificate files and load one if seen.
Because a proper Mercurial install will have the path to the CA
certificate file defined at install time, we print a warning that
the install isn't proper and provide a URL with instructions to
correct things.
We only perform path-based fallback on Pythons that don't know
how to call into OpenSSL to load the default verify locations. This
is because we trust that Python/OpenSSL is properly configured
and knows better than Mercurial. So this new code effectively only
runs on Python <2.7.9 (technically Pythons without the modern ssl
module).
Previously, failure to load system certificates on OS X would lead
to a certificate verify failure and that's it. We now print a warning
message with a URL that will contain information on how to configure
certificates on OS X.
As the inline comment states, there is room to improve here. I think
we could try harder to detect Homebrew and MacPorts installed
certificate files, for example. It's worth noting that Homebrew's
openssl package uses `security find-certificate -a -p` during package
installation to export the system keychain root CAs to
etc/openssl/cert.pem. This is something we could consider adding
to setup.py. We could also encourage packagers to do this. For now,
I'd just like to get this warning (which matches Windows behavior)
landed. We should have time to improve things before release.
I should have ran the entire test suite on Python 2.6. Since the
hostname matching tests are implemented in Python (not .t tests),
it didn't uncover this warning. I'm not sure why - warnings should
be printed regardless. This is possibly a bug in the test runner.
But that's for another day...
When reverting interactively, we always backup files before prompting the user
to find out if they actually want to revert them. This can create spurious
*.orig files if a user enters an interactive revert session and then doesn't
revert any files. Instead, we should only backup files that are actually being
touched.
Previously, ETag headers from hgweb weren't correctly formed, because rfc2616
(section 14, header definitions) requires double quotes around the content of
the header. str(web.mtime) didn't do that.
Additionally, strong ETags signify that the resource representations are
byte-for-byte identical. That is, they can be reconstructed from byte ranges if
client so wishes. Considering ETags for all hgweb pages is just mtime of
00changelog.i and doesn't consider of e.g. .hg/hgrc with description, contact
and other fields, it's clearly shouldn't be strong. The W/ prefix marks it as
weak, which still allows caching the whole served file/page, but doesn't allow
byte-range requests.
sslutil contains its own hostname matching logic. CPython has code
for the same intent. However, it is only available to Python 2.7.9+
(or distributions that have backported 2.7.9's ssl module
improvements).
This patch effectively imports CPython's hostname matching code
from its ssl.py into sslutil.py. The hostname matching code itself
is pretty similar. However, the DNS name matching code is much more
robust and spec conformant.
As the test changes show, this changes some behavior around
wildcard handling and IDNA matching. The new behavior allows
wildcards in the middle of words (e.g. 'f*.com' matches 'foo.com')
This is spec compliant according to RFC 6125 Section 6.5.3 item 3.
There is one test where the matcher is more strict. Before,
'*.a.com' matched '.a.com'. Now it doesn't match. Strictly speaking
this is a security vulnerability.
See the inline comment for what's going on here.
There is magic built into the "ssl" module that ships with modern
CPython that knows how to load the system CA certificates on
Windows. Since we're not shipping a CA bundle with Mercurial,
if we're running on legacy CPython there's nothing we can do
to load CAs on Windows, so it makes sense to print a warning.
I don't anticipate many people will see this warning because
the official (presumed popular) Mercurial distributions on
Windows bundle Python and should be distributing a modern Python
capable of loading system CA certs.
The "certifi" Python package provides a distribution of the
Mozilla trusted CA certificates as a Python package. If it is
present, we assume the user intends it to be used and we use
it to provide the default CA certificates when certificates
are otherwise not configured.
It's worth noting that this behavior roughly matches the popular
"requests" package, which also attempts to use "certifi" if
present.
Before, devel.disableloaddefaultcerts only impacted the loading of
default certs via SSLContext. After this patch, the config option also
prevents sslutil._defaultcacerts() from being called.
This config option is meant to be used by tests to force no CA certs
to be loaded. Future patches will enable _defaultcacerts() to have
success more often. Without this change we can't reliably test the
failure to load CA certs. (This patch also likely fixes test failures
on some OS X configurations.)
Future patches will change _defaultcacerts() to do something
on platforms that aren't OS X. Change the comment and logged
message to reflect the future.
This is a fix to an old problem when Mercurial got confused by an
untracked folder with the same name as one of the files in a commit
hg was trying to update to. It is pretty safe to remove this folder if
it is empty. Backing up an empty folder seems to go against Mercurial's
"don't track dirs" philosophy.
hgweb currently offers limited functionality for "classifying"
repositories. This patch aims to change that.
The web.labels config option list is introduced. Its values
are exposed to the "index" and "summary" templates. Custom
templates can use template features like ifcontains() to e.g.
look for the presence of a specific label and engage specific
behavior. For example, a site operator may wish to assign a
"defunct" label to a repository so the repository is prominently
marked as dead in repository indexes.
Now that we extend matches backwards in the inner loop, the final
adjustment has no effect.
(A similar extension for the forward direction is trickier and has
less benefit.)
For very large diffs that have large numbers of identical lines (JSON
dumps) that also have large blocks of identical text, bdiff could become
confused about which block matches which because it can only match
very limited regions. The result is very large diffs for small sets of edits.
The earlier recursion rebalancing fix made this behavior more frequent because
it's now more prone to match block 1 to block 2. One frequent user of
large JSON files reported being unable to pass the resulting diffs
through their code review system.
Prior to this change, bdiff would calculate the length of a match at
(i, j) as 1 + length found at (i-1, j-1). With large number of popular
(ignored) lines, this often meant matches couldn't be extended
backwards at all and thus all matching regions were very small.
Disabling the popularity threshold is not an option because it brings
back quadratic behavior.
Instead, we extend a match backwards until we either found a previously
discovered match or we find a mismatching line. This thus successfully
bridges over any popular lines inside and before a matching region.
The larger regions then significant reduce the probability of confusion.
Usually, the heads will have the same ordering in handlecheckheads. Insisting
on the same ordering is however an unnecessary constraint that in some custom
cases can cause pushes to fail even though the actual heads didn't change. This
caused production issues for us in combination with the current version of
https://bitbucket.org/Unity-Technologies/hgwebcachingproxy/ .
Stripping has only partly worked since f41815302d49 (repair: use cg3
for treemanifests, 2016-01-19): the bundle seems to have been created
correctly, but revlog entries in subdirectory revlogs were not
stripped. This meant that e.g. "hg verify" would fail after stripping
in a tree manifest repo.
To find the revisions to strip, we simply iterate over all directories
in the repo (included in store.datafiles()). This is inefficient for
stripping few commits, but efficient for stripping many commits. To
optimize for stripping few commits, we could instead walk the tree
from the root and find modified subdirectories, just like we do in the
changegroup code. I'm leaving that for another day.
This is to make them wrap-able. chgserver wants to know if an extension
accesses config or environment variables during uisetup and extsetup and
include them in confighash accordingly.
The httplib library is renamed to http.client in python 3. So the
import is conditionalized and a test is added in check-code to warn
to use util.httplib
If no CA certificates are loaded, that is almost certainly a/the
reason certificate verification fails when connecting to a server.
The modern ssl module in Python 2.7.9+ provides an API to access
the list of loaded CA certificates. This patch emits a warning
on modern Python when certificate verification fails and there are
no loaded CA certificates.
There is no way to detect the number of loaded CA certificates
unless the modern ssl module is present. Hence the differences
in test output depending on whether modern ssl is available.
It's worth noting that a test which specifies a CA file still
renders this warning. That is because the certificate it is loading
is a x509 client certificate and not a CA certificate. This
test could be updated if anyone is so inclined.
Before, we would call SSLContext.load_default_certs() when
certificate verification wasn't being used. Since
SSLContext.verify_mode == ssl.CERT_NONE, this would ideally
no-op. However, there is a slim chance the loading of system
certs could cause a failure. Furthermore, this behavior
interfered with a future patch that aims to provide a more
helpful error message when we're unable to load CAs.
The lack of test fallout is hopefully a sign that our
security code and tests are in a relatively good state.
Before, sslcontext.load_verify_locations() would raise a
ssl.SSLError which would be caught further up the stack and converted
to a urlerror. By that time, we lost track of what actually errored.
Trapping the error here gives users a slightly more actionable error
message.
The behavior between Python <2.7.9 and Python 2.7.9+ differs. This
is because our fake SSLContext class installed on <2.7.9 doesn't
actually do anything during load_verify_locations: it defers actions
until wrap_socket() time. Unfortunately, a number of errors can occur
at wrap_socket() time and we're unable to ascertain what the root
cause is. But that shouldn't stop us from providing better error
messages to people running a modern and secure Python version.
Before the error was caught at func() as an unknown identifier, and the
optimizer failed to detect the syntax error. This patch introduces getsymbol()
helper to ensure that a string is not allowed as a function name.
It was mixing tabs and spaces, and not in a good way.
Indent style of other atom entries seems to be 1 space per level, so let's
apply it here as well.
Since content is of type "text" (and is already escaped), using a CDATA section
is not required.
Looks like this was just an artifact of copying things from rss style in
529b23a26574, because other entries in atom style don't use CDATA in such
places.
It was mixing tabs and spaces, and not in a good way.
Indent style of other rss entries seems to be 4 spaces per level, so let's
apply it here as well.
The directory argument (for tree manifests) should belong to to the
--dir argument. I had mistakenly made --dir a flag. One effect of this
was that I had meant for "-m" to be optional, but instead it changed
the behavior of --dir, so with "hg debugdata -m --dir dir1 0", the -m
took over and the "dir1" got treated as a revision in the root
manifest log.
Before 'hg push -B .' on new remote head complained with:
abort: push creates new remote head ...
It was because _nowarnheads was not expanding active bookmark
name, so it didn't add active bookmark "proper" name to no
warn heads list.
When we remove a changeset from the changelog, the phase cache must be
invalidated, otherwise it could refer to changesets that are no longer in the
repo.
To reproduce the failure, I created an extension querying the phase cache after
the strip transaction is over.
To do that, I stripped two commits with a bookmark on one of them to force
another transaction (we open a transaction for moving bookmarks)
after the strip transaction.
Without the fix in this patch, the test leads to a stacktrace showing the issue:
repair.strip(ui, repo, revs, backup)
File "/Users/lcharignon/facebook-hg-rpms/hg-crew/mercurial/repair.py", line 205, in strip
tr.close()
File "/Users/lcharignon/facebook-hg-rpms/hg-crew/mercurial/transaction.py", line 44, in _active
return func(self, *args, **kwds)
File "/Users/lcharignon/facebook-hg-rpms/hg-crew/mercurial/transaction.py", line 490, in close
self._postclosecallback[cat](self)
File "$TESTTMP/crashstrip2.py", line 4, in test
[repo.changelog.node(r) for r in repo.revs("not public()")]
File "/Users/lcharignon/facebook-hg-rpms/hg-crew/mercurial/changelog.py", line 337, in node
return super(changelog, self).node(rev)
File "/Users/lcharignon/facebook-hg-rpms/hg-crew/mercurial/revlog.py", line 377, in node
return self.index[rev][7]
IndexError: revlog index out of range
The situation was encountered in inhibit (evolve's repo) where we would crash
following the volatile set invalidation submitted by Augie in
cbc52a99d057d11790cf5011e877c6f698bf57bf. Before his patch the issue was masked
as we were not accessing the phasecache after stripping a revision.
This bug uncovered another but in histedit (see explanation in issue5235).
I changed the histedit test accordingly to avoid fixing two things at once.
If you have just executable-bit change and amend it twice it will vanish:
* After the first amend the commit will have the proper executable bit set
in manifest but it won't have the the file on the list of files in
changelog.
* The second amend will read the wrong list of files from changelog and it
will copy the manifest entry from parent for this file.
* Voila! The change is lost.
This change repairs the bug in localrepo causing this and adds a test for it.
Before this patch, "hg help topic.section" might show unexpected
section of help topic in some encoding.
It applies str.lower() instead of encoding.lower(str) on translated
message to search section case-insensitively, but some encoding uses
0x41(A) - 0x5a(Z) as the second or later byte of multi-byte character
(for example, ja_JP.cp932), and str.lower() causes unexpected result.
To search section of help topic by translated section name correctly,
this patch replaces str.lower() by encoding.lower(str) for both query
string (in commands.help()) and translated help text (in
minirst.getsections()).
Before this patch, patch.filterpatch() shows meaningless translation
of help message for chunk selection in some encoding.
It applies str.lower() instead of encoding.lower(str) on translated
message, but some encoding uses 0x41(A) - 0x5a(Z) as the second or
later byte of multi-byte character (for example, ja_JP.cp932), and
str.lower() causes unexpected result.
To show lower-ed translated message correctly, this patch replaces
str.lower() by encoding.lower(str).
This corrects a warning from lintian that we're shipping an executable without
a man page. Since there is a doc string in the text, let's use that for the man
page.
The progress bar was being cleared on every write(), regardless of
whether it was currently displayed. This could foul up the display of
any writes that didn't include a linebreak.
In particular, the win32 mode of the color extension was turning
single prompt string writes into two writes, and the resulting
clear/write/clear/write pattern was making the prompt invisible.
We fix this by insisting that we have shown a progress bar and haven't
just cleared it (setting lastprint to 0).
Conveniently, the test suite already had instances of duplicate
clears.. that are now cleared up.
getbundle was requesting the "phase" namespace instead of the "phases"
namespace, which led to the client still requesting the phases
separately after getbundle finished.
In the 'hg pull --rebase', we don't want to pick a rebase destination unrelated
to the pull, we lay down basic infrastructure to allow such restriction on
stable (before 3.8 release) in this case. See issue 5214 for details.
Actual usage and test will be in the next patch.
This effectively backs out changeset 60b56b3206cc.
The http library behind ui.http2=true isn't specifying the hostname.
It is the day before the expected 3.8 release and we don't want to ship
a regression.
I'll try to restore this requirement in the 3.9 release cycle as part
of planned improvements to Mercurial's SSL/TLS interactions.
The atomictemp.close() file attempts to do a rename, which can fail.
Moving the close inside the exception handler fixes it.
This doesn't fit well with the with: pattern, as it's the finalizer
that's failing.
tryread() doesn't handle "is a directory" errors and presumably
others. We might not want to globally swallow such tryread errors, so
we replace with our own try/except handling.
An upcoming test will use directories as a portable stand-in for
various bizarre circumstances that cache read/write code should be
robust to.
Properly shell quote arguments, to avoid printing commands that won't work when
run literally. For example, a date string with timestamp needs to be quoted:
--date '1456953053 28800'
Initializing a subrepo when one doesn't exist is the right thing to do when the
parent is being updated, but in few other cases. Unfortunately, there isn't
enough context in the subrepo module to distinguish this case. This same issue
can be caused with other subrepo aware commands, so there is a general issue
here beyond the scope of this fix.
A simpler attempt I tried was to add an '_updating' boolean to localrepo, and
set/clear it around the call to mergemod.update() in hg.updaterepo(). That
mostly worked, but doesn't handle the case where archive will clone the subrepo
if it is missing. (I vaguely recall that there may be other commands that will
clone if needed like this, but certainly not all do. It seems both handy, and a
bit surprising for what should be a read only operation. It might be nice if
all commands did this consistently, but we probably need Angel's subrepo caching
first, to not make a mess of the working directory.)
I originally handled 'Exception' in order to pick up the Aborts raised in
subrepo.state(), but this turns out to be unnecessary because that is called
once and cached by ctx.sub() when iterating the subrepos.
It was suggested in the bug discussion to skip looking at the subrepo links
unless -S is specified. I don't really like that idea because missing a subrepo
or (less likely, but worse) a corrupt .hgsubstate is a problem of the parent
repo when checking out a revision. The -S option seems like a better fit for
functionality that would recurse into each subrepo and do a full verification.
Ultimately, the default value for 'allowcreate' should probably be flipped, but
since the default behavior was to allow creation, this is less risky for now.
LoadLibrary() changes behavior depending on whether the argument
passed to it contains a period. From the MSDN docs:
If no file name extension is specified in the lpFileName parameter,
the default library extension .dll is appended. However, the file name
string can include a trailing point character (.) to indicate that the
module name has no extension. When no path is specified, the function
searches for loaded modules whose base name matches the base name of
the module to be loaded. If the name matches, the load succeeds.
Otherwise, the function searches for the file.
As the subsequent patch will show, some environments on Windows
define their Python library as e.g. "libpython2.7.dll." The existing
code would pass "libpython2.7" into LoadLibrary(). It would assume
"7" was the file extension and look for a "libpython2.dll" to load.
By passing ".dll" into LoadLibrary(), we force it to search for the
exact basename we want, even if it contains a period.
The old "update across branches if no uncommitted changes" made
it sound like updating across branches (with no uncommitted changes)
was allowed only with this option, which was not true. Also, the option
did not care whether it was linear or across branches. Instead, it
checked that there were no uncommitted changes. Let's explain what it
does instead of trying to suggest what happens without it.
Update makedirs() to ignore EEXIST in case someone else has already created the
directory in question. Previously the ensuredirs() function existed, and was
nearly identical to makedirs() except that it fixed this race. Unfortunately
ensuredirs() was only used in 3 places, and most code uses the racy makedirs()
function. This fixes makedirs() to be non-racy, and replaces calls to
ensuredirs() with makedirs().
In particular, mercurial.scmutil.origpath() used the racy makedirs() code,
which could cause failures during "hg update" as it tried to create backup
directories.
This does slightly change the behavior of call sites using ensuredirs():
previously ensuredirs() would throw EEXIST if the path existed but was a
regular file instead of a directory. It did this by explicitly checking
os.path.isdir() after getting EEXIST. The makedirs() code did not do this and
swallowed all EEXIST errors. I kept the makedirs() behavior, since it seemed
preferable to avoid the extra stat call in the common case where this directory
already exists. If the path does happen to be a file, the caller will almost
certainly fail with an ENOTDIR error shortly afterwards anyway. I checked
the 3 existing call sites of ensuredirs(), and this seems to be the case for
them.
This causes the longest_match search to limit itself to a window of
30000 lines during search (roughly 1MB), thus avoiding a full O(N*M)
search that might occur in repetitive structured inputs. For a
particular class of many MB pathological test cases, this generated
the following timings:
size before after
10x 1.25s 1.24s
100x 57s 33s
1000x >8400s 400s
The times on the right quickly become much faster and appear more linear.
While windowing means deltas are no longer "optimal", the resulting
deltas were within a couple percent of expected size. While we've yet
to have a report of a file with the level of repetition necessary to
hit this case, some JSON/XML database dump scenario is fairly likely
to hit it.
This may also slightly improve the average-case performance for deltas
of large binaries.
For highly structured files like JSON or XML dumps with large numbers
of duplicate lines (eg braces) and isolated matching lines, bdiff
could find large numbers of equally good spans. Because it prefers
earlier matches, this would result in pathologically unbalance
recursion that resulted in quadratic performance.
This patch makes it prefer matches closer to the middle that tend to
balance recursion. This change improves the speed of a pathological
test case from 1100s to 9s.
Included is a smaller test that has a roughly 50x safety margin on the
performance it accepts. It's likely to fail on pure builds because
difflib also has a recursion-balancing problem.
The longest_match code compares all the possible positions in two
files to find the best match. Given a pair of sequences, it
effectively searches a grid like this:
a b b b c . d e . f
0 1 2 3 4 5 6 7 8 9
a 1 - - - - - - - - -
b - 2 1 1 - - - - - -
b - 1 3 2 - - - - - -
b - 1 2 4 - - - - - -
. - - - - - 1 - - 1 -
Here, the 4 in the middle says "the first four lines of the
file match", which it can compute be comparing the fourth lines and
then adding one to the result found when comparing the third lines in
the entry to the upper left.
We generally avoid the quadratic worst case by only looking at lines
that match, which is precomputed. We also avoid quadratic storage by
only keeping a single column vector and then keeping track of the best
match.
Unfortunately, this can get us into trouble with the sequences above.
Because we want to reuse the '3' value when calculating the '4', we
need to be careful not to overwrite it with the '2' we calculate
immediately before. If we scan left to right, top to bottom, we're
going to have a problem: we'll overwrite our 3 before we use it and
calculate a suboptimal best match.
To address this, we can either keep two column vectors and swap
between them (which significantly complicates bookkeeping), or change
our scanning order. If we instead scan from left to right, bottom to
top, we'll avoid ever overwriting values we'll need in the future.
This unfortunately needs several changes to be made simultaneously:
- change the order we build the initial hash chains for the b sequence
- change the sentinel values from INT_MAX to -1
- change the visit order in the longest_match inner loop
- add a tie-breaker preference for earlier matches
This last is needed because we previously had an implicit tie-breaker
from our visitation order that our test suite relies on. Later matches
can also trigger a bug in the normalization code in diff().
This bug is hidden by the current bias towards matches at the
beginning of the file. When this bias is tweaked later to address
recursion balancing, the normalization code could cause the next block
to shrink to a negative length, thus creating invalid delta chunks. We
add checks here to disallow that.
This bug requires test cases that are an awkwardly large size for the test
suite, but is very rapidly picked up by the included torture tester.
This just makes the code harder to read without any performance
advantage. We're going to make the check here more complex, let's make
it simpler first.
When displaying patches from graphical tools where you can browse through
individual files, with diff being called separately on each, recomputing the
limits of file copy history can become rather expensive on large repositories.
Instead, we can compute it once and pass it in for subsequent calls.
match() is the special case of a single element list being passed
to matchany() with the additional error checking that the revset
spec is defined. Change the implementation to remove the redundant
code and have match() call matchany().
I can never remember the differences between the various revset
APIs. I can never remember that scmutil.revrange() is the one I
want to use from user-facing commands.
Add some documentation to clarify this.
While we're here, the argument name for revrange() is changed to
"specs" because that's what it actually is.
Other parts of this expression are already using unicode literals.
We need this to make Python 3 happy and to avoid an implicit
conversion in Python 2.
Now that we have a mechanism for declaring path sub-options, we can
start to pile on features!
Many power users have expressed frustration that bare `hg push`
attempts to push all local revisions to the remote. This patch
introduces the "pushrev" path sub-option to control which revisions
are pushed when no "-r" argument is specified.
The value of this sub-option is a revset, naturally.
A future feature addition could potentially introduce a "pushnames"
sub-options that declares the list of names (branches, bookmarks,
topics, etc) to push by default. The entire "what to push by default"
feature should probably be considered before this patch lands.
As part of developing a subsequent patch I discovered that sub-option
values like "." were getting converted to paths. This is because the
[paths] section is treated specially during config loading.
This patch prevents post-processing sub-options from the [paths]
section.
Previously, when we connected to a server and were unable to verify
its certificate against a trusted certificate authority we would
issue a warning and continue to connect. This is obviously not
great behavior because the x509 certificate model is based upon
trust of specific CAs. Failure to enforce that trust erodes security.
This behavior was defined several years ago when Python did not
support loading the system trusted CA store (Python 2.7.9's
backports of Python 3's improvements to the "ssl" module enabled
this).
This commit changes behavior when connecting to abort if the peer
certificate can't be validated. With an empty/default Mercurial
configuration, the peer certificate can be validated if Python is
able to load the system trusted CA store. Environments able to load
the system trusted CA store include:
* Python 2.7.9+ on most platforms and installations
* Python 2.7 distributions with a modern ssl module (e.g. RHEL7's
patched 2.7.5 package)
* Python shipped on OS X
Environments unable to load the system trusted CA store include:
* Python 2.6
* Python 2.7 on many existing Linux installs (because they don't
ship 2.7.9+ or haven't backported modern ssl module)
* Python 2.7.9+ on some installs where Python is unable to locate
the system CA store (this is hopefully rare)
Users of these Pythongs will need to configure Mercurial to load the
system CA store using web.cacerts. This should ideally be performed
by packagers (by setting web.cacerts in the global/system hgrc file).
Where Mercurial packagers aren't setting this, the linked URL in the
new abort message can contain instructions for users.
In the future, we may want to add more code for finding the system
CA store. For example, many Linux distributions have the CA store
at well-known locations (such as /etc/ssl/certs/ca-certificates.crt
in the case of Ubuntu). This will enable CA loading to "just work"
on more Python configurations and will be best for our users since
they won't have to change anything after upgrading to a Mercurial
with this patch.
We may also want to consider distributing a trusted CA store with
Mercurial. Although we should think long and hard about that because
most systems have a global CA store and Mercurial should almost
certainly use the same store used by everything else on the system.
The ordering of 'x & head()' was broken in 329d82866742 (revset:
improve head revset performance, 2014-03-13). Presumably due to other
optimizations since then, undoing that change to fix the order does
not slow down the simple case of "hg log -r 'head()'" mentioned in
that commit. I see a small slowdown from ~0.16s to about ~0.19s with
'not 0 & head()', but I'd say it's worth it for the correct output.
Before this patch, "missing _() in ui message" rule overlooks
translatable message, which starts with other than alphabet.
To detect "missing _() in ui message" more exactly, this patch
improves the regexp with assumptions below.
- sequence consisting of below might precede "translatable message"
in same string token
- formatting string, which starts with '%'
- escaped character, which starts with 'b' (as replacement of '\\'), or
- characters other than '%', 'b' and 'x' (as replacement of alphabet)
- any string tokens might precede a string token, which contains
"translatable message"
This patch builds an input file, which is used to examine "missing _()
in ui message" detection, before '"$check_code" stringjoin.py' in
test-contrib-check-code.t, because this reduces amount of change churn
in subsequent patch.
This patch also applies "()" instead of "_()" on messages below to
hide false-positives:
- messages for ui.debug() or debug commands/tools
- contrib/debugshell.py
- hgext/win32mbcs.py (ui.write() is used, though)
- mercurial/commands.py
- _debugchangegroup
- debugindex
- debuglocks
- debugrevlog
- debugrevspec
- debugtemplate
- untranslatable messages
- doc/gendoc.py (ReST specific text)
- hgext/hgk.py (permission string)
- hgext/keyword.py (text written into configuration file)
- mercurial/cmdutil.py (formatting strings for JSON)
Before fd1bb7c, if the C index.partialmatch raises RevlogError, the Python
code raises "ambiguous identifier" error immediately, which is efficient.
fd1bb7c took hidden revisions into consideration and forced the slow path
enumerating the changelog to double-check hidden revisions. But it's not
necessary if we know the revlog has no hidden revisions.
This patch adds back the fast path for unfiltered revlogs.
I.e. when a revision blames a block of source lines, only display the
revision link on the first line of the block (this is identified by the
"blockhead" key in annotate context).
This addresses item "Visual grouping of changesets" of the blame improvements
plan (https://www.mercurial-scm.org/wiki/BlamePlan) which states: "Typically
there are block of lines all attributed to the same revision. Instead of
rendering the revision/changeset for every line, we could only render it once
per block."
* Distinguish the /annotate/<revision>/<file>#<linenumber> link when it would
lead to the current page (i.e. <revision> is the current revision) (style it
gray and undecorated). This indicates more clearly that this is a "dead-end"
in blame navigation.
* Display lines changed in current revision in green.
Change summary webcommand to yield each element of the shortlog instead of the
entire list.
This makes generated json more readable since each entry can be formatted
separately, instead of returning all the shortlog content in a single string.
Modify changelistentry structure to also deliver phase and branch data and use
either 'parents' or 'allparents' depending on what is defined in the view, in
order to reuse it in filelog structure.
Before this commit url.opener overwritten stored password
for connection with given url/user even when
new password for given connection was not filled. This
commit makes opener overwrites saved authentication only
when it contains password.
This makes http password database stored in ui object.
It allows reusing authentication information when we
use this database for creating password manager for
the new connection.
So far password manager was keeping authentication information so opening
new connection and creating new password manager made all saved authentication
information lost.
This commit separates password manager and password database to make it
possible to reuse saved authentication information.
This commit violates code checker because it adds add_password method (name
with underscore) to passwordmgr object to provide method required by urllib2.
Before this patch, "from a import b" doesn't delay loading module "b",
if absolute_import is enabled, even though "from . import b" does.
For example:
- it is assumed that extension X has "from P import M" for module M
under package P with absolute_import feature
- if importing module M is already delayed before loading extension
X, loading module M in extension X is delayed until actually
referring
util, cmdutil, scmutil or so of Mercurial itself should be
imported by "from . import M" style before loading extension X
- otherwise, module M is loaded immediately at loading extension X,
even if extension X itself isn't used at that "hg" command invocation
Some minor modules (e.g. filemerge or so) of Mercurial itself
aren't imported by "from . import M" style before loading
extension X. And of course, external libraries aren't, too.
This might cause startup performance problem of hg command, because
many bundled extensions already enable absolute_import feature.
To delay loading module for "from a import b" with absolute_import
feature, this patch does below in "from a (or .a) import b" with
absolute_import case:
1. import root module of "name" by system built-in __import__
(referred as _origimport)
2. recurse down the module chain for hierarchical "name"
This logic can be shared with non absolute_import
case. Therefore, this patch also centralizes it into chainmodules().
3. and fall through to process elements in "fromlist" for the leaf
module of "name"
Processing elements in "fromlist" is executed in the code path
after "if _pypy: .... else: ..." clause. Therefore, this patch
replaces "if _pypy:" with "elif _pypy:" to share it.
At faecf59a4184 introducing original "work around" for "from a import
b" case, elements in "fromlist" were imported with "level=level". But
"level" might be grater than 1 (e.g. level=2 in "from .. import b"
case) at demandimport() invocation, and importing direct sub-module in
"fromlist" with level grater than 1 causes unexpected result.
IMHO, this seems main reason of "errors for unknown reason" described
in faecf59a4184, and we don't have to worry about it, because this
issue was already fixed by 2711f50242cf.
This is reason why this patch removes "errors for unknown reasons"
comment.
To make it easier to patch the wrapped function, make it possible to access the
filecache descriptor directly on the class (rather than have to use
ClassObject.__dict__['attributename']). Returning `self` when the first
argument to `__get__` is `None` makes the descriptor behave the same way
`property` objects do.
When grafting/rebasing, it is common for multiple changesets to make
the same change to a subdirectory. When writing the revlog for the
directory, the revlog code already takes care of not writing the entry
again. In 3eb9fa4180d3 (changegroup: prune subdirectory dirlogs too,
2016-02-12), I added the corresponding code in changegroup (not
sending entries the client already has), but I forgot to avoid sending
the entire changegroup if no nodes remained in the pruned
set. Although that's harmless besides the wasted network traffic, the
receiving side was checking for it (copied from the changegroup code
for handling files). This resulted in the client crashing with:
abort: received dir revlog group is empty
Fix by simply not emitting a changegroup for the directory if there
were no changes is it. This matches how files are handled.
This will allow us to clear in-memory password storage per runcommand().
I've updated commandserver to call resetstate() of both ui and repo.ui because
they may have different states in theory.
Make it noop as before ddf6bfe09ab2. We could change it to an error, but
allowing empty key makes some sense for scripting that builds a key string
programmatically.
In some cases below, copying from backup is used to restore original
contents of a file, which is backuped via addfilegenerator(). If
copying keeps ctime, mtime and size of a file, restoring is
overlooked, and old contents cached before restoring isn't invalidated
as expected.
- failure of transaction (from '.hg/journal.backup.*')
- rollback of previous transaction (from '.hg/undo.backup.*')
To avoid ambiguity of file stat at restoring, this patch invokes
util.copyfile() with checkambig=True.
This patch is a part of "Exact Cache Validation Plan":
https://www.mercurial-scm.org/wiki/ExactCacheValidationPlan
Rollback of previous transaction restores contents of files below by
renaming from 'undo.*' file. If renaming keeps ctime, mtime and size
of a file, restoring is overlooked, and old contents cached before
restoring isn't invalidated as expected.
- .hg/bookmarks
- .hg/phaseroots
To avoid ambiguity of file stat at restoring, this patch invokes
vfs.rename() with checkambig=True.
BTW, .hg/dirstate is also restored at rollback. But it is restored by
dirstate.restorebackup(), and previous patch already made it invoke
vfs.rename() with checkambig=True.
This patch is a part of "Exact Cache Validation Plan":
https://www.mercurial-scm.org/wiki/ExactCacheValidationPlan
File .hg/dirstate is restored by renaming from backup in failure
inside scopes below. If renaming keeps ctime, mtime and size of a
file, restoring is overlooked, and old contents cached before
restoring isn't invalidated as expected.
- dirstateguard scope (from '.hg/dirstate.SUFFIX')
- transaction scope (from '.hg/journal.dirstate')
To avoid ambiguity of file stat at restoring, this patch invokes
vfs.rename() with checkambig=True.
This patch is a part of "Exact Cache Validation Plan":
https://www.mercurial-scm.org/wiki/ExactCacheValidationPlan
Sort revisions in reverse revision order but grouped by topographical branches.
Visualised as a graph, instead of:
o 4
|
| o 3
| |
| o 2
| |
o | 1
|/
o 0
revisions on a 'main' branch are emitted before 'side' branches:
o 4
|
o 1
|
| o 3
| |
| o 2
|/
o 0
where what constitutes a 'main' branch is configurable, so the sort could also
result in:
o 3
|
o 2
|
| o 4
| |
| o 1
|/
o 0
This sort was already available as an experimental option in the graphmod
module, from which it is now removed.
This sort is best used with hg log -G:
$ hg log -G "sort(all(), topo)"
This used to be needed to paper over hashlib not being in all Pythons
we support, but that's not a problem anymore, so we can simplify
things a little bit.
All versions of Python we support or hope to support make the hash
functions available in the same way under the same name, so we may as
well drop the util forwards.