Rename unstable to orphan in all external user-facing output. Only update
user-facing output for the moment, variables names, templates keyword and
potentially configuration would be done in later series.
The renaming is done according to
https://www.mercurial-scm.org/wiki/CEDVocabulary.
Differential Revision: https://phab.mercurial-scm.org/D214
Rename trouble(s) to instability in all external user-facing output. Only
update user-facing output for the moment, variables names, templates keyword
and potentially configuration would be done in later series.
The renaming is done according to
https://www.mercurial-scm.org/wiki/CEDVocabulary.
Differential Revision: https://phab.mercurial-scm.org/D213
When a transaction is started, we must load the hookargs from the
bundleoperation object to the transaction so that they can be used in the
transaction. Also this patch makes sure no more hookargs are added to the
bundleoperation object once the transaction starts.
This is a part of porting fb extension bundle2hooks to core.
Differential Revision: https://phab.mercurial-scm.org/D209
There are extensions like pushrebase, pushvars which run hooks on a server
before taking the lock. Since the lock is not taken, transaction is not there,
so the hookargs can't be stored on the transaction. Adding hooksargs to bundle
operation object will help in running hooks before taking the lock.
This is a part of moving fb's extension bundle2hooks to core.
Differential Revision: https://phab.mercurial-scm.org/D208
This was previously landed as 4bc0c14fb501 but backed out in b63351f6a2 because
it broke hooks mid-rebase and caused conflict resolution data loss in the event
of unexpected exceptions. This new version adds the behavior back but behind a
config flag, since the performance improvement is notable in large repositories.
The old commit message was:
Recently we switched rebases to run the entire rebase inside a single
transaction, which dramatically improved the speed of rebases in repos with
large working copies. Let's also move the dirstate into a single dirstateguard
to get the same benefits. This let's us avoid serializing the dirstate after
each commit.
In a large repo, rebasing 27 commits is sped up by about 20%.
I believe the test changes are because us touching the dirstate gave the
transaction something to actually rollback.
(grafted from 9e3dc3a1638b9754b58a0cb26aaa75d868058109)
(grafted from 7d38b41d2266d9a02a15c64229fae0da5738dcec)
Differential Revision: https://phab.mercurial-scm.org/D135
This allows us to read a customized range of markers, instead of loading all
of them.
The condition of stop is made consistent across C and Python implementation
so we will still read marker when offset=a, stop=a+1.
Previously, commandserver was using select.select. That could have issue if
_sock.fileno() >= FD_SETSIZE (usually 1024), which raises:
ValueError: filedescriptor out of range in select()
We got that in production today, although it's the code opening that many
files to blame, it seems better for commandserver to work in this case.
There are multiple way to "solve" it, like preserving a fd with a small
number and swap it with sock using dup2(). But upgrading to a modern
selector supported by the system seems to be the most correct way.
This library was a backport of the Python 3 "selectors" library. It is
useful to provide a better selector interface for Python2, to address some
issues of the plain old select.select, mentioned in the next patch.
The code [1] was ported using the MIT license, with some minor modifications
to make our test happy:
1. "# no-check-code" was added since it's foreign code.
2. "from __future__ import absolute_import" was added.
3. "from collections import namedtuple, Mapping" changed to avoid direct
symbol import.
[1]: d27dbd2fdc/selectors2.py
# no-check-commit
I was several directories deep in the kernel tree, ran `hg add`, and got the
warning about the size of one of the files. I noticed that it suggested undoing
the add with a specific revert command. The problem is, it would have failed
since the path printed was relative to the repo root instead of cwd. While
here, I just fixed the other messages too. As an added benefit, these messages
now look the same as the verbose/inexact messages for the corresponding command.
I don't think most of these messages are reachable (typically the corresponding
cmdutil function does the check). I wasn't able to figure out why the keyword
tests were failing when using pathto()- I couldn't cause an absolute path to be
used by manipulating the --cwd flag on a regular add. (I did notice that
keyword is adding the file without holding wlock.)
This is done by a script [2] using RedBaron [1], a tool designed for doing
code refactoring. All "default" values are decided by the script and are
strongly consistent with the existing code.
There are 2 changes done manually to fix tests:
[warn] mercurial/exchange.py: experimental.bundle2-output-capture: default needs manual removal
[warn] mercurial/localrepo.py: experimental.hook-track-tags: default needs manual removal
Since RedBaron is not confident about how to indent things [2].
[1]: https://github.com/PyCQA/redbaron
[2]: https://github.com/PyCQA/redbaron/issues/100
[3]:
#!/usr/bin/env python
# codemod_configitems.py - codemod tool to fill configitems
#
# Copyright 2017 Facebook, Inc.
#
# This software may be used and distributed according to the terms of the
# GNU General Public License version 2 or any later version.
from __future__ import absolute_import, print_function
import os
import sys
import redbaron
def readpath(path):
with open(path) as f:
return f.read()
def writepath(path, content):
with open(path, 'w') as f:
f.write(content)
_configmethods = {'config', 'configbool', 'configint', 'configbytes',
'configlist', 'configdate'}
def extractstring(rnode):
"""get the string from a RedBaron string or call_argument node"""
while rnode.type != 'string':
rnode = rnode.value
return rnode.value[1:-1] # unquote, "'str'" -> "str"
def uiconfigitems(red):
"""match *.ui.config* pattern, yield (node, method, args, section, name)"""
for node in red.find_all('atomtrailers'):
entry = None
try:
obj = node[-3].value
method = node[-2].value
args = node[-1]
section = args[0].value
name = args[1].value
if (obj in ('ui', 'self') and method in _configmethods
and section.type == 'string' and name.type == 'string'):
entry = (node, method, args, extractstring(section),
extractstring(name))
except Exception:
pass
else:
if entry:
yield entry
def coreconfigitems(red):
"""match coreconfigitem(...) pattern, yield (node, args, section, name)"""
for node in red.find_all('atomtrailers'):
entry = None
try:
args = node[1]
section = args[0].value
name = args[1].value
if (node[0].value == 'coreconfigitem' and section.type == 'string'
and name.type == 'string'):
entry = (node, args, extractstring(section),
extractstring(name))
except Exception:
pass
else:
if entry:
yield entry
def registercoreconfig(cfgred, section, name, defaultrepr):
"""insert coreconfigitem to cfgred AST
section and name are plain string, defaultrepr is a string
"""
# find a place to insert the "coreconfigitem" item
entries = list(coreconfigitems(cfgred))
for node, args, nodesection, nodename in reversed(entries):
if (nodesection, nodename) < (section, name):
# insert after this entry
node.insert_after(
'coreconfigitem(%r, %r,\n'
' default=%s,\n'
')' % (section, name, defaultrepr))
return
def main(argv):
if not argv:
print('Usage: codemod_configitems.py FILES\n'
'For example, FILES could be "{hgext,mercurial}/*/**.py"')
dirname = os.path.dirname
reporoot = dirname(dirname(dirname(os.path.abspath(__file__))))
# register configitems to this destination
cfgpath = os.path.join(reporoot, 'mercurial', 'configitems.py')
cfgred = redbaron.RedBaron(readpath(cfgpath))
# state about what to do
registered = set((s, n) for n, a, s, n in coreconfigitems(cfgred))
toregister = {} # {(section, name): defaultrepr}
coreconfigs = set() # {(section, name)}, whether it's used in core
# first loop: scan all files before taking any action
for i, path in enumerate(argv):
print('(%d/%d) scanning %s' % (i + 1, len(argv), path))
iscore = ('mercurial' in path) and ('hgext' not in path)
red = redbaron.RedBaron(readpath(path))
# find all repo.ui.config* and ui.config* calls, and collect their
# section, name and default value information.
for node, method, args, section, name in uiconfigitems(red):
if section == 'web':
# [web] section has some weirdness, ignore them for now
continue
defaultrepr = None
key = (section, name)
if len(args) == 2:
if key in registered:
continue
if method == 'configlist':
defaultrepr = 'list'
elif method == 'configbool':
defaultrepr = 'False'
else:
defaultrepr = 'None'
elif len(args) >= 3 and (args[2].target is None or
args[2].target.value == 'default'):
# try to understand the "default" value
dnode = args[2].value
if dnode.type == 'name':
if dnode.value in {'None', 'True', 'False'}:
defaultrepr = dnode.value
elif dnode.type == 'string':
defaultrepr = repr(dnode.value[1:-1])
elif dnode.type in ('int', 'float'):
defaultrepr = dnode.value
# inconsistent default
if key in toregister and toregister[key] != defaultrepr:
defaultrepr = None
# interesting to rewrite
if key not in registered:
if defaultrepr is None:
print('[note] %s: %s.%s: unsupported default'
% (path, section, name))
registered.add(key) # skip checking it again
else:
toregister[key] = defaultrepr
if iscore:
coreconfigs.add(key)
# second loop: rewrite files given "toregister" result
for path in argv:
# reconstruct redbaron - trade CPU for memory
red = redbaron.RedBaron(readpath(path))
changed = False
for node, method, args, section, name in uiconfigitems(red):
key = (section, name)
defaultrepr = toregister.get(key)
if defaultrepr is None or key not in coreconfigs:
continue
if len(args) >= 3 and (args[2].target is None or
args[2].target.value == 'default'):
try:
del args[2]
changed = True
except Exception:
# redbaron fails to do the rewrite due to indentation
# see https://github.com/PyCQA/redbaron/issues/100
print('[warn] %s: %s.%s: default needs manual removal'
% (path, section, name))
if key not in registered:
print('registering %s.%s' % (section, name))
registercoreconfig(cfgred, section, name, defaultrepr)
registered.add(key)
if changed:
print('updating %s' % path)
writepath(path, red.dumps())
if toregister:
print('updating configitems.py')
writepath(cfgpath, cfgred.dumps())
if __name__ == "__main__":
sys.exit(main(sys.argv[1:]))
This was only used by the sparse extension's dirstate._ignore
override, which no longer exists.
Differential Revision: https://phab.mercurial-scm.org/D60
This is a Windows only thing. Unfortunately, the socket is closed at this point
(so the certificate is unavailable to check the chain). That means it's printed
out when verification fails as a guess, on the assumption that 1) most of the
time verification won't fail, and 2) sites using expired or certs that are too
new will be rare. Maybe this is an argument for adding more functionality to
debugssl, to test for problems and print certificate info. Or maybe it's an
argument for bundling certificates with the Windows builds. That idea was set
aside when the enhanced SSL code went in last summer, and it looks like there
were issues with using certifi on Windows anyway[1].
This was tested by deleting the certificate out of certmgr.msc > "Third-Party
Root Certification Authorities" > "Certificates", seeing `hg pull` fail (with
the new message), trying this command, and then successfully performing the pull
command.
[1] https://www.mercurial-scm.org/pipermail/mercurial-devel/2016-October/089573.html
This is only useful on Windows, and avoids the need to use Internet Explorer to
build the certificate chain. I can see this being extended in the future to
print information about the certificate(s) to help debug issues on any platform.
Maybe even perform some of the python checks listed on the secure connections
wiki page. But for now, all I need is 1) a command that can be invoked in a
setup script to ensure the certificate is installed, and 2) a command that the
user can run if/when a certificate changes in the future.
It would have been nice to leverage the sslutil library to pick up host specific
settings, but attempting to use sslutil.wrapsocket() failed the
'not sslsocket.cipher()' check in it and aborted.
The output is a little more chatty than some commands, but I've seen the update
take 10+ seconds, and this is only a debug command.
I started a thread[1] on the mailing list awhile ago, but the short version is
that Windows doesn't ship with a full list of certificates[2]. Even if the
server sends the whole chain, if Windows doesn't have the appropriate
certificate pre-installed in its "Third-Party Root Certification Authorities"
store, connections mysteriously fail with:
abort: error: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:661)
Windows expects the application to call the methods invoked here as part of the
certificate verification, triggering a call out to Windows update if necessary,
to complete the trust chain. The python bug to add this support[3] hasn't had
any recent activity, and isn't targeting py27 anyway.
The only work around that I could find (besides figuring out the certificate and
walking through the import wizard) is to browse to the site in Internet
Explorer. Opening the page with FireFox or Chrome didn't work. That's a pretty
obscure way to fix a pretty obscure problem. We go to great lengths to
demystify various SSL errors, but this case is clearly lacking. Let's try to
make things easier to diagnose and fix.
When I had trouble figuring out how to get ctypes to work with all of the API
pointers, I found that there are other python projects[4] using this API to
achieve the same thing.
[1] https://www.mercurial-scm.org/pipermail/mercurial-devel/2017-April/096501.html
[2] https://support.microsoft.com/en-us/help/931125/how-to-get-a-root-certificate-update-for-windows
[3] https://bugs.python.org/issue20916
[4] 3b86bce206/source/updateCheck.py (L511)
There is still some use of 'deletedivergent' bookmark here. They will be taken
care of later. The 'deletedivergent' code needs some rework before fitting in
the new world.
We want to track bookmark movement within a transaction. For this we need a
more centralized way to update bookmarks.
For this purpose we introduce a new 'applychanges' method that apply a list of
changes encoded as '(name, node)'. We'll cover all bookmark updating code to
this new method in later changesets and add bookmark move in the transaction
when all will be migrated.
If a matcher doesn't implement visitdir, we should be returning True so that
tree traversals are not prematurely pruned. The old value of False would prevent
tree traversals when using any matcher that didn't implement visitdir.
Differential Revision: https://phab.mercurial-scm.org/D83
Thinking a bit further about list/dict subscript operation (proposed by
issue 5534), I noticed the current data structure, a dict of dicts, might
not be ideal.
For example, if there were "'[' index ']'" and "'.' key" operators,
"{parents[0]}" would return "{p1rev}:{p1node}", and we would probably want to
write "{parents[0].desc}" to get the first element of "{parents % "{desc}"}".
This will basically execute parents[0].makemap()['desc'] in Python.
Given the rule above, "{peerpaths.default.pushurl}" will be translated to
peerpaths['default'].makemap()['pushurl'], which means {peerpaths} should
be a single-level dict and sub-options should be makemap()-ed.
"{peerpaths % "{name} = {url}, {pushurl}, ..."}"
(Well, it could be peerpaths['default']['pushurl'], but in which case,
peerpaths['default'] should be a plain dict, not a hybrid object.)
So, let's mark the current implementation experimental and revisit it later.
find_deepest is used to find the "best" ancestors given a list. In the main
loop it keeps an invariant called 'ninteresting' which is supposed to contain
the number of non-zero entries in the 'interesting' array. This invariant is
incorrectly maintained, however, which leads the the algorithm returning an
empty result for certain graphs. This has been fixed.
Also, the 'interesting' array is supposed to fit 2^ancestors values, but is
incorrectly allocated to twice that size. This has been fixed as well.
The tests in test-ancestor.py compare the Python and C versions of the code,
and report the error correctly, since the Python version works correct. Even
so, I have added an additional test against the expected result, in the event
that both algorithms have an identical error in the future.
This fixes issue5623.
In some case, the default of one value is derived from other value. We add a
way to register them anyway and an associated devel-warning.
The registration is very naive for the moment. We might be able to have a
better way for registering each of these cases but it could be done later.
cg.apply used to returns the added nodes. Callers doesn't have a use for it
anymore, remove the added node and stops recording it in the current
operation.
This information was added in the current release cycle so no extensions
breakage should happens.
updatephases have no use of the 'addednodes' parameter since 44be3dc1fec8.
However caller are still passing it for nothing, remove the parameter and
remove computing of the added nodes in caller.
We adds new computation to find and record the revision affected by the
boundary retraction. This add more complication to the function but this seems
fine since it is only used in a couple of rare and explicit cases (`hg phase
--force` and `hg qimport`).
Having strong tracking of phase changes is worth the effort.
It is useful to detect noop and avoid expensive operations in this case.
We return the information to inform the caller of a possible update. Top level
function might need to react to the phase update (eg: invalidating some
caches, tracking phase change).
We rework the code to call 'registernew' before any other phase advancement.
This make 'changegroup.apply' register correct phase movement for the added
and bundled nodes.
This new function will be used by code that adds new changesets. It ajusts the
phase boundary to make sure added changesets are at least in their target
phase (they end up in an higher phase if their parents are in a higher phase).
Having a dedicated function also simplify the phases tracking. All the new
nodes are passed as argument, so we know that all of them needs to have their
new phase registered. We also know that no other nodes will be affected, so no
extra computation are needed.
This function differ from 'retractboundary' where some nodes might change
phase while some other might not. It can also affect nodes not passed as
parameters.
These simplification also apply to the computation itself. For now we use
'_retractboundary' there by convenience, but we may introduces simpler code
later.
While registering new revisions, we still need to check the actual phases of
the added node because it might be higher than the target phase (eg: target is
draft but parent is secret).
We will migrate users over the next changesets.
At the moment the 'retractboundary' function is called for multiple reasons:
First, actually retracting boundaries. There are only two cases for theses:
'hg phase --force' and 'hg qimport'. This will need extra graph computation to
retrieve the phase changes.
Second, setting the phases of newly added changesets. In this case we already
know all the affected nodes and we just needs to register different
information (old phase is None).
Third, when reducing the set of roots when advancing phase. The phase are
already properly tracked so we do not needs anything else in this case.
To deal with this difference in phase tracking, we extract the core logic into
a private method that all three cases can use.
Makes advanceboundary record the phase movement of affected revisions in
tr.changes['phases'].
The tracking is not usable yet because the 'retractboundary' function can also
affect phases.
We'll improve that in the coming changesets.
When advancing phases, we compute the new roots for the phases above. During
this process, we need to compute all the revisions that change phases (to the
new target phases). Extract these revisions into a separate variable. This
will be useful to record the phase changes in the transaction.
It seems that we were calling retractboundary for each phases to process.
Putting the retractboundary out of the loop reduce the number of calls,
helping tracking the phases changes.
unionmatcher is currently used where only a limited subset of its
functions will be called. Specifically, visitdir() is never
called. The next patch will pass it to dirstate.walk() where it will
matter that visitdir() is correctly implemented, so let's fix
that. Also add the explicitdir etc that will also be assumed by
dirstate.walk() to exist on a matcher.
Differential Revision: https://phab.mercurial-scm.org/D58
The forceincludematcher is simply a unionmatcher of a includematcher
(matching paths recursively) with the given matcher. Since the
forceincludematcher is only used by sparse, move it there.
I don't have a good sparse repo setup to test performance impact on.
Differential Revision: https://phab.mercurial-scm.org/D57
rebase will have similar logic, so let's extract it. Besides, it makes
the histedit code more readable.
We may want to parametrize acceptintervention() by the exception(s)
that should result in transaction close.
Differential Revision: https://phab.mercurial-scm.org/D66
Update the dirstate functions so that the caller supplies the full backup
filename rather than just a prefix and suffix.
The localrepo code was already hard-coding the fact that the backup name must
be (exactly prefix + "dirstate" + suffix): it relied on this in _journalfiles()
and undofiles(). Making the caller responsible for specifying the full backup
name removes the need for the localrepo code to assume that dirstate._filename
is always "dirstate".
Differential Revision: https://phab.mercurial-scm.org/D68
This was meant as a substitute for Python's "with" with multiple
context managers before we moved to Python 2.7. We're now on 2.7, so
we should have no reason to keep ctxmanager. "hg grep --all
ctxmanager" says that it was never used anyway.
Differential Revision: https://phab.mercurial-scm.org/D73
This is the result of running:
python codemod_nestedwith.py **/*.py
where codemod_nestedwith.py looks like this:
#!/usr/bin/env python
# codemod_nestedwith.py - codemod tool to rewrite nested with
#
# Copyright 2017 Facebook, Inc.
#
# This software may be used and distributed according to the terms of the
# GNU General Public License version 2 or any later version.
from __future__ import absolute_import, print_function
import sys
import redbaron
def readpath(path):
with open(path) as f:
return f.read()
def writepath(path, content):
with open(path, 'w') as f:
f.write(content)
def main(argv):
if not argv:
print('Usage: codemod_nestedwith.py FILES')
for i, path in enumerate(argv):
print('(%d/%d) scanning %s' % (i + 1, len(argv), path))
changed = False
red = redbaron.RedBaron(readpath(path))
processed = set()
for node in red.find_all('with'):
if node in processed or node.type != 'with':
continue
top = node
child = top[0]
while True:
if len(top) > 1 or child.type != 'with':
break
# estimate line length after merging two "with"s
new = '%swith %s:' % (top.indentation, top.contexts.dumps())
new += ', %s' % child.contexts.dumps()
# only do the rewrite if the end result is within 80 chars
if len(new) > 80:
break
processed.add(child)
top.contexts.extend(child.contexts)
top.value = child.value
top.value.decrease_indentation(4)
child = child[0]
changed = True
if changed:
print('updating %s' % path)
writepath(path, red.dumps())
if __name__ == "__main__":
sys.exit(main(sys.argv[1:]))
Differential Revision: https://phab.mercurial-scm.org/D77
we wrap 'repo.svfs.audit' to check for the store lock when accessing file in
'.hg/store' for writing. This caught a couple of instance where the transaction
was released after the lock, we should probably have a dedicated checker for
that case.
When the appropriate developer warnings are enabled, We wrap 'repo.vfs.audit' to
check for locks when accessing file in '.hg' for writing. Another changeset will
add a 'ward' for the store vfs (svfs).
This check system has caught a handful of locking issues that have been fixed
in previous series (mostly in 4.0). I expect another batch to be caught in third
party extensions.
We introduce two real exceptions from extensions 'blackbox.log' (because a lot of
read-only operations add entry to it), and 'last-email.txt' (because 'hg email'
is currently a read only operation and there is value to keep it this way).
In addition we are currently allowing bisect to operate outside of the lock
because the current code is a bit hard to get properly locked for now. Multiple
clean up have been made but there is still a couple of them to do and the freeze
is coming.
We want to be able to do more precise check when auditing a path depending of
the intend of the file access (eg read versus write). So we now pass the 'mode'
value to 'audit' and update the audit function to accept them.
This will be put to use in the next changeset.
This function already does an excellent job of reading from context objects;
we simply need to change the single write call to eliminate all uses of the
wvfs.
As with past changes, the effect should be a no-op but opens the door to
in-memory merge later by using different context objects.
I ran into this ctypes bug while working with the Crypto API. While this could
be an issue with any Win32 API in theory, the handful of things that we call are
older functions that are unlikely to return COM errors, so I didn't retrofit
this everywhere.
The proposed syntax [1] was originally 'set{n rel}', but it seemed slightly
confusing if template is involved. On the other hand, we want to keep 'set[n]'
for future extension. So this patch introduces 'set#rel[n]' ternary operator.
I chose '#' just because it looks like applying an attribute.
This also adds stubs for 'set[n]' and 'set#rel' operators since these syntax
elements are fundamental for constructing 'set#rel[n]'.
[1]: https://www.mercurial-scm.org/wiki/RevsetOperatorPlan#ideas_from_mpm
It's sometimes useful to show hyperlinks in log output.
"{get(peerpaths, "default")}/rev/{node}"
Since each path may have sub options, "{peerpaths}" is structured as a dict
of dicts, but the inner dict is rendered as if it were a string URL. The
implementation is ad-hoc, so there are some weird behaviors described in
the test. We might need to introduce a proper way of handling a hybrid
scalar object.
This patch adds _hybrid.__getitem__() so d['path']['url'] works.
The keyword is named as "peerpaths" since "paths" seemed too generic in
log context.
The new 'phase-heads' forced all added node to secret before advancing the
boundary to work around the fact changesets were added as draft by default.
This is no longer necessary since the changegroup part can now use the
'targetphase' parameter.
Not doing this retract boundary call has a couple of advantages:
* This makes implementing phases change tracking in the transaction much
simpler since retract boundary can become a rare case.
* Bundling secret changesets is not the norm. Exchange never does that and
even for strip, the use-case is not common.Skipping the retract boundary
will avoid useless work here.
* Sending phase update on push can be simplified since we can rely on the
behavior of 'cg.apply' for most of it.
This means less phases update send for example.
* We no longer needs to track and use the addednodes during unbundling. This
make it possible to have multiple 'changegroup' and 'phase-heads' parts in the
same bundle without them interfering with each others.
The new part has not been part of any release yet so we do not offer backward
compatibility yet. It is important to update this semantic before the 4.3
freeze happens.
If we are bundling secret changeset and the bundle will contain phase, we
request the changegroup to be applied as secret.
It will be useful for next patch as we are now sure that secrets changesets
are applied as secret and not applied as draft then forced to secret.
By default unbundled changesets are drafts. We want to reduce the number of
phases changes during unbundling by giving the possibility to the bundle to
indicate the phase of unbundled changesets.
The longer terms goal is to add phase movement tracking in tr.changes and the
'retractboundary' call is making it more complicated than we want.
Since 26d535788092, the strip bundle includes the phases of the stripping
node. Hence we don't need this special case anymore.
Dropping it will helps make the phase behavior more consistent across all
exchanges medium.
This changeset attempts to solve two issues with the "followlines" UI in
hgweb. First the "followlines" action is currently not easily discoverable
(one has to hover on a line for some time, wait for the invite message to
appear and then perform some action). Second, it gets in the way of natural
line selection, especially in filerevision view.
This changeset introduces an additional markup element (a <button
class="btn-followlines">) alongside each content line of the view. This button
now holds events for line selection that were previously plugged onto content
lines directly. Consequently, there's no more action on content lines, hence
restoring the "natural line selection" behavior (solving the second problem).
These buttons are hidden by default and get displayed upon hover of content
lines; then upon hover of a button itself, a text inviting followlines section
shows up. This solves the first problem (discoverability) as we now have a
clear visual element indicating that "some action could be perform" (i.e. a
button) and that is self-documented.
In followlines.js, all event listeners are now attached to these <button>
elements. The custom "floating tooltip" element is dropped as <button>
elements are now self-documented through a "title" attribute that changes
depending on preceding actions (selection started or not, in particular).
The new <button> element is inserted in followlines.js script (thus only
visible if JavaScript is activated); it contains a "+" and "-" with a
"diff-semantics" style; upon hover, it scales up.
To find the parent element under which to insert the <button> we either rely
on the "data-selectabletag" attribute (which defines the HTML tag of children
of class="sourcelines" element e.g. <span> for filerevision view and <tr> for
annotate view) or use a child of the latter elements if we find an element
with class="followlines-btn-parent" (useful for annotate view, for which we
have to find the <td> in which to insert the <button>).
On noticeable change in CSS concerns the "margin-left" of span:before
pseudo-elements in filelog view that has been increased a bit in order to
leave space for the new button to appear between line number column and
line content one.
Also note the "z-index" addition for "annotate-info" box so that the latter
appears on top of new buttons (instead of getting hidden).
In some respect, the UI similar to line commenting feature that is implemented
in popular code hosting site like GitHub, BitBucket or Kallithea.
Python introduces a reference cycle on dynamically created types
via __mro__, making them very easy to leak. See
https://bugs.python.org/issue17950.
Previously, repo.filtered() created a type on every invocation.
Long-running processes (like `hg convert`) could call this
function thousands of times, leading to a steady memory leak.
Since we're Unable to stop the leak because this is a bug in
Python, the next best thing is to contain it.
This patch adds a cache of of the dynamically generated repoview/filter
types on the localrepo object. Since we only generate each type
once, we cap the amount of memory that can leak to something
reasonable.
After this change, `hg convert` no longer leaks memory on every
revision. The process will likely grow memory usage over time due
to e.g. larger manifests. But there are no leaks.
isfilecached() encapsulates internal implementation of filecache-ed
property.
"name in repo.unfiltered().__dict__" or so can't be used for this
purpose, because corresponded entry in __dict__ might be discarded by
repo.invalidate(), repo.invalidatedirstate() or so (fsmonitor does so,
for example).
This patch makes isfilecached() return not only whether filecache-ed
property is already cached, but also already cached value (or None),
in order to avoid subsequent access to cached object via "repo.NAME",
which prevents main Mercurial procedure after reposetup() from
validating cache.
Currently, sslutil._hostsettings() performs validation that web.cacerts
exists. However, client certificates are passed in to the function
and not all callers may validate them. This includes
httpconnection.readauthforuri(), which loads the [auth] section.
If a missing file is specified, the ssl module will raise a generic
IOException. And, it doesn't even give us the courtesy of telling
us which file is missing! Mercurial then prints a generic
"abort: No such file or directory" (or similar) error, leaving users
to scratch their head as to what file is missing.
This commit introduces explicit validation of all paths passed as
arguments to wrapsocket() and wrapserversocket(). Any missing file
is alerted about explicitly.
We should probably catch missing files earlier - as part of loading
the [auth] section. However, I think the sslutil functions should
check for file presence regardless of what callers do because that's
the only way to be sure that missing files are always detected.
The matchers that were recently moved into core from the sparse
extension override __call__, while the previously existing matchers
override matchfn. Let's switch to the latter for consistency.
When I added prefix() in 559ee9ecae07 (match: introduce boolean
prefix() method, 2014-10-28), we already had always(), isexact(), and
anypats(), so it made sense to write it in terms of them (a prefix
matcher is one that isn't any of the other types). It's only now that
I realize that it's much more natural to define prefix() explicitly
(it's one that uses path: patterns, roughly speaking) and let
anypats() be defined in terms of the others. Remember that these
methods are all used for determining which fast paths are
possible. anypats() simply means that no fast paths are possible (it
could be called complex() instead). Further evidence is that
rootfilesin:some/dir does not have any patterns, but it's still
considered to be an anypats() matcher. That's because anypats() really
just means that it's not a prefix() matcher (and not always() and not
isexact()).
This patch thus changes prefix() to return False by default and
anypats() to return True only if the other three are False. Having
anypats() be True by default also seems like a good thing, because it
means forgetting to override it will lead only to performance bugs,
not correctness bugs.
Since the base class's implementation changes, we're also forced to
update the subclasses. That change exposed and fixed a bug in the
differencematcher: for example when both its two input matchers were
prefix matchers, we would say that the result was also a prefix
matcher, which is incorrect, because e.g "path:dir - path:dir/foo" no
longer matches everything under "dir" (which is what prefix() means).
The m.isexact() and m.prefix() methods are used by callers to
determine whether m.files() can be used for fast paths. It seems safe
to let callers to any fast paths it can that rely on the empty
m.files().
This revset returns all successors, including transit nodes and the source
nodes (to be consistent with existing revsets like "ancestors").
To filter out transit nodes, use `successors(X)-obsolete()`.
To filter out divergent case, use `successors(X)-divergent()-obsolete()`.
The revset could be useful to define rebase destination, like:
`max(successors(BASE)-divergent()-obsolete())`. The `max` is to deal with
splits.
There are other implementations where `successors` returns just one level of
successors, and `allsuccessors` returns everything. I think `successors`
returning all successors by default is more user friendly. We have seen
cases in production where people use 1-level `successors` while they really
want `allsuccessors`. So it seems better to just have one single revset
returning all successors by default to avoid user errors.
In the future we might want to add `depth` keyword argument to it and for
other revsets like `ancestors` etc. Or even build some flexible indexing
syntax [1] to satisfy people having the depth limit requirement.
[1]: https://www.mercurial-scm.org/pipermail/mercurial-devel/2017-July/101140.html
* Use context manager for wlock
* Rename oldsparsematch to oldmatcher
* Always call parseconfig() because parsing an empty string yields
the same result as the old code
The sparse extension performs a lot of monkeypatching of dirstate
to make it sparse aware. Essentially, various operations need to
take the active sparse config into account. They do this by obtaining
a matcher representing the sparse config and filtering paths through
it.
The monkeypatching is done by stuffing a reference to a repo on
dirstate and calling sparse.matcher() (which takes a repo instance)
during each function call. The reason this function takes a repo
instance is because resolving the sparse config may require resolving
file contents from filelogs, and that requires a repo. (If the
current sparse config references "profile" files, the contents of
those files from the dirstate's parent revisions is resolved.)
I seem to recall people having strong opinions that the dirstate
object not have a reference to a repo. So copying what the sparse
extension does probably won't fly in core. Plus, the dirstate
modifications shouldn't require a full repo: they only need a matcher.
So there's no good reason to stuff a reference to the repo in
dirstate.
This commit exposes a sparse matcher to dirstate via a property that
when looked up will call a function that eventually calls
sparse.matcher(). The repo instance is bound in a closure, so it
isn't exposed to dirstate.
This approach is functionally similar to what the sparse extension does
today, except it hides the repo instance from dirstate. The approach
is not optimal because we have to call a proxy function and
sparse.matcher() on every property lookup. There is room to cache
the matcher instance in dirstate. After all, the matcher only changes
if the dirstate's parents change or if the sparse config changes. It
feels like we should be able to detect both events and update the
matcher when this occurs. But for now we preserve the existing
semantics so we can move the dirstate sparseness bits into core. Once
in core, refactoring becomes a bit easier since it will be clearer how
all these components interact.
The sparse extension has been updated to use the new property.
Because all references to the repo on dirstate have been removed,
the code for setting it has been removed.
This is a pretty straightforward port. Some code cleanup was
performed. But no major changes to the logic were made.
I'm not a huge fan of this function because it does multiple
things. I'd like to get things into core first to facilitate
refactoring later.
Please also note the added inline comment about the oddities
of writeconfig() and the try..except to undo it. This is because
of the hackiness in which the sparse matcher is obtained by
various consumers, notably dirstate. We'll need a massive
refactor to address this. That refactor is effectively blocked
on having the sparse dirstate hacks live in core.
activeprofiles() is a special case of a more generic function.
Furthermore, that generic function is essentially already
implemented inline in the sparse extension.
So, refactor activeprofiles() to a generic activeconfig(). Change
the only consumer of activeprofiles() to use it. And have the
inline implementation in the sparse extension use it.
Well, mostly. The annotation on subrepo functions tacks on a parenthetical to
the abort message, which seems reasonable for a generic mechanism. But now all
messages consistently spell out 'subrepository', and double quote the name of
the repo. I noticed the inconsistency in the change for the last commit.
This simply passes the 'missing' argument down from the context of the parent
repo, so the same rules apply. subrepo.bailifchanged() is hardcoded to care
about missing files, because cmdutil.bailifchanged() is too.
In the end, it looks like this addresses inconsistencies with 'archive',
'identify', blackbox logs, 'merge', and 'update --check'. I wasn't sure how to
implement this in git, so that's left for someone more familiar with it.
Since the identify command adds a '+' for missing files, it's reasonable that
this does too. Perhaps the node field's hex value should be p1+p2 for merges?
This is equivalent to the previous code. But it seems to me that if the user is
going to be prompted that a commit is needed, missing files should be ignored,
but branch and merge changes shouldn't be.
This is equivalent to the previous code, but it seems better to be explicit
about what aspects of dirty are being ignored. Perhaps they shouldn't be, since
the help text says 'followed by a "+" if the working directory has uncommitted
changes'. Both merges and branch changes are committable, even if the files are
unchanged.
Additionally, this will make the `identify` command notice missing subrepo
files, once subrepos are taught to look for missing files.
The regexes for path: and relpath: patterns are the same (since the
paths have already been normalized at the point we create the
regexes).
I don't think the "if pat == '.'" will have any effect relpath:
because relpath: patterns will have the root directory already
normalized to '' by pathutil.canonpath() (unlike path:, for which the
root gets normalized to '.' by util.normpath()).
The regexes are passed to re.match(), which matches against the
beginning of the input, so the '^' doesn't do anything.
Note that unrooted patterns, such as globs and regexes from .hgignore
are instead achieved by adding '.*' to the expression given by the
user. (That's unless the user's expression started with '^', in which
case the '.*' is not added, perhaps to keep the regex cleaner?)
Instead of wrapping committablectx.markcommitted(), we inline
the call into workingctx.markcommitted().
Per smf's review, workingctx is the proper location for this
code, as committablectx is the shared base class for it and
memctx. Since this code touches the working directory, it belongs
in workingctx.
The 'successors' part of the mappings used of be a tuple. This avoid issue from
code consuming the generator "by mistake". For example, an extension inspecting the
mapping content used to be able to iterate over the successors mapping without
consequence.
Since the mapping are small we do not expect any performance impact we use tuple
again for this.
Previously repo.anyrevs only expand aliases in [revsetalias] config. This
patch makes it more flexible to accept a customized dict defining aliases
without having to couple with ui.
revsetlang.expandaliases now has the signature (tree, aliases, warn=None)
which is more consistent with templater.expandaliases. revsetlang.py is now
free from "ui", which seems to be a good thing.
cleanupnodes takes care of bookmark movement, and bookmark movement could
cause bookmark divergent resolution as a side effect. This patch adds such
bookmark divergent resolution logic so future rebase migration will be
easier.
The revset is carefully written to be equivalent to what rebase does today.
Although I think it might make sense to remove divergent bookmarks more
aggressively, for example:
F book@1
|
E book@2
|
| D book
| |
| C
|/
B book@3
|
A
When rebase -s C -d E, "book@1" will be removed, "book@3" will be kept,
and the end result is:
D book
|
C
|
F
|
E book@2 (?)
|
B book@3
|
A
The question is should we keep book@2? The current logic keeps it. If we
choose not to (makes some sense to me), the "deleterevs" revset could be
simplified to "newnode % oldnode".
For now, I just make it compatible with the existing behavior. If we want to
make the "deleterevs" revset simpler, we can always do it in the future.
In some valid usecases, the "mapping" received by scmutil.cleanupnodes have
filtered nodes. Use unfiltered repo to access them correctly.
The added test case will fail with the old cleanupnodes code.
This is important to migrate histedit to use the cleanupnodes API.
Aliases define optional alternatives to existing options. For example the old
option ui.user was deprecated and replaced by ui.username. With this mechanism,
it's even possible to create an alias to an option in a different section.
Add ui.user as alias to ui.username as an example of this concept.
The old alternates principle in ui.config is removed as it was used only for
this option.
If the matching command lives in an in-tree extension (which is all we
scan for), and the user has disabled that extension with
"extensions.<name>=!", we were not finding it, because the path in
_disabledextensions was the empty string. If the user had set
"extensions.<name>=!<valid path>" it would work, so it seems like just
a mistake that it didn't work.
This is a pretty straightforward move of the code.
I converted the "force" argument to a keyword argument.
Like other recent changes, this code is tightly coupled with
working directory update code in merge.py. I suspect the code
will become more tightly coupled over time, possibly even moved
to merge.py. For now, let's get the code in core.
merge.calculateupdates() now filters the update actions through sparse
by default.
The filtering no-ops if sparse isn't enabled or no sparse config
is defined.
The function has been refactored to behave more like a filter
instead of a wrapper of merge.calculateupdates().
We should arguably take sparse into account earlier in
merge.calculateupdates(). This patch preserves the old behavior
of applying sparse at the end of update calculation, which is the
simplest and safest approach.
This was our last method on the custom repo type, meaning we could
remove that custom type and inline the 2 lines of code into
reposetup().
As part of the move, instead of wrapping merge.update() from
the sparse extension, we inline the function call. The ported
function now no-ops if sparse isn't enabled, making it safe to
always call.
The call site in update() may not be the most appropriate. But
it matches the previous behavior, which is the safest thing
to do. It can be improved later.
As part of the move, the function arguments changed so revs are
passed as a list instead of *args. This allows us to use keyword
arguments properly.
Since the plan is to integrate sparse into core and have it
enabled by default, we need to prepare for a sparse matcher
to always be obtained and operated on. As part of the move,
we inserted code that returns an always matcher if sparse
isn't enabled. Some callers in the sparse extension take this
into account and conditionally perform matching depending on
whether the special always matcher is seen. I /think/ this
may have sped up some operations where the extension is
installed but no sparse config is activated.
One thing I'm ensure of in this code is whether os.path.dirname()
is semantically correct. os.posixpath.dirname() (which is
exported as pathutil.dirname) might be a better choise because
all patterns should be using posix directory separators (/)
instead of Windows (\). There's an inline comment that implies
Windows was tested. So hopefully it won't be a problem. We
can improve this in a follow-up. I've added a TODO to track it.
The sparse extension contains some matcher types that are
generic and can exist in core.
As part of the move, the classes now inherit from basematcher.
always(), files(), and isexact() have been dropped because
they match the default implementations in basematcher.
Before, 0 was being used as the default signature value and we cast
the int to a string. We also handled I/O exceptions manually.
The new code uses cfs.tryread() so we always feed data into the
hasher. The empty string does hash and and should be suitable
for input into a cache key.
The changes made the code simple enough that the separate checksum
function could be inlined.
This simplifies the method slightly. It does create a full list of
paths while doing so, but it's not a lot of data anyway (besides, I
would think references to strings are no larger than (references to?)
True).
mergestate.unresolved() is a generator, so it seems better for it to
rely on iteritems() than items(), although it also seems unlikely for
it to make a noticeable difference.
I don't know why applying an empty changegroup should be an error. It
seems harmless. I suspect the check was there to find code that
creates empty changegroups just because that would be wasteful. Let's
use develwarn() for that instead, so we catch any such cases that run
with our test runner, but we still allow others to generate empty
changegroups if they want to.
We have run into this check at Google once or twice and had to work
around it, but I'm changing this not so much because of that, but
because it seems like it shouldn't be an error.
I also changed the message slightly to be more modern ("changelog
group" -> "changegroup") and more generic ("received" -> "applied").
Applying an empty changegroup has been an error since the
beginning. The only exception was strip, which would allow to apply an
empty changegroup from the temporary bundle. However, the emptyok=True
option was only set for bundle1 bundles. In other words, temporary
bundle2 bundles would fail if they were empty.
Bundle2 has now been used enough that it seems safe to say that we
simply don't create bundle2 bundles with empty changegroups. That also
suggests that we never create bundle1 bundles with empty changegroups
(i.e. empty bundle1 bundles, since bundle1 is just a changegroup),
because, AFAICT, the code leading up to the application of the bundle
is the same for bundle1 and bundle2.
Therefore, let's stop passing emptyok=True, so we more clearly get the
same behavior for bundle1 and bundle2.
The "patterns"/"include" in "patternspat"/"includepat" is redundant,
so drop it. Also a "_" prefix since it's "private".
Inline the "pm"/"im" variables.
The code was refactored during the move to be more procedural
instead of using string formatting. This has the benefit of not
writing empty sections, which changed tests.
The sparse extension maintains caches for the sparse files
to a signature and a signature to a matcher. This allows the
sparse matchers to be resolved quickly, which is apparently
something that can occur in loops.
This patch ports the sparse caches to the localrepo class
pretty much as-is. There is potentially room to improve the
caching mechanism. But that can be done as a follow-up.
The default invalidatecaches() now clears the relevant sparse
cache. invalidatesignaturecache() has been moved to sparse.py.
This method is reasonably well-contained and simple to move.
As part of the move, some light formatting was performed.
A "working copy" reference in an error message was changed to
"working directory."
The biggest change was to _refreshoncommit() in sparse.py. It
was previously checking for the existence of an attribute on
the repo instance. Since the moved function now returns empty
data if sparse isn't enabled, we unconditionally call the
new function. However, we do have to protect another method
call in that function. This will all be unhacked eventually.
Currently, the sparse extension sniffs repo instances for
attributes defined by the sparse extension to determine if
sparse is enabled. As we move code away from repo instances,
these checks will be a bit more brittle.
We introduce a module-level variable to track whether sparse is
enabled as a temporary workaround.
One more step towards weaning off methods on repo instances and
moving code to core. While this function is only used once and
is simple, it needs to exist on its own so Facebook can monkeypatch
it to enable simplecache integration.
This patch marks the beginning of moving code from the sparse
extension into core. The goal is to move as much of the
functionality as possible into core, where it will be an
experimental feature. The extension will likely continue to
exist to enable the feature and provide UI elements.
As part of the move, the repo method was converted to a module
function. It doesn't need to exist on repos.
An error message was also updated to reflect that an error isn't
necessarily from the .hg/sparse file. The API should be updated
later to pass in a filename so the error can be more descriptive.
Copyright of the added file was copied from the sparse extension.
In blockdescendants(), we had an assertion when line range of a merge
changeset was not consistent depending on which parent was considered for
computation. For instance, this might occur when file content (in lookup
range) is significantly different between parent branches of the merge as
demonstrated in added tests (where we almost completely rewrite the "baz" file
while also introducing similarities with its content in the other branch we
later merge to).
Now, in such case, we combine line ranges from all parents by storing the
envelope of both line ranges. This is conservative (the line range is
extended, possibly unnecessarily) but at least this should avoid missing
descendants with changes in a range that would fall in that of one parent but
not in another one (the case of "baz: narrow change (2->2+)" changeset in
tests).
Switch the lone call in merge.py to use it.
As with past refactors, the goal is to make wctx hot-swappable with an
in-memory context in the future. This change should be a no-op today.
Now, files (to be truncated) are opened with checkambig=True, only if
localrepository caches it.
Therefore, straightforward "copy if EPERM" is always reasonable to
avoid file stat ambiguity at closing.
This patch makes checkambigatclosing close wrapper copy the target
file, and advance mtime on it after renaming, if EPERM. This can avoid
file stat ambiguity, even if the target file is owned by another (see
issue5418 and issue5584 for detail).
This patch factors main logic out instead of changing
checkambigatclosing._checkambig() directly, in order to reuse it.
Now, transaction can determine whether avoidance of file stat
ambiguity is needed for each files, by blacklist "checkambigfiles".
For similarity to truncation in _playback(), this patch apply
checkambig=True only on limited files in code paths below.
- restoring files by util.copyfile(), in _playback()
(checkambigfiles itself is examined at first, because it as a
keyword argument might be None)
- writing files at finalization of transaction, in _generatefiles()
This patch reduces cost of checking stat at writing out and restoring
files, which aren't filecache-ed.
Advancing mtime by os.utime() fails for EPERM, if the target file is
owned by another. 0d920bcb0fd1 and related changes made some code
paths give advancing mtime up in such case, to fix issue5418.
This causes file stat ambiguity (again), if it is owned by another.
https://www.mercurial-scm.org/wiki/ExactCacheValidationPlan
To avoid file stat ambiguity in such case, especially for
.hg/dirstate, c75c7b3e3284 made vfs.rename() copy the target file, and
advance mtime of renamed one again, if EPERM (see issue5584 for detail).
But straightforward "copy if EPERM" isn't reasonable for truncation of
append-only files at rollbacking, because rollbacking might cost much
for truncation of many filelogs, even though filelogs aren't
filecache-ed.
Therefore, this patch introduces blacklist "checkambigfiles", and
avoids file stat ambiguity only for files specified in this blacklist.
This patch consists of two parts below, which should be applied at
once in order to avoid regression.
- specify 'checkambig=True' at vfs.open(mode='a') in _playback()
according to checkambigfiles
- invoke _playback() with checkambigfiles
- add transaction.__init__() checkambigfiles argument, for _abort()
- make localrepo instantiate transaction with _cachedfiles
- add rollback() checkambigfiles argument, for "hg rollback/recover"
- make localrepo invoke rollback() with _cachedfiles
After this patch, straightforward "copy if EPERM" will be reasonable
at closing the file opened with checkambig=True, because this policy
is applied only on files, which are listed in blacklist
"checkambigfiles".
This information is used to make transaction handle these files
specially, in order to avoid file stat ambiguity of them.
Gathering information about cached files via annotation classes can
avoid overlooking properties newly introduced in the future.
Add a 'successorssets' template that returns the list of all closest known
sucessorssets for a changectx. The elements of the list are changesets.
The "closest successors" are the first locally known revisions encountered
while, walking successors markers. It uses successorsets previously modified
to support the closest argument.
This logic respect repository filtering. So hidden revision will be skipped by
this logic unless --hidden is specified. Since we only display the visible
predecessors, this template will not display anything in most case. It makes a
good candidate for inclusion in the default log output.
I updated the test-obsmarker-template.t test file introduced with the
predecessors template to test successorssets template.
Add a closest argument to successorssets changing the definition of latest
successors.
With "closest=false" (current behavior), latest successors are "leafs" on the
obsmarker graph. They don't have any successor and are known locally.
With "closest=true", latest successors are the closest locally-known
changesets that are visible in the repository or repoview. Closest successors
can be then obsolete, orphan.
This will be used in a later patch to show the closest successor of
changesets with the successorssets template.
We plan to add a new argument to successorsets. But first we need to update
all callers to pass cache argument explicitly to avoid arguments confusion.
Clarify successorssets documentation before we start updating the main function.
This patch serie will introduce the successorssets template, the opposite of
predecessor templates.
Successors will use successorssets function and requires some improvement so
before doing that, we clean up successorssets a bit.
Previously there is a suspicious `if False and delta > 0` which dates back
to the beginning of hgext/record.py (f995f03023c7).
The "trimming context lines" feature could be useful (and is used by the
next patch). So let's enable it. This patch adds a new `maxcontext`
parameter to `recordhunk` and `parsepatch`, changing the `if False`
condition to respect it.
The old `trimcontext` implementation is also wrong - it does not update
`toline` correctly and it does not do the right thing for `before` context.
A doctest was added to guard us from making a similar mistake again.
Since `maxcontext` is set to `None` (unlimited), there is no behavior
change.
There are no remaining users of 'mustaudit' so we can safely drop the API.
External user are unlikely from a quick research so no deprecation is added.
The whole 'mustaudit' API is quite complex compared to its actual usage by its
unique user in stream clone.
Instead we add a "auditpath" parameter to 'vfs.__call_'. The stream clone code
then explicitly open files with path auditing disabled.
The 'mustaudit' API will be cleaned up in the next changeset.