Summary:
Before, we would raise whenever the `usemergedriver` condition was set when merging in-memory,
which equated to "any merge with (cd, dc, or m) actions in a repo with a mergedriver script".
This was done to be as conservative as possible.
However, a better solution is to run the preprocess() script and only raise if any files are
marked to actually be driver-resolved. That way we only restart the merge if we absolutely need
to.
Since some of our preprocess() scripts aren't ready yet, I also added
experimental.inmemory.nomergedriver in a previous change so we can deploy this in a build before the preprocess scripts are good to go.
Test Plan: ./run-tests.py
Reviewers: quark, #sourcecontrol
Reviewed By: quark
Subscribers: durham
Differential Revision: https://phabricator.intern.facebook.com/D6668426
Signature: 6668426:1515185050:a640208454caf053f8213b831d0f8e645ebe682c
Summary: Log whichever paths were driver-resolved but not in experimental.inmemorydisallowedpaths, so we can update experimental.inmemorydisallowedpaths and keep the experience of rebasing with IMM and
mergedriver a pleasant one.
Test Plan: .
Reviewers: #sourcecontrol
Differential Revision: https://phabricator.intern.facebook.com/D6656159
In large repositories, updates involving the creation of many files check the
same directories repeatedly in the wctx manifest. Move these checks out to a
separate loop to avoid repeated checks hitting the manifest.
Differential Revision: https://phab.mercurial-scm.org/D1226
As mentioned in D1222, the recent pathconflicts change regresses update
performance in large repositories when many files are being updated.
To mitigate this, we introduce two caches of directories that have
already found to be either:
- unknown directories, but which are not aliased by files and
so don't need to be checked if they are files again; and
- missing directores, which cannot cause path conflicts, and
cannot contain a file that causes a path conflict.
When checking the paths of a file, testing against this caches means we can
skip tests that involve touching the filesystem.
Differential Revision: https://phab.mercurial-scm.org/D1224
We've found a severe perf regression in `hg update` caused by the path conflict
checking code. The next patch will disable this by default.
Differential Revision: https://phab.mercurial-scm.org/D1222
The short options "-c" and "-C" may be confusing for a novice reading the
documentation. Let's try to be more explicit, also mentioning the equivalent
long options ("--check" and "--clean") in the comments.
fsmonitor can significantly speed up operations on large working
directories. But fsmonitor isn't enabled by default, so naive users
may not realize there is a potential to make Mercurial faster.
This commit introduces a warning to working directory updates when
fsmonitor could be used.
The following conditions must be met:
* Working directory is previously empty
* New working directory adds >= N files (currently 50,000)
* Running on Linux or MacOS
* fsmonitor not enabled
* Warning not disabled via config override
Because of the empty working directory restriction, most users will
only see this warning during `hg clone` (assuming very few users
actually do an `hg up null`).
The addition of a warning may be considered a BC change. However, clone
has printed warnings before. Until recently, Mercurial printed a warning
with the server's certificate fingerprint when it wasn't explicitly
trusted for example. The warning goes to stderr. So it shouldn't
interfere with scripts parsing meaningful output.
The OS restriction was on the advice of Facebook engineers, who only
feel confident with watchman's stability on the supported platforms.
.. feature::
Print warning when fsmonitor isn't being used on a large repository
Differential Revision: https://phab.mercurial-scm.org/D894
With in-memory merge, copy information needs to be stored in-memory, not in the
dirstate.
To make this transition easy, move the existing dirstate-based approach to
workingfilectx; that way, other implementations can choose to store it
somewhere else.
Differential Revision: https://phab.mercurial-scm.org/D1106
In future patches, we may halt the merge process based on configuration or
user requests by raising exceptions. We need to ensure that the mergestate
is unconditionally committed even when such an exception is raised.
Depends on D930.
Differential Revision: https://phab.mercurial-scm.org/D931
When merging, check for any path conflicts introduced by the manifest
merge and rename the conflicting file to a safe name.
Differential Revision: https://phab.mercurial-scm.org/D784
When updating to a new revision, check for path conflicts caused by unknown
files in the working directory, and handle these by backing up the file or
directory and replacing it.
Differential Revision: https://phab.mercurial-scm.org/D781
We will need to distinguish between file conflicts and path conflicts. Rename
the conflicts variable so that it will be clearly distinct from pathconflicts,
which will be introduced in a future commit.
Differential Revision: https://phab.mercurial-scm.org/D780
Add a new function which, given a file name, finds the shortest path for which
there is a conflicting file or directory in the working directory.
Differential Revision: https://phab.mercurial-scm.org/D779
During batchget, if a target file conflicts with a directory, or if the
directory a target file is in conflicts with a file, backup and remove the
conflicting file or directory before performing the get.
Differential Revision: https://phab.mercurial-scm.org/D778
Add a new merge action to handle a path conflict by renaming the conflicting
file to a safe name.
The rename is just to avoid problems on the filesystem. The conflict is still
considered unresolved until the user marks the original path as resolved.
Differential Revision: https://phab.mercurial-scm.org/D777
Add a new merge action to record path conflicts. A status message is
printed, and the path conflict is added to the merge state.
Differential Revision: https://phab.mercurial-scm.org/D776
Path conflicts that occur during merges are represented by 'pu' (unresolved)
and 'pr' (resolved) records in the merge state. These are stored on disk
in 'P' records.
Differential Revision: https://phab.mercurial-scm.org/D774
This will allow anyone to enable the first in-menmory merge milestone
by wrapping merge.update in an extension and creating an overlayworkingctx.
Differential Revision: https://phab.mercurial-scm.org/D682
``recordupdates`` calls into the dirstate which requires the files to be
there, so this is the last possible moment we can flush anything.
Differential Revision: https://phab.mercurial-scm.org/D673
Since we fork to create workers, any changes they queue up will be lost after
the worker terminates, so the easiest solution is to have each worker flush
the writes they accumulate--we are close to the end of the merge in any case.
To prevent duplicated writes, we also have the master processs flush before
forking.
In an in-memory merge (M2), we'll instead disable the use of workers.
Differential Revision: https://phab.mercurial-scm.org/D628
In the in-memory merge branch. we'll need to call a function (``flushall``) on
the wctx inside of _xmerge.
This prepares the way so it can be done without hacks like ``fcd.ctx()``.
Differential Revision: https://phab.mercurial-scm.org/D449
This is done by a script [2] using RedBaron [1], a tool designed for doing
code refactoring. All "default" values are decided by the script and are
strongly consistent with the existing code.
There are 2 changes done manually to fix tests:
[warn] mercurial/exchange.py: experimental.bundle2-output-capture: default needs manual removal
[warn] mercurial/localrepo.py: experimental.hook-track-tags: default needs manual removal
Since RedBaron is not confident about how to indent things [2].
[1]: https://github.com/PyCQA/redbaron
[2]: https://github.com/PyCQA/redbaron/issues/100
[3]:
#!/usr/bin/env python
# codemod_configitems.py - codemod tool to fill configitems
#
# Copyright 2017 Facebook, Inc.
#
# This software may be used and distributed according to the terms of the
# GNU General Public License version 2 or any later version.
from __future__ import absolute_import, print_function
import os
import sys
import redbaron
def readpath(path):
with open(path) as f:
return f.read()
def writepath(path, content):
with open(path, 'w') as f:
f.write(content)
_configmethods = {'config', 'configbool', 'configint', 'configbytes',
'configlist', 'configdate'}
def extractstring(rnode):
"""get the string from a RedBaron string or call_argument node"""
while rnode.type != 'string':
rnode = rnode.value
return rnode.value[1:-1] # unquote, "'str'" -> "str"
def uiconfigitems(red):
"""match *.ui.config* pattern, yield (node, method, args, section, name)"""
for node in red.find_all('atomtrailers'):
entry = None
try:
obj = node[-3].value
method = node[-2].value
args = node[-1]
section = args[0].value
name = args[1].value
if (obj in ('ui', 'self') and method in _configmethods
and section.type == 'string' and name.type == 'string'):
entry = (node, method, args, extractstring(section),
extractstring(name))
except Exception:
pass
else:
if entry:
yield entry
def coreconfigitems(red):
"""match coreconfigitem(...) pattern, yield (node, args, section, name)"""
for node in red.find_all('atomtrailers'):
entry = None
try:
args = node[1]
section = args[0].value
name = args[1].value
if (node[0].value == 'coreconfigitem' and section.type == 'string'
and name.type == 'string'):
entry = (node, args, extractstring(section),
extractstring(name))
except Exception:
pass
else:
if entry:
yield entry
def registercoreconfig(cfgred, section, name, defaultrepr):
"""insert coreconfigitem to cfgred AST
section and name are plain string, defaultrepr is a string
"""
# find a place to insert the "coreconfigitem" item
entries = list(coreconfigitems(cfgred))
for node, args, nodesection, nodename in reversed(entries):
if (nodesection, nodename) < (section, name):
# insert after this entry
node.insert_after(
'coreconfigitem(%r, %r,\n'
' default=%s,\n'
')' % (section, name, defaultrepr))
return
def main(argv):
if not argv:
print('Usage: codemod_configitems.py FILES\n'
'For example, FILES could be "{hgext,mercurial}/*/**.py"')
dirname = os.path.dirname
reporoot = dirname(dirname(dirname(os.path.abspath(__file__))))
# register configitems to this destination
cfgpath = os.path.join(reporoot, 'mercurial', 'configitems.py')
cfgred = redbaron.RedBaron(readpath(cfgpath))
# state about what to do
registered = set((s, n) for n, a, s, n in coreconfigitems(cfgred))
toregister = {} # {(section, name): defaultrepr}
coreconfigs = set() # {(section, name)}, whether it's used in core
# first loop: scan all files before taking any action
for i, path in enumerate(argv):
print('(%d/%d) scanning %s' % (i + 1, len(argv), path))
iscore = ('mercurial' in path) and ('hgext' not in path)
red = redbaron.RedBaron(readpath(path))
# find all repo.ui.config* and ui.config* calls, and collect their
# section, name and default value information.
for node, method, args, section, name in uiconfigitems(red):
if section == 'web':
# [web] section has some weirdness, ignore them for now
continue
defaultrepr = None
key = (section, name)
if len(args) == 2:
if key in registered:
continue
if method == 'configlist':
defaultrepr = 'list'
elif method == 'configbool':
defaultrepr = 'False'
else:
defaultrepr = 'None'
elif len(args) >= 3 and (args[2].target is None or
args[2].target.value == 'default'):
# try to understand the "default" value
dnode = args[2].value
if dnode.type == 'name':
if dnode.value in {'None', 'True', 'False'}:
defaultrepr = dnode.value
elif dnode.type == 'string':
defaultrepr = repr(dnode.value[1:-1])
elif dnode.type in ('int', 'float'):
defaultrepr = dnode.value
# inconsistent default
if key in toregister and toregister[key] != defaultrepr:
defaultrepr = None
# interesting to rewrite
if key not in registered:
if defaultrepr is None:
print('[note] %s: %s.%s: unsupported default'
% (path, section, name))
registered.add(key) # skip checking it again
else:
toregister[key] = defaultrepr
if iscore:
coreconfigs.add(key)
# second loop: rewrite files given "toregister" result
for path in argv:
# reconstruct redbaron - trade CPU for memory
red = redbaron.RedBaron(readpath(path))
changed = False
for node, method, args, section, name in uiconfigitems(red):
key = (section, name)
defaultrepr = toregister.get(key)
if defaultrepr is None or key not in coreconfigs:
continue
if len(args) >= 3 and (args[2].target is None or
args[2].target.value == 'default'):
try:
del args[2]
changed = True
except Exception:
# redbaron fails to do the rewrite due to indentation
# see https://github.com/PyCQA/redbaron/issues/100
print('[warn] %s: %s.%s: default needs manual removal'
% (path, section, name))
if key not in registered:
print('registering %s.%s' % (section, name))
registercoreconfig(cfgred, section, name, defaultrepr)
registered.add(key)
if changed:
print('updating %s' % path)
writepath(path, red.dumps())
if toregister:
print('updating configitems.py')
writepath(cfgpath, cfgred.dumps())
if __name__ == "__main__":
sys.exit(main(sys.argv[1:]))
merge.calculateupdates() now filters the update actions through sparse
by default.
The filtering no-ops if sparse isn't enabled or no sparse config
is defined.
The function has been refactored to behave more like a filter
instead of a wrapper of merge.calculateupdates().
We should arguably take sparse into account earlier in
merge.calculateupdates(). This patch preserves the old behavior
of applying sparse at the end of update calculation, which is the
simplest and safest approach.
This was our last method on the custom repo type, meaning we could
remove that custom type and inline the 2 lines of code into
reposetup().
As part of the move, instead of wrapping merge.update() from
the sparse extension, we inline the function call. The ported
function now no-ops if sparse isn't enabled, making it safe to
always call.
The call site in update() may not be the most appropriate. But
it matches the previous behavior, which is the safest thing
to do. It can be improved later.
This simplifies the method slightly. It does create a full list of
paths while doing so, but it's not a lot of data anyway (besides, I
would think references to strings are no larger than (references to?)
True).
mergestate.unresolved() is a generator, so it seems better for it to
rely on iteritems() than items(), although it also seems unlikely for
it to make a noticeable difference.
Switch the lone call in merge.py to use it.
As with past refactors, the goal is to make wctx hot-swappable with an
in-memory context in the future. This change should be a no-op today.
This is necessary because some callers in merge.py pass backgroundclose=True
when writing.
As with previous changes in this series, this should be a no-op.