Python 2.6 introduced a new octal syntax: "0oXXX", replacing "0XXX". The
old syntax is not recognized in Python 3 and will result in a parse
error.
Mass rewrite all instances of the old octal syntax to the new syntax.
This patch was generated by `2to3 -f numliterals -w -n .` and the diff
was selectively recorded to exclude changes to "<N>l" syntax conversion,
which will be handled separately.
Previously, if a subrepo was added in ctx2 and then compared to another without
it (ctx1), the subrepo for ctx2 was returned amongst all of the ctx1 based
subrepos, since no subrepo exists in ctx1 to replace it in the 'subpaths' dict.
The two callers of this, basectx.status() and cmdutil.diffordiffstat(), both
compare the yielded subrepo against ctx2, and thus saw no changes when ctx2's
subrepo was returned. The tests here previously didn't mention 's/a' for the
'p1()' case.
This appears to have been a known issue, because some diffordiffstat() comments
mention that the subpath disappeared, and "the best we can do is ignore it". I
originally ran into the issue with some custom convert code to flatten a tree of
subrepos causing hg.putcommit() to abort, but this new behavior seems like the
correct status and diff behavior regardless. (The abort in convert isn't
something users will see, because convert doesn't currently support subrepos in
the official repo.)
This reduces the stack depth from O(n) to O(log(n)). Therefore, repeated
-rREV options will never exceed the Python stack limit.
Currently it depends on private revset._combinesets() function. But at some
point, we'll be able to drop the old-style parser, and revrange() can be
completely rewritten without using _combinesets():
trees = [parse(s) for s in revs]
optimize(('or',) + trees) # combine trees and optimize at once
...
Blockers that prevent eliminating old-style parser:
- nullary ":" operator
- revrange(repo, [intrev, ...]), can be mapped to 'rev(%d)' ?
- revrange(repo, [binnode, ...]), should be banned ?
This will work for any command that creates its matcher via scmutil.match(), but
only the files command is tested here (both workingctx and basectx based tests).
The previous behavior was to completely ignore the files in the subrepo, even
though -S was given.
My first attempt was to teach context.walk() to optionally recurse, but once
that was in place and the complete file list was built up, the predicate test
would fail with 'path in nested repo' when a file in a subrepo was accessed
through the parent context.
There are two slightly surprising behaviors with this functionality. First, any
path provided inside the fileset isn't narrowed when it is passed to the
subrepo. I dont see any clean way to do that in the matcher. Fortunately, the
'subrepo()' fileset is the only one to take a path.
The second surprise is that status predicates are resolved against the subrepo,
not the parent like 'hg status -S' is. I don't see any way to fix that either,
given the path auditor error mentioned above.
The logic to read a requires file resides in scmutil, so it's only logical that
the logic to write the file should be there too.
And now I've typed logic too many times it no longer looks like a word.
Logically.
Just displaying the warning makes it quite hard to recognise the guilty code
quickly and using --traceback for all calls is not very convenient. So we
include the call site with all simple message to help developer to recognise
errors sources.
To eliminate "path prefix" (= "the root of vfs") part from "dirpath"
yielded by "os.walk()" correctly, "path prefix" should have "os.sep"
at the end of own string, but it isn't easy to ensure it, because:
- examination by "path.endswith(os.sep)" isn't portable
Some problematic encodings use 0x5c (= "os.sep" on Windows) as the
tail byte of some multi-byte characters.
- "os.path.join(path, '')" isn't portable
With Python 2.7.9, this invocation doesn't add "os.sep" at the end
of UNC path (see issue4557 for detail).
Python 2.7.9 changed also behavior of "os.path.normpath()" (see *) and
"os.path.splitdrive()" for UNC path.
vfs root normpath splitdrive os.sep required
=============== ============== =================== ============
z:\ z:\ z: + \ no
z:\foo z:\foo z: + \foo yes
z:\foo\ z:\foo z: + \foo yes
[before Python 2.7.9]
\\foo\bar \\foo\bar '' + \\foo\bar yes
\\foo\bar\ \\foo\bar (*) '' + \\foo\bar yes
\\foo\bar\baz \\foo\bar\baz '' + \\foo\bar\baz yes
\\foo\bar\baz\ \\foo\bar\baz '' + \\foo\bar\baz yes
[Python 2.7.9]
\\foo\bar \\foo\bar \\foo\bar + '' yes
\\foo\bar\ \\foo\bar\ (*) \\foo\bar + \ no
\\foo\bar\baz \\foo\bar\baz \\foo\bar + \baz yes
\\foo\bar\baz\ \\foo\bar\baz \\foo\bar + \baz yes
If it is ensured that "normpath()"-ed vfs root is passed to
"splitdrive()", adding "os.sep" is required only when "path" part of
"splitdrive()" result isn't "os.sep" itself. This is just what
"pathutil.nameasprefix()" examines.
This patch applies "os.path.normpath()" on "self.join(None)"
explicitly, because it isn't ensured that vfs root is already
normalized: vfs itself is constructed with "realpath=False" (= avoid
normalizing in "vfs.__init__()") in many code paths.
This normalization should be much cheaper than subsequent file I/O for
directory traversal.
An upcoming patch will establish per-filter tags caches. We'll want
to use the same cache validation logic as the branch cache. Prepare
for that by moving the logic for computing a filtered view hash
to somewhere central.
This duplicates "onerror()" function from "svnsubrepo.remove()" for
equivalence of replacing in subsequent patch.
This "onerror()" function for "shutil.rmtree()" was introduced by
094a056562e7, which avoids failure of removing svn repository on
Windows.
An upcoming commit requires that match.py be able to call scmutil.dirs(), but
when match.py imports scmutil, a dependency cycle is created. This commit
avoids the cycle by moving dirs() and its related finddirs() function from
scmutil to util, which match.py already depends on.
It's unfortunate that workingctx revision is None, which doesn't work well in
arithmetic operation or comparison. This function is trivial but will be used
in several places.
Several commands take the fast path when match.always() is
true. However, when the user passes "." on the command line, that
results in a matcher for which match.always() == False. Let's make it
so such matchers return True, and have an empty list of .files(). This
makes e.g. "hg log ." as fast as "hg log" and "hg revert ." as fast as
"hg revert --all" (when run from repo root).
If a user has a revsetalias defined, it is their explicit wish for
this alias to be parsed as a revset and nothing else. Although the
case of the alias being short enough and only contain the letters a-f
is probably kind of rare, it may still happen.
This change is intended to avoid exposing the implementation detail to
callers. I'm going to extend fullreposet to support "null" revision, so
these mfunc calls will have to use fullreposet() instead of spanset().
The full path is propagated to the original match object since this is often
used directly for printing a file name to the user. This is cleaner than
requiring each caller to join the prefix with the file name prior to calling it,
and will lead to not having to pass the prefix around separately. It is also
consistent with the bad() and abs() methods in terms of the required input. The
uipath() method now inherits this path building property.
There is no visible change in path style for rel() because it ultimately calls
util.pathto(), which returns an os.sep based path. (The previous os.path.join()
was violating the documented usage of util.pathto(), that its third parameter be
'/' separated.) The doctest needed to be normalized to '/' separators to avoid
test differences on Windows, now that a full path is returned for a short
filename.
The test changes are to drop globs that are no longer necessary when printing an
absolute file in a subrepo, as returned from match.uipath(). Previously when
os.path.join() was used to add the prefix, the absolute path to a file in a
subrepo was printed with a mix of '/' and '\'. The absolute path for a file not
in a subrepo, as returned from match.uipath(), is still purely '/' based.
This method has the same behavior as the 'os.path.split' function, but having
it in vfs will allow handling of tricky encoding situations in the future.
In the same patch, we replace the use of 'os.path.split' in the transaction code.
The vfs.join method only works for absolute paths. We need something
that works for relative paths too when transforming filenames. Since
os.path.join may misbehave in tricky encoding situations, encapsulate
the new join method in our vfs abstraction. The default implementation
remains os.path.join, but this opens the door to other VFSes doing
something more intelligent based on their needs.
In the same go, we replace the usage of 'os.path.join' in transaction code.
The recursive addremove operation occurs completely before the first subrepo is
committed. Only hg subrepos support the addremove operation at the moment- svn
and git subrepos will warn and abort the commit.
It looks like a bad path is the only mode of failure for addremove. This
warning is probably useful for the standalone command, but more important for
'commit -A'. That command doesn't currently abort if the addremove fails, but
it will be made to do so prior to adding subrepo support, since not all subrepos
will support addremove. We could just abort here, but it looks like addremove
has always silently ignored bad paths, except for the exit code.
This fixes the previously mentioned issue with 7d5fcea60c78, and undoes its
corresponding test change.
The test change demonstrates the correctness when a file is specified (i.e. the
glob is required on Windows because relative paths use '\' and absolute paths
use '/'). It is admittedly very subtle, but there will be a more robust test in
the addremove -S v3 series.
For "hg addremove 'glob:*.py'", we print any paths added or removed as
relative to the current directory, but when "hg addremove -I
'glob:*.py'" is used, we use the absolute path (relative from the repo
root). It seems like they should be the same, so change it so we use
relative paths in both cases. Continue to use absolute paths when no
patterns are given.
This patch uses "False" as default value of "notindexed" argument,
even though "vfs.makedir()" uses "True" for it, because "os.mkdir()"
doesn't set "_FILE_ATTRIBUTE_NOT_CONTENT_INDEXED" attribute to newly
created directories.
This patch allows "readlines" and "tryreadlines" to take "mode"
argument, because "subrepo" requires to read files not in "rb"
(binary, default for vfs) but in "r" (text) mode in subsequent patch.
After running "hg forget README && hg addremove", README will still be
reported as removed, while "hg forget README && hg add README" adds it
back so it gets reported as clean. It seems like they should behave
the same. Furthermore, it seems like no files should remain untracked
after 'hg addremove && hg commit' (or 'hg commit -A'). For these
reasons, change the behavior of addremove so it does add forgotten
files back.
The problem is with scmutil._interestingfiles(), which reports the
file as removed, so scmutil.addremove() does not add it. Fix by
teaching _interestingfiles() to report forgotten files separately from
removed files and make addremove() add forgotten files back. However,
do not treat forgotten files as sources for rename detection. Note
that since removed and forgotten files are treated the same before
this change, forgotten files were considered sources for rename
detection.
Also update the other caller, marktouched(), in the same way as
addremove().
This helps providing a more consistent user experience on all platforms and
with all packaging.
The exact location of default.d depends on how Mercurial is installed and
whether it is 'frozen'. The exact location should never be relevant to users
and is intentionally not explained in details in the documentation. It will
however always be next to the help and templates files.
Note that setting HGRCPATH also disables these defaults. I don't know if that
should be considered a bug or a feature.
The various status types are currently documented on the
dirstate.status() method. Now that we have a class for the status
types, it makese sense to document the status types there
instead. Only leave the bits related to lookup/unsure in the status()
method documentation.
Callers of various status() methods (on dirstate, context, repo) get a
tuple of 7 elements, where each element is a list of files. This
results in lots of uses of indexes where names would be much more
readable. For example, "status.ignored" seems clearer than "status[4]"
[1]. So, let's introduce a simple named tuple containing the 7 status
fields: modified, added, removed, deleted, unknown, ignored, clean.
This patch introduces the class and updates the status methods to
return instances of it. Later patches will update the callers.
[1] Did you even notice that it should have been "status[5]"?
(tweaked by mpm to introduce the class in scmutil and only change one user)
No real changes.
pattern: 'kind:pat' as specified on the command line
patterns, pats: list of patterns
kind: 'path', 'glob' or 're' or ...
pat: string in the corresponding 'kind' format
kindpats: list of (kind, pat) tuples
Since recent revset changes, revrange now return a smartset. This smart set
probably does not support indexing (_addset does not). This led to crash.
Instead when the smartset is ordered we use the `min` and `max` method of
smart set. Otherwise we turn is into a list and use indexing on it.
The tests have been updated to catch such regression.
Unknown requirements will now be reported as:
abort: repository requires features unknown to this Mercurial: largefiles!
(see http://mercurial.selenic.com/wiki/MissingRequirement for more information)
Some features of this phrasing:
* avoid double ':' in abort message
* make it more clear who requires and knows what
* don't quote the requirement names - it is not something the user entered or
need the exact spelling of ... and it is "identifiers" that are unambiguous
anyway
* remove double hint by removing "(upgrade Mercurial)" comment
* don't mention upgrading Mercurial without mentioning enabling the feature -
instead, just refer to wiki page for both
* don't just talk about "details", talk about "more information"
The wiki page is intended to describe several solution to the requirement issue.
Some of those solutions does not involve upgrading mercurial. That is very
useful for people that can't easily upgrade they Mercurial in some place.
Upcoming patches will allow the file cache to watch over multiple files, and
call the decorated function again if any of the files change.
The particular use case for this is the bookmark store, which needs to be
invalidated if either .hg/bookmarks or .hg/bookmarks.current changes. (This
doesn't currently happen, which is a bug. This bug will also be fixed in
upcoming patches.)
Before this patch, almost all of check code in
"casecollisionauditor.__call__()" is executed, even if specified
filename is already checked, because "f in self._newfiles" is examined
lastly.
In addition to it, adding "fl" to "self._loweredfiles" and "f" to
"self._newfiles" are also redundant in such case.
This patch checks "f in self._newfiles" first, and returns immediately
to avoid execution of check code for efficiency.
This patch also changes paths added to "rejected" list from full path
(referred by "p") to relative one (referred by "f"), when type of
target path is neither file nor symlink.
This change should be reasonable, because the path added to "rejected"
list is relative one, when "OSError" is raised at "lstat()".
Just invoking "os.fstat()" with "file.fileno()" doesn't require non
ANSI file API, because filename is not used for invocation of
"os.fstat()".
But "util.fstat()" should invoke "os.stat()" with "fp.name", if file
object doesn't have "fileno()" method for portability, and "fp.name"
may cause invocation of non ANSI file API.
So, this patch makes the constructor of appender class invoke
"util.fstat()" via vfs, to encapsulate filename handling.
Full walks are only necessary when the caller needs the list of clean files.
addremove by definition doesn't need them.
With this patch and an extension that produces a subset of results for
dirstate.walk when full is False, on a repository with over 200,000 files, hg
addremove drops from 2.8 seconds to 0.5.
Several places use scmutil.addremove as a means to declare that certain files
have been operated on. This is ugly because:
- addremove takes patterns relative to the cwd, not paths relative to the root,
which means extra contortions for callers.
- addremove doesn't make clear what happens to files whose status hasn't
changed.
This new method accepts filenames relative to the repo root, and has a much
clearer contract. It also allows future modifications that do more with files
whose status hasn't changed.
An upcoming patch will refactor some code out into a method called
_findrenames. Having a line saying "copies = _findrenames..." is confusing.
Besides, 'renames' is a more precise name for this local anyway.
Before this patch, vfs constructor applies both "util.expandpath()"
and "os.path.realpath()" on "base" path, if "expand" is True.
This patch splits it into "realpath" and "expandpath", to apply each
functions separately: this splitting can allow to use vfs also where
one of each is not needed.
This is over twice as fast as the Python dirs code. Upcoming changes
will nearly double its speed again.
perfdirs results for a working dir with 170,000 files:
Python 638 msec
C 244
This encapsulates the "multiset of directories" structures that are
currently open-coded (and duplicated) in both the dirstate and
context modules.
This will be used, and optionally replaced by a C implementation,
in upcoming changes.
Now that we no longer sort all the walk results, using iteritems becomes
possible.
This is a relatively minor speedup: on a large repository with 170,000 files,
perfaddremove goes from 2.13 seconds to 2.10.
The only place where the order matters is in printing out added or removed
files. We already sort that set.
On a large repository with 170,000 files, this speeds up perfaddremove from
2.34 seconds to 2.13.
This will let us stop sorting all the results in an upcoming patch.
This also permits future refactorings where the same code consumes
dirstate.walk results but doesn't print anything out.
An argument could be made that printing out results as we go along is more
responsive UI-wise. However, at this point iterating through walk results is
actually faster than sorting them, so once we stop sorting all the results the
argument ceases to be valid.
dirstate.walk only does lstats and never returns stat objects for directories.
On a large repository with 170,000 files, this speeds perfaddremove up from
2.40 seconds to 2.34.