Adds a only() revset that has two forms:
only(<set>) is equivalent to "::<set> - ::(heads() - heads(<set>::))"
only(<include>,<exclude>) is equivalent to "::<include> - ::<exclude>"
On a large repo, this implementation can process/traverse 50,000 revs in 0.7
seconds, versus 4.2 seconds using "::<include> - ::<exclude>".
This is useful for performing histedits on your branch:
hg histedit -r 'first(only(.))'
Or lifting branch foo off of branch bar:
hg rebase -d @ -s 'only(foo, bar)'
Or a variety of other uses.
A missing ancestor expression is any expression of the form (::x - ::y) or
equivalent. Such expressions are remarkably common, and so far have involved
multiple walks down the DAG, followed by a set difference operation.
With this patch, such expressions will be transformed into uses of the fast
algorithm at ancestor.missingancestor.
For a repository with over 600,000 revisions, perfrevset for '::tip - ::-10000'
returns:
Before: ! wall 3.999575 comb 4.000000 user 3.910000 sys 0.090000 (best of 3)
After: ! wall 0.132423 comb 0.130000 user 0.130000 sys 0.000000 (best of 75)
Before this patch, revset predicate "filelog()" uses "match.files()"
to get filename also for the pattern without explicit kind.
But in such case, only canonicalization of relative path is required,
and other initializations of "match" object including regexp
compilation are meaningless.
This patch uses "pathutil.canonpath()" directly for "filelog()"
pattern without explicit kind like "glob:", for efficiency.
This patch also does below as a part of introducing "canonpath()":
- move location of "matchmod.match()" invocation, because "m" is no
more used in "if not matchmod.patkind(pat)" code path
- omit passing "default" argument to "matchmod.match()", because
"pat" should have explicit kind of pattern in this code path
Before this patch, default kind of pattern for revset predicate
"contains()" is treated as the exact file path rooted at the root of
the repository. This decreases usability, because:
- all other predicates taking pattern argument (also "filelog()")
treat such pattern as the path rooted at the current working
directory
- "contains()" doesn't describe this difference in its help
- this difference may confuse users
for example, this prevents revset aliases from sharing same
argument between "contains()" and other predicates
This patch makes default kind of pattern for revset predicate
"contains()" be rooted at the current working directory.
This patch uses "pathutil.canonpath()" instead of creating "match"
object for efficiency.
Some changesets can be wrongly reported as matched by this predicate
due to searching in a string joined with spaces and not individually.
A test case added, which fails without this fix.
Backout 308a153b9120. The changeset prevented closing non-head changesets but
did not provide any rationale or test case and I don't see what value it adds.
Users might have their reasons to commit something anywhere - and close it
immediately.
And contrary to the comment that is removed: The topo heads set is _not_
included in the branch heads set of the current branch. It do not include
closed topological heads.
The change thus prevented closing commits on top of closing commits. A valid
usecase for that is to merge closed heads to reduce the number of topological
heads.
The only existing test coverage for this is the failing double close in
test-revset.t. It was added in dc0e42c06b4e and seems to not be intentional.
Change ancestor to accept 0 or more arguments. The greatest common ancestor of a
single changeset is that changeset. If passed no arguments, the empty list is
returned.
Before this patch, sub expression may return unexpected result, if it
is joined with another expression by "or":
- "^"/parentspec():
"R or R^1" is not equal to "R^1 or R". the former returns only "R".
- "~"/ancestorspec():
"R or R~1" is not equal to "R~1 or R". the former returns only "R".
- ":"/rangeset():
"10 or (10 or 15):" is not equal to "(10 or 15): or 10". the
former returns only 10 and 15 or grater (11 to 14 are not
included).
In "or"-ed expression "A or B", the "subset" passed to evaluation of
"B" doesn't contain revisions gotten from evaluation of "A", for
efficiency.
In the other hand, "stringset()" fails to look corresponding revision
for specified string/symbol up, if "subset" doesn't contain that
revision.
So, predicates looking revisions up indirectly should evaluate sub
expressions of themselves not with passed "subset" but with "entire
revisions in the repository", to prevent "stringset()" from unexpected
failing to look symbols in them up.
But predicates in above example don't so. For example, in the case of
"R or R^1":
1. "R^1" is evaluated with "subset" containing revisions other than
"R", because "R" is already gotten by the former of "or"-ed
expressions
2. "parentspec()" evaluates "R" of "R^1" with such "subset"
3. "stringset()" fails to look "R" up, because "R" is not contained
in "subset"
4. so, evaluation of "R^1" returns no revision
This patch evaluates sub expressions for predicates above with "entire
revisions in the repository".
63af8381691a (released as Mercurial 2.3) introduced the issue that the
revset program started with 40 hexadecimal letters caused unexpected
result at "hg log" execution.
This issue was already fixed by b579dcd15206 (released as 2.3.1), but
there is no test to examine whether this issue is certainly fixed or
not: no test fails even if b579dcd15206 is backed out.
This patch adds test for this issue.
Added test is also confirmed to fail, when it is tested against:
- Mercurial 2.3, or
- Mercurial 2.3.1 or later with backing b579dcd15206 out
The branchpoint() function returns changesets with more than one child.
Eventually I would like to be able to see only branch points and merge
points in a graphical log to see the topology of the repository.
In MSYS, the test fails like this if the hghave exit at the beginning is
removed:
--- C:\Users\adi\hgrepos\hg-main\tests\test-revset.t
+++ C:\Users\adi\hgrepos\hg-main\tests\test-revset.t.err
@@ -58,7 +58,7 @@
$ hg co 3
2 files updated, 0 files merged, 0 files removed, 0 files unresolved
$ hg branch /a/b/c/
- marked working directory as branch /a/b/c/
+ marked working directory as branch a:/b/c/
(branches are permanent and global, did you want a bookmark?)
$ hg ci -Aqm"5 bug"
@@ -252,7 +252,7 @@
2 a-b-c-
3 +a+b+c+
4 -a-b-c-
- 5 /a/b/c/
+ 5 a:/b/c/
6 _a_b_c_
7 .a.b.c.
$ log 'children(ancestor(4,5))'
due to the posix path conversion done by MSYS globally, as explained here
http://www.mingw.org/wiki/Posix_path_conversion
The solution is a bit lame, but it is simple and works: don't use strings that
look like '/a/b', in order not to trigger the path magic done by MSYS.
So, if we can agree not to insist on testing branch names starting with '/',
then this relatively simple patch makes the test pass both on Windows with MSYS
and Linux.
If the string provided to the 'tag' predicate starts with 're:', the rest
of the string will be treated as a regular expression and matched against
all tags in the repository.
There is a slight backwards-compatibility problem for people who actually
have tags that start with 're:'. As a workaround, these tags can be matched
using a 'literal:' prefix.
If no tags match the pattern, an error is raised. This matches the behaviour
of the previous exact-match code.
The alias expansion code it changed from:
1- Get replacement tree
2- Substitute arguments in the replacement tree
3- Expand the replacement tree again
into:
1- Get the replacement tree
2- Expand the replacement tree
3- Expand the arguments
4- Substitute the expanded arguments in the replacement tree
and fixes cases like:
[revsetalias]
level1($1, $2) = $1 or $2
level2($1, $2) = level1($2, $1)
$ hg log -r "level2(level1(1, 2), 3)"
where the original version incorrectly aborted on infinite expansion
error, because it was confusing the expanded aliases with their
arguments.
The current revset alias expansion code works like:
1- Get the replacement tree
2- Substitute the variables in the replacement tree
3- Expand the replacement tree
It makes it easy to substitute alias arguments because the placeholders
are always replaced before the updated replacement tree is expanded
again. Unfortunately, to fix other alias expansion issues, we need to
reorder the sequence and delay the argument substitution. To solve this,
a new "virtual" construct called _aliasarg() is introduced and injected
when parsing the aliases definitions. Only _aliasarg() will be
substituted in the argument expansion phase instead of all regular
matching string. We also check user inputs do not contain unexpected
_aliasarg() instances to avoid argument injections.
The fast path was triggered if the argument was not like "type:value", with
type a known pattern type. This is wrong for several reasons:
- path:value is valid for the fast path
- '*' is interpreted as a glob by default and is not valid for fast path
Fast path detection is now done after the pattern is parsed, and the normalized
path is extracted for direct comparison. All this seems a bit complicated, it
is tempting to drop the fast path completely. Also, the hasfile() revset does
something similar (only check .files()), without a fast path. If the fast path
is really that efficient maybe it should be used there too.
Note that:
$ log 'modifies("set:modified()")'
is different from:
$ log 'modifies("*")'
because of the usual merge ctx.files()/status(ctx.p1(), ctx) differences.
Reported by Steffen Eichenberg <steffen.eichenberg@msg-gillardon.de>
This adds a couple of tests for the revset "matching" keyword:
1. Test that the 2nd parameter is optional
2. Test that all the 1st argument can be a revset and that all the supported
fields of the 2nd argument work.
The revset aliases expansion worked like:
expr = "some revset"
for alias in aliases:
expr = alias.process(expr)
where "process" was replacing the alias with its *unexpanded* substitution,
recursively. So it only worked when aliases were applied in proper dependency
order.
This patch rewrites the expansion process so all aliases are expanded
recursively at every tree level, after parent alias rewriting and variable
expansion.
some problematic encoding (e.g.: cp932) uses ASCII alphabet characters
in byte sequence of multi byte characters.
"str.lower()" on such byte sequence may treat distinct characters as
same one, and cause unexpected log matching.
this patch uses "encoding.lower()" instead of "str.lower()" to
normalize strings for compare.
The existing code seemed to have incorrect assumptions about how parameter
lists are represented by the parser.
Now the match and replace functions have been merged and simplified by using
getlist().
Like keyword(), but does not search in filenames and users.
No grepdesc() or descgrep() added, because it might be bad to introduce
grepfoo() versions of too many string searches.
When limit, last, min and max were evaluated they worked on a reduced set in the
wrong way. Now they work on an unrestricted set (the whole repo) and get
limited later on.
^ (Nth parent) and ~ (Nth first ancestor) are infix operators that match
certain ancestors of the set:
set^0
the set
set^1 (also available as set^)
the first parent of every changeset in set
set^2
the second parent of every changeset in set
set~0
the set
set~1
the first ancestor (i.e. the first parent) of every changeset in set
set~2
the second ancestor (i.e. first parent of first parent) of every changeset
in set
set~N
the Nth ancestor (following first parents only) of every changeset in set;
set~N is equivalent to set^1^1..., with ^1 repeated N times.
This is valuable because now revsets like 'bookmarks() or tip' will
always show tip after bookmarks unless tip was itself a bookmark. This
is a somewhat contrived example, but this behavior is useful for
"where am I" type aliases that use log and revsets.
For the boolean operators, the subset optimization works by calculating
the cheaper argument first, and passing the subset to the second
argument to restrict the revision domain. This works well for filtering
predicates.
But parents() don't work like a filter: it may return revisions outside the
specified set. So, combining it with boolean operators may easily yield
incorrect results. For instance, for the following revision graph:
0 -- 1
the expression '0 and parents(1)' should evaluate as follows:
0 and parents(1) ->
0 and 0 ->
0
But since [0] is passed to parents() as a subset, we get instead:
0 and parents(1 and 0) ->
0 and parents([]) ->
0 and [] ->
[]
This also affects children(), p1() and p2(), for the same reasons.
Predicates that call these (like heads()) are also affected.
We work around this issue by ignoring the subset when propagating
the call inside those predicates.