The large or-expressions we used to build required a substantial
amount of subset filtering in orset() which was inefficient. Instead we build a
single string which we process in one go with a special internal predicate.
Simplifies client logic in multiple places since it encapsulates the
computation of the common and, more importantly, the missing node lists.
This also allows an upcomping patch to communicate precomputed versions of
these lists to clients.
some problematic encoding (e.g.: cp932) uses ASCII alphabet characters
in byte sequence of multi byte characters.
"str.lower()" on such byte sequence may treat distinct characters as
same one, and cause unexpected log matching.
this patch uses "encoding.lower()" instead of "str.lower()" to
normalize strings for compare.
This allows folding external revsets or lists of revsets into a revset
expression. Revsets are pre-parsed for validity so that syntax errors
don't escape.
This patch adds two new revset descriptions:
- 'goods': the list of topologicaly-good csets:
- if good csets are topologically before bad csets, yields '::good'
- else, yields 'good::'
- and conversely for 'bads'
Signed-off-by: "Yann E. MORIN" <yann.morin.1998@anciens.enib.fr>
The 'ignored' changesets are outside the bisection range, but are
changesets that may have an impact on the outcome of the bisection.
For example, in case there's a merge between the good and bad csets,
but the branch-point is out of the bisection range, and the issue
originates from this branch, the branch will not be visited by bisect
and bisect will find that the culprit cset is the merge.
So, the 'ignored' set is equivalent to:
( ( ::bisect(bad) - ::bisect(good) )
| ( ::bisect(good) - ::bisect(bad) ) )
- bisect(range)
- all ancestors of bad csets that are not ancestors of good csets, or
- all ancestors of good csets that are not ancestors of bad csets
- but that are not in the bisection range.
Signed-off-by: "Yann E. MORIN" <yann.morin.1998@anciens.enib.fr>
Use repo.set() wherever possible, instead of locally trying to
reproduce complex graph computations.
'pruned' now means 'all csets that will no longer be visited by the
bisection'. The change is done is this very patch instead of its own
dedicated one becasue the code changes all over the place, and the
previous 'pruned' code was totally rewritten by the cleanup, so it
was easier to just change the behavior at the same time.
The previous series went in too fast for this cleanup pass to be
included, so here it is. ;-)
Signed-off-by: "Yann E. MORIN" <yann.morin.1998@anciens.enib.fr>
The 'untested' set is made of changesets that are in the bisection range
but for which the status is still unknown, and that can later be used to
further decide on the bisection outcome.
Signed-off-by: "Yann E. MORIN" <yann.morin.1998@anciens.enib.fr>
The 'pruned' set is made of changesets that did participate to
the bisection. They are made of
- all good changesets
- all bad changsets
- all skipped changesets, provided they are in the bisection range
Signed-off-by: "Yann E. MORIN" <yann.morin.1998@anciens.enib.fr>
The 'range' set is made of all changesets that make the bisection
range, that is
- csets that are ancestors of bad csets and descendants of good csets
or
- csets that are ancestors of good csets and descendants of bad csets
That is, roughly equivalent of:
bisect(good)::bisect(bad) | bisect(bad)::bisect(good)
Signed-off-by: "Yann E. MORIN" <yann.morin.1998@anciens.enib.fr>
Computing the ranges of csets in the bisection belongs to the hbisect
code. This allows for reusing the status computation from many places,
not only the revset code, but also to later display the bisection status
of a cset...
Signed-off-by: "Yann E. MORIN" <yann.morin.1998@anciens.enib.fr>
Rename the 'bisected' keyword to simply 'bisect'.
Still accept the old name, but no longer advertise it.
As discussed with Matt on IRC.
Signed-off-by: "Yann E. MORIN" <yann.morin.1998@anciens.enib.fr>
Given an operator ^ that's either postfix or infix and an operator :
that's either prefix or infix, the parser can't figure out the right
thing to do. So we rewrite the expression to be sensible in the optimizer.
The existing code seemed to have incorrect assumptions about how parameter
lists are represented by the parser.
Now the match and replace functions have been merged and simplified by using
getlist().
Like keyword(), but does not search in filenames and users.
No grepdesc() or descgrep() added, because it might be bad to introduce
grepfoo() versions of too many string searches.
- Handle 'subset' argument
- Stop returning the null rev from p1 and parents, as in the non-dirstate case
- Order parents as in the non-dirstate case (ascending revs)
This patch makes the 'set' argument to revset function parents() optional.
Like p1() and p2(), if no argument is given, returns the parent(s) of the
working directory.
Morally equivalent to 'p1()+p2()', as expected.
This patch makes the 'set' argument to revset functions p1() and p2()
optional. If no argument is given, p1() and p2() return the first or second
parent of the working directory.
If the working directory is not an in-progress merge (no 2nd parent), p2()
returns the empty set. For a checkout of the null changeset, both p1() and
p2() return the empty set.
For the boolean operators, the subset optimization works by calculating
the cheaper argument first, and passing the subset to the second
argument to restrict the revision domain. This works well for filtering
predicates.
But parents() don't work like a filter: it may return revisions outside the
specified set. So, combining it with boolean operators may easily yield
incorrect results. For instance, for the following revision graph:
0 -- 1
the expression '0 and parents(1)' should evaluate as follows:
0 and parents(1) ->
0 and 0 ->
0
But since [0] is passed to parents() as a subset, we get instead:
0 and parents(1 and 0) ->
0 and parents([]) ->
0 and [] ->
[]
This also affects children(), p1() and p2(), for the same reasons.
Predicates that call these (like heads()) are also affected.
We work around this issue by ignoring the subset when propagating
the call inside those predicates.
hg log -r 'outgoing(..)' ignored #branch in some cases.
This patch fixes it.
The cases where it misbehaved are now covered by the added
test-revset-outgoing.t
This adds support for r'...' and r"..." as string literals. Strings
with the "r" prefix will not have their escape characters interpreted.
This is especially useful for grep(), where, with regular string
literals, \number is interpreted as an octal escape code, and \b is
interpreted as the backspace character (\x08).
A query like
head() and (descendants("bad") and not descendants("fix"))
(testing if repo heads are affected by a bug) will abort with a
RepoLookupError if either badrev or fixrev aren't found inside
the repository, which is not very informative.
The new predicate returns an empty set for lookup errors, so
head() and (descendants(present("bad")) and not descendants(present("fix")))
will behave as wanted even if those revisions are not found.
Rather than dynamically optimize in methods, we pre-optimize the parse tree
directly. This also lets us do some substitution on some of the
symbols like - and ::.
discovery.findoutgoing used to be a useful hook for extensions like
hgsubversion. This patch reintroduces this version of findcommonincoming
which is meant to be used when computing outgoing changesets.
When limit, last, min and max were evaluated they worked on a reduced set in the
wrong way. Now they work on an unrestricted set (the whole repo) and get
limited later on.
This is a long desired cleanup and paves the way for new discovery.
To specify subsets for bundling changes, all code should use the heads
of the desired subset ("heads") and the heads of the common subset
("common") to be excluded from the bundled set. These can be used
revlog.findmissing instead of revlog.nodesbetween.
This fixes an actual bug exposed by the change in test-bundle-r.t
where we try to bundle a changeset while specifying that said changeset
is to be assumed already present in the target. This used to still
bundle the changeset. It no longer does. This is similar to the bugs
fixed by the recent switch to heads/common for incoming/pull.
^ (Nth parent) and ~ (Nth first ancestor) are infix operators that match
certain ancestors of the set:
set^0
the set
set^1 (also available as set^)
the first parent of every changeset in set
set^2
the second parent of every changeset in set
set~0
the set
set~1
the first ancestor (i.e. the first parent) of every changeset in set
set~2
the second ancestor (i.e. first parent of first parent) of every changeset
in set
set~N
the Nth ancestor (following first parents only) of every changeset in set;
set~N is equivalent to set^1^1..., with ^1 repeated N times.
if range(len(repo)) is passed to stringset and x is a valid rev
(checked before) then x is guaranteed to be in subset, we can check
for that by comparing the lengths of the sets
This is valuable because now revsets like 'bookmarks() or tip' will
always show tip after bookmarks unless tip was itself a bookmark. This
is a somewhat contrived example, but this behavior is useful for
"where am I" type aliases that use log and revsets.
# Date 1289564504 -3600
# Node ID b75264c15cc888cf38c3c7b8f619801e3c2589c7
# Parent 89b2e5d940f669e590096c6be70eee61c9172fff
revsets: overload the branch() revset to also take a branch name.
This should only change semantics in the specific case of a tag/branch
conflict where the tag wasn't done on the branch with the same
name. Previously, branch(whatever) would resolve to the branch of the
tag in that case, whereas now it will resolve to the branch of the
name. The previous behaviour, while documented, seemed very
counter-intuitive to me.
An alternate approach would be to introduce a new revset such as
branchname() or namedbranch(). While this would retain backwards
compatibility, the distinction between it and branch() would not be
readily apparent to users. The most intuitive behaviour would be to
have branch(x) require 'x' to be a branch name, and something like
branchof(x) or samebranch(x) do what branch(x) currently
does. Unfortunately, our backwards compatibility guarantees prevent us
from doing that.
Please note that while 'hg tag' guards against shadowing a branch, 'hg
branch' does not. Besides, even if it did, that wouldn't solve the
issue of conversions with such tags and branches...