Commit Graph

183 Commits

Author SHA1 Message Date
Martin von Zweigbergk
a9c40085c0 match: allow pats to be None
match.match already interprets "!bool(patterns)" as matching
everything (but includes and excludes still apply). We might as well
allow None, which lets us simplify some callers a bit.

I originally wrote this patch while trying to change
match.match(patterns=[]) to mean to match no patterns. This patch is
one step towards that goal. I'm not sure it'll be worth the effort to
go all the way there, but I think this patch still makes sense on its
own.
2017-06-08 22:18:17 -07:00
Martin von Zweigbergk
2c881818ba match: simplify nevermatcher
Most of it does the same as its superclass, so it can simply be
removed. It also seems to make more sense for it to use relative
paths, as we do for everything except alwaysmatcher, although
nevermatcher.uipath() will probably never get called anyway, so it
won't matter.
2017-06-01 08:31:21 -07:00
Siddharth Agarwal
9202c22f7c match: introduce nevermatcher for when no ignore files are present
c01965ab5195 introduced a deterministic `__repr__` for ignores. However, it
didn't account for when ignore was `util.never`. This broke fsmonitor's ignore
change detection -- with an empty hgignore, it would kick in all the time.

Introduce `nevermatcher` and switch to it. This neatly parallels
`alwaysmatcher`.
2017-06-01 00:40:52 -07:00
Martin von Zweigbergk
a00e1e8b0a match: remove special-casing of always-matching patterns in patternmatcher
This moves the optimization for patterns that match everything to the
caller, so we can remove it from patternmatcher.

Note that we need to teach alwaysmatcher to use relative paths now in
cases like "hg files .." from inside mercurial/, because while it
still matches everything, paths should be printed relative to the
working directory.
2017-05-19 13:16:15 -07:00
Martin von Zweigbergk
40160c59ed match: move normalize() call out of matcher constructors
By passing in the result of the normalize() call, we prepare for
moving the special handling of patterns that always match out of the
patternmatcher.

It also lets us remove many of the arguments from the matcher, because
they were passed only the the normalize function (we could have
removed the arguments by binding them to the function instead of
moving the normalize() call out).
2017-05-19 12:47:45 -07:00
Martin von Zweigbergk
cc35e8cc47 match: drop support for empty pattern list in patternmatcher
Since the caller now deals with empty pattern lists, we can drop that
support in the patternmatcher. It now gets the more logical behavior
of matching nothing when no patterns are given (although there is no
in-core caller that will pass no patterns).
2017-05-19 11:58:16 -07:00
Martin von Zweigbergk
b0d04b4dc4 match: optimize visitdir() for when no explicit files are listed
In patternmatcher, we used to say that all directories should be
visited if no explicit files were listed, because the case of empty
_files usually implied that no patterns were given (which in turns
meant that everything should match). However, this made e.g. "hg files
-r .  rootfilesin:."  slower than necessary, because that also ended
up with an empty list in _files. Now that patternmatcher does not
handle includes, the only remaining case where its _files/_fileset
fields will be empty is when it's matching everything. We can
therefore treat the always-case specially and stop treating the empty
_files case specially. This makes the case mentioned above faster on
treemanifest repos.
2017-05-20 23:49:14 -07:00
Martin von Zweigbergk
57f17ff9f2 match: handle everything-matching using new alwaysmatcher
Having a special matcher that always matches seems to make more sense
than making one of the other matchers handle the case. For now, we
just use this new matcher when no patterns were provided.
2017-05-19 11:50:01 -07:00
Martin von Zweigbergk
e63330c2d2 match: add __repr__ for subdirmatcher
Should at least be useful for debugging. Would matter for correctness
too if fsmonitor or Facebook's sparse extension worked with subrepos
(which I don't know if they do).
2017-05-26 13:08:30 -07:00
Yuya Nishihara
477ffb0437 match: define exactmatcher.matchfn statically
This should eliminate the reference cycle, self.matchfn -> self.exact -> self.
2017-05-28 23:54:31 +09:00
Yuya Nishihara
b7251d7b93 match: remove override of prefix() from differencematcher
It's exactly the same as basematcher.prefix().
2017-05-28 23:51:30 +09:00
Martin von Zweigbergk
882acf90e8 match: remove support for includes from patternmatcher
Includes (and excludes) are now delegated to the includematcher.
2017-05-19 11:44:05 -07:00
Martin von Zweigbergk
19566696b9 match: simplify includematcher a bit
The "include" we have in symbols is redundant and the double negative
in visitdir() can be removed.
2017-05-22 23:31:15 -07:00
Martin von Zweigbergk
66dd6b9e1c match: remove support for non-include patterns from includematcher
The includematcher will always get at least one include pattern and
will never get any non-include patterns, so we can remove most of the
code in it. This patch does mostly straight-forward deletions of
code. We will clean up further later.
2017-05-19 13:36:34 -07:00
Martin von Zweigbergk
1ba59afc49 match: split up main matcher into patternmatcher and includematcher
At this point the includematcher is an exact copy of the main matcher
class. We will specialize and simplify both classes in the following
patches. This initial unmodified copy is just to make the differences
clearer. We also rename the main matcher to "patternmatcher" for
consistency.

I may eventually merge this new includematcher back into the main
matcher, but I think doing it this way makes the intermediate steps
clearer regardless.
2017-05-19 22:36:14 -07:00
Martin von Zweigbergk
ce94f073cb match: remove support for exact matching from main matcher class
Exact matching is now handled by the exactmatcher class.

We can safely remove _files from the __repr__() implementation,
because even though the field is set, the patternspat field is enough
for the representation to be unambiguous (which was not the case when
the matcher could handle exact matches).
2017-05-18 23:39:39 -07:00
Martin von Zweigbergk
6cc2daf5d6 match: handle exact matching using new exactmatcher 2017-05-17 09:26:15 -07:00
Martin von Zweigbergk
7767620115 match: handle includes using new intersectionmatcher 2017-05-12 23:12:05 -07:00
Martin von Zweigbergk
8a54c0d671 match: move entire uipath() implementation to basematcher
Even though most matchers will always want to use the relative path in
uipath(), when we add support for intersecting matcher, we will want
to control which form to use for any kind of matcher without knowing
the type (see next patch), so we need the implementation on the base
class.

Also rename the attribute from "pathrestricted" to "relativeuipath"
since there actually are cases where we match everything but still use
relative paths (like when the user runs "hg files .." from inside
mercurial/).
2017-05-25 14:32:56 -07:00
Martin von Zweigbergk
6f7738b741 match: remove support for excludes from matcher class
The support is now provided by differencematcher() and still available
via the match() function.
2017-05-16 22:15:42 -07:00
Martin von Zweigbergk
8d0d310985 match: handle excludes using new differencematcher
As I've said on earlier patches, I'm hoping to use more composition of
simpler matchers instead of the single complex matcher we currently
have. This extracts a first new matcher that composes two other
matchers. It matches if the first matcher matches but the second does
not. As such, we can use it for excludes, which this patch also
does. We'll remove the now-unncessary code for excludes in the next
patch.
2017-05-16 16:36:48 -07:00
Martin von Zweigbergk
2724c60601 match: override matchfn() the usual way in subdirmatcher 2017-05-25 09:52:56 -07:00
Martin von Zweigbergk
cb783946fc match: make matchfn a method on the class
This makes it easier to override in subclasses, so they don't have to
assign the attribute with a lambda.
2017-05-25 09:52:49 -07:00
Martin von Zweigbergk
de3c23309e match: fix visitdir for roots of includes
I'm hoping to rewrite the matcher so excludes are handled by
composition of one matcher with another matcher where the second
matcher has only includes. For that to work, we need to make
visitdir() to return 'all' for directory 'foo' for a '-I foo' matcher.
2017-05-16 14:31:21 -07:00
Martin von Zweigbergk
605d9dfcea match: make subdirmatcher extend basematcher
This makes the subdirmatcher not depend on the main matcher, giving us
more freedom to modify that (specifically, it will lose it _always
field in a while).
2017-05-17 23:02:42 -07:00
Martin von Zweigbergk
5e75aba9b0 match: make basematcher._files a @propertycache
This will make it easier to override in subclasses (otherwise the
function @propertycache object will be replaced by the
super-constructor call)..
2017-05-19 10:17:08 -07:00
Martin von Zweigbergk
c9664eeaa0 match: extract base class for matchers
We will soon start splitting up the current matcher class into more
specialized classes, so we'll want a base class for all the things
that don't vary much between different matchers.
2017-05-17 23:45:13 -07:00
Martin von Zweigbergk
2410b10b5f match: use ProgrammingError where appropriate 2017-05-23 08:49:01 -07:00
Martin von Zweigbergk
243fda7165 match: catch attempts to create case-insenstive exact matchers
Exact matchers are only created internally (as opposed to from user
input) based on a set of files that the caller collected before, so
they should always match the list exactly (i.e. case-sensitively).
2017-05-22 08:49:34 -07:00
Martin von Zweigbergk
ee3be3c6ea match: implement __repr__() and update users (API)
fsmonitor and debugignore currently access matcher fields that I would
consider implementation details, namely patternspat, includepat, and
excludepat. Let' instead implement __repr__() and have the few users
use that instead.

Marked (API) because the fields can now be None.
2017-05-22 11:08:18 -07:00
Martin von Zweigbergk
722dff8abb match: replace icasefsmatch() function by flag to regular match()
match() will soon gain more logic and we don't want to duplicate that
in icasefsmatch(), so merge the two functions instead and use a flag
to get case-insensitive behavior.
2017-05-18 22:20:59 -07:00
Martin von Zweigbergk
8daa488d75 match: delete icasefsmatcher now that it's same as matcher 2017-05-18 16:48:02 -07:00
Martin von Zweigbergk
a7219ec491 match: pass in normalize() function to matchers
This will let us delete icasefsmatcher.
2017-05-18 15:45:50 -07:00
Martin von Zweigbergk
04a6ac40df match: don't print explicitly listed files with wrong case (BC)
On case-insensitive file systems, if file A exists and you try to
remove it (or add, etc.) by specifying a different case, you will see
something like this:

  $ hg rm a
  removing file A

I honestly found this surprising because it seems to me like it was
explicitly listed by the user. Still, there is a comment in the code
describing it, so it is very clearly intentional. The code was added
in d70aa474bd84 (match: add a subclass for dirstate normalizing of the
matched patterns, 2015-04-12).

I'm going to do a lot of refactoring to matchers and the feature
mentioned above is going to get in my way. I'm therefore removing it
for the time being and we can hopefully add it back when I'm done.
2017-05-18 16:05:46 -07:00
Martin von Zweigbergk
6b512ea877 match: move body of _normalize() to a static function
matcher._normalize() no longer depends on any of the matcher's state,
and making it static will enable further refactoring. Note that the
subdirmatcher subclass calls _normalize(), so we can't remove it
completely.
2017-05-18 15:25:16 -07:00
Martin von Zweigbergk
57f69ee76c match: pass 'warn' argument to _normalize() for consistency
No other arguments are passed via the matcher's state, so we should
treat 'warn' the consistently. More importantly, this will let us make
it a static function, which will help with further refactoring.
2017-05-18 15:11:04 -07:00
Martin von Zweigbergk
ab0ee38146 match: replace match class by match function (API)
The matcher class is getting hard to understand. It will be easier to
follow if we can break it up into simpler matchers that we then
compose. I'm hoping to have one matcher that accepts regular
(non-include) patterns, one for exact file matches, one that always
matches (and maybe one that never does) and then compose them by
intersection and difference.

This patch takes a simple but important step towards that goal by
making match.match() a function (and renaming the matcher class itself
from "match" to "matcher"). The new function will eventually be
responsible for creating the simple matchers and composing them.

icasefsmatcher similarly gets a factory function (called
"icasefsmatch"). I also moved the other factory functions nearby.
2017-05-12 23:11:41 -07:00
Martin von Zweigbergk
64acd1e09e match: use match.prefix() in subdirmatcher
It seems like the subdirmatcher should be checking if the matcher it's
based on is matching prefixes. It was effectively doing that already
because "prefix() == not always() and not anypats() and not
isexact()", subdirmatcher was checking the first two parts of that
condition and I don't think it will ever be given an "exact" matcher
with it's directory name (because exact matchers are for matching
files, not directories). Still, let's switch to using prefix() for
clarity (and because I'm trying to remove code that reaches for
matchers internals).
2017-05-17 22:33:15 -07:00
Martin von Zweigbergk
096fa56147 match: avoid accessing match._pathrestricted from subdirmatcher
Accessing only the public API wherever possible helps us refactor
matchers later.
2017-05-12 16:31:21 -07:00
Martin von Zweigbergk
ddbf56e07e match: override visitdir() the usual way in subdirmatcher
Just override the function instead of replacing it on each instance.
2017-05-18 10:17:57 -07:00
Martin von Zweigbergk
767cd2bb63 match: make _fileroots a @propertycache and rename it to _fileset
The files in the set are not necesserily roots of anything. Making it
a @propertycache will help towards extracting a base class for
matchers.
2017-05-18 09:04:37 -07:00
Martin von Zweigbergk
3bc2187d25 match: remove ispartial()
The function was added in c2498bb6d298 (match: add match.ispartial(),
2015-05-15) for use by narrowhg, but narrowhg never ended up needing
it.
2017-05-17 09:43:50 -07:00
Martin von Zweigbergk
c3406ac3db cleanup: use set literals
We no longer support Python 2.6, so we can now use set literals.
2017-02-10 16:56:29 -08:00
Martin von Zweigbergk
41765f4b1b match: optimize visitdir() for patterns matching only root directory
Because _rootsanddirs() returns a list of directories to visit
recursively and a list of directories to visit non-recursively. For
patterns such as 'rootfilesin:foo/bar', we clearly need to visit the
directory foo/bar, but we also need to visit its parents. The method
therefore uses util.dirs() to find the parent directories of
'foo/bar'. That method does not include the root directory, but since
we obviously need to visit the root directory, we always added '.' to
the set of directories to visit non-recursively.

The visitdir() method had special handling to consider set(['.']) to
mean that no includes had been specified and would thus visit all
directories. However, when the pattern is 'rootfilesin:.', set(['.'])
is actually the real set of directories to visit and the special
handling of that set meant that all directories got visited instead of
just the root directory.

The fix is simple: add '.' to the set of parent directories in
_rootsanddirs() and stop treating set(['.']) specially. This makes

  hg files -r .  -I rootfilesin:.

in a treemanifest version of the Firefox repo go from 1.5s to 0.26s on
warm disk (and a *much* bigger improvement on cold disk).

Note that the -I is necessary for no good reason. We just haven't
optimized visitdir() for regular (non-include, non-exclude) patterns
yet.
2017-05-05 08:49:07 -07:00
Durham Goode
5566fd666f match: make subinclude construction lazy
The matcher subinclude functionality allows us to have .hgignore files that
include subdirectory hgignore files. Today it parses the entire repo at once,
even if we only need to test a file in one subdirectory. This patch makes the
subinclude tree creation lazy, which speeds up matcher creation significantly in
large repos with very large trees of ignore patterns.
2017-05-03 10:30:57 -07:00
Pierre-Yves David
be968edfac match: explicitly tests for None
Changeset ba7f2a1cc2d2 removed the mutable default value, but did not explicitly
tested for None. Such implicit testing can introduce semantic and performance
issue. We move to an explicit testing for None as recommended by PEP8:

https://www.python.org/dev/peps/pep-0008/#programming-recommendations
2017-03-15 15:08:45 -07:00
Pulkit Goyal
5a7c5d918e match: slice over bytes to get the byteschr instead of ascii value 2017-03-16 08:03:51 +05:30
Pulkit Goyal
0bad6ee6aa match: make regular expression bytes to prevent TypeError 2017-03-16 07:52:47 +05:30
Rishabh Madan
6a6d5ec05c py3: open file in rb mode 2017-03-15 14:51:18 +05:30
Gregory Szorc
98c99b99fa match: don't use mutable default argument value
There shouldn't be a big perf hit creating a new object because
this function is complicated and does things that dwarf the cost
of creating a new PyObject.
2017-03-12 21:53:03 -07:00