dirstate: respect gitignore

Summary:
Use the new gitignore matcher powered by Rust.

The hgignore matcher has some laziness, but is not tree-aware - with N
"hgignore" files loaded, it needs O(N) time to match.  The gitignore matcher
is tree-aware and backed by native code with decent time complexity.

We have been maintaining a translation script that collects all gitignores,
generate hgignore files with very long regexp for them. That script has
issues with sparse recently. This diff allows us to remove those generated
hgignore files from the repo.

Note: fsmonitor state does not contain ignored files. And ignore
invalidation is generally broken in fsmonitor (it only checks top-level
.hgignore). That means, once a file is ignored, it cannot be "unignored" by
just removing the matched pattern from ".gitignore". The file has to be
"touched" or so.

Reviewed By: markbt

Differential Revision: D7319608

fbshipit-source-id: 1763544aedb44676413efb6d14ffd3917ed3b1cd
This commit is contained in:
Jun Wu 2018-03-29 11:01:06 -07:00 committed by Saurabh Singh
parent 5ea461493e
commit 26b5601cf3
5 changed files with 51 additions and 5 deletions

View File

@ -202,6 +202,7 @@ def populateextmods(localmods):
'hgext.patchrmdir',
'hgext.traceprof',
'mercurial.cext.xdiff',
'mercurial.rust.matcher',
])
return newlocalmods

View File

@ -164,11 +164,13 @@ class dirstate(object):
@rootcache('.hgignore')
def _ignore(self):
files = self._ignorefiles()
gitignore = matchmod.gitignorematcher(self._root, '')
if not files:
return matchmod.never(self._root, '')
return gitignore
pats = ['include:%s' % f for f in files]
return matchmod.match(self._root, '', [], pats, warn=self._ui.warn)
hgignore = matchmod.match(self._root, '', [], pats, warn=self._ui.warn)
return matchmod.unionmatcher([gitignore, hgignore])
@propertycache
def _slash(self):

View File

@ -12,6 +12,7 @@ import os
import re
from .i18n import _
from .rust import matcher as rustmatcher
from . import (
error,
pathutil,
@ -370,6 +371,24 @@ class nevermatcher(basematcher):
def __repr__(self):
return '<nevermatcher>'
class gitignorematcher(basematcher):
'''Match files specified by ".gitignore"s'''
def __init__(self, root, cwd, badfn=None):
super(gitignorematcher, self).__init__(root, cwd, badfn)
self._matcher = rustmatcher.gitignorematcher(root)
def matchfn(self, f):
# XXX: is_dir is set to True here for performance.
# It should be set to whether "f" is actually a directory or not.
return self._matcher.match_relative(f, True)
def visitdir(self, dir):
return not self._matcher.match_relative(dir, True)
def __repr__(self):
return '<gitignorematcher>'
class patternmatcher(basematcher):
def __init__(self, root, cwd, kindpats, ctx=None, listsubrepos=False,

23
tests/test-gitignore.t Normal file
View File

@ -0,0 +1,23 @@
$ newrepo
$ cat > .gitignore << EOF
> *.tmp
> build/
> EOF
$ mkdir build exp
$ cat > build/.gitignore << EOF
> !*
> EOF
$ cat > exp/.gitignore << EOF
> !i.tmp
> EOF
$ touch build/libfoo.so t.tmp Makefile exp/x.tmp exp/i.tmp
$ hg status
? .gitignore
? Makefile
? exp/.gitignore
? exp/i.tmp

View File

@ -1,9 +1,10 @@
$ hg init ignorerepo
$ cd ignorerepo
debugignore with no hgignore should be deterministic:
gitignore is used when there is no hgignore:
$ hg debugignore
<nevermatcher>
<gitignorematcher>
Issue562: .hgignore requires newline at end:
@ -197,7 +198,7 @@ Test relative ignore path (issue4473):
A b.o
$ hg debugignore
<includematcher includes='(?:(?:|.*/)[^/]*(?:/|$))'>
<unionmatcher matchers=[<gitignorematcher>, <includematcher includes='(?:(?:|.*/)[^/]*(?:/|$))'>]>
$ hg debugignore b.o
b.o is ignored