This gives the repository control over which nested repository paths
that should be allowed via the custom path auditor.
Since paths into subrepositories are now allowed, dirstate.walk must
now filter away more paths than before.
When the filesystem cannot handle the executable bit, we currently
ignore it completely when looking for modified files. Similarly, it is
impossible to set or clear the bit when the filesystem ignores it.
This patch makes Mercurial treat symbolic links the same way.
Symlinks are a little different since they manifest themselves as
small files containing a filename (the symlink target). On Windows,
these files show up as regular files, and on Linux and Mac they show
up as real symlinks.
Issue1888 presents a case where the symlink files are better ignored
from the Windows side. A Linux client creates symlinks in a working
copy which is shared over a network between Linux and Windows clients.
The Samba server is helpful and defererences the symlink when the
Windows client looks at it. This means that Mercurial on the Windows
side sees file content instead of a file name in the symlink, and
hence flags the link as modified. Ignoring the change would be much
more helpful, similarly to how Mercurial does not report any changes
when executable bits are ignored in a checkout on Windows.
An initial checkout of a symbolic link on a file system that cannot
handle symbolic links will still result in a regular file containing
the target file name as its content. Sharing such a checkout with a
Linux client will not turn the file into a symlink automatically, but
'hg revert' can fix that. After the revert, the Windows client will
see the correct file content (provided by the Samba server when it
follows the link on the Linux side) and otherwise ignore the change.
Running 'hg perfstatus' 10 times gives these results:
Before: After:
min: 0.544703 min: 0.546549
med: 0.547592 med: 0.548881
avg: 0.549146 avg: 0.548549
max: 0.564112 max: 0.551504
The median time is increased about 0.24%.
The dirstate.granularity configuration parameter was never documented,
it only adds code complexity and it is unneeded.
Adding comments describing forced 'unset' entries.
This change narrows the race guard that was introduced by ffd022830d6d
("dirstate: ignore stat data for files that were updated too recently")
to not discard the _map entry's stat data if the mtime is in the future.
Without this change, status locks files having odd mtimes in the future
into the 'unset' state, causing needless file compares later (admittedly
harmless), but also inflicting highly irritating sticky effects on
tools/plugins that directly read .hg/dirstate (e.g. TortoiseHg).
Bound methods hold a reference to self, so assigning a bound method to
an instance unavoidably creates a cycle. Work around this by choosing
a normalize method at walk time instead. Eliminate default arg while
we're at it.
- merge always and match with patterns
- make always and match with patterns the default
- invert dostep3 to skipstep3
- move dirignore test inside exact case
nothing will ever match on match.never
nothing new will match on match.exact (all found in step 1)
nothing new will match on match.match when
there is no pattern and
there is no direcory in pats
Copy information was saved in a common loop, then refined in a git-only block.
The problem was the latter did filter out renames occuring in the current
patch and irrelevant to commit. In the non-git case, copy records still existed
in the dirstate, referencing removed files, making the commit to fail. Git and
non-git copy handling paths are now separated for simplicity.
Reported by Gary Bernhardt
util module implements two versions of statfiles function
_statfiles calls lstat per file
_statfiles_clustered takes advantage of optimizations in osutil.c, stats all
files in directory at once when new directory is hit and caches the results
util.statfiles dispatches to appropriate version during module loading
The speedup on directory tree with 2k directories and 63k files is about
factor of 1.8 (1.3s -> 0.8s for hg diff - hg startup overhead about .2s)
At this point only Win32 now benefit from this patch.
Rest of OSes use the non clustered implementation.
Normcase already takes care of upper/lower case and /->\ conversions.
What's left for normpath is folding of a/../a sequences but this should
be either done consistently on both non-folding and folding code path
or not at all, otherwise we are introducing inconsistent behavior between the
two that has nothing to do with case folding.
Second argument against it - normpath being pure Python function is very slow -
as much as 50% of time is spend just inside normpath call on my repository.
This patch fixes regression reported in 1286 that causes util.fspath
to be called for every file not in current manifest - including ignored files.
The regression is quite severe - the time for simple hg st goes from 5s to 1m38s
on one of my source trees - which basically renders mercurial useless.
Ignore unknown files if we don't need them (eg in hg diff).
It slows things down a little bit for big trees (kernel repo), since _join()
is called for each file instead of for each directory.
fix issue567
- add fast _finddirs function
- remove recursion from incpath/decpath
- split changepath into addpath/droppath
- change relax arg to check
- move incpathcheck logic into addpath
- move incpath into addpath
- move decpath into droppath
- inline code in self._dirs creation
add _checklink var to dirstate
introduce dirstate.flagfunc
switch users of util.execfunc/linkfunc to flagfunc
change manifestdict.set to take a flags string
change ctx.fileflags to ctx.flags
change gitmode func to a dict
remove util.execfunc/linkfunc
This method returns the normalised form of a path. This is
- the form in the dirstate, if available, or
- the form on disk, if available, or
- the form passed on the command line
normalize() is called on the type-'f' result of statwalk.
This fixes issues 910 and 1092
This should fix the race where
hg commit foo
<change foo without changing its size>
happens in the same second and status is fooled into thinking foo
is clean.
A configuration item is used to determine the timeout, since different
filesystems may have different requirements (I think VFAT needs 3s,
while most Unix filesystems are fine with 1s).
We encode the previous state as a negative file size (AFAICS, previous
versions of hg always have size == 0 when state == 'r').
We save the state of 'm'erged and dirty files, because they're the
two states that indicate that a file has to be committed on a merge
to correctly record per-file history.
With a pattern like '^directory$' in .hgignore, a "hg status directory"
would still walk "directory" and all its subdirs.
This is the first half of a fix for issue886.
Workaround for dir-changed-to-file updates mentioned
in rev c3f3393b9096 doesn't actually work since tests
introduced in mentioned changeset prevented dirstate
updates even if working directory updates succeded.
Make tests more relaxed for dirstate operations
not directly accessible from cli. See also issue660.
While here, move _dirs existance check from _decpath()
to _changepath() for unification.
Allow adding to dirstate files that clash with previously existing
but marked for removal. Protect from reintroducing clashes by revert.
This change doesn't address related issues with update. Current
workaround is to do "clean" update by manually removing conflicting
files/dirs from working directory.
read:
- single call to len(st)
- fewer assignments for position tracking
- don't split apart tuple from unpack
- use a literal for the unpack spec
write:
- localize variables and functions
- avoid copied function call
- use % for string concatenation
- shortcircuit decpath if we haven't built the _dirs map
- increment only for leafnodes of directory tree
(this should make construction more like O(nlog n) than O(n^2))
After a hg merge, we want to include in the commit all the files that we
got from the second parent, so that we have the correct file-level
history. To make them visible to hg commit, we try to mark them as dirty.
Unfortunately, right now we can't really mark them as dirty[1] - the
best we can do is to mark them as needing a full comparison of their
contents, but they will still be considered clean if they happen to be
identical to the version in the first parent.
This changeset extends the dirstate format in a compatible way, so that
we can mark a file as dirty:
Right now we use a negative file size to indicate we don't have valid
stat data for this entry. In practice, this size is always -1.
This patch uses -2 to indicate that the entry is dirty. Older versions
of hg won't choke on this dirstate, but they may happily mark the file
as clean after a full comparison, destroying all of our hard work.
The patch adds a dirstate.normallookup method with the semantics of the
current normaldirty, and changes normaldirty to forcefully mark the
entry as dirty.
This should fix issue522.
[1] - well, we could put them in state 'm', but that state has a
different meaning.
Theoretically, it's possible to forget modified dirstate
parents by doing:
dirstate.invalidate()
dirstate.setparents(p1, p2)
dirstate._map
The final access to _map should call _read(), which will
unconditionally overwrite dirstate._pl.
This doesn't actually happen right now because invalidate
accidentally ends up rebuilding dirstate._map.
Every time util.pathto is called, we have to pass the repo root and the
repo cwd.
dirstate.pathto is a simple convenience function that knows about the
root and the cwd arguments. It's still possible to pass the cwd as an
optimization.
localrepo.pathto is a convenience function that just calls
dirstate.pathto, just like localrepo.getcwd.
dirstate.pathto becomes a single point that converts most (all?) paths
from the internal representation to some OS-specific relative path for
display purposes.
The main problem was that dirstate.getcwd() returned just "",
which was interpreted as "we're at the repo root". It now returns
an absolute path.
The util.pathto function was also changed to deal with the "cwd is
an absolute path" case.
This removes a hack where we appended '/' to a dirname so that:
- it would not appear on the "dc" dict
- it would always be matched by the match function
This was a contorted way of checking if the directory was matched by
some hgignore pattern, and it would still fail with some uses of
--include/--exclude patterns.
Things would still work fine if we removed the check altogether and
just appended things to "work" directly, but then we would end up
walking ignored directories too, which could be quite a bit of work.
This allows further simplification of the match function returned by
util._matcher, and fixes walking the working directory with a
--include pattern that matches only the end of a name.
filterfiles was failing to find files for directory arguments if
another file existed that started with the directory name and
sorted earlier. For example, a manifest of ('foo.h', 'foo/foo')
would cause filterfiles('foo') to return nothing. This resolves
issue #294.
Reference: http://www.selenic.com/mercurial/bts/issue166
If the [ui] section of .hgrc contains keys like "ignore" or
"ignore.something", the values corresponding to these keys are
treated as per-user hgignore files. These hgignore files apply to all
repositories used by that user.
On 11/15/05, Robin Farine <robin.farine@terminus.org> wrote:
> # HG changeset patch
> # User Robin Farine <robin.farine@terminus.org>
> # Node ID ce0a3cc309a8d1e81278ec01a3c61fbb99c691f4
> # Parent feb77e0951e74d75c213e8471f107fdcc124c876
> remove walk warning about nonexistent files
>
> diff -r feb77e0951e7 -r ce0a3cc309a8 mercurial/dirstate.py
> --- a/mercurial/dirstate.py Tue Nov 15 08:42:45 2005 +0100
> +++ b/mercurial/dirstate.py Tue Nov 15 08:59:50 2005 +0100
> @@ -336,9 +336,6 @@
> try:
> st = os.lstat(f)
> except OSError, inst:
> - if ff not in dc: self.ui.warn('%s: %s\n' % (
> - util.pathto(self.getcwd(), ff),
> - inst.strerror))
> continue
> if stat.S_ISDIR(st.st_mode):
> cmp1 = (lambda x, y: cmp(x[1], y[1]))
this break some tests,
a better fix would be to check if ff can be a directory prefix from files in dc
you can try the attached patch.
Benoit
The ''seen'' dictionary stores paths in canonical form,
so the walkhelp must also provide paths in that form,
otherwise the changed files are listed twice.
- add a dirstate.lazyread function that read only if it wasn't read before and
update all callers
- use the atomic keyword from util.opener to atomically write the dirstate
mercurial/dirstate.py
if a file was of unsupported type, it was considered as 'seen' while
walking. this way it was possible to have file in the dirstate not
yielded by the walk function.