Without the change, the error looks like:
warning: Watchman unavailable: "watchman" executable not in PATH (%s),
while executing [Errno 2] No such file or directory
With the change, it now looks like:
warning: Watchman unavailable: "watchman" executable not in PATH
([Errno 2] No such file or directory)
Differential Revision: https://phab.mercurial-scm.org/D322
The initial attempt was to discard cache when appropriate, but it appears
to be error prone. We had to carefully inspect all places where audit() is
called e.g. without actually updating filesystem, before removing files and
directories, etc.
So, this patch disables the cache of audited paths by default, and enables
it only for the following cases:
- short-lived auditor objects
- repo.vfs, repo.svfs, and repo.cachevfs, which are managed directories
and considered sort of append-only (a file/directory would never be
replaced with a symlink)
There would be more cacheable vfs objects (e.g. mq.queue.opener), but I
decided not to inspect all of them in this patch. We can make them cached
later.
Benchmark result:
- using old clone of http://selenic.com/repo/linux-2.6/ (38319 files)
- on tmpfs
- run HGRCPATH=/dev/null hg up -q --time tip && hg up -q null
- try 4 times and take the last three results
original:
real 7.480 secs (user 1.140+22.760 sys 0.150+1.690)
real 8.010 secs (user 1.070+22.280 sys 0.170+2.120)
real 7.470 secs (user 1.120+22.390 sys 0.120+1.910)
clearcache (the other series):
real 7.680 secs (user 1.120+23.420 sys 0.140+1.970)
real 7.670 secs (user 1.110+23.620 sys 0.130+1.810)
real 7.740 secs (user 1.090+23.510 sys 0.160+1.940)
enable cache only for vfs and svfs (this series):
real 8.730 secs (user 1.500+25.190 sys 0.260+2.260)
real 8.750 secs (user 1.490+25.170 sys 0.250+2.340)
real 9.010 secs (user 1.680+25.340 sys 0.280+2.540)
remove cache function at all (for reference):
real 9.620 secs (user 1.440+27.120 sys 0.250+2.980)
real 9.420 secs (user 1.400+26.940 sys 0.320+3.130)
real 9.760 secs (user 1.530+27.270 sys 0.250+2.970)
Before this patch, reposetup() of fsmonitor executes setup procedures
for dirstate, even if it isn't yet instantiated at that time.
On the other hand, dirstate might be already instantiated before
reposetup() intentionally (prefilling by chg, for example, see
69de86112468 for detail). If so, just discarding already instantiated
one in reposetup() causes issue.
To resolve both issues above, this patch executes setup procedures,
only if dirstate is already instantiated.
BTW, this patch removes "del repo.unfiltered().__dict__['dirstate']",
because it is responsibility of the code path, which causes
instantiation of dirstate before reposetup(). After this patch, using
localrepo.isfilecached() should avoid creating the corresponded entry
in repo.unfiltered().__dict__.
Using repo.local() instead of util.safehasattr(repo, 'dirstate') also
avoids executing setup procedures for remote repository (including
statichttprepo).
This is reason why this patch also removes a part of subsequent
comment, and try/except for AttributeError at accessing to repo.wvfs.
Inspired by the dirstate fix in 39954a8760cd, this should fix any race
conditions with the fsmonitor state changing from underneath.
Since we now grab the wlock for any non-invalidate writes, the only situation
this appears to happen in is with a concurrent invalidation. Test that.
This means that the state will not be written if:
(1) either the wlock can't be obtained
(2) something else came along and changed the dirstate while we were in the
middle of a status run.
fsmonitor and debugignore currently access matcher fields that I would
consider implementation details, namely patternspat, includepat, and
excludepat. Let' instead implement __repr__() and have the few users
use that instead.
Marked (API) because the fields can now be None.
Everyone knows that it's supposed to be spelled with two asterisks.
It started failing in 7013de107975 (update: accept --merge to allow
merging across topo branches (issue5125), 2017-02-13) because until
then there was only one argument that was covered by the kwargs, so
*kwargs or **kwargs both worked (or at least that's what I think with
my limited understanding of Python).
The state-enter command may not have been successful; for example, the watchman
client session may have timed out if the user was busy/idle for a long period
during a merge conflict resolution earlier in processing a rebase for a stack
of diffs.
It's cleaner (from the perspective of the watchman logs) to avoid issuing the
state-leave command in these cases.
Test Plan:
ran
`hg rebase --tool :merge -r '(draft() & date(-14)) - master::' -d master`
and didn't observe any errors in the watchman logs or in the output from
`watchman -p -j <<<'["subscribe", "/data/users/wez/fbsource", "wez", {"expression": ["name", ".hg/updatestate"]}]'`
we see some weird things in the watchman logs where the mercurial
process is seemingly confused about which hg.update state it is publishing
through watchman.
On closer examination, we're seeing conflicting pids for the clients involved
and this implies a race.
To resolve this, we extend the wlock around the state-enter/state-leave
events that are emitted to watchman.
Test Plan:
Some manual testing:
In one window, run this, and then checkout a different rev:
```
$ watchman -p -j <<<'["subscribe", "/data/users/wez/fbsource", "wez", {"expression": ["name", ".hg/updatestate"]}]'
{
"version": "4.9.0",
"subscribe": "wez",
"clock": "c:1495034090:814028:1:312576"
}
{
"state-enter": "hg.update",
"version": "4.9.0",
"clock": "c:1495034090:814028:1:312596",
"unilateral": true,
"subscription": "wez",
"metadata": {
"status": "ok",
"distance": 125,
"rev": "a1275d79ffa6c58b53116c8ec401c275ca6c1e2a",
"partial": false
},
"root": "/data/users/wez/fbsource"
}
{
"root": "/data/users/wez/fbsource",
"metadata": {
"status": "ok",
"distance": 125,
"rev": "a1275d79ffa6c58b53116c8ec401c275ca6c1e2a",
"partial": false
},
"subscription": "wez",
"unilateral": true,
"version": "4.9.0",
"clock": "c:1495034090:814028:1:312627",
"state-leave": "hg.update"
}
```
Tailed the watchman log file and looked for invalid state assertion errors,
then ran my `rebase-all` script to update/rebase all of my heads.
Didn't trigger the error condition (but couldn't reliably trigger it previously
anyway), and the output captured above shows that the states are being emitted
correctly.
It seems like fsmonitor/__init__.py was based on a pretty old version
of dirstate.py. Let's copy over the changes from the following two
commits:
1161b515cc4d (match: add isexact() method to hide internals, 2014-10-29)
b1d8372ab1d0 (dirstate: avoid match.files() in walk(), 2015-05-19)
In the future, chg may prefill repo's dirstate filecache so it's valuable
and should be kept. Previously we drop both filecache and property cache for
dirstate during fsmonitor reposetup, this patch changes it to only drop
property cache but keep the filecache.
watchman's paths encoding can differ from filesystem encoding. For example,
on Windows, it's always utf-8.
Before this patch, on Windows, mismatch in path comparison between fsmonitor
state and osutil.statfiles would yield a clean status for added/modified files.
In addition to status reporting wrong results, this leads to files being
discarded from changesets while doing history editing operations such as rebase.
Benchmark:
There is a little overhead at module import:
python -m timeit "import hgext.fsmonitor"
Windows before patch: 1000000 loops, best of 3: 0.563 usec per loop
Windows after patch: 1000000 loops, best of 3: 0.583 usec per loop
Linx before patch: 1000000 loops, best of 3: 0.579 usec per loop
Linux after patch: 1000000 loops, best of 3: 0.588 usec per loop
10000 calls to _watchmantofsencoding:
python -m timeit -s "from hgext.fsmonitor import _watchmantofsencoding, _fixencoding" "fname = '/path/to/file'" "for i in range(10000):" " if _fixencoding: fname = _watchmantofsencoding(fname)"
Windows (_fixencoding is True): 100 loops, best of 3: 19.5 msec per loop
Linux (_fixencoding is False): 100 loops, best of 3: 3.08 msec per loop
Earlier this was left thinking that its part of pywatchman package.
This patch replaces variables os.sep, sys.platform and os.envrion with their
py3 compatible ones.
Update to upstream to version c77452. The refresh includes fixes to improve
windows compatibility.
There is a minor update to 'test-check-py3-compat.t' as c77452 no longer have
the py3 compatibility issues the previous version had.
# no-check-commit
pywatchman.CommandError formats its error message such that
'unable to resolve root' is not a prefix. This change fixes that by
instead just searching for it as a substring.
fsmonitor could write out bad state if interrupted part way through, and
would then crash when it tried to read it back in.
Make both sides of the operation more robust - reading state should fail
cleanly, and we can use atomictemp to write out cleanly as the file is
small. Between the two, we shouldn't crash with an IndexError any more.
I've caught multiple extensions in the wild lying about being
'internal', so it's time to move the goalposts on people. Goalpost
moving will continue until third party extensions stop trying to
defeat the system.
All versions of Python we support or hope to support make the hash
functions available in the same way under the same name, so we may as
well drop the util forwards.
Since (b) is banned, we should do the same for (a) for consistency.
a) from mercurial import hg
from mercurial.i18n import _
b) from . import hg
from .i18n import _
Visual C/C++ 9, which Python 2.7 is compatible with, doesn't have C99
support and thus doesn't contain a stdint.h file.
This changeset adds a custom version of stdint.h, created specifically
for Visual C, and uses it when building with that compiler.
Keeping the codebase in sync with upstream:
Watchman 4.4 introduced an advanced settling feature that allows publishing
tools to notify subscribing tools of the boundaries for important filesystem
operations.
https://facebook.github.io/watchman/docs/cmd/subscribe.html#advanced-settling
has more information about how this feature works.
This diff connects a signal that we're calling `hg.update` to the mercurial
update function so that mercurial can indirectly notify tools (such as IDEs or
build machinery) when it is changing the working copy. This will allow those
tools to pause their normal actions as the files are changing and defer them
until the end of the operation.
In addition to sending the enter/leave signals for the state, we are able to
publish useful metadata along the same channel. In this case we are passing
the following pieces of information:
1. destination revision hash
2. An estimate of the distance between the current state and the target state
3. A success indicator.
4. Whether it is a partial update
The distance is estimate may be useful to tools that wish to change their
strategy after the update has complete. For example, a large update may be
efficient to deal with by walking some internal state in the subscriber rather
than feeding every individual file notification through its normal (small)
delta mechanism.
We estimate the distance by comparing the repository revision number. In some
cases we cannot come up with a number so we report 0. This is ok; we're
offering this for informational purposes only and don't guarantee its accuracy.
The success indicator is only really meaningful when we generate the
state-leave notification; it indicates the overall success of the update.
Extension to plug into a Watchman daemon, speeding up hg status calls by
relying on OS events to tell us what files have changed.
Originally developed at https://bitbucket.org/facebook/hgwatchman
In preparation for the filesystem monitor extension, include the pywatchman
library. The fbmonitor extension relies on this library to communicate with
the Watchman service. The library is BSD licensed and is taken from
https://github.com/facebook/watchman/tree/master/python.
This package has not been updated to mercurial code standards.