Incidentally, this avoids the changelog cache being invalidated each time
it's accessed on a repoview.
On a filtering experiment on a repository the size of mozilla-central,
this makes a significant difference:
Before, running hg log -l 10 --time with about 8k changesets filtered out:
time: real 1.490 secs (user 1.450+0.000 sys 0.040+0.000)
After:
time: real 0.540 secs (user 0.530+0.000 sys 0.010+0.000)
Use the introduced caching infrastructure to cache hidden
changesets. We crosscheck if the content of the cache unless
experimental.verifyhiddencache is set to False. This will be removed
in the future. Without crosschecking the caches speed ups hg status and
other commands:
without caching:
$ time hg status
hg status 0.72s user 0.20s system 100% cpu 0.917 total
with caching
$ time hg status
hg status 0.49s user 0.15s system 100% cpu 0.645 total
Add a caching infrastructure to cache hidden changesets. The cache tries to read
the cache lazily and falls back to recomputing if no wlock can be obtain.
To validate the cache we store a sha of the obstore content and repo heads in
the beginning of the cache which we check every request.
Split up _gethiddenblockers into two categories: (1) "static' blockers
that solely rely on the contents of obstore and are visible children of
hidden changsets. (2) "dynamic" blockers, appearing by having wd parents,
bookmarks or tags pointing to hidden changesets.
We assume that (1) doesn't change often and can be easily cached with a good
invalidation strategy. (2) change often, but barely produce blockers, so we
can recompute them if necessary.
No functionality has changed, since we've only extracted the code into its own
function. Now extensions can wrap _gethiddenblockers to provide their own
blocker without polluting bookmarks or local tags.
For repos with a large number of heads (including hidden heads), a stale tag
cache would cause computehidden to be drastically slower because of a the call
to repo.tags() (which would build the tag cache).
We actually don't need the tag cache for computehidden because we filter out
global tags. This patch replaces the call to repo.tags with readlocaltags so
as to avoid the tag cache.
Previously, only bookmarks would be considered for blocking a changeset from
being hidden. Now, we also consider non-global tags. This is helpful if we have
local tags that might be hard to find once they are hidden, or tag that are
added by extensions (e.g. hggit or remotebranches).
Changeset aad678a92970 moved `subsettable` from `mercurial/repoview.py` to
`mercurial/branchmap.py`. This mean that `filtertable` and `subsettable` are no
longer next to each other. So we add a comment to remind people to update both.
This is a step towards breaking an import cycle between revset and
repoview. Import cycles happened to work in Python 2 with implicit
relative imports, but breaks on Python 3 when we start using explicit
relative imports via 2to3 rewrite rules.
Creating a new changelog object for each access is costly and prevents efficient
caching changelog side. This introduced a x5 performance regression in log
because chunk read from disk were never reused. We were jumping from about 100
disk read to about 20 000.
This changeset introduce a simple cache mechanism that help the last changelog
object created by a repoview. The changelog is reused until the changelog or the
filtering changes.
The cache invalidation is much more complicated than it should be. But MQ test
show a strange cache desync. I was unable to track down the source of this
desync in descent time so I'm not sure if the issue is in MQ or core. However
given the proximity to the 2.5 freeze, I'm choosing the inelegant but safe route
that makes the cache mechanism safer.
If for some reason the phase roots contains nullid, the set of filtered revs
will contains -1. That confuse Mercurial a lot. In particular this corrupt the
branchcache.
Standard code path does not result in nullid phase root. It can only result from
altered `.hg/store/phaseroots` or buggy extension. However better safe than
sorry.
Now that changelog filtering is in place, it's become evident that
naming the filters according to the set of revs _not_ included in the
filtered changelog is confusing. This is especially evident in the
collaborative branch cache scheme.
This changes the names of the filters to reflect the revs that _are_
included:
hidden -> visible
unserved -> served
mutable -> immutable
impactable -> base
repoview.filteredrevs is renamed to filterrevs, so that callers read a
bit more sensibly, e.g.:
filterrevs('visible') # filter revs according to what's visible
In their current state, revset calls can be very costly, as we test
predicates on the entire repository. The "mutable" filter is used
during branch cache loading operation. We need to make it fast.
This change drops revset calls in favor of direct testing of the
phase of a changeset.
Performance test on my Mercurial checkout
- 19857 total changesets,
- 1646 mutable revision
Before:
! mutable
! wall 0.032405
After:
! mutable
! wall 0.001469
Performance test on a Mozilla central checkout:
- 117293 total changesets,
- 1 mutable changeset,
Before:
! mutable
! wall 0.188636
After:
! mutable
! wall 0.000022
In their current state, revset calls can be very costly, as we test
predicates on the entire repository. The "unserved" filter is used
in multiple applications, and in particular in some branch cache
loading operations. We need to make it fast.
This change drops revset calls in favor of direct testing of the
phase of a changeset.
Performance test on my Mercurial checkout
- 19857 total changesets,
- 1584 obsolete changesets,
- 13310 obsolescence markers.
Before:
! unserved
! wall 0.030477
After:
! unserved
! wall 0.011844
Performance test on a Mozilla central checkout:
- 117293 total changesets,
- 1 obsolete changeset,
- 1 obsolescence marker.
Before:
! unserved
! wall 0.111259
After:
! unserved
! wall 0.000084
In their current state, revset calls can be very costlys, as we
test predicates on the entire repository. The hidden filter is very
widely used, and needs to be very fast.
This change drops revset calls in favor of direct revision manipulation.
Performance test on my Mercurial checkout
- 19857 total changesets,
- 1584 obsolete changesets,
- 13310 obsolescence markers.
Before:
! hidden
! wall 0.077553
After this changes:
! hidden
! wall 0.011230
Performance test on a Mozilla central checkout:
- 117293 total changesets,
- 1 obsolete changeset,
- 1 obsolescence marker.
Before:
! hidden
! wall 0.389472
After:
! hidden
! wall 0.000079
There is not good reason for this computation to be handle in a different way
from the other. We are moving the computation of hidden revs in the filter
function. In later changesets, code that access to `repo.hiddenrevs` will be
updated and the property dropped.
The `mutable` filter still have some chance to get invalidated. This will happen
when:
- you garbage collect hidden changeset,
- public phase is moved backward,
- something is changed in the filtering (this could be fixed)
So we introduce an even more stable filtering set: everything with a revision
number egal or higher than the first mutable changeset is filtered.
The only official use of this filter is for branchcache.
It filters all mutable changesets, leaving only public changeset unfiltered.
This filtering set is expected to be much more stable that the previous one as
public changeset are unlikely to disapear.
The only official use of this filter is for branchcache.
This filter exclude all hidden revision. We plan to use this filter to hide
revision instead of manually checking contents of the hidden revisions set.
Recomputing the filtered revisions at every access to changelog is far too
expensive. This changeset introduce a cache for this information. This cache is
hold by the repository (unfiltered repository) and invalidated when necessary.
This cache is not a protected attribute (leading _) because some logic that
invalidate it is not held by the local repo itself.
We add a `filtered` method on repo. This method return an instance of `repoview`
that behaves exactly as the original repository but with a filtered changelog
attribute. Filters are identified by a "name". Planned filter are `unserved`,
`hidden` and `mutable`. Filtering the repository in place what out of question
as it wont not allows multiple thread to share the same repo. It would makes
control of the filtering scope harder too. See the `repoview` docstring for
details.
A mechanism to compute filtered revision is also installed. Some caches will be
installed in later commit.