We need to make sure that if X is in the filecache then it's also in the
filecache owner's __dict__, otherwise it will go out of sync:
repo.X # first access to X, records stat info in
# filecache and updates __dict__
repo._filecache.clear() # removes X from _filecache but it's still in __dict__
repo.invalidate() # iterates over _filecache and removes entries
# from __dict__, but X isn't in _filecache, so
# it's kept in __dict__
repo.X # X is fetched from __dict__, bypassing the filecache
We call invalidate to remove properties from __dict__ because they're
possibly outdated and we'd like to check for a new version. Next time
the property is accessed the filecache mechanism checks the current stat
info with the one recorded at the last time the property was read, if
they're different it recreates the property.
Previously we refreshed the stat info on all properties in the filecache
when the lock is released, including properties that are missing from
__dict__. This is a problem because:
l = repo.lock()
repo.P # stat info S for P is recorded in _filecache
<changes are made to repo.P indirectly, e.g. underlying file is replaced>
# P's new stat info = S'
l.release() # filecache refreshes, records S' as P's stat info
At this point our filecache contains P with stat info S', but P's
version is from S, which is outdated.
The above happens during _rollback and strip. Currently we're wiping the
filecache and forcing everything to reload from scratch which works but
isn't the right solution.
Creation of changectx objects is very slow, and they are not very
useful. We are going to drop them. The first step is to change the
function argument type.
This changeset installs a broad filter on most repos used for
serving. This removes the need to use the `visiblehead`/`visiblebranchmap`
functions, and ensures that changesets we should not serve are in
fact never served.
We do not use filtering on hgweb yet, as there is still a number
of issues to solve there.
The `filteredrevs` function already have a cache mechanism. And this cache in
invalidated at the same time than the current property cache. So we drop the
cache on the property.
The property itself is going to be dropped soon.
There is not good reason for this computation to be handle in a different way
from the other. We are moving the computation of hidden revs in the filter
function. In later changesets, code that access to `repo.hiddenrevs` will be
updated and the property dropped.
Disables this simple optimisation to allow coming more powerfull approach: cache
collaboration.
Our goal is to have branchcache collaborate. This means that unfiltered
branchcache will fallback to some filtered branchcache if invalid. We can't have
the filtered branchcache to use the unfiltered one. That would loop.
When commit is followed by strip (qrefresh), phasecache contains nodes that were
removed from the changelog. Since phasecache is filecached with .hg/store/phaseroots
which doesn't change as a result of stripping, we have to filter it manually.
If we don't write it immediately, the next time it is read from disk the nodes
will be filtered again. That's what happened before, but there's no reason not
to write it immediately.
The change in test-keyword.t is caused by the above.
The `_branchcache` attribute is turned into a dictionary. Key are filter name and
value is a `branchcache` object. Unfiltered version is cached as `None` filter.
The attribute is renamed to `_branchcaches` to avoid confusion with the previous
one. Both old and new contents are dictionary even if their contents are
different. I prefer possible extension code to crash right away instead of just
messing the wrong dictionary.
As all different caches work isolated to each other, this code keeps the
previous behavior of using the unfiltered cache we nothing is filtered. This
is a cheap way to have cache collaborate and nullify potential impact in the
default case.
At this moment, the cache is invalid, and will be thrown away.
Later the strip function will call the `localrepo.destroyed` method
that will update the branchmap cache.
The update function have all necessary data to keep the branchcache key
up to date with its value.
This saves assignment to the cache key that each caller of update had to do by
hand.
The strip case is a bit more complicated to handles from inside the function but
I do not expect any impact.
Value and key of branchcache would benefit from being hold by the same object.
Moreover some logic (update, write, validation) could be move on such object.
The creation of this object is the first step toward this move. The result will
clarify branchcache related code and hide most of the detail in the class
itself. This encapsulation will greatly helps implementation of branchcache for
filtered view of the repo.
The previous code was writing it to a non existent `branchcache` attribute. We
now write is to the proper `_branchcache` attribute and initialize the
`_branchcachetip` at the same time.
We keep writing it to disk, the previous code had this part right.
The `_updatebranchmap` method on repo does not need to be filtered as all
callers are already handling filtering themself.
The fact it is filtered may had even lead to buggy behaviors, but by chances the method
make very sparse use of the repo object.
The current _branchtags method is a remnant of an older, much larger function.
Its only remaining role is to help MQ to alter the version of branchcache we
store on disk. As MQ mutates the repository it ensures persistent cache does not
contain anything it is likely to alter.
This changeset makes explicit the stable vs volatile part of repository and
reduces the MQ specific code to the computation of this limit. The main
_branchtags code now handles this possible limit in all cases.
This will help to extract the branchmap logic from the repository.
The new code of _branchtags is a bit duplicated, but as I expect major
refactoring of this section I'm not keen to setup factorisation function here.
The `hiddenrevs` cache is volatile too (It use content from `obscache`). When
unsure it is invalidated when necessary. In a near future, the cache will
probably be moved to `revsfiltercache`
Recomputing the filtered revisions at every access to changelog is far too
expensive. This changeset introduce a cache for this information. This cache is
hold by the repository (unfiltered repository) and invalidated when necessary.
This cache is not a protected attribute (leading _) because some logic that
invalidate it is not held by the local repo itself.
We add a `filtered` method on repo. This method return an instance of `repoview`
that behaves exactly as the original repository but with a filtered changelog
attribute. Filters are identified by a "name". Planned filter are `unserved`,
`hidden` and `mutable`. Filtering the repository in place what out of question
as it wont not allows multiple thread to share the same repo. It would makes
control of the filtering scope harder too. See the `repoview` docstring for
details.
A mechanism to compute filtered revision is also installed. Some caches will be
installed in later commit.
For a repository with over 400,000 commits, rebasing one revision near tip,
this avoids two treks up the DAG, speeding the operation up by around 1.6
seconds.
With the current implementation, `changelog.nodemap` is not filtered. So some
filtered changeset in common are not filtered by `n in nodemap`. This leads to
crash lower in the stack when the bundle generation try to access those node on
a filtered changelog.
The check pattern only checked for whitespace between keyword and operator.
Now it also warns:
> x = f(),7
missing whitespace after ,
> x = f()+7
missing whitespace in expression
All filecache usage on repo is for logic that should be unfiltered. The
caches should be common to all filtered instances, and computation must
be done unfiltered. A dedicated storecache subclass is created for
this purpose.
Some of the localrepo property caches must be computed unfiltered and
stored globally. Some others must see the filtered version and store data
relative to the current filtering.
This changeset introduces two classes `unfilteredpropertycache`
and `filteredpropertycache` for this purpose. A new function
`hasunfilteredcache` is introduced for unambiguous checking for cached
values on unfiltered repos.
A few tweaks are made to the property cache class to allow overriding
the way the computed value is stored on the object.
Some logic relative to _tagcaches is cleaned up in the process.