We simplify the unstable computation code, skipping the expensive creation of
changectx object. We focus on efficient set operation and revnumber centric
functions.
In my mercurial development repository, this provides a 3x speedup to the
function:
before: 5.319 ms
after: 1.844 ms
repo details:
total changesets: 40886
obsolete changesets: 7756
mutable (not obsolete): 293
unstable: 30
We now display data about the "exclusive markers" in the test dedicated to
relevant and exclusive markers computation and usage. Each output have been
carefully validated
This set will be used to select the obsmarkers to be stripped alongside the
stripped changesets. See the function docstring for details.
More advanced testing is introduced in the next changesets to keep this one
simpler. That extra testing provides more example.
We raise a more precise subclass of Abort with details about the faulty
version. This will be used to detect this case and display some information
in debugbundle.
The markers pruning a node was not directly considered relevant for the pruned
node, only to its parents.
This went unnoticed during obsmarkers exchange because all
ancestors of the pruned node would be included in the computation.
This still affects obsmarkers exchange a bit since "inline" prune markers would
be ignored (see second test case). This went unnoticed, because in such case,
we always push another obsolescence markers for that node.
We add explicit tests covering this case.
(The set of relevant changeset is use in the obsmarkers discovery protocol used
in the evolve experimental extension, the impact will be handled on the
extension side).
Also use the default-date when creating obsmarkers. Currently they are created
with the current date and without any option to force their value.
To test the feature, we remove some of the many 'glob' used to match obsmarker
date in the tests.
Adding markers to the repository might affect the set of obsolete changesets. So we
most remove the "volatile" set who rely in that data. We add two missing
invalidations after merging markers. This was caught by code change in the evolve
extensions tests.
This issues highlight that the current way to do things is a bit fragile,
however we keep things simple for stable.
It seems better to introduce the experiment behind a flag for now as there are
multiple concerns around the feature:
* Storing operation increase the size of obsolescence markers significantly
(+10-20%).
* It performs poorly when exchanging markers (cannot combine command names,
command name might be unknown remotely, etc)
By recording what operation created the obsmarker, we can show very intuitive
messages to the user in various UIs. For instance, log output could have
messages like "Amended as XXX" to show why a commit is old and has an 'x' on it.
@ ac28e3 durham
/ First commit
|
| o d4afe7 durham
| | Second commit
| |
| x 8e9a5d (Amended as ac28e3) durham
|/ First commit
|
I'm going to replace hgimporter with a simpler import function, so we can
access to pure/cext modules by name:
# util.py
base85 = policy.importmod('base85') # select pure.base85 or cext.base85
# cffi/base85.py
from ..pure.base85 import * # may re-export pure.base85 functions
This means we'll have to use policy.importmod() function in place of the
standard import statement, but we wouldn't want to write it every place where
C extension modules are used. So this patch makes util host base85 functions.
This is part of a refactoring that moves some phase query optimization from
revset.py to phases.py. See previous patches for the motivation.
Now we have APIs in phasecache to get the non-public set efficiently, let's
use it directly instead of going through the "not public()" revset language
in "obsolete()" computation.
This patch was meaured using:
for i in 'public()' 'not public()' 'draft()' 'not draft()'; do
hg perfrevset "$i"; hg perfrevset "$i" --hidden;
done
and no noticeable (> 1%) performance difference was observed.
Since one of the original patches was accepted already and people on the
mailing list still have suggestions as to how this should be improved, I'm
implementing those suggestions in the following patches (this and the ones that
might follow).
Previously, if you ran obsolete.createmarkers with a bunch of markers that did
not have successors (like when you do a prune), it encountered a n^2 computation
behavior because the loop would read the changelog (to get ctx.parents()), then
add a marker, in a loop. Adding a marker invalidated the computehidden cache,
and reading the changelog recomputed it.
This resulted in pruning 150 commits taking 150+ seconds in a large repo.
The fix is to break the reading part of the loop to be separate from the writing
part.
This patch makes _computeobsoleteset much faster by looping
over the draft and secrets as opposed to looping over the
successors.
This works because "number of draft and secret" is typically
way smaller(<100) than the number of successor in the repo (~90k in
my checkout of core mercurial as of today). And also because
it is very fast to compute "not public()".
I timed the code with the following setup:
"""
from mercurial import hg, ui, obsolete
ui = ui.ui()
repo = hg.repository(ui, "~/hg")
l = repo.obsstore.successors # This caches the result
"""
With about 90k successors.
k=obsolete._computeobsoleteset(repo) before this patch:
10 loops, best of 3: 33.9 ms per loop
k=obsolete._computeobsoleteset(repo) after this patch:
10000 loops, best of 3: 83.3 µs per loop
We want to gracefully handle the read only case in some case (current target:
advisory obsmarkers parts in bundle2). So we expose the attribute in a clean
way.
The home of 'Abort' is 'error' not 'util' however, a lot of code seems to be
confused about that and gives all the credit to 'util' instead of the
hardworking 'error'. In a spirit of equity, we break the cycle of injustice and
give back to 'error' the respect it deserves. And screw that 'util' poser.
For great justice.
This will allow us to use cached revisions without parsing obsstore.
Because _version isn't determined at __init__, the debugobsconvert command no
longer works correctly. I'll fix it by a separate patch for the evolve
extension.
The next patch will make _all variable propertycached to avoid costly parsing
of obsstore. This means we can't call _addmarkers() to initialize _all.
Because all cached markers depend on _all, it isn't necessary to update these
caches when _all is initially loaded.
We do not return the first successors we found, we returns the latest (non
obsolete (mostly)) one following the obsolete link transitively. We update the
documentation to make this clean.