The `metadata` argument only allow to specify metadata for all new markers. We
extension the format of the `relations` argument to support optional metadata
argument.
The first user of this should be the evolve extension who want to store parent
information of pruned changeset in extra (until we make a second version of the
format)
The `obsstore` class have a `create` method that create new obsolescence marker
from node. There is another function in the same module `createmarkers`. This
other function is higher level and automatically missing meta data (ultimately
calling the first one)
We add a new comment in the docstring of `obsstore.create` highlighting that
people writing new code probably want to use the top level one.
The obsolescence marker exchange code was already extracted during a previous
cycle. We are moving the extracted functio in this module. This function will
read and write data in the `pulloperation` object and I prefer to have all core
function collaborating through this object in the same place.
This changeset is pure code movement only. Code change for direct consumption of
the `pulloperation` object will come later.
The obsolescence marker exchange code was already extracted during a previous
cycle. We are moving the extracted functio in this module. This function will
read and write data in the `pushoperation` object and I prefer to have all core
function collaborating through this object in the same place.
This changeset is pure code movement only. Code change for direct consumption of
the `pushoperation` object will come later.
Reminder: a changeset is said "bumped" if it tries to obsolete a immutable
changeset.
The previous algorithm for computing bumped changeset was:
1) Get all public changesets
2) Find all they successors
3) Search for stuff that are eligible for being "bumped"
(mutable and non obsolete)
The entry size of this algorithm is `O(len(public))` which is mostly the same as
`O(len(repo))`. Even this this approach mean fewer obsolescence marker are
traveled, this is not very scalable.
The new algorithm is:
1) For each potential bumped changesets (non obsolete mutable)
2) iterate over precursors
3) if a precursors is public. changeset is bumped
We travel more obsolescence marker, but the entry size is much smaller since
the amount of potential bumped should remains mostly stable with time `O(1)`.
On some confidential gigantic repo this move bumped computation from 15.19s to
0.46s (×33 speedup…). On "smaller" repo (mercurial, cubicweb's review) no
significant gain were seen. The additional traversal of obsolescence marker is
probably probably counter balance the advantage of it.
Other optimisation could be done in the future (eg: sharing precursors cache
for divergence detection)
Detection of bumped changeset should use `allprecursors(<mutable>)` instead or
`allsuccessors(<immutable>)` so we need the all precursors function to exists.
Before this patch, duplicated obsolescence markers could slip into an
obstore if the bookmark was unknown locally and duplicated in the
incoming obsolescence stream.
Existing duplicate markers will not be automatically removed but
they'll stop propagating. Having a few duplicated markers is harmless
and people have been warned evolution is <blink>experimental</blink>
anyway.
Having a dedicated function will allow us to experiment with other exchange
strategies in an extension. As we have no solid clues about how to do it right,
being able to experiment is vital.
Some transaction tricks are necessary for pull. But nothing too scary.
Having a dedicated function will allows us to experiment with other exchange
strategies in an extension. As we have no solid clues about how to do it right,
being able to experiment is vital.
I intended a more ambitious extraction of push logic, but we are far too
advanced in the release cycle for it.
Obsolescence creates a sparse DAG mostly composed of a lot of small independent
chain of markers. Date is the only simple and "reliable" way to sort them. The
existence of a date is now enforced at creation time as I'm more and more
convinced that date will have a key role in obsolescence markers exchange.
In their current state, revset calls can be very costly, as we test
predicates on the entire repository.
This change drops revset calls in favor of direct testing of the phase
of changesets.
Performance test on my Mercurial checkout
- 19857 total changesets,
- 1584 obsolete changesets,
- 13310 obsolescence markers.
Before:
! extinct
! wall 0.015124
After:
! extinct
! wall 0.009424
Performance test on a Mozilla central checkout:
- 117293 total changesets,
- 1 obsolete changeset,
- 1 obsolescence marker.
Before:
! extinct
! wall 0.032844
After:
! extinct
! wall 0.000066
In their current state, revset calls can be very costly, as we test
predicates on the entire repository.
This change drops a revset call in favor of direct testing of the
phase of changesets.
Performance test on my Mercurial checkout
- 19857 total changesets,
- 1584 obsolete changesets,
- 13310 obsolescence markers.
Before:
! suspended
! wall 0.014319
After:
! suspended
! wall 0.009559
Performance test on a Mozilla central checkout:
- 117293 total changesets,
- 1 obsolete changeset,
- 1 obsolescence marker.
Before:
! suspended
! wall 0.033373
After:
! suspended
! wall 0.000053
In their current state, revset calls can be very costly, as we test
predicates on the entire repository.
This change drops revset call in favor of direct testing of the
phase of changesets.
Performance test on my Mercurial checkout
- 19857 total changesets,
- 1584 obsolete changesets,
- 13310 obsolescence markers.
Before:
! unstable
! wall 0.017366
After this changes:
! unstable
! wall 0.008093
Performance test on a Mozilla central checkout:
- 117293 total changesets,
- 1 obsolete changeset,
- 1 obsolescence marker.
Before:
! unstable
! wall 0.045190
After:
! unstable
! wall 0.000032
In their current state, revset calls can be very costly as we test
predicates on the entire repository. As obsolete computation is used
by the "hidden" filter, it needs to be very fast.
This changet drops the revset call in favor of direct testing of
the phase of a changeset.
Performance test on my Mercurial checkout
- 19857 total changesets,
- 1584 obsolete changesets,
- 13310 obsolescence markers.
Before:
! obsolete
! wall 0.047041
After:
! obsolete
! wall 0.004590
Performance test on a Mozilla central checkout:
- 117293 total changesets,
- 1 obsolete changeset,
- 1 obsolescence marker.
Before:
! obsolete
! wall 0.001539
After:
! obsolete
! wall 0.000017
Recomputing the filtered revisions at every access to changelog is far too
expensive. This changeset introduce a cache for this information. This cache is
hold by the repository (unfiltered repository) and invalidated when necessary.
This cache is not a protected attribute (leading _) because some logic that
invalidate it is not held by the local repo itself.
Divergent changeset are final successors (non obsolete) of a changeset who
compete with another set of final successors for this same changeset.
For example if you have two obsolescence markers A -> B and A -> C, B and C are
both "divergent" because they compete to be the one true successors of A.
Public revision can't be divergent.
This function is used and tested in the next changeset.
Successors set are an important part of obsolescence. It is necessary to detect
and solve divergence situation. This changeset add a core function to compute
them, a debug command to audit them and solid test on the concept.
Check function docstring for details about the concept.
All obsolescence related sets need to be computed on the full unfiltered
version of the repository, in particular because several of them
(obsolete, extinct) are used to compute the hidden revisions.
On a filtered repo, revset predicates related to these sets will be
properly filtered because of revset's own pre-filtering.
The first obsolescence flag is introduced to allows for fixing the "bumped"
changeset situation.
bumpedfix == 1.
Creator of new changesets intended to fix "bumped" situation should not forget
to add this flag to the marker. Otherwise the newly created changeset will be
bumped too. See inlined documentation for details.
The old name was not very good for two reasons:
- caller does not care about "cache",
- set of revision returned may not be obsolete at all.
The new name was suggested by Kevin Bullock.
People were confused by the fact `obstore.precursors` contained marker allowing
to find "precursors" and vice-versa.
This changeset changes their meaning to:
- precursors[x] -> set(markers on precursors edges of x)
- successors[x] -> set(markers on successors edges of x)
Some documentation is added to clarify the situation.
Nullid as successors create multiple issues:
- Nullid revnum is -1, confusing algorithm that use revnum unless you add
special handling in all of them.
- Nullid confuses "divergent" changeset detection and resolution. As you can't
add any successors to Nullid without being in even more troubles
Fortunately, there is no good reason to use nullid as a successor. The only
sensible meaning of "succeed by nullid" is "dropped" and this meaning is already
covered by obsolescence marker with empty successors set.
However, letting some nullid successors to slip in may cause terrible damage in
such algorithm difficult to debug. So I prefer to perform and clear detection of
of such pathological changeset. We could be much smarter by cleaning up nullid
successors on the fly but it would be much for expensive. As core Mercurial does
not create any such changeset, I think it is fine to just abort when suspicious
situation is detected.
Earlier experimental version created such changesets, so there are some out
there. The evolve extension added the necessary logic to clean up its mess.
This function is designed to be used by all code that creates new
obsolete markers in the local repository.
It is not used by debugobsolete because debugobsolete allows the
use of an unknown hash as argument.
This changeset introduces caches on the `obsstore` that keeps track of sets of
revisions meaningful for obsolescence related logics. For now they are:
- obsolete: changesets used as precursors (and not public),
- extinct: obsolete changesets with osbolete descendants only,
- unstable: non obsolete changesets with obsolete ancestors.
The cache is accessed using the `getobscache(repo, '<set-name>')` function which
builds the cache on demand. The `clearobscaches(repo)` function takes care of
clearing the caches if any.
Caches are cleared when one of these events happens:
- a new marker is added,
- a new changeset is added,
- some changesets are made public,
- some public changesets are demoted to draft or secret.
Declaration of more sets is made easy because we will have to handle at least
two other "troubles" (latecomer and conflicting).
Caches are now used by revset and changectx. It is usually not much more
expensive to compute the whole set than to check the property of a few elements.
The performance boost is welcome in case we apply obsolescence logic on a lot of
revisions. This makes the feature usable!
Obsolete markers wide-usage and propagation should be avoided by default until
the obsolete feature is more mature.
This changeset introduce the `_enable` variable and prevent the creation of
obsolete marker if the feature is set to `False` (the default).
More limitation comes in followup changesets.
Obsolete markers are now exchanged in smaller pieces that fit in a http header.
This changes is done to avoid 400 bad request error when exchanging obsolete
puskey over http.
The last key pushed is always hold by the "dump0" key to ensure an easy place to
hook for people who need it.
The function is now able to write the version header as necessary. The function
now yield bytes to be written to a stream.
This should ease later use of this function for wireprotocol based exchanged.
Prepare the public use of the writemarker by wireprotocol function.
This function yield every nodes which succeed to a group of nodes.
The first user will be checkheads who need to know if we push successors for
remote extra heads.
Marker are now written as soon as possible but within a transaction. Using a
transaction ensure a proper behavior on error and rollback compatibility.
Flush logic are not necessary anymore and are dropped from lock release.
With this changeset, the obsstore is open, written and closed for every single
added marker. This is expected to be highly inefficient and batched write should
be implemented "quickly".
Another issue is that every flush of the file will invalidate the obsstore
filecache and trigger a full re instantiation of the repo.obsstore attribute
(including, reading and parsing entry). This is also expected to be highly
inefficient and proper filecache operation should be implemented "quickly" too.
A side benefit of the filecache issue is that repo.obsstore object is properly
invalidated on transaction abortion.
This is the second step toward incremental writing of marker inside a
transaction. The obsstore file is now handled append only.
Header writing have been extracted from _writemarkers.
Because the _writemarkers method have been dropped, the push code
directly reuse the serialised content of local repo `listkeys`. This
is not very pretty, but this part of the protocol still need major
improvement anyway.
This is the first step toward incremental writing of obsolete marker within a
transaction.
For this purpose, obsstore is now given its repo sopener. This make it able to
handles read and write to the obsstore file itself. Most IO logic is removed
from localrepo and handled by obsstore object directly.
An `obsolete` boolean property is added to changeset context. Function to get
obsolete marker object from a changeset context are added to the obsolete
module.