Commit Graph

310 Commits

Author SHA1 Message Date
FUJIWARA Katsunori
387c38ed85 annotate: discard refcount of discarded annotation for memory efficiency
Before this patch, refcount (managed in "needed") of the annotation
result is kept as 1, even if corresponding annotation result is
discarded from "hist", because it isn't decreased and discarded.

In the history tree including merging revision, the most recent common
ancestor of merged revisions is scanned twice. Refcount of such
ancestor never becomes 0, because refcount is started from 1 at the
second scanning.

This prevents annotation results of merging revision in "hist" from
being discarded, and decreases memory efficiency.

This patch discards refcount of the annotation result, when the
corresponding annotation is discarded from "hist".
2013-04-18 19:50:04 +09:00
FUJIWARA Katsunori
3a71637352 annotate: increase refcount of each revisions correctly (issue3841)
Before this patch, refcount (managed in "needed") of parents of each
revisions in "visit" is increased, only when parent is not annotated
yet (examined by "p not in hist").

But this causes less refcount of the revision like "A" in the tree
below ("A" is assumed as the second parent of "C"):

    A --- B --- C
      \       /
       \-----/

Steps of annotation for "C" in this case are shown below:

  1. for "C"
    1.1 increase refcount of "B"
    1.2 increase refcount of "A" (=> 1)
    1.3 defer annotation for "C"

  2. for "A"
    2.1 annotate for "A" (=> put result into "hist[A]")
    2.2 clear "pcache[A]" ("pcache[A] = []")

  3. for "B"
    3.1 not increase refcount of "A", because "A not in hist" is False
    3.2 annotate for "B"
    3.3 decrease refcount of "A" (=> 0)
    3.4 delete "hist[A]", even though "A" is still needed by "C"
    3.5 clear "pcache[B]"

  4. for "C", again
    4.1 not increase refcount of "B", because "B not in hist" is False
    4.2 increase refcount of "A" (=> 1)
    4.3 defer annotation for "C"

  5. for "A", again
    5.1 annotate for "A" (=> put result into "hist[A]", again)
    5.2 clear "pcache[A]"

  6. for "C", once again
    6.1 not increase refcount of "B", because "B not in hist" is False
    6.2 not increase refcount of "A", because "A not in hist" is False
    6.3 annotate for "C"
    6.4 decrease refcount of "A", and delete "hist[A]"
    6.5 decrease refcount of "B", and delete "hist[B]"
    6.6 clear "pcache[C]"

At step (5.1), annotation for "A" mis-recognizes that all lines are
created at "A", because "pcache[A]" already cleared at step (2.2)
prevents from scanning ancestors of "A".

So, annotation for "C" or its descendants loses information about "A"
or its ancestors.

The root cause of this problem is that refcount of "A" is decreased at
step (3.3), even though it isn't increased at step (3.1).

To increase refcount correctly, this patch increases refcount of each
parents of each revisions:

  - regardless of "p not in hist" or not, and
  - only once for each revisions in "visit" (by "not pcached")

In fact, this problem should occur only on legacy repositories in
which a filelog includes the merging between the revision and its
ancestor (as the second parent), because:

  - tree is scanned in depth-first

    without such merging, revisions in "visit" refer different
    revisions as parent each other

  - recent Mercurial doesn't allow such merging

    changelog and manifest can include such merging someway, but
    filelogs can't, because "localrepository._filecommit()" converts
    such merging request to linear history.

This patch tests merging cases below: these cases are from filelog of
"mercurial/commands.py" in the repository of Mercurial itself.

  - both parents are same

        10 --- 11 --- 12
                  \_/

        filelogrev: changesetid:
          10          526aca6bcb38
          11          05098100ff44
          12          2d4f4cfa81d6

  - the second parent is also ancestor of the first one

        37 --- 38 --- 39 --- 40
                  \________/

        filelogrev: changesetid:
          37          033dc4170fe6
          38          5ff1a23ce38c
          39          661a47367859
          40          a2ba99fd026f
2013-03-29 22:57:16 +09:00
FUJIWARA Katsunori
8355d1b7f6 annotate: reuse already calculated annotation
Before this patch, annotation is re-calculated even if it is already
calculated. This may cause unexpected annotation, because already
cleared "pcache" ("pcache[f] = []") prevents from scanning ancestors.

This patch reuses already calculated annotation if it is available.

In fact, "reusable" situation should be seen only on legacy
repositories in which a filelog include the merging between the
revision and its ancestor, because:

  - tree is scanned in depth-first

    without such merging, annotation result should be released soon

  - recent Mercurial doesn't allow such merging

    changelog and manifest can include such merging someway, but
    filelogs can't, because "localrepository._filecommit()" converts
    such merging request to linear history.
2013-03-29 22:57:15 +09:00
Bryan O'Sullivan
4a3a46aff6 ancestor: a new algorithm that is faster for nodes near tip
Instead of walking all the way to the root of the DAG, we generate
a set of candidate GCA revs, then figure out which ones will win
the race to the root (usually without needing to traverse all the
way to the root).

In the common case of nodes that are close to each other in both
revision number and topology, this is usually a big win: it makes
"hg --time debugancestors" up to 9 times faster than the more general
ancestor function when measured on heads of the linux-2.6 hg repo.

Victory is not assured, however. The older function can still win
by a large margin if one node is much closer to the root than the
other, or by a much smaller amount if one is an ancestor of the
other.

For now, we've also got a small paranoid harness function that calls
both ancestor functions on every input and ensures that they give
equivalent answers.

Even without the checker function, the old ancestor function needs
to stay alive for the time being, as its generality is used by
context.filectx.merge.
2013-04-16 10:08:18 -07:00
Mads Kiilerich
a8db98ea05 spelling: fix typos and spelling errors 2013-04-15 01:37:23 +02:00
Bryan O'Sullivan
4a4a5dde94 scmutil: use new dirs class in dirstate and context
The multiset-of-directories code was open coded in each of these
modules; this change gets rid of the duplication.
2013-04-10 15:08:26 -07:00
Bryan O'Sullivan
82ca6ed101 merge with mpm 2013-04-02 08:58:42 -07:00
Takumi IINO
94c0d6fcb6 hgweb: show correct error message for i18n environment
If exception is error.LookupError and running in i18n environment,
below condition is always true.
Because msg is translated and dosen't contain 'manifest'.

    if util.safehasattr(err, 'name') and 'manifest' not in msg:

This patch creates a new exception class and uses it instead of
string match.
2013-02-15 18:07:14 +09:00
Pierre-Yves David
093fc83eab changectx: fix the handling of tip
We can not use `len(repo,changelog)`, it may be a filtered revision. We now use
`repo,changelog.tip()` to fetch this information.

The `tip` command is also fixed and tested

Thanks goes to Idan Kamara for the initial report.
2013-01-22 11:39:14 +01:00
David Schleimer
1dc36ff74c commit: factor out post-commit cleanup into workingctx
This pulls some of the logic for the cleanup that needs to happen
after a commit has been made otu of localrepo.commit and into
workingctx.  This is part of a larger refactoring effort that will
eventually allow us to perform some types of merges in-memory.
2013-02-08 05:36:08 -08:00
Mads Kiilerich
5787baee50 spelling: fix some minor issues found by spell checker 2013-02-10 18:24:29 +01:00
Pierre-Yves David
6c68029a60 clfilter: stronger detection of filtered changeset in changectx.__init__
We previously let some IndexError spill out of this function.

A new tests is added to check the command that spotted the error.
2013-01-16 05:21:11 +01:00
Kevin Bullock
93f9cb7f25 filtering: rename filters to their antonyms
Now that changelog filtering is in place, it's become evident that
naming the filters according to the set of revs _not_ included in the
filtered changelog is confusing. This is especially evident in the
collaborative branch cache scheme.

This changes the names of the filters to reflect the revs that _are_
included:

  hidden -> visible
  unserved -> served
  mutable -> immutable
  impactable -> base

repoview.filteredrevs is renamed to filterrevs, so that callers read a
bit more sensibly, e.g.:

  filterrevs('visible') # filter revs according to what's visible
2013-01-13 01:39:16 -06:00
Mads Kiilerich
2d6545f8b6 subrepos: process subrepos in sorted order
Add sorted() in places found by testing with PYTHONHASHSEED=random and code
inspection.

An alternative to sprinkling sorted() all over would be to change substate to a
custom dict with sorted iterators...
2012-12-12 02:38:14 +01:00
Pierre-Yves David
b7230b8249 context: retrieve hidden from filteredrevs
This prepare the dropping of the repo.hiddenrevs property
2013-01-03 18:51:16 +01:00
Pierre-Yves David
704a17970c clfilter: fallback to unfiltered version when linkrev point to filtered history
On `filectx`, linkrev may point to any revision in the repository. When the
repository is filtered this may lead to `filectx` trying to build `changectx`
for filtered revision. In such case we fallback to creating `changectx` on the
unfiltered version of the reposition. This fallback should not be an issue
because `changectx` from `filectx` are not used in complex operation that
care about filtering. It is complicated to work around the issue in a
clearer way as code raising such `filectx` rarely have access to the
repository directly.

Linkrevs create a lot of issue with filtering. It is stored in revlog entry at
creation time and never changed. Nothing prevent the changeset revision pointed
to become filtered. Several bogus behavior emerge from such situation. Those
bugs are complex to solve and not part of the current effort to install
filtering. This changeset is simple hack that prevent plain crash in favor on
minor misbehavior without visible effect.

This "hack" is longly documented in to code itself to help people that would
look at it in the future.
2012-12-29 00:40:18 +01:00
Pierre-Yves David
0571efafd8 obsolete: introduce a troubles method on context
A troubled changeset may be affected by multiple trouble at the same time. This
new method returns a list of all troubles affecting a changes.
2012-12-17 15:17:54 +01:00
Pierre-Yves David
c1a745d834 obsolete: introduce a troubled method on context
Allows to quickly check if a changeset is affected by any troubles.
(troubles are: unstable, bumped and divergent)
2012-12-17 15:06:15 +01:00
Pierre-Yves David
985f5be6c5 clfilter: ensure context raise RepoLookupError when the revision is filtered
Currently the code path of `changectx(filteredrepo, rev)` call
`filteredrepo.changelog.node(rev)`. When `rev` is filtered this raise an
unhandled `IndexError`. This case now raise a `RepoLookupError` as other
error case do.
2012-12-17 18:09:41 +01:00
Pierre-Yves David
38afff41e0 obsolete: add a divergent method on context
The same we have `unstable` and `bumped`. Convenient method to access troubles
information in general may land later.

This get actual use and testing in the next changesets.
2012-12-12 03:20:49 +01:00
David Schleimer
3dbabdb2fc merge: support calculating merge actions against non-working contexts
This is not currently used.  It is instead a pre-requisite to
performing non-conflicting grafts in memory, which a subsequent patch
will do.
2012-12-04 12:54:18 -08:00
Pierre-Yves David
c2a31b5437 clfilter: prevent unwanted warning about filtered parents as unknown
During changectx __init__ the dirstate's parents MAY be checked. If
the repo is filtered, this check will complain "working directory has
unknown parents" even if the parents are perfectly known.

This may happen when the repo is used for serving and the dirstate has
parents that are secret, as those secret changesets will be filtered.
2012-10-08 17:15:08 +02:00
Pierre-Yves David
b3f5aa66c1 context: add a bumped method to changectx
Same as `unstable()`, returns true if the changeset is bumped.
2012-10-19 00:43:44 +02:00
Pierre-Yves David
bc08c6dbf1 obsolete: rename getobscache into getrevs
The old name was not very good for two reasons:
- caller does not care about "cache",
- set of revision returned may not be obsolete at all.

The new name was suggested by Kevin Bullock.
2012-10-19 00:28:13 +02:00
Sean Farley
26d22253ab phases: add a phase and phasestr method to file context 2012-10-16 17:09:50 -05:00
FUJIWARA Katsunori
f8ea001372 context: add "descendant()" to changectx for efficient descendant examination
This patch adds "descendant()", which uses "revlog.descendant()" for
descendant examination, to changectx.

This implementation is more efficient than "new in old.descendants()"
expression, because:

  - "changectx.descendants()" creates temporary "changectx" objects,
    but "revlog.descendant()" doesn't

    "revlog.descendant()" checks only revision numbers of descendants.

  - "revlog.descendant()" stops scanning, when scanning of all
    revisions less than one of examination target is finished

    this can avoid useless scanning in "not descendant" case.
2012-09-18 21:39:12 +09:00
Pierre-Yves David
785d90eba0 obsolete: introduce caches for all meaningful sets
This changeset introduces caches on the `obsstore` that keeps track of sets of
revisions meaningful for obsolescence related logics. For now they are:

- obsolete: changesets used as precursors (and not public),
- extinct:  obsolete changesets with osbolete descendants only,
- unstable: non obsolete changesets with obsolete ancestors.

The cache is accessed using the `getobscache(repo, '<set-name>')` function which
builds the cache on demand. The `clearobscaches(repo)` function takes care of
clearing the caches if any.

Caches are cleared when one of these events happens:

- a new marker is added,
- a new changeset is added,
- some changesets are made public,
- some public changesets are demoted to draft or secret.

Declaration of more sets is made easy because we will have to handle at least
two other "troubles" (latecomer and conflicting).

Caches are now used by revset and changectx. It is usually not much more
expensive to compute the whole set than to check the property of a few elements.
The performance boost is welcome in case we apply obsolescence logic on a lot of
revisions. This makes the feature usable!
2012-08-28 20:52:04 +02:00
Mads Kiilerich
e973af65d0 improve some comments and docstrings, fixing issues found when spell checking 2012-08-21 02:41:20 +02:00
Mads Kiilerich
2372d51b68 fix wording and not-completely-trivial spelling errors and bad docstrings 2012-08-15 22:39:18 +02:00
Mads Kiilerich
2f4504e446 fix trivial spelling errors 2012-08-15 22:38:42 +02:00
Patrick Mezard
5d1bd7fadf context: simplify workingctx._parents 2012-08-02 17:48:58 +02:00
Pierre-Yves David
edc2b520d9 hidden: move hiddenrevs set on the repository
This set is always accessed through the repo for now. Having this set
carried by the changelog make it complicated to:

- initialize it, computing hidden set may involve revset call
- lazy compute it, (1) only the changelog can detect someone access it,
                   (2) only the repo have enought knowledge to compute it.

In later version I expect he changelog to apply filtering itself and the set to
be carried by changelog again.
2012-07-16 17:44:46 +02:00
Pierre-Yves David
9e13d2931c obsolete: compute extinct changesets
`extinct` changesets are obsolete changesets with obsolete descendants only. They
are of no interest anymore and can be:

- exclude from exchange
- hidden to the user in most situation
- safely garbage collected

This changeset just allows mercurial to detect them.

The implementation is a bit naive, as for unstable changesets. We better use a
simple revset query and a cache, but simple version comes first.
2012-07-06 19:34:09 +02:00
Pierre-Yves David
2444c95546 obsolete: compute unstable changeset
An unstable changeset is a changeset *not* obsolete but with some obsolete
ancestors.

The current logic to decide if a changeset is unstable is naive and very
inefficient. A better solution is to compute the set of unstable changeset with
a simple revset and to cache the result. But this require cache invalidation
logic. Simpler version goes first.
2012-07-06 00:18:09 +02:00
Pierre-Yves David
5793a07f30 obsolete: fix context.obsolete() method
- obsstore attribut name changed.
- public changeset can't be obsolete
2012-07-04 17:26:51 +02:00
Pierre-Yves.David@ens-lyon.org
3c02b8eab1 obsolete: function and method to access some obsolete data
An `obsolete` boolean property is added to changeset context. Function to get
obsolete marker object from a changeset context are added to the obsolete
module.
2012-06-06 01:56:58 +02:00
Matt Mackall
cbbdbdd866 copies: re-include root directory in directory rename detection (issue3511) 2012-06-27 13:41:04 -05:00
Bryan O'Sullivan
141bd09daa revlog: descendants(*revs) becomes descendants(revs) (API)
Once again making the API more rational, as with ancestors.
2012-06-01 12:45:16 -07:00
Bryan O'Sullivan
6ba97b40c1 revlog: ancestors(*revs) becomes ancestors(revs) (API)
Accepting a variable number of arguments as the old API did is
deeply ugly, particularly as it means the API can't be extended
with new arguments.  Partly as a result, we have at least three
different implementations of the same ancestors algorithm (!?).

Most callers were forced to call ancestors(*somelist), adding to
both inefficiency and ugliness.
2012-06-01 12:37:18 -07:00
Matt Mackall
d53510c6a2 merge with stable 2012-05-21 16:35:27 -05:00
Matt Mackall
ca006af287 context: grudging accept longs in constructor 2012-05-21 16:32:50 -05:00
Brodie Rao
ab32f1721d context: add changectx.closesbranch() method
This removes the duplicated code for inspecting the 'close' extra field in
a changeset.
2012-05-13 14:04:06 +02:00
Brodie Rao
d36ae7f264 localrepo: add branchtip() method for faster single-branch lookups
For the PyPy repo with 744 branches and 843 branch heads, this brings
hg log -r default over NFS from:

   CallCount    Recursive    Total(ms)   Inline(ms) module:lineno(function)
        3249            0      1.3222      1.3222   <open>
        3244            0      0.6211      0.6211   <method 'close' of 'file' objects>
        3243            0      0.0800      0.0800   <method 'read' of 'file' objects>
        3241            0      0.0660      0.0660   <method 'seek' of 'file' objects>
        3905            0      0.0476      0.0476   <zlib.decompress>
        3281            0      2.6756      0.0472   mercurial.changelog:182(read)
       +3281            0      2.5256      0.0453   +mercurial.revlog:881(revision)
       +3276            0      0.0389      0.0196   +mercurial.changelog:28(decodeextra)
       +6562            0      0.0123      0.0123   +<method 'split' of 'str' objects>
       +6562            0      0.0408      0.0073   +mercurial.encoding:61(tolocal)
       +3281            0      0.0054      0.0054   +<method 'index' of 'str' objects>
        3241            0      2.2464      0.0456   mercurial.revlog:818(_loadchunk)
       +3241            0      0.6205      0.6205   +<method 'close' of 'file' objects>
       +3241            0      0.0765      0.0765   +<method 'read' of 'file' objects>
       +3241            0      0.0660      0.0660   +<method 'seek' of 'file' objects>
       +3241            0      1.4209      0.0135   +mercurial.store:374(__call__)
       +3241            0      0.0122      0.0107   +mercurial.revlog:810(_addchunk)
        3281            0      2.5256      0.0453   mercurial.revlog:881(revision)
       +3280            0      0.0175      0.0175   +mercurial.revlog:305(rev)
       +3281            0      2.2819      0.0119   +mercurial.revlog:847(_chunkraw)
       +3281            0      0.0603      0.0083   +mercurial.revlog:945(_checkhash)
       +3281            0      0.0051      0.0051   +mercurial.revlog:349(flags)
       +3281            0      0.0040      0.0040   +<mercurial.mpatch.patches>
       13682            0      0.0479      0.0248   <method 'decode' of 'str' objects>
       +7418            0      0.0228      0.0076   +encodings.utf_8:15(decode)
          +1            0      0.0003      0.0000   +encodings:71(search_function)
        3248            0      1.3995      0.0246   mercurial.scmutil:218(__call__)
       +3248            0      1.3222      1.3222   +<open>
       +3248            0      0.0235      0.0184   +os.path:80(split)
       +3248            0      0.0084      0.0068   +mercurial.scmutil:92(__call__)
Time: real 2.750 secs (user 0.680+0.000 sys 0.360+0.000)

down to:

   CallCount    Recursive    Total(ms)   Inline(ms) module:lineno(function)
          55           31      0.0197      0.0163   <__import__>
          +1            0      0.0006      0.0002   +mercurial.context:8(<module>)
          +1            0      0.0042      0.0001   +mercurial.revlog:12(<module>)
          +1            0      0.0002      0.0001   +mercurial.match:8(<module>)
          +1            0      0.0003      0.0001   +mercurial.dirstate:7(<module>)
          +1            0      0.0057      0.0001   +mercurial.changelog:8(<module>)
           1            0      0.0117      0.0032   mercurial.localrepo:525(_readbranchcache)
        +844            0      0.0015      0.0015   +<binascii.unhexlify>
        +845            0      0.0010      0.0010   +<method 'split' of 'str' objects>
        +843            0      0.0045      0.0009   +mercurial.encoding:61(tolocal)
        +843            0      0.0004      0.0004   +<method 'setdefault' of 'dict' objects>
          +1            0      0.0003      0.0003   +<method 'close' of 'file' objects>
           3            0      0.0029      0.0029   <method 'read' of 'file' objects>
           9            0      0.0018      0.0018   <open>
         990            0      0.0017      0.0017   <binascii.unhexlify>
          53            0      0.0016      0.0016   mercurial.demandimport:43(__init__)
         862            0      0.0015      0.0015   <_codecs.utf_8_decode>
         862            0      0.0037      0.0014   <method 'decode' of 'str' objects>
        +862            0      0.0023      0.0008   +encodings.utf_8:15(decode)
         981            0      0.0011      0.0011   <method 'split' of 'str' objects>
         861            0      0.0046      0.0009   mercurial.encoding:61(tolocal)
        +861            0      0.0037      0.0014   +<method 'decode' of 'str' objects>
         862            0      0.0023      0.0008   encodings.utf_8:15(decode)
        +862            0      0.0015      0.0015   +<_codecs.utf_8_decode>
           4            0      0.0008      0.0008   <method 'close' of 'file' objects>
         179          154      0.0202      0.0004   mercurial.demandimport:83(__getattribute__)
         +36           11      0.0199      0.0003   +mercurial.demandimport:55(_load)
         +72            0      0.0001      0.0001   +mercurial.demandimport:83(__getattribute__)
         +36            0      0.0000      0.0000   +<getattr>
           1            0      0.0015      0.0004   mercurial.tags:148(_readtagcache)
Time: real 0.060 secs (user 0.030+0.000 sys 0.010+0.000)
2012-05-13 14:04:04 +02:00
Brodie Rao
d6a6abf2b0 cleanup: eradicate long lines 2012-05-12 15:54:54 +02:00
Patrick Mezard
641ee7d3ba phases: introduce phasecache
The original motivation was changectx.phase() had special logic to
correctly lookup in repo._phaserev, including invalidating it when
necessary. And at other places, repo._phaserev was accessed directly.

This led to the discovery that phases state including _phaseroots,
_phaserev and _dirtyphase was manipulated in localrepository.py,
phases.py, repair.py, etc. phasecache helps encapsulating that.

This patch replaces all phase state in localrepo with phasecache and
adjust related code except for advance/retractboundary() in phases.
These still access to phasecache internals directly. This will be
addressed in a followup.
2012-05-12 00:24:07 +02:00
Idan Kamara
6360e96576 context: fix call to util.safehasattr 2012-05-09 02:46:58 +03:00
Matt Mackall
4d062ed81e context: add copies method with caching 2012-05-06 14:37:51 -05:00
Matt Mackall
e38c9f282e filectx: handle some other simple cases for finding merge ancestor 2012-05-06 14:20:53 -05:00
Matt Mackall
da4217097f filectx: make ancestor require actx
When grafting or rebasing, we need to know the target ancestor.
2012-05-04 17:27:14 -05:00
Patrick Mezard
036643538d update: make --check abort with dirty subrepos
Aka "we could use dirty() but... yeah let's use it"
2012-04-23 12:12:04 +02:00