This makes a big difference to performance.
In a clean working directory containing 170,000 files, performance of
"hg --time diff" improves from 2.38 seconds to 1.69.
In a clean working directory containing 170,000 tracked files, this
improves performance of "hg --time diff" from 1.69 seconds to 1.43.
This idea is due to Siddharth Agarwal.
Files in a subrepo were overwritten on update. But this should only happen on a
clean update (example: -C is specified).
Use the overwrite parameter introduced for svn subrepos in e3640daa4703 to
decide whether to merge changes (as update) or remove them (as clean).
The new function hg.updaterepo is intruduced to keep all update calls in hg.
test-subrepo.t is extended to test if an untracked file is overwritten
(issue3276). (Update -C is already tested in many places.)
The first two chunks are debugging output which has changed. (Because overwrite
is not always true anymore for subrepos)
All other tests still pass without any change.
Before this patch, case-folding collision is checked simply between
manifests of each merged revisions.
So, files may be considered as colliding each other, even though one
of them is already deleted on one of merged branches: in such case,
merge causes deleting it, so case-folding collision doesn't occur.
This patch checks whether both of files colliding each other still
remain after merge or not, and ignores collision if at least one of
them is deleted by merge.
In the case that one of colliding files is deleted on one of merged
branches and changed on another, file is considered to still remain
after merge, even though it may be deleted by merge, if "deleting" of
it is chosen in "manifestmerge()".
This avoids fail to merge by case-folding collisions after choices
from "changing" and "deleting" of files.
This patch adds only tests for "removed remotely" code paths in
"_remains()", because other ones are tested by existing tests in
"test-casecollision-merge.t".
All filecache usage on repo is for logic that should be unfiltered. The
caches should be common to all filtered instances, and computation must
be done unfiltered. A dedicated storecache subclass is created for
this purpose.
Some of the localrepo property caches must be computed unfiltered and
stored globally. Some others must see the filtered version and store data
relative to the current filtering.
This changeset introduces two classes `unfilteredpropertycache`
and `filteredpropertycache` for this purpose. A new function
`hasunfilteredcache` is introduced for unambiguous checking for cached
values on unfiltered repos.
A few tweaks are made to the property cache class to allow overriding
the way the computed value is stored on the object.
Some logic relative to _tagcaches is cleaned up in the process.
The logic recently added to `bookmark.validdest` uses data about obsolete
changesets to see if a bookmark destination is valid. Obsolete changesets
are likely to be filtered, so we need to work on an unfiltered repository.
Computation of common changesets during push needs to be done on the
widest set possible. An unfiltered version of the repo is kept for
discovery and various revset calls. The discovery code itself enforces
the filtering of unserved outgoing changeset.
During changectx __init__ the dirstate's parents MAY be checked. If
the repo is filtered, this check will complain "working directory has
unknown parents" even if the parents are perfectly known.
This may happen when the repo is used for serving and the dirstate has
parents that are secret, as those secret changesets will be filtered.
Strip is a "write" operation that needs to be aware of the whole repo's
content before destroying changesets.
Only the low level function is altered. The top level command will still
process its argument filtered (if any filtering is in place).
All obsolescence related sets need to be computed on the full unfiltered
version of the repository, in particular because several of them
(obsolete, extinct) are used to compute the hidden revisions.
On a filtered repo, revset predicates related to these sets will be
properly filtered because of revset's own pre-filtering.
The current branchcache construction is not aware of filtering. We keep
the status quo, ensuring that the branch cache logic is computed as
before: without any filtering.
This decorator ensure the method in run on an unfiltered version of the
repository. See follow-up commit for details.
This decorator is not named `unfiltered` because it would clash with the
`unfilteredmethod` on `localrepo` itself.
This commit is part of the changelog level filtering effort. It returns
the main "unfiltered" version of a repo-like object. For localrepo this
means the same localrepo object. But this method will be overwritten
by the filtered versions of a repository to return the core unfiltered
version of the repo.
Introducing this simple method first allows later commits to prepare
for the use of a filtered version of a repository.
A new repo method is added because a lot of users may call it. At the
end of this series of commits, about 40 calls exist in core and hgext.
For changelog level filtering to take effect it need to be used for any
iteration.
This changeset removes usage of `range` and `xrange` that survived the first
pass.
During merge of branches, it is useful to compare merge results against
the two parents. This change adds this support to hgweb. To specify
which parent to compare to, use rev/12300:12345 where 12300 is a
parent changeset number. Two links are added to changeset web page so
that one can choose which parent to compare to.
scmutil.checknewlabel takes a repo object as its first argument.
When the call to this function was added in 4d438984605c, the
first argument was mistakenly set to 'None'.
When commiting to a repo with lots of files (>170000),
manifest.py:addlistdelta takes some time because it's editing a large
array many times. Changing it to build a new array instead of editing
the old one saves around 0.04 seconds on a 1.64 second commit. A 2.5%
gain.
The gain here is pretty minor, but it was blatantly at the top of the
profiler report and the fix is straight forward.
I tested it by comparing the arrays produced by the new and old logic
while running all of the tests.
If a template iterator is implemented with generator, the iterator is exhau=
sted
after we use it. This leads to undesired behavior in template. This chang=
e
converts a generator-based iterator to list-based iterator when template en=
gine
first detects a generator-based iterator. All future usages of iterator wi=
ll
use list instead.
We often need to perform rev iteration in reverse order. This
changeset makes it possible to do so, in order to avoid costly reverse
or reversed() calls later.
This also speeds up other commands that use findmissing, like
incoming and merge --preview. With a large linear repository (>400000
commits) and with one incoming changeset, incoming is sped up from
around 4-4.5 seconds to under 3.
One of the major reasons rebase is slow in large repositories is
the computation of the detach set: the set of ancestors of the
changesets to rebase not in the destination parent. This is currently
done via a revset that does two walks all the way to the root of
the DAG. Instead of doing that, to find ancestors of a set <revs>
not in another set <common> we walk up the tree in reverse revision
number order, maintaining sets of nodes visited from <revs>, <common>
or both.
For the common case where the sets are close both topologically and
in revision number (relative to repository size), this has been
found to speed up rebase by around 15-20%. When the nodes are farther
apart and the DAG is highly branching, it is harder to say which
would win.
Here's how long computing the detach set takes in a linear repository
with over 400000 changesets, rebasing near tip:
Rebasing across 4 changesets
Revset method: 2.2s
New algorithm: 0.00015s
Rebasing across 250 changesets
Revset method: 2.2s
New algorithm: 0.00069s
Rebasing across 10000 changesets
Revset method: 2.4s
New algorithm: 0.019s
The bisect command does not have an option to limit itself only to
subdirectories, but it's possible to use revsets for the --skip option
for the same effect. Given the relative obscurity of revsets, it helps
to have this as another example for bisect.
In an upcoming patch, we will add index information to all git diffs, not
only binary diffs, so this code needs to be moved to a more appropriate
place.
Also, since this information is used for patch headers, it makes more
sense to be in the patch module, along with other patch-related metadata.
addmodehdr is a header helper, same as diffline, so it doesn't
need to be a top-level function and can be nested under trydiff.
In upcoming patches we will generalize this approach for
all headers.
Make diffline more readable, using strings with placeholders
rather than appending to a list from many ifs that makes
difficult to understand the actual output format.
Before, quiet mode produced no diffline header for mercurial as a
side effect of not populating "revs". This was a weird side effect,
and we will always need revs for git index header that will be added
in upcoming patches, so now we just check ui.quiet from diffline
directly.
diffline is not part of diff computation, so it makes more sense
to place it with other header generation in patch module.
In upcoming patches we will generalize this approach for
all headers added in the patch, including the git index
header.
diffline was called from trydiff for binary diffs and from unidiff
for text diffs. In this patch we unify those calls into one.
diffline is also a header, not part of diff mechanisms, so it makes
sense to remove that responsibility from the mdiff module. In
upcoming patches we will move diffline to patch module and
keep grouping responsibilities.
b85diff generates a binary diff, so we move this code to mdiff module
along with unidiff for text diffs. All diffing mechanisms will be in the
same place.
In an upcoming patch we will remove the responsibility to print the
index header from b85diff and move it back to patch, since it's
a patch metadata header, not part of the diff generation.
Currently when obtaining an archive snapshot of a repository via the
web interface, subrepositories are not taken in the snapshot. I
introduce an option, archivesubrepos, which allows this.
'*' causes the resulting RE to match 0 or more repetitions of the preceding RE:
>>> bool(re.search('.*', ''))
>>> True
This causes an infinite loop because currently we're only checking if there was
a match without looking at where we are in the searched string.
Bookmarks persistence still showed a fair amount of its legacy as a
monkeypatching extension. This encapsulates all bookmarks
serialization and parsing in a single class, and offers a single
location where other bookmarks storage engines can be substituted
in. As a result, many files no longer import the bookmarks module,
which strikes me as an encapsulation win.
This doesn't do anything to the current bookmark state yet, but I'm
hoping put that in the bmstore class as well.
Update to this code was minimalist when `allsuccessors` argument were changed
from a list to a set. As this code is getting my attention again I realised we
can drastically simplify this part of the code by issue a single call to
`allsuccessors`.
bundle() revset expression returns all changes that are present
in the bundle file (no matter whether they are in the repo or not).
Bundle file should be specified via -R option.
ui contains repo specific configuration, so do not use it when there is a repo.
But pass it to hg.peer when there is no repo. Then it only contains global
configuration.
Do not pass ui because it contains the configuration of the repo. It is the
same object as repo.ui.
When a repo is passed to hg.peer, the global configuration is read from
repo.baseui.
Before this patch, repository local configurations are not isolated
between repositories in subrepo tree, because "localrepository"
objects for each subrepositories are created with "ui" instance of the
parent of each ones.
So, local configuration of the parent or higher repositories are
visible also in children or lower ones.
This patch uses "baseui" instead of "ui" to create repository object:
the former contains only global configuration.
This patch also copies 'ui.commitsubrepos' configuration to commit
recursively in subrepo tree, because it may be set in not
"repo.baseui" but "repo.ui".
The message "updating bookmark @ failed!" in test-bookmarks-pushpull.t
is correct, because the changeset that the @ bookmark points to is not
pushed to the target repository.
Before this change a bookmark named "default" or a branch named "@" would
cause the wrong changeset to be checked out.
The change in output of test-hardlinks.t is due to the fact that no unneeded
tag lookups for the tags "@" or "default" happen, therefore the cache file is
not created.
The `%ln` revset substitution does not accept unknown node. We prune unknown
node from potential successors before computing descendants.
This have no impact on the result of this function.
- Descendants of unknown changeset as unknown,
- all successors of unknown changesets are already return by the call who
returned those same unknown changesets,
- unknown changesets are never a valid destination for a bookmark.
These slashes are a hangover from issue3612, fixed in d5787cfaa7cf.
Although the bugfix in that commit is correct, the test it adds
does not replicate the conditions for the bug correctly.