Before this patch, "hg summary --large" shows how many largefiles are
changed or added in outgoing revisions only in the point of the view
of filenames.
For example, according to the number of outgoing largefiles shown in
"hg summary" output, users should expect that the former below costs
much more to upload outgoing largefiles than the latter.
- outgoing revisions add a hundred largefiles, but all of them refer
the same data entity
in this case, only one data entity is outgoing, even though "hg
summary" says that a hundred largefiles are outgoing.
- a hundred outgoing revisions change only one largefile with
distinct data
in this case, a hundred data entities are outgoing, even though
"hg summary" says that only one largefile is outgoing.
But the latter costs much more than the former, in fact.
This patch shows also how many data entities are outgoing at "hg
summary" by counting number of unique hash values for outgoing
largefiles.
This patch introduces "_getoutgoings" to centralize the logic
(de-duplication, too) into it for convenience of subsequent patches,
even though it is not required in "hg summary" case.
This patch adds tests for summary/outgoing improved in subsequent
patches, to reduce amount of diffs in each patches.
This patch adds new revisions below:
- revision #2 adds new largefiles, but they contain as same data as
one already existing
this causes that multiple standins refer the same data entity
- revision #3, #4 and #5 change the already existing largefile
this causes that multiple data entities are outgoing for the standin.
#5 can be used to check de-duplication of "(hash, filename)" pair.
When the matcher is exact, there's no reason to iterate over the entire
manifest. It's much more efficient to iterate over the list of files instead.
For a repository with approximately 300,000 files, this speeds up
hg log -l10 --patch --follow for a frequently modified file from 16.5 seconds
to 10.5 seconds.
The arguments to log --patch --follow are expected to be exact paths.
This will be used to make manifest filtering for these cases more efficient in
upcoming patches.
Previously, the 'patch' code for hg log --patch --follow would try to resolve
patterns relative to the repository root rather than the current working
directory. Fix that by using match.files instead of pats, as done elsewhere
nearby.
Before this patch, all operations applied on ".gitmodules" at git
source revisions are treated as modification, even if they are
actually removal of it.
If removal of ".gitmodules" is treated as modification unexpectedly,
"hg convert" is aborted by the exception raised in
"retrievegitmodules()" for ".gitmodules" at the git source revision
removing it, because that revision doesn't have any information of
".gitmodules".
This patch detects removal of ".gitmodules" at git source revisions
correctly.
If ".gitmodules" is removed at the git source revision, this patch
records "hex(nullid)" as the contents hash value for ".hgsub" and
".hgsubstate" at the destination revision.
This patch makes "getfile()" raise IOError also for ".hgstatus" and
".hgsubstate" if the contents hash value is "hex(nullid)", and this
tells removal of ".hgstatus" and ".hgsubstate" at the destination
revision to "localrepository.commitctx()" correctly.
For files other than ".hgstatus" and ".hgsubstate", checking the
contents hash value in "getfile()" may be redundant, because
"catfile()" for them also does so.
But this patch chooses writing it only once at the beginning of
"getfile()", to avoid writing same code twice both for ".hgsub" and
".hgsubstate" separately.
This separation makes it easier to extend/hook building commit text
from the specified context.
This patch uses 'committext' instead of 'edittext' for names of newly
added variable and function, because the former is more purpose
specific than the latter, even though 'edittext' in 'buildcommittext'
is left as it is to reduce amount of diff.
Before this patch, filemerge slices byte sequence directly to trim
conflict markers, but this may cause:
- splitting at intermediate multi-byte sequence
- incorrect calculation of column width (length of byte sequence is
different from columns in display in many cases)
This patch uses 'util.ellipsis' to trim custom conflict markers
correctly, even if multi-byte characters are used in them.
Before this patch, with careless configuration (missing '|firstline'
filtering for '{desc}' keyword, for example), '[ui]
mergemarkertemplate' can make conflict markers multiple lines.
For ordinary users, advantage of allowing '[ui] mergemarkertemplate'
to generate multiple lines for customizing seems to be less than
advantage of disallowing it for safety.
This patch uses only the first line of the conflict marker generated
from '[ui] mergemarkertemplate' configuration for safety.
Before this patch, 'progress' extension applies 'len' on byte sequence
to get column width of it, but it causes incorrect result, when length
of byte sequence and columns in display are different from each other
in multi-byte characters.
This patch uses 'encoding.colwidth' to get column width of items in
output line correctly, even if it contains multi-byte characters.
Before this patch, 'progress' extension trims items in output line by
directly slicing byte sequence, but it may split at intermediate
multi-byte sequence.
This patch uses 'encoding.trim' to trim items in output line
correctly, even if it contains multi-byte characters.
Before this patch, 'progress' extension applies 'len' on byte sequence
to get column width of it, but it causes incorrect result, when length
of byte sequence and columns in display are different from each other
in multi-byte characters.
This patch uses 'encoding.colwidth' to get column width of output line
correctly, even if it contains multi-byte characters.
Before this patch, 'progress' extension trims output line by directly
slicing byte sequence, but it may split at intermediate multi-byte
sequence.
This patch uses 'encoding.trim' to trim output line correctly, even if
it contains multi-byte characters.
"rm -f loop.pyc" before changing "loop.py" in "test-progress.t"
ensures that re-compilation of "loop.py", even if "loop.py" and
"loop.pyc" have same timestamp in seconds.
Before this patch, trimming description of each changesets in histedit
may split at intermediate multi-byte sequence.
This patch uses 'util.ellipsis' to trim description of each changesets
instead of directly slicing byte sequence.
Even though 'util.ellipsis' adds '...' as ellipsis when specified
string is trimmed (= this changes result of trimming), this patch uses
it, because:
- it can be used without any additional 'import', and
- ellipsis seems to be better than just trimming, for usability
Before this patch, 'util.ellipsis' tried to avoid splitting at
intermediate multi-byte sequence, but its implementation was incorrect.
Internal function '_ellipsis' trims specified unicode sequence not at
most maxlength 'columns in display', but at most maxlength number of
'unicode characters'.
def _ellipsis(text, maxlength):
if len(text) <= maxlength:
return text, False
else:
return "%s..." % (text[:maxlength - 3]), True
In many encodings, number of unicode characters can be different from
columns in display.
This patch replaces 'ellipsis' implementation by 'encoding.trim',
which can trim string at most maxlength columns in display correctly,
even though specified string contains multi-byte characters.
'_ellipsis' is removed in this patch, because it is referred only from
'ellipsis'.
Newly added 'trim' is used to trim multi-byte characters at most
specified columns correctly: directly slicing byte sequence should be
replaced with 'encoding.trim', because the former may split at
intermediate multi-byte sequence.
Slicing unicode sequence ('uslice') and concatenation with ellipsis
('concat') are defined as function, to make enhancement in subsequent
patch easier.
This option had very limited utility and counterintuitive behavior and
collided unfortunately with the much later -B option.
Normally we would no-op such a feature so as to avoid annoying existing
scripts. However, we have to weigh that against the silent misbehavior
that results when users mistakenly intended to use -B: because -b
takes no arg, the bookmark gets interpreted as a normal revision, and
gets stripped without removing the associated bookmark, while also not
backing up the revision in question. A no-op behavior or warning would only
remove the latter half of the misadventure.
The only users I can find of this feature were using it in error and
have since stopped. The few (if any) remaining users of this feature
would be better served by --no-backup.
In the context of standalone Hg receiving a set of incoming changes, it makes
sense for the Bugzilla module to cache basic setup to avoid reconnecting
to Bugzilla for each change. After processing the changes, Hg will exit
and so the connection is short-lived.
But this doesn't work too well when used from a long-lived environment
such as hgweb or Kallithea where, for example, the connection can time out.
So take the simple approach, abandon the cache and do the basic setup on
each call. This fixes current problems with Kallithea.
Prior to this, doing "hg rebase -s @foo -d @" would delete @, which is
obviously wrong: a primary bookmark should never be automatically deleted.
This change blocks the deletion, but doesn't yet properly clean up the
divergence: @ should replace @foo.
Previously, a glob pattern of the form 'foo/**/bar' would match 'foo/a/bar' but
not 'foo/bar'. That was because the '**' in 'foo/**/bar' would be translated to
'.*', making the final regex pattern 'foo/.*/bar'. That pattern doesn't match
the string 'foo/bar'.
This is a bug because the '**/' glob matches the empty string in standard Unix
shells like bash and zsh.
Fix that by making the ending '/' optional if an empty string can be matched.
The defect was that copies were always duplicated against the target
revision, rather than the first parent of the revision being
rebased. This produced nominally correct results if changes were
rebased one at a time (or with --collapse), but was wrong if we
rebased a sequence of changesets which contained a sequence of copies.
This fixes a crash that may happen when using mercurial 3.0.x.
The _gethiddenblockers function assumed that the output of tags.readlocaltags()
was a dict mapping tags to of valid nodes. However this was not necessarily the
case. When a repository had obsolete revisions and had local tag pointing to a
non existing revision was found, many mercurial commands would crash.
This revision fixes the problem by removing any tags from the output of
tags.readlocaltags() which point to invalid nodes.
We may want to add a warning when this happens (although it might be
annoying to get that warning for every command, possibly even more than once per
command).
A test for this problem has been added to test-obsolete.t. Without this fix the
test would output:
$ hg tags
abort: 00changelog.i@3816541e5485: no node!
[255]
Instead of:
$ hg tags
tiptag 2:3816541e5485
tip 2:3816541e5485
visible 0:193e9254ce7e
This will make resolve use correct locking and thus make it more safe.
Resolve is usually a long running command spending a lot of time waiting for
user input on hard problems. It is thus a real world scenario to start multiple
resolves at once or run other commands (such as up -C and merge) while resolve
is running. Proper locking prevents that.