Changeset d79feb65f3ee added advertising of supported changegroup version
through the new 'b2x:changegroup' capability. However, this capability is not
new and has been around since 3.1 with an empty value. This makes new clients
unable to push to 3.2 servers through bundle2 as they cannot find a common
changegroup version to use from and empty list.
Treating empty 'b2x:changegroup' value as old client fixes it.
Instead of calling 'cl.finalize()' by hand (possibly at a bogus time) we
register it in the transaction during 'delayupdate' and rely on 'tr.close()' to
call it at the right time.
The new 'addfinalize' method allows people to register a callback to
be triggered when the transaction is closed. This aims to get rid of
explicit calls to 'changelog.finalize'. This also obsoletes the
'onclose' function but removing it is not in the scope of this series.
The 'delayupdate' method now takes a transaction object and registers its
'_writepending' method for execution in 'transaction.writepending()'. The hook can then
use 'transaction.writepending()' directly.
At some point this will allow the addition of other file creation
during writepending.
The contents of the transaction must be flushed to disk before running
a hook. But it must be flushed to a special file so that the normal
reader does not use it. This logic is currently in the changelog only.
We add some facility to register such operations in the transaction
itself.
The current way we use the 'delayupdate' mechanism is wrong. We call
'delayupdate' right after the transaction retrieval, then we call 'finalize'
right before calling 'tr.close()'. The 'finalize' call will -always- result in a
flush to disk, making the data available to all readers. But the 'tr.close()' may
be a no-op if the transaction is nested. This would result in data:
1) exposed to reader too early,
2) rolled back by other part of the transaction after such exposure
So we need to end up in a situation where we call 'finalize' a single time when
the transaction actually closes. For this purpose we need to be able to call
'delayupdate' and '_writepending' multiple times and 'finalize' once. This was
not possible with the previous state of the code.
This changeset refactors the code to makes this possible. We buffer data in memory
as much as possible and fall-back to writing to a ".a" file after the first call
to '_writepending'.
2ec3e28dea6b changed 'sample' from a list to a set. The iteration order is thus
undefined and the yesno indices are not stable.
To solve this, repeat the listification and comment from elsewhere in the code.
Note: the randomness in the discovery protocol can make this problem hard to
reproduce.
2ec3e28dea6b made it possible that the initial head check didn't include all
heads. If that is the case, don't use the early exit just because this random
sample happened to be 'all known'.
Note: the randomness in the discovery protocol can make this problem hard to
reproduce.
The status for missing_content2_content2-untracked doesn't get
reported at all. Since the file does exist in the working copy, it
should reported as unknown. Document that in the test.
Start using the generate-working-copy-states.py script that's shared
with test-revert.t, instead of creating the states manually in the
test. This adds several states that are currently missing. We will
start checking those states later.
This keyword remapping was introduced in 236440938a03 as part of converting
generator based iterators into list based iterators, mentioning "undesired
behavior in template" when a generator is exhausted, but doesn't say what and
introduces no tests.
The problem with the remapping was that it corrupted the output for keywords
like 'extras', 'file_copies' and 'file_copies_switch' in templates such as:
$ hg log -r 82a4f5557c6b --template "{file_copies % ' File: {file_copy}\n'}"
File: mercurial/changelog.py (mercurial/hg.py)
File: mercurial/changelog.py (mercurial/hg.py)
File: mercurial/changelog.py (mercurial/hg.py)
File: mercurial/changelog.py (mercurial/hg.py)
File: mercurial/changelog.py (mercurial/hg.py)
File: mercurial/changelog.py (mercurial/hg.py)
File: mercurial/changelog.py (mercurial/hg.py)
File: mercurial/changelog.py (mercurial/hg.py)
What was happening was that in the first call to runtemplate() inside runmap(),
'lm' mapped the keyword (e.g. file_copies) to the appropriate showxxx() method.
On each subsequent call to runtemplate() in that loop however, the keyword was
mapped to a list of the first item's pieces, e.g.:
'file_copy': ['mercurial/changelog.py', ' (', 'mercurial/hg.py', ')']
Therefore, the dict for the second and any subsequent items were not processed
through the corresponding showxxx() method, and the first item's data was
reused.
The 'extras' keyword regressed in 56b014c52204, and 'file_copies' regressed in
4e182fb53989 for other reasons. The common thread of things fixed by this seems
to be when a list of dicts are passed to the templatekw._hybrid class.
To prepare for using generate-working-copy-states.py for generating
the files and their content, let's start by renaming the files
according to the naming scheme used by that script.
With the recent change in naming of the generated files, it becomes
much easier to generate the files by iterating over all the possible
states than over the state transitions.
Putting "lambda *msg, **opts: None" (= avoid printing messages always)
into "_lfstatuswriters" while rebasing makes explicit passing
"printmessage = False" for "updatelfiles()" useless.
This patch also removes setting/unsetting "repo._isrebasing" in
"overriderebase", because there is no code path referring it.
This patch makes "updatelfiles()" get appropriate function to write
largefiles specific status messages via "getstatuswriter()".
This patch introduces None as "print messages if needed", because True
(forcibly writing) and False (forcibly ignoring) are already used for
"printmessage" of "updatelfiles".
Subsequent patch will move "avoid printing messages only while
automated committing" decision from caller of "updatelfiles()" into
"getstatuswriter()".
"lfutil.getstatuswriter" is the utility to get appropriate function to
write largefiles specific status out from "repo._lfstatuswriters".
This patch uses "stack" with an element instead of flag like
"_isXXXXing" or so, because:
- the former works correctly even when customizations are nested, and
- ensuring at least one element can ignore empty check
Before this patch, "hg rebase --continue" may record incorrect
standins, because largefiles extension always avoid updating standins
while rebasing, even though largefiles in the working directory may be
modified manually at the 1st commit of "hg rebase --continue".
But, on the other hand, updating standins should be avoided at
subsequent commits for efficiency reason.
To update standins only at the 1st commit of "hg rebase --continue",
this patch introduces state-full callable object
"automatedcommithook", which updates standins by
"lfutil.updatestandinsbymatch()" only at the 1st commit of resuming.
Even after this patch, "repo._isrebasing = True" is still needed to
avoid some status report while updating largefiles in
"lfcommands.updatelfiles()".
This is reason why this patch omits not "repo._isrebasing = True" in
"overriderebase" but examination of "getattr(repo, "_isrebasing",
False)" in "updatestandinsbymatch".
This changes allows to customize pre-committing procedures according
to conditions.
This patch uses "stack" with an element instead of flag like
"_isXXXXing" or so, because:
- the former works correctly even when customizations are nested, and
- ensuring at least one element can ignore empty check
This patch factors out procedures to update standins for
pre-committing. This is one of preparations to avoid execution of such
procedures according to invocation context.
For example, resuming automated committing (e.g. "hg rebase
--continue") should update standins at the 1st commit, because
largefiles in the working directory may be modified manually. But on
the other hand, it should avoid updating standins at subsequent
committings for efficiency reason.
For simplicity, this patch just moves procedures mechanically only
with replacing below.
- "self" => "repo"
- "lfutil." => (none)
- "orig" invocation => returning "match"
Using "fstandin" instead "standin" as the name of local variable for
the loop below is the only special care, because the latter shadows
the same name function in "lfutil.py".
[before]
for standin in standins:
lfile = lfutil.splitstandin(standin)
if lfdirstate[lfile] != 'r':
lfutil.updatestandin(self, standin)
[after]
for fstandin in standins:
lfile = splitstandin(fstandin)
if lfdirstate[lfile] != 'r':
updatestandin(repo, fstandin)
Before this patch, procedures to update lfdirstate for post-committing
are scattered in "lfilesrepo.commit". In the case of "hg commit" with
patterns for target files ("Case 2"), lfdirstate is updated BEFORE
real committing.
This patch factors out procedures to update lfdirstate for
post-committing into "lfutil.markcommitted", and makes it callable via
"markcommitted" of the context passed to "lfilesrepo.commitctx".
"markcommitted" of the context is called, only when it is committed
successfully.
Passing original "markcommitted" of the context is meaningless in this
patch, but required in subsequent one to prepare something before
invocation of it.
This patch removes "--rebase" specific code path for "hg pull" in
"overridepull", because previous patch makes it meaningless: now,
"rebase.rebase" ("orig" invocation in this patch) can
update/commit largefiles safely without "repo._isrebasing = True".
As a side effect of removing "rebase.rebase" invocation in
"overridepull", this patch removes "nothing to rebase ..." message in
"test-largefiles.t", which is shown only when rebase extension is
enabled AFTER largefiles:
before this patch:
1. "dispatch" invokes "pullrebase" of rebase as "hg pull" at
first, because rebase wraps "hg pull" later
2. "pullrebase" invokes "overridepull" of largefiles as "orig",
even though rebase assumes that "orig" is "pull" of commands
3. "overridepull" executes "pull" and "rebase" directly
3.1 "pull" pulls changesets and creates new head "X"
3.2 "rebase" rebases current working parent "Y" on "X"
4. "overridepull" returns to "pullrebase"
5. "pullrebase" tries to rebase, but there is nothing to be done,
because "Y" is already rebased on "X". then, it shows "nothing
to rebase ..."
after this patch:
1. "dispatch" invokes "pullrebase" of rebase as "hg pull"
2. "pullrebase" invokes "overridepull" of largefiles as "orig"
3. "overridepull" executes "pull" as "orig"
4. "overridepull" returns to "pullrebase"
5. revision "Y" is not yet rebased, so "pullrebase" doesn't shows
"nothing to rebase ..."
As another side effect of removing "rebase.rebase" invocation, this
patch fixes issue3861, which occurs only when rebase extension is
enabled BEFORE largefiles:
before this patch:
1. "dispatch" invokes "overridepull" of largefiles at first,
because largefiles wrap "hg pull" later
2. "overridepull" executes "pull" and "rebase" explicitly
2.1 "pull" pulls changesets and creates new head "X"
2.2 "rebase" rebases current working parent, but fails because
no revision is checked out in issue3861 case
3. "overridepull" returns to "dispatch" with exit code 1 returned
from "rebase" at (2.2)
4. "hg pull" terminates with exit code 1 unexpectedly
after this patch:
1. "dispatch" invokes "overridepull" of largefiles at first
2. "overridepull" invokes "pullrebase" of rebase as "orig"
3. "pullrebase" invokes "pull" as "orig"
4. "pullrebase" invokes "rebase", and it fails
5. "pullrebase" returns to "overridepull" with exit code 0
(because "pullrebase" ignores result of "pull" and "rebase")
6. "overridepull" returns to "dispatch" with exit code 0 returned
from "rebase" at (5)
7. "hg pull" terminates with exit code 0
Before this patch, largefiles extension wraps only "rebase" in the
command table by "extensions.wrapcommand". But there are some
functions using "rebase.rebase" directly.
Without special care for them, largefiles extension can't work
correctly with such functions. In addition to it, "special care" often
becomes complicated and awkward. For example:
- "unshelve" can't get correct result of "rebase.rebase", because of
lack of special care
- special care for "hg pull --rebase" causes issue3861
This patch wraps "rebase.rebase" for functions using it directly.
For simplicity, this patch keeps 'special care for "hg pull --rebase"'.
It is removed in the subsequent patch.
cg2 supports generaldelta in changegroups, to be used in bundle2.
Since generaldelta is handled directly in cg2, reordering is switched
off by default.
When using bundle2, we find the common subset of supported changegroup-packers
and we pick the max of them. This allow to use generaldelta aware changegroups through
bundle2.
When using bundle2, we find the common subset of supported changegroup-packers
and we pick the max of them. This allow to use generaldelta aware changegroup
through bundle2.
The commands getchangegroup, getlocalchangegroup and getsubset now each
have a version ending in -raw. The raw versions return the chunk generator
from the changegroup packer directly, without wrapping it in a chunkbuffer
and unpacker. This avoids extra chunkbuffers in the bundle2 code path.
Also, the raw versions can be extended to support alternative packers
in the future, to be used from bundle2.
The current output is mostly a wall of text. This makes it hard to
actually check something for people with lazy eyes. We use labels and
colors to make it more joyful (and get the patch summaries to stand
out). The colors have been arbitrarily choosen. They can be changed
later if someone has a more scientific choice.
We use a `formatter` object in the perf extensions. This allow the use of
formatted output like json. To avoid adding logic to create a formatter and pass
it around to the timer function in every command, we add a `gettimer` function
in charge of returning a `timer` function as simple as before but embedding an
appropriate formatter.
This new `gettimer` function also return the formatter as it needs to be
explicitly closed at the end of the command.
example output:
$ hg --config ui.formatjson=True perfvolatilesets visible obsolete
[
{
"comb": 0.02,
"count": 126,
"sys": 0.0,
"title": "obsolete",
"user": 0.02,
"wall": 0.0199398994446
},
{
"comb": 0.02,
"count": 117,
"sys": 0.0,
"title": "visible",
"user": 0.02,
"wall": 0.0250301361084
}
]
This will let the bundle2 client and server detect what packer they should be using.
This detection part is not done. I expect it to be done with the addition of the
second packer (with generaldelta support).
We only have "01" right now, but we should get general delta in soon.
Bundle2 is expected to make use of this to advertise and select the right packer
to use on both sides.
Calling 'baseset(repo.changelog)' builds a list for all revisions in
the repo. And we already have the lazy and efficient 'fullreposet'
class for this purpose.
This gives us the usual benefits of the fullreposet but it is less visible
because the matching process itself is very expensive:
revset) matching(100)
before) wall 6.413281 comb 6.420000 user 5.910000 sys 0.510000 (best of 3)
after) wall 6.173608 comb 6.170000 user 5.750000 sys 0.420000 (best of 3)
However for some complex list, this provide a massive speedup
revset) matching(parents(100))
before) wall 23.890740 comb 23.890000 user 23.450000 sys 0.440000 (best of 3)
after) wall 6.382280 comb 6.390000 user 5.930000 sys 0.460000 (best of 3)
Calling 'baseset(repo.changelog)' builds a list for all revisions in
the repo. And we already have the lazy and efficient 'fullreposet'
class for this purpose.
This gives us the usual benefits of the fullreposet:
revset) 100^1
before) wall 0.002694 comb 0.000000 user 0.000000 sys 0.000000 (best of 897)
after) wall 0.000997 comb 0.000000 user 0.000000 sys 0.000000 (best of 2324)
revset) parents(100)^1
before) wall 0.003832 comb 0.000000 user 0.000000 sys 0.000000 (best of 587)
after) wall 0.001034 comb 0.000000 user 0.000000 sys 0.000000 (best of 2309)
revset) (100^1)^1
before) wall 0.005616 comb 0.000000 user 0.000000 sys 0.000000 (best of 405)
after) wall 0.001030 comb 0.000000 user 0.000000 sys 0.000000 (best of 2258)
Calling 'baseset(repo.changelog)' builds a list for all revisions in the
repo. And we already have the lazy and efficient 'fullreposet' class
for this purpose.
This gives us the usual benefits of the fullreposet:
revset) children(tip~100)
before) wall 0.007469 comb 0.010000 user 0.010000 sys 0.000000 (best of 338)
after) wall 0.003356 comb 0.000000 user 0.000000 sys 0.000000 (best of 755)
Calling 'baseset(repo.changelog)' builds a list for all revisions in
the repo. And we already have the lazy and efficient 'fullreposet'
class for this purpose.
This gives us the usual benefits of the fullreposet:
revset) 100~5
before) wall 0.002712 comb 0.000000 user 0.000000 sys 0.000000 (best of 918)
after) wall 0.000996 comb 0.000000 user 0.000000 sys 0.000000 (best of 2493)
revset) parents(100)~5
before) wall 0.003812 comb 0.010000 user 0.010000 sys 0.000000 (best of 667)
after) wall 0.001038 comb 0.000000 user 0.000000 sys 0.000000 (best of 2361)
revset) (100~5)~5
before) wall 0.005614 comb 0.000000 user 0.000000 sys 0.000000 (best of 446)
after) wall 0.001035 comb 0.000000 user 0.000000 sys 0.000000 (best of 2424)
Calling 'baseset(repo.changelog)' builds a list for all revisions in
the repo. And we already have the lazy and efficient 'fullreposet'
class for this purpose.
This gives us the usual benefit ofs the fullreposet:
revset) 10:100
before) wall 0.002774 comb 0.000000 user 0.000000 sys 0.000000 (best of 797)
after) wall 0.001977 comb 0.000000 user 0.000000 sys 0.000000 (best of 1244)
revset) parents(10):parents(100)
before) wall 0.005054 comb 0.000000 user 0.000000 sys 0.000000 (best of 481)
after) wall 0.002060 comb 0.000000 user 0.000000 sys 0.000000 (best of 1056)
When a file is missing in the 'parent' version and is tracked but
missing in the working directory, which happens by the 'missing' or
'removed' types, and the 'clean' type in the working directory, the
file does not exist in the working directory (unlike it would had the
'deleted' type been used). Thus, the *_missing_missing_tracked are not
actually tracked and they end up testing the same state as
*_missing_missing_untracked. To make them tracked, add a temporary
file, just like we do for the delete case. For simplicity's sake,
let's make sure the gen-revert-cases.py script always puts a file in
the working directory, whether or not it's going to be deleted.
Future patches will change how the output of 'gen-revert-cases.py
filelist' is generated, so now we want the order to depend on just the
filename again.
This is the main patch in a series. See motivation in earlier patch.
In this patch, we actually change the names of the generated
files. For example, the file that is currently called missing_clean
becomes missing_missing_missing-tracked and it's clearer that it
should be tracked. It turns out that since the state was not
previously clear, it ended up testing an untracked state, which was
the same as for missing_clean. We'll fix this in a later patch.
Let's also change the content from (base,parent,wc) to
(content1,content2,content3) to make them all the same length so they
line up when displayed.
The next patch will change the names of the files produced by the
script in test-revert. In order to reduce the size and increase the
clarity of the next patch, make the order produced by the internal
'gen-revert-cases.py filelist' command independent of the filenames.