See previous patch descriptions for the motivation.
The tests reflect the current state of the world -- as we add support we'll see
changes in the test output.
In an upcoming patch we'll enable support as an option to 'hg diff' as well.
The tests reflect the current state of the world -- as we add support we'll see
changes in the test output.
This is a very silly case and not particularly likely to happen in the wild,
but it turns out we can hit it in a couple of places. As we tune the storage
parameters we're likely to hit more such cases.
The affected test cases all have smaller revlogs now.
Before this patch, while "hg convert", largefiles avoids copying
largefiles in the working directory into the store area by combination
of setting "repo._isconverting" in "mercurialsink{before|after}" and
checking it in "copytostoreabsolute".
This avoiding is needed while "hg convert", because converting doesn't
update largefiles in the working directory.
But this implementation is not efficient, because:
- invocation in "markcommitted" can easily ensure updating
largefiles in the working directory
"markcommitted" is invoked only when new revision is committed via
"commit" of "localrepository" (= with files in the working
directory). On the other hand, "commitctx" may be invoked directly
for in-memory committing.
- committing without updating the working directory (e.g. "import
--bypass") also needs this kind of avoiding
For efficiency of this kind of avoiding, this patch does:
- move "copyalltostore" invocation into "markcommitted"
- remove meaningless procedures below:
- hooking "mercurialsink{before|after}" to (un)set "repo._isconverting"
- checking "repo._isconverting" in "copytostoreabsolute"
This patch invokes "copyalltostore" also in "_commitcontext", because
"_commitcontext" expects that largefiles in the working directory are
copied into store area after "commitctx". In this case, the working
directory is used as a kind of temporary area to write largefiles out,
even though converted revisions are committed via "commitctx" (without
updating normal files).
Before this patch, "hg transplant --continue" may record incorrect
standins, because largefiles extension always avoid updating standins
while transplanting, even though largefiles in the working directory
may be modified manually at the 1st commit of "hg transplant --continue".
But, on the other hand, updating standins should be avoided at
subsequent commits for efficiency reason.
To update standins only at the 1st commit of "hg transplant
--continue", this patch uses "automatedcommithook", which updates
standins by "lfutil.updatestandinsbymatch()" only at the 1st commit of
resuming.
Even after this patch, "repo._istransplanting = True" is still needed
to avoid some status report while updating largefiles in
"lfcommands.updatelfiles()".
This is reason why this patch omits not "repo._istransplanting = True"
in "overriderebase" but examination of "getattr(repo,
"_istransplanting", False)" in "updatestandinsbymatch".
At "hg transplant --merge REV", largefiles newly coming from the 2nd
parent (= REV) are marked as "a"(dded) by "patch.patch()", and have to
be marked as "n"(ormal) after commit.
But until changeset 978713c45992, such largefiles were still marked as
"a" unexpectedly even after commit, because no additional entry is
added to filelog of such largefiles and they aren't listed in
"repo[newnode].files()" in this case: "newnode" is one of newly
committed changeset (= result of "repo.commit()").
"updatelfiles" invocation in "overridetransplant" shadows this problem
by forcibly synchronizing lfdirstate to dirstate.
Now, "updatelfiles" invocation in "overridetransplant" is redundant,
because changeset 978713c45992 made "markcommitted" use "ctx.files()"
to get targets of "synclfdirstate" instead of "repo[newnode].files()".
The new rebase revset didn't check for the case when there are no common
ancestors. Now it does. The new behavior should be the same as the pre-3.2
behavior. Added a test.
simplejson produces slightly different output from the built-in json
module, specifically:
* It uses 0.000 instead of 0.0000
* It likes to put a trailing space after a comma
This change works around both of those variations.
A matcher is required when enabling the subrepo option on archival.archive(),
because that calls match.narrowmatcher(), which accesses fields on the object.
It's therefore probably a bad idea to default the matcher to None on archive(),
but that's a fix for default.
This keyword remapping was introduced in 236440938a03 as part of converting
generator based iterators into list based iterators, mentioning "undesired
behavior in template" when a generator is exhausted, but doesn't say what and
introduces no tests.
The problem with the remapping was that it corrupted the output for keywords
like 'extras', 'file_copies' and 'file_copies_switch' in templates such as:
$ hg log -r 82a4f5557c6b --template "{file_copies % ' File: {file_copy}\n'}"
File: mercurial/changelog.py (mercurial/hg.py)
File: mercurial/changelog.py (mercurial/hg.py)
File: mercurial/changelog.py (mercurial/hg.py)
File: mercurial/changelog.py (mercurial/hg.py)
File: mercurial/changelog.py (mercurial/hg.py)
File: mercurial/changelog.py (mercurial/hg.py)
File: mercurial/changelog.py (mercurial/hg.py)
File: mercurial/changelog.py (mercurial/hg.py)
What was happening was that in the first call to runtemplate() inside runmap(),
'lm' mapped the keyword (e.g. file_copies) to the appropriate showxxx() method.
On each subsequent call to runtemplate() in that loop however, the keyword was
mapped to a list of the first item's pieces, e.g.:
'file_copy': ['mercurial/changelog.py', ' (', 'mercurial/hg.py', ')']
Therefore, the dict for the second and any subsequent items were not processed
through the corresponding showxxx() method, and the first item's data was
reused.
The 'extras' keyword regressed in 56b014c52204, and 'file_copies' regressed in
4e182fb53989 for other reasons. The common thread of things fixed by this seems
to be when a list of dicts are passed to the templatekw._hybrid class.
After running "hg forget README && hg addremove", README will still be
reported as removed, while "hg forget README && hg add README" adds it
back so it gets reported as clean. It seems like they should behave
the same. Furthermore, it seems like no files should remain untracked
after 'hg addremove && hg commit' (or 'hg commit -A'). For these
reasons, change the behavior of addremove so it does add forgotten
files back.
The problem is with scmutil._interestingfiles(), which reports the
file as removed, so scmutil.addremove() does not add it. Fix by
teaching _interestingfiles() to report forgotten files separately from
removed files and make addremove() add forgotten files back. However,
do not treat forgotten files as sources for rename detection. Note
that since removed and forgotten files are treated the same before
this change, forgotten files were considered sources for rename
detection.
Also update the other caller, marktouched(), in the same way as
addremove().
I accidentally did 'hg forget .' and tried to undo the operation with
'hg add .'. I expected the files to be reported as either modified or
clean, but they were still reported as removed. It turns out that
forgotten files are only added back if they are listed explicitly, as
shown by the following two invocations. This makes it hard to recover
from the mistake of forgetting a lot of files.
$ hg forget README && hg add README && hg status -A README
C README
$ hg forget README && hg add . && hg status -A README
R README
The problem lies in cmdutil.add(). That method checks that the file
isn't already tracked before adding it, but it does so by checking the
dirstate, which does have an entry for forgotten files (state 'r'). We
should instead be checking whether the file exists in the
workingctx. The workingctx is also what we later call add() on, and
that method takes care of transforming the add() into a normallookup()
on the dirstate.
Since we're changing repo.dirstate into wctx, let's also change
repo.walk into wctx.walk for consistency (repo.walk calls wctx.walk,
so we're simply inlining the call).
The current heuristic for deciding between storing delta and full texts
is based on ratio of (sizeofdeltas)/(sizeoffulltext).
In some cases (for example a manifest for ahuge repo) this approach
can result in extremely long delta chains (~30,000) which are very slow to
read. (In the case of a manifest ~500ms are added to every hg command because of that).
This commit introduces "revlog.maxchainlength" configuration variable that will
limit delta chain length.
The randomness in the discovery protocol made this problem hard to reproduce.
The test mocks random.sample to make sure we hit the problem every time. The
set iteration order also made the output unstable ... but with the issue fixed,
it is stable.
We have tests for the status across from '.^' to the working copy. It
makes sense to have the similar tests for the inter-revision status
between '.^' and '.' and for the dirstate status in the same
place.
The initial commit was there when we had a group of tests that
compared against an empty base, but since those tests no longer exist,
we can drop the empty commit.
It's getting a little hard to read the ~30 calls to 'hg status' with
one per file. Instead, let's use one glob for each expected
status. For example, modified files can be listed with
'glob:content1_*_content[23]-tracked'. That also nicely becomes an
explanation for why each status is expected.
The second group of tests in test-status-rev compare to an empty
revision. The first group of tests that compare to the first commit
should be testing all the same states with the missing_* files, so
drop the second group of tests.
The post-transaction hooks run after the lock release (because hooks may want to
touch the repository), but they must only run if the transaction is successfully
closed.
We use the new 'addpostclose' method on transaction to register a callback
installing this post-lock-release call.
The status for missing_content2_content2-untracked doesn't get
reported at all. Since the file does exist in the working copy, it
should reported as unknown. Document that in the test.
Start using the generate-working-copy-states.py script that's shared
with test-revert.t, instead of creating the states manually in the
test. This adds several states that are currently missing. We will
start checking those states later.
To prepare for using generate-working-copy-states.py for generating
the files and their content, let's start by renaming the files
according to the naming scheme used by that script.
With the recent change in naming of the generated files, it becomes
much easier to generate the files by iterating over all the possible
states than over the state transitions.
Before this patch, "hg rebase --continue" may record incorrect
standins, because largefiles extension always avoid updating standins
while rebasing, even though largefiles in the working directory may be
modified manually at the 1st commit of "hg rebase --continue".
But, on the other hand, updating standins should be avoided at
subsequent commits for efficiency reason.
To update standins only at the 1st commit of "hg rebase --continue",
this patch introduces state-full callable object
"automatedcommithook", which updates standins by
"lfutil.updatestandinsbymatch()" only at the 1st commit of resuming.
Even after this patch, "repo._isrebasing = True" is still needed to
avoid some status report while updating largefiles in
"lfcommands.updatelfiles()".
This is reason why this patch omits not "repo._isrebasing = True" in
"overriderebase" but examination of "getattr(repo, "_isrebasing",
False)" in "updatestandinsbymatch".
This patch removes "--rebase" specific code path for "hg pull" in
"overridepull", because previous patch makes it meaningless: now,
"rebase.rebase" ("orig" invocation in this patch) can
update/commit largefiles safely without "repo._isrebasing = True".
As a side effect of removing "rebase.rebase" invocation in
"overridepull", this patch removes "nothing to rebase ..." message in
"test-largefiles.t", which is shown only when rebase extension is
enabled AFTER largefiles:
before this patch:
1. "dispatch" invokes "pullrebase" of rebase as "hg pull" at
first, because rebase wraps "hg pull" later
2. "pullrebase" invokes "overridepull" of largefiles as "orig",
even though rebase assumes that "orig" is "pull" of commands
3. "overridepull" executes "pull" and "rebase" directly
3.1 "pull" pulls changesets and creates new head "X"
3.2 "rebase" rebases current working parent "Y" on "X"
4. "overridepull" returns to "pullrebase"
5. "pullrebase" tries to rebase, but there is nothing to be done,
because "Y" is already rebased on "X". then, it shows "nothing
to rebase ..."
after this patch:
1. "dispatch" invokes "pullrebase" of rebase as "hg pull"
2. "pullrebase" invokes "overridepull" of largefiles as "orig"
3. "overridepull" executes "pull" as "orig"
4. "overridepull" returns to "pullrebase"
5. revision "Y" is not yet rebased, so "pullrebase" doesn't shows
"nothing to rebase ..."
As another side effect of removing "rebase.rebase" invocation, this
patch fixes issue3861, which occurs only when rebase extension is
enabled BEFORE largefiles:
before this patch:
1. "dispatch" invokes "overridepull" of largefiles at first,
because largefiles wrap "hg pull" later
2. "overridepull" executes "pull" and "rebase" explicitly
2.1 "pull" pulls changesets and creates new head "X"
2.2 "rebase" rebases current working parent, but fails because
no revision is checked out in issue3861 case
3. "overridepull" returns to "dispatch" with exit code 1 returned
from "rebase" at (2.2)
4. "hg pull" terminates with exit code 1 unexpectedly
after this patch:
1. "dispatch" invokes "overridepull" of largefiles at first
2. "overridepull" invokes "pullrebase" of rebase as "orig"
3. "pullrebase" invokes "pull" as "orig"
4. "pullrebase" invokes "rebase", and it fails
5. "pullrebase" returns to "overridepull" with exit code 0
(because "pullrebase" ignores result of "pull" and "rebase")
6. "overridepull" returns to "dispatch" with exit code 0 returned
from "rebase" at (5)
7. "hg pull" terminates with exit code 0
When a file is missing in the 'parent' version and is tracked but
missing in the working directory, which happens by the 'missing' or
'removed' types, and the 'clean' type in the working directory, the
file does not exist in the working directory (unlike it would had the
'deleted' type been used). Thus, the *_missing_missing_tracked are not
actually tracked and they end up testing the same state as
*_missing_missing_untracked. To make them tracked, add a temporary
file, just like we do for the delete case. For simplicity's sake,
let's make sure the gen-revert-cases.py script always puts a file in
the working directory, whether or not it's going to be deleted.
Future patches will change how the output of 'gen-revert-cases.py
filelist' is generated, so now we want the order to depend on just the
filename again.
This is the main patch in a series. See motivation in earlier patch.
In this patch, we actually change the names of the generated
files. For example, the file that is currently called missing_clean
becomes missing_missing_missing-tracked and it's clearer that it
should be tracked. It turns out that since the state was not
previously clear, it ended up testing an untracked state, which was
the same as for missing_clean. We'll fix this in a later patch.
Let's also change the content from (base,parent,wc) to
(content1,content2,content3) to make them all the same length so they
line up when displayed.
The next patch will change the names of the files produced by the
script in test-revert. In order to reduce the size and increase the
clarity of the next patch, make the order produced by the internal
'gen-revert-cases.py filelist' command independent of the filenames.
By putting the file content rather than keys in the 'combination'
list, we restrict the knowledge of 'ctxcontent' and 'wccontent' to the
loop generating the combinations. That will make it easier to replace
the generation code.
The 'wccontent' variable has eight different states, four of them
tracked, and the other four untracked (at least when the file existed
in the parent revision). Among these eight states, 'removed' sticks
out by lacking the 'untracked-' prefix despite resulting in an
untracked state. To make the symmetry clearer, and to prepare for
future patches, rename 'removed' to 'untracked-deleted', which is
exactly what it is.
Note that, unlike 'remove', 'deleted' is configured in
gen-revert-cases.py to have content in the working directory and that
that content is instead expected to be removed in the test script.
However, no changes are needed to the test script, since it already
contains 'hg forget *untracked*' and 'rm *deleted*', which together
have the same effect as 'hg remove'.
See additional motivation in earlier patch.
The tests for removed_deleted and removed_removed test the same state
as removed_clean and removed_untracked-clean, respectively. Drop the
duplicate tests.
See additional motivation in earlier patch.
The tests for added_revert and added_untracked-revert test the same
state as added_deleted and added_removed, respectively. Drop the
duplicate tests.
See additional motivation in earlier patch.
This is the first step in a series that aims to put the state, not the
state transitions, in the filenames of the files generated by the
gen-revert-cases.py script. The possible state of a file in a revision
and in the working copy is only whether it exists and what its content
is (the tests don't care check flags). In the dirstate, the only state
is whether it's tracked or not. With the new naming, the file that is
currently called modified_untracked-clean now becomes
content1_content2_content2-untracked, for example.
By putting these states in the filename, it becomes easier to see that
we're not missing or duplicating any state, and to check that the
state is what we think it is. For example, the file that is currently
called missing_clean becomes missing_missing_missing-tracked and it's
clearer that it should be tracked.
Putting the content in the filename will also make the tests of file
content (e.g. "cat ../content-parent.txt") very obvious.
When we put the state in the filename, the filenames clearly need to
be unique. However, it turns out that some states are currently tested
multiple times. The 'revert' transition in the script means to take
the content from the grandparent. If the parent is the same as the
grandparent, there is no change compared to the parent, which is
exactly what 'clean' means. Avoid testing the same state twice.