When creating patches modifying binary files using "git format-patch",
git creates 'literal' and 'delta' hunks. Mercurial currently supports
'literal' hunks only, which makes it impossible to import patches with
'delta' hunks.
This changeset adds support for 'delta' hunks. It is a reimplementation
of patch-delta.c from git :
http://git.kernel.org/cgit/git/git.git/tree/patch-delta.c
The tests in test-annotate.t and test-import-git.t that relied on trailing
space in a file created by a here string is now masked by a literal 'EOL'
string that is removed.
The test had a long chain of commits depending on execbits in some of first
commits. The hashes are checked throughout the file, so there was no elegant
way to make it pass both with and without execbits.
We now rollback the execbits-or-not commits and make a stable change instead.
The hash chain is thus updated once but is now a bit more stable. The test
coverage should be unaltered.
The only place where an trailing CR could be meaningful is in the "diff --git"
line as part of a filename, and existing code already rule out this
possibility. Extend the CR/LF filtering to the whole binary hunk.
$ hg import --no-commit ../mercurial_1915035238540490516.patch
applying ../mercurial_1915035238540490516.patch
abort: could not extract binary data
Becomes:
abort: could not extract "binary2" binary data
Before, import was terminating with a traceback. Now it says:
$ hg import --no-commit ../bad.patch
applying ../bad.patch
abort: could not decode binary patch: bad base85 character at position 66
Git patches are parsed in two phases: 1) extract metadata, 2) parse actual
deltas and merge them with the previous metadata. We do this to avoid
dependency issues like "modify a; copy a to b", where "b" must be copied from
the unmodified "a".
Issue3384 is caused by flaky code I wrote to synchronize the patch metadata
with the emitted hunk:
if (gitpatches and
(gitpatches[-1][0] == afile or gitpatches[-1][1] == bfile)):
gp = gitpatches.pop()[2]
With a patch like:
diff --git a/a b/c
copy from a
copy to c
--- a/a
+++ b/c
@@ -1,1 +1,2 @@
a
+a
@@ -2,1 +2,2 @@
a
+a
diff --git a/a b/a
--- a/a
+++ b/a
@@ -1,1 +1,2 @@
a
+b
the first hunk of the first block is matched with the metadata for the block
"diff --git a/a b/c", then the second hunk of the first block is matched with
the metadata of the second block "diff --git a/a b/a", because of the "or" in
the code paste above. Turning the "or" into an "and" is not enough as we have
to deal with /dev/null cases for each file.
We I remove this broken piece of code:
# copy/rename + modify should modify target, not source
if gp.op in ('COPY', 'DELETE', 'RENAME', 'ADD') or gp.mode:
afile = bfile
because "afile = bfile" set "afile" to stuff like "b/file" instead of "a/file",
and because this only happens for git patches, which afile/bfile are ignored
anyway by applydiff().
v2:
- Avoid a traceback on git metadata desynchronization
There is no reason to discard copy sources from the set of files considered by
addremove(). It was done to handle the case where a first patch would create
'a' and a second one would move 'a' to 'b'. If these patches were applied with
--no-commit, 'a' would first be marked as added, then unlinked and dropped from
the dirstate but still passed to addremove(). A better fix is thus to exclude
removed files which ends being dropped from the dirstate instead of removed.
Reported by Jason Harris <jason@jasonfharris.com>