Commit Graph

427 Commits

Author SHA1 Message Date
Nicolas Vigier
a9bf6787d2 patch: add support for git delta hunks
When creating patches modifying binary files using "git format-patch",
git creates 'literal' and 'delta' hunks. Mercurial currently supports
'literal' hunks only, which makes it impossible to import patches with
'delta' hunks.

This changeset adds support for 'delta' hunks. It is a reimplementation
of patch-delta.c from git :
http://git.kernel.org/cgit/git/git.git/tree/patch-delta.c
2013-11-27 18:39:00 +01:00
Augie Fackler
fb91efd733 makememctx: move from patch to context to break import cycle 2013-11-06 22:09:15 -05:00
Johan Bjork
6055bcdec2 patch: ensure valid git diffs if source/destination file is missing (issue4046)
This is arguably a workaround, a better fix may be in the repo to
ensure that it won't list a file 'modified' unless there is a file
context for the previous version.
2013-10-13 08:38:30 -04:00
Augie Fackler
cdf1bbd9e1 merge with stable 2013-10-07 17:47:55 -04:00
Johan Bjork
bbce7855f8 patch: Fix nullid for binary git diffs (issue4054)
The index for an empty file in git is not 0, but sha-1("blobl 0\0").
2013-10-07 17:47:19 -04:00
Matt Mackall
aa456e3925 import: cut commit messages at --- unconditionally (issue2148)
We used to do this based on X-mailer: mentioning git, but git doesn't
put X-mailer in its git-format-patch output.
2013-07-27 19:31:14 -05:00
Augie Fackler
67877b90cc python2.4: fix imports of sub-packages of the email package
These all have an obvious comment so if/when we finally ditch Python
2.4 we can eradicate them easily.
2013-09-24 15:10:32 -04:00
Augie Fackler
897a535360 patch: correct import of email module 2013-09-20 10:15:51 -04:00
Siddharth Agarwal
4340781945 patch: use scmutil.marktouched instead of scmutil.addremove
addremove required paths relative to the cwd, which meant a lot of extra code
that transformed paths into relative ones. That code is now gone as well.
2013-04-04 13:45:21 -07:00
Sean Farley
0a28dbdae7 patch: match 'diff --git a/' instead of 'diff --git'
This reduces the likelihood of a traceback when trying to email a
patch that happens to have 'diff --git' at the beginning of a line
in the description, as this patch did:

http://markmail.org/message/wxpgowxd7ucxygwe
2013-03-22 17:27:06 -05:00
Johan Bjork
7b64b304ec diff: fix binary file removals in git mode.
With the previous version, a binary file removal diff generated with
2013-03-04 22:34:11 +00:00
Mads Kiilerich
275333d6c9 util: fold ENOENT check into unlinkpath, controlled by new ignoremissing flag
Refactor a common pattern.
2012-12-28 11:55:57 +01:00
Mads Kiilerich
ac8e1fc147 check-code: there must also be whitespace between ')' and operator
The check pattern only checked for whitespace between keyword and operator.

Now it also warns:
 >     x = f(),7
 missing whitespace after ,
 >     x = f()+7
 missing whitespace in expression
2012-12-09 23:33:16 +01:00
Bryan O'Sullivan
0dc2115da0 subrepo: use posixpath when diffing, for consistent paths
This fixes a Windows failure in test-subrepo-recursion.t introduced
by c9345d017402.
2012-11-27 14:58:00 -08:00
Guillermo Pérez
2c1f380e8a diff: move index header generation to patch
In an upcoming patch, we will add index information to all git diffs, not
only binary diffs, so this code needs to be moved to a more appropriate
place.

Also, since this information is used for patch headers, it makes more
sense to be in the patch module, along with other patch-related metadata.
2012-11-15 15:16:41 -08:00
Guillermo Pérez
f655c9c8c8 patch: make _addmodehdr a function under trydiff
addmodehdr is a header helper, same as diffline, so it doesn't
need to be a top-level function and can be nested under trydiff.

In upcoming patches we will generalize this approach for
all headers.
2012-11-15 15:06:32 -08:00
Guillermo Pérez
e12d2e6289 diff: rewrite diffline
Make diffline more readable, using strings with placeholders
rather than appending to a list from many ifs that makes
difficult to understand the actual output format.
2012-11-15 13:57:03 -08:00
Guillermo Pérez
4339c3bf86 diff: swap and simplify diffline args
Args order swapped, since a, b are more important than revs
(which is only used on non-git format), and change to read opts
from context.
2012-11-15 13:52:51 -08:00
Guillermo Pérez
565ed512ac diff: change how quiet mode supresses diffline
Before, quiet mode produced no diffline header for mercurial as a
side effect of not populating "revs". This was a weird side effect,
and we will always need revs for git index header that will be added
in upcoming patches, so now we just check ui.quiet from diffline
directly.
2012-11-15 13:49:04 -08:00
Guillermo Pérez
33e8ab6c12 diff: move diffline to patch module
diffline is not part of diff computation, so it makes more sense
to place it with other header generation in patch module.

In upcoming patches we will generalize this approach for
all headers added in the patch, including the git index
header.
2012-11-15 12:19:03 -08:00
Guillermo Pérez
9482817c96 diff: unify calls to diffline
diffline was called from trydiff for binary diffs and from unidiff
for text diffs. In this patch we unify those calls into one.

diffline is also a header, not part of diff mechanisms, so it makes
sense to remove that responsibility from the mdiff module. In
upcoming patches we will move diffline to patch module and
keep grouping responsibilities.
2012-11-15 12:16:08 -08:00
Guillermo Pérez
d54d996bfb diff: move b85diff to mdiff module
b85diff generates a binary diff, so we move this code to mdiff module
along with unidiff for text diffs. All diffing mechanisms will be in the
same place.

In an upcoming patch we will remove the responsibility to print the
index header from b85diff and move it back to patch, since it's
a patch metadata header, not part of the diff generation.
2012-11-06 14:04:05 -08:00
Mads Kiilerich
688b9e2048 check-code: indent 4 spaces in py files 2012-07-31 03:30:42 +02:00
Bryan O'Sullivan
abdf4a8227 util: subclass deque for Python 2.4 backwards compatibility
It turns out that Python 2.4's deque type is lacking a remove method.
We can't implement remove in terms of find, because it doesn't have
find either.
2012-06-01 17:05:31 -07:00
Matt Mackall
da90115a02 merge with stable 2012-06-01 15:14:29 -05:00
Patrick Mezard
5315858d38 patch: keep patching after missing copy source (issue3480)
When applying a patch renaming/copying 'a' to 'b' on a revision where
'a' does not exist, the patching process would abort immediately,
without processing the remaining hunks and without reporting it. This
patch makes the patching no longer abort and possible hunks applied on
the copied/renamed file be written in reject files.
2012-06-01 17:37:56 +02:00
Bryan O'Sullivan
bef5b61512 cleanup: use the deque type where appropriate
There have been quite a few places where we pop elements off the
front of a list.  This can turn O(n) algorithms into something more
like O(n**2).  Python has provided a deque type that can do this
efficiently since at least 2.4.

As an example of the difference a deque can make, it improves
perfancestors performance on a Linux repo from 0.50 seconds to 0.36.
2012-05-15 10:46:23 -07:00
Brodie Rao
7f47d4e347 check-code: ignore naked excepts with a "re-raise" comment
This also promotes the naked except check from a warning to an error.
2012-05-13 13:18:06 +02:00
Brodie Rao
92158e04de cleanup: "raise SomeException()" -> "raise SomeException" 2012-05-12 16:00:58 +02:00
Brodie Rao
a7ef0a0cc5 cleanup: "not x in y" -> "x not in y" 2012-05-12 16:00:57 +02:00
Brodie Rao
d6a6abf2b0 cleanup: eradicate long lines 2012-05-12 15:54:54 +02:00
Matt Mackall
8420ca3aee merge with stable 2012-05-12 12:23:49 +02:00
Yuya Nishihara
9228db24a3 patch: fix segfault against unified diffs which start line is zero
Since f7e538c3b7ba, if a chunk starts with "@@ -0,1", oldstart turns into
a negative value. Because diffhelpers.testhunk() doesn't expect negative bstart,
it bypasses "alen > blen - bstart" condition and segfaults at
"PyList_GET_ITEM(b, i + bstart)".
2012-05-12 16:10:01 +09:00
Patrick Mezard
601035e58a patch: clarify binary hunk parsing loop 2012-04-29 11:19:51 +02:00
Patrick Mezard
816bc359f4 patch: be more tolerant with EOLs in binary diffs (issue2870)
The only place where an trailing CR could be meaningful is in the "diff --git"
line as part of a filename, and existing code already rule out this
possibility. Extend the CR/LF filtering to the whole binary hunk.
2012-04-26 21:44:02 +02:00
Patrick Mezard
293cc1e994 patch: include file name in binary patch error messages
$ hg import --no-commit ../mercurial_1915035238540490516.patch
 applying ../mercurial_1915035238540490516.patch
 abort: could not extract binary data

Becomes:

 abort: could not extract "binary2" binary data
2012-04-26 21:44:00 +02:00
Patrick Mezard
4076709ad5 patch: display a nice error for invalid base85 data
Before, import was terminating with a traceback. Now it says:

 $ hg import --no-commit ../bad.patch
 applying ../bad.patch
 abort: could not decode binary patch: bad base85 character at position 66
2012-04-21 19:58:18 +02:00
Patrick Mezard
b5270209ca patch: fix patch hunk/metdata synchronization (issue3384)
Git patches are parsed in two phases: 1) extract metadata, 2) parse actual
deltas and merge them with the previous metadata. We do this to avoid
dependency issues like "modify a; copy a to b", where "b" must be copied from
the unmodified "a".

Issue3384 is caused by flaky code I wrote to synchronize the patch metadata
with the emitted hunk:

 if (gitpatches and
     (gitpatches[-1][0] == afile or gitpatches[-1][1] == bfile)):
     gp = gitpatches.pop()[2]

With a patch like:

 diff --git a/a b/c
 copy from a
 copy to c
 --- a/a
 +++ b/c
 @@ -1,1 +1,2 @@
  a
 +a
 @@ -2,1 +2,2 @@
  a
 +a
 diff --git a/a b/a
 --- a/a
 +++ b/a
 @@ -1,1 +1,2 @@
  a
 +b

the first hunk of the first block is matched with the metadata for the block
"diff --git a/a b/c", then the second hunk of the first block is matched with
the metadata of the second block "diff --git a/a b/a", because of the "or" in
the code paste above. Turning the "or" into an "and" is not enough as we have
to deal with /dev/null cases for each file.

We I remove this broken piece of code:

 # copy/rename + modify should modify target, not source
 if gp.op in ('COPY', 'DELETE', 'RENAME', 'ADD') or gp.mode:
     afile = bfile

because "afile = bfile" set "afile" to stuff like "b/file" instead of "a/file",
and because this only happens for git patches, which afile/bfile are ignored
anyway by applydiff().

v2:
- Avoid a traceback on git metadata desynchronization
2012-04-21 21:40:25 +02:00
Patrick Mezard
41dcbfe34f patch: be more tolerant with "Parent" header (issue3356)
Here is how export and mq write the "Parent" header:

  mq:     # Parent XXXXX
  export: # Parent  XXXXX

then import expects exactly 2 spaces while mq tolerates one or more. So "hg
import --exact" truncates mq generated patches header by one character and
fails. This patch aligns import "Parent" header parsing on mq one. I do not
expect spaces in parent references anytime soon.

Reported by Stefan Ring <stefanrin@gmail.com>
2012-04-20 19:11:54 +02:00
Patrick Mezard
83515678f4 patch: remove useless variable assignment 2012-04-05 19:23:04 +02:00
Patrick Mezard
20e5c736c0 patch: fuzz more aggressively to match patch(1) behaviour
The previous code was assuming a default context of 3 lines. When fuzzing, it
would take this value in account to reduce the amount of removed line from
hunks top or bottom. For instance, if a hunk has only 2 lines of bottom
context, fuzzing with fuzz=1 would do nothing and with fuzz=2 it would remove
one of those lines. A hunk with one line of bottom context could not be fuzzed
at all.  patch(1) has apparently no such restrictions and takes the fuzz level
at face value.

- test-import.t: fuzz/offset changes at the beginning of file are explained by
  the new fuzzing behaviour and match patch(1) ones. Patching locations are
  different but those of my patch(1) do not make a lot of sense right now
  (patched output are the same)

- test-import-bypass.t: more agressive fuzzing makes a patching supposed to
  fail because of context, succeed. Change the diff to avoid this.

- test-mq-merge.t: more agressive fuzzing would allow the merged patch to apply
  with fuzz, but fortunately we disallow this behaviour. The new output is
  kept.

I have not enough experience with patch(1) fuzzing to know whether aligning our
implementation on it is a good or bad idea. Until now, it has been the
implementation reference. For instance, "qpush" tolerates fuzz (test-mq-merge.t
runs the special case of pushing merge revisions where fuzzing is forbidden).
2012-02-13 17:22:35 +01:00
Patrick Mezard
39da9af001 patch: fix fuzzing of hunks without previous lines (issue3264)
When applying hunks such as:

  @@ -2,1 +2,2 @@
   context
  +change

fuzzing would empty the "old" block and make patchfile.apply() traceback.
Instead, we apply the new block at specified location without testing.

The "bottom hunk" test was removed as patch(1) has no problem applying hunk
with no context in the middle of a file.
2012-02-13 16:47:31 +01:00
Patrick Mezard
1be0e99fa3 patch: make hunk.fuzzit() compute the fuzzed start locations
- It moves hunks processing weirdness where it belongs
- It helps reusing said weirdness whenever fuzzit() is called, like during the
  actual hunk fuzzing.
2012-02-13 13:51:38 +01:00
Patrick Mezard
e783da0202 patch: fuzz old and new lines at the same time
In theory, the fuzzed offsets for old and new lines should be exactly the same
as they are based on hunk parsing. Make it true in practice.
2012-02-13 13:21:00 +01:00
Patrick Mezard
a7eea871c8 import: handle git renames and --similarity (issue3187)
There is no reason to discard copy sources from the set of files considered by
addremove(). It was done to handle the case where a first patch would create
'a' and a second one would move 'a' to 'b'. If these patches were applied with
--no-commit, 'a' would first be marked as added, then unlinked and dropped from
the dirstate but still passed to addremove(). A better fix is thus to exclude
removed files which ends being dropped from the dirstate instead of removed.

Reported by Jason Harris <jason@jasonfharris.com>
2012-02-16 13:03:42 +01:00
Jesus Espino Garcia
9146d99b1d patch: a little bit more robust line counting on diff --stat (issue3183) 2012-01-21 23:50:58 +01:00
Matt Mackall
9bfa890ee6 copies: split the copies api for "normal" and merge cases (API) 2012-01-04 15:48:02 -06:00
Matt Mackall
3d60dfdd1c merge with stable 2011-11-30 17:15:39 -06:00
Benoit Allard
f3be9c5304 diff: '\ No newline at end of file' is also not part of the header
Diff containing '\ No newline at end of file' were colorized incorrectly.
2011-11-29 19:51:35 +01:00
Nicolas Venegas
2582d1e3d9 mdiff/patch: fix bad hunk handling for unified diffs with zero context
Prior to this patch "hg diff -U0", i.e., zero lines of context, would
output hunk headers with a start line one greater than what GNU patch
and git output. Guido van Rossum documents the unified diff format[1]
as having a start line value "one lower than one would expect" for
zero length hunks.

Comparing the behaviour of the three systems prior to this patch in
transforming

  c1
  c3

to

  c1
  c2
  c3

- GNU "diff -U0" reports the hunk as "@@ -1,0 +2 @@"
- "git diff -U0" reports the hunk as "@@ -1,0 +2 @@"
- "hg diff -U0" reports the hunk as "@@ -2,0 +2,1 @@"

After this patch, "hg diff -U0" reports "@@ -1,0 +2,1 @@".

Since "hg export --config diff.unified=0" outputs zero-context unified
diffs, "hg import" has also been updated to account for start lines
one less than expected for zero length hunk ranges.

[1]: http://www.artima.com/weblogs/viewpost.jsp?thread=164293
2011-11-09 16:55:59 -08:00