Commit Graph

399 Commits

Author SHA1 Message Date
Brodie Rao
92158e04de cleanup: "raise SomeException()" -> "raise SomeException" 2012-05-12 16:00:58 +02:00
Brodie Rao
a7ef0a0cc5 cleanup: "not x in y" -> "x not in y" 2012-05-12 16:00:57 +02:00
Brodie Rao
d6a6abf2b0 cleanup: eradicate long lines 2012-05-12 15:54:54 +02:00
Matt Mackall
8420ca3aee merge with stable 2012-05-12 12:23:49 +02:00
Yuya Nishihara
9228db24a3 patch: fix segfault against unified diffs which start line is zero
Since f7e538c3b7ba, if a chunk starts with "@@ -0,1", oldstart turns into
a negative value. Because diffhelpers.testhunk() doesn't expect negative bstart,
it bypasses "alen > blen - bstart" condition and segfaults at
"PyList_GET_ITEM(b, i + bstart)".
2012-05-12 16:10:01 +09:00
Patrick Mezard
601035e58a patch: clarify binary hunk parsing loop 2012-04-29 11:19:51 +02:00
Patrick Mezard
816bc359f4 patch: be more tolerant with EOLs in binary diffs (issue2870)
The only place where an trailing CR could be meaningful is in the "diff --git"
line as part of a filename, and existing code already rule out this
possibility. Extend the CR/LF filtering to the whole binary hunk.
2012-04-26 21:44:02 +02:00
Patrick Mezard
293cc1e994 patch: include file name in binary patch error messages
$ hg import --no-commit ../mercurial_1915035238540490516.patch
 applying ../mercurial_1915035238540490516.patch
 abort: could not extract binary data

Becomes:

 abort: could not extract "binary2" binary data
2012-04-26 21:44:00 +02:00
Patrick Mezard
4076709ad5 patch: display a nice error for invalid base85 data
Before, import was terminating with a traceback. Now it says:

 $ hg import --no-commit ../bad.patch
 applying ../bad.patch
 abort: could not decode binary patch: bad base85 character at position 66
2012-04-21 19:58:18 +02:00
Patrick Mezard
b5270209ca patch: fix patch hunk/metdata synchronization (issue3384)
Git patches are parsed in two phases: 1) extract metadata, 2) parse actual
deltas and merge them with the previous metadata. We do this to avoid
dependency issues like "modify a; copy a to b", where "b" must be copied from
the unmodified "a".

Issue3384 is caused by flaky code I wrote to synchronize the patch metadata
with the emitted hunk:

 if (gitpatches and
     (gitpatches[-1][0] == afile or gitpatches[-1][1] == bfile)):
     gp = gitpatches.pop()[2]

With a patch like:

 diff --git a/a b/c
 copy from a
 copy to c
 --- a/a
 +++ b/c
 @@ -1,1 +1,2 @@
  a
 +a
 @@ -2,1 +2,2 @@
  a
 +a
 diff --git a/a b/a
 --- a/a
 +++ b/a
 @@ -1,1 +1,2 @@
  a
 +b

the first hunk of the first block is matched with the metadata for the block
"diff --git a/a b/c", then the second hunk of the first block is matched with
the metadata of the second block "diff --git a/a b/a", because of the "or" in
the code paste above. Turning the "or" into an "and" is not enough as we have
to deal with /dev/null cases for each file.

We I remove this broken piece of code:

 # copy/rename + modify should modify target, not source
 if gp.op in ('COPY', 'DELETE', 'RENAME', 'ADD') or gp.mode:
     afile = bfile

because "afile = bfile" set "afile" to stuff like "b/file" instead of "a/file",
and because this only happens for git patches, which afile/bfile are ignored
anyway by applydiff().

v2:
- Avoid a traceback on git metadata desynchronization
2012-04-21 21:40:25 +02:00
Patrick Mezard
41dcbfe34f patch: be more tolerant with "Parent" header (issue3356)
Here is how export and mq write the "Parent" header:

  mq:     # Parent XXXXX
  export: # Parent  XXXXX

then import expects exactly 2 spaces while mq tolerates one or more. So "hg
import --exact" truncates mq generated patches header by one character and
fails. This patch aligns import "Parent" header parsing on mq one. I do not
expect spaces in parent references anytime soon.

Reported by Stefan Ring <stefanrin@gmail.com>
2012-04-20 19:11:54 +02:00
Patrick Mezard
83515678f4 patch: remove useless variable assignment 2012-04-05 19:23:04 +02:00
Patrick Mezard
20e5c736c0 patch: fuzz more aggressively to match patch(1) behaviour
The previous code was assuming a default context of 3 lines. When fuzzing, it
would take this value in account to reduce the amount of removed line from
hunks top or bottom. For instance, if a hunk has only 2 lines of bottom
context, fuzzing with fuzz=1 would do nothing and with fuzz=2 it would remove
one of those lines. A hunk with one line of bottom context could not be fuzzed
at all.  patch(1) has apparently no such restrictions and takes the fuzz level
at face value.

- test-import.t: fuzz/offset changes at the beginning of file are explained by
  the new fuzzing behaviour and match patch(1) ones. Patching locations are
  different but those of my patch(1) do not make a lot of sense right now
  (patched output are the same)

- test-import-bypass.t: more agressive fuzzing makes a patching supposed to
  fail because of context, succeed. Change the diff to avoid this.

- test-mq-merge.t: more agressive fuzzing would allow the merged patch to apply
  with fuzz, but fortunately we disallow this behaviour. The new output is
  kept.

I have not enough experience with patch(1) fuzzing to know whether aligning our
implementation on it is a good or bad idea. Until now, it has been the
implementation reference. For instance, "qpush" tolerates fuzz (test-mq-merge.t
runs the special case of pushing merge revisions where fuzzing is forbidden).
2012-02-13 17:22:35 +01:00
Patrick Mezard
39da9af001 patch: fix fuzzing of hunks without previous lines (issue3264)
When applying hunks such as:

  @@ -2,1 +2,2 @@
   context
  +change

fuzzing would empty the "old" block and make patchfile.apply() traceback.
Instead, we apply the new block at specified location without testing.

The "bottom hunk" test was removed as patch(1) has no problem applying hunk
with no context in the middle of a file.
2012-02-13 16:47:31 +01:00
Patrick Mezard
1be0e99fa3 patch: make hunk.fuzzit() compute the fuzzed start locations
- It moves hunks processing weirdness where it belongs
- It helps reusing said weirdness whenever fuzzit() is called, like during the
  actual hunk fuzzing.
2012-02-13 13:51:38 +01:00
Patrick Mezard
e783da0202 patch: fuzz old and new lines at the same time
In theory, the fuzzed offsets for old and new lines should be exactly the same
as they are based on hunk parsing. Make it true in practice.
2012-02-13 13:21:00 +01:00
Patrick Mezard
a7eea871c8 import: handle git renames and --similarity (issue3187)
There is no reason to discard copy sources from the set of files considered by
addremove(). It was done to handle the case where a first patch would create
'a' and a second one would move 'a' to 'b'. If these patches were applied with
--no-commit, 'a' would first be marked as added, then unlinked and dropped from
the dirstate but still passed to addremove(). A better fix is thus to exclude
removed files which ends being dropped from the dirstate instead of removed.

Reported by Jason Harris <jason@jasonfharris.com>
2012-02-16 13:03:42 +01:00
Jesus Espino Garcia
9146d99b1d patch: a little bit more robust line counting on diff --stat (issue3183) 2012-01-21 23:50:58 +01:00
Matt Mackall
9bfa890ee6 copies: split the copies api for "normal" and merge cases (API) 2012-01-04 15:48:02 -06:00
Matt Mackall
3d60dfdd1c merge with stable 2011-11-30 17:15:39 -06:00
Benoit Allard
f3be9c5304 diff: '\ No newline at end of file' is also not part of the header
Diff containing '\ No newline at end of file' were colorized incorrectly.
2011-11-29 19:51:35 +01:00
Nicolas Venegas
2582d1e3d9 mdiff/patch: fix bad hunk handling for unified diffs with zero context
Prior to this patch "hg diff -U0", i.e., zero lines of context, would
output hunk headers with a start line one greater than what GNU patch
and git output. Guido van Rossum documents the unified diff format[1]
as having a start line value "one lower than one would expect" for
zero length hunks.

Comparing the behaviour of the three systems prior to this patch in
transforming

  c1
  c3

to

  c1
  c2
  c3

- GNU "diff -U0" reports the hunk as "@@ -1,0 +2 @@"
- "git diff -U0" reports the hunk as "@@ -1,0 +2 @@"
- "hg diff -U0" reports the hunk as "@@ -2,0 +2,1 @@"

After this patch, "hg diff -U0" reports "@@ -1,0 +2,1 @@".

Since "hg export --config diff.unified=0" outputs zero-context unified
diffs, "hg import" has also been updated to account for start lines
one less than expected for zero length hunk ranges.

[1]: http://www.artima.com/weblogs/viewpost.jsp?thread=164293
2011-11-09 16:55:59 -08:00
Patrick Mezard
cc3315778f annotate: support diff whitespace filtering flags (issue3030)
splitblock() was added to handle blocks returned by bdiff.blocks() which differ
only by blank lines but are not made only of blank lines. I do not know exactly
how it could happen but mdiff.blocks() threshold behaviour makes me think it
can if those blocks are made of very popular lines mixed with popular blank
lines. If it is proven to be wrong, the function can be dropped.

The first implementation made annotate share diff configuration entries. But it
looks like users will user -w/b for annotate but not for diff, on both the
command line and hgweb. Since the latter cannot use command line entries, we
introduce a new [annotate] section duplicating the diff whitespace options.
2011-11-18 12:04:31 +01:00
Patrick Mezard
31705153e3 patch: simplify hunk extents parsing
Do not capture unwanted groups in regexps
2011-11-14 18:16:01 +01:00
Patrick Mezard
1e1f58f358 diffstat: be more picky when marking file as 'binary' (issue2816)
The 'Bin' marker was added to every changed file for which we could not find
any diff changes. This included binary files but also copy/renames and mode
changes. Since Mercurial regular diff format emits a 'Binary file XXX has
changed' line when fed with binary files, we use that and the usual git marker
to tell them from other cases. In particular, new empty files are no longer
reported as binary.

Still, this fix is not complete since copy/renames/mode changes are now
reported as '0' lines changes, instead of 'Bin'.
2011-10-24 13:41:19 +02:00
Kirill Elagin
a331869573 diff: enhance highlighting with color (issue3034)
Now the highlighter knows whether it is in the header of the patch or not.
2011-10-05 09:20:38 +03:00
Matt Mackall
e538620d00 merge with stable 2011-09-27 18:50:18 -05:00
Steffen Daode Nurpmeso
f5ba5be1a8 patch: correctly handle non-tabular Subject: line
The line content of continued Subject: lines was yet joined via
str.replace('\n\t', ' '), which does not handle continuation via
spaces.  So expan the regular expression instead to
handle all allowed forms of mail header line continuation.
2011-09-27 18:41:09 -05:00
Dan Villiom Podlaski Christiansen
c5a1d45b09 patch: handle 'gitpatches' being empty, but not none 2011-09-11 18:49:41 +02:00
Matt Mackall
af293bcf51 merge with stable 2011-09-12 23:02:45 -05:00
Wagner Bruna
56657ac28b patch: fix parsing patch files containing CRs not followed by LFs
Since 75e6e3c16f9c , patch lines containing a CR not followed by a LF
would be incorrectly splitten, causing a failure to apply the patch.
2011-07-04 19:53:39 -03:00
Thomas Arendsen Hein
692a53d202 classes: fix class style problems found by 06e968819ac9
This makes test-wireprotocol.py work on Python 2.4
2011-06-29 15:00:00 +02:00
Peter Arrenbrecht
97a6442c24 patch: fix typo in variable name 2011-06-20 09:30:03 +02:00
Patrick Mezard
86e3700676 patch: make filestore store data in memory and fallback to fs 2011-06-17 20:33:02 +02:00
Patrick Mezard
fd8786d770 import: add --bypass option
This feature is more a way to test patching without a working directory than
something people asked about. Adding a --rev option to specify the parent patch
revision would make it a little more useful.

What this change introduces is patch.repobackend class which let patches be
applied against repository revisions. The caller must supply a filestore object
to receive patched content, which can be turned into a memctx with
patch.makememctx() helper.
2011-06-14 23:26:35 +02:00
Patrick Mezard
9a9794b348 patch: extend filtestore to store an optional copy source
This will help wrapping filestores in memctx.
2011-06-14 23:24:34 +02:00
Patrick Mezard
3680e66462 patch: generalize the use of patchmeta in applydiff()
- Add patchmeta.copy() and emit copies from iterhunks. Modifying patchmeta
  instances in applydiff() makes things simpler.
- Rename selectfile() into makepatchmeta(). It is responsible for creating
  patchmeta for regular patches.
- Pass patchmeta objects to patchfile() directly

patchmeta instances were associated with git patches, for regular patches we
had to pass additional variables to tell the patch intent to patchfile().
Instead, we generate patchmeta for regular patches and pass them. This will
also help with patch filtering by matcher objects.
2011-06-11 14:17:25 +02:00
Patrick Mezard
c72a5b0080 patch: stop updating changed files set in applydiff()
This information is more correctly returned by backends.

The extra updated file removed from test-mq-merge.t output came from changes
from git patches being counted before being really applied in some cases.
2011-06-11 14:14:13 +02:00
Patrick Mezard
6aaca90508 patch: turn patch() touched files dict into a set 2011-06-11 14:14:11 +02:00
Patrick Mezard
7acb34dda5 patch: dot not ignore hunk of files marked as 'deleted'
git 'deleted' flag was processed unconditionnally and the file removed even if
the related hunk was not matching.
2011-06-05 22:26:01 +02:00
Patrick Mezard
f3d2dd1ca2 patch: fix patchmeta/hunk synchronization in iterhunks()
Synchronizing on bfile does not work on file removal where bfile is /dev/null.
We match items on afile or bfile instead. The incorrect code makes iterhunks()
to emit patchmeta and hunks separately in some cases. This is currently hidden
by applydiff() being too tolerant when processing patchmeta, and will be fixed
later.
2011-06-05 22:24:19 +02:00
Patrick Mezard
fc7fde3949 patch: remove unnecessary exists() call in selectfile() 2011-06-05 22:24:11 +02:00
Patrick Mezard
62588fedca patch: remove redundant islink() call 2011-06-05 13:27:06 +02:00
Martin Geisler
af8a35e078 check-code: flag 0/1 used as constant Boolean expression 2011-06-01 12:38:46 +02:00
Patrick Mezard
ab82a700d3 patch: do not patch unknown files (issue752) 2011-05-27 21:50:11 +02:00
Patrick Mezard
d4b7db6294 patch: use temporary files to handle intermediate copies
git patches may require copies to be handled out-of-order. For instance, take
the following sequence:

  * modify a
  * copy a into b

Here, we have to generate b from a before its modification. To do so,
applydiff() was scanning for copy metadata and performing the copies before
processing the other changes in-order. While smart and efficient, this approach
complicates things by handling file copies and file creations at different
places and times. While a new file must not exist before being patched a copied
file already exists before applying the first hunk.

Instead of copying the files at their final destination before patching, we
store them in a temporary file location and retrieve them when patching. The
filestore always stores file content in real files but nothing prevents adding
a cache layer. The filestore class was kept separate from fsbackend for at
least two reasons:

- This class is likely to be reused as a temporary result store for a future
  repository patching call (entries just have to be extended to contain copy
  sources).

- Delegating this role to backends might be more efficient in a repository
  backend case: the source files are already available in the repository itself
  and do not need to be copied again. It also means that third-parties backend
  would have to implement two other methods. If we ever decide to merge the
  filestore feature into backend, a minimalistic approach would be to compose
  with filestore directly. Keep in mind this copy overhead only applies for
  copy/rename sources, and may even be reduced to copy sources which have to
  handled ahead of time.
2011-05-27 21:50:10 +02:00
Patrick Mezard
e6f284be06 patch: refactor file creation/removal detection
The patcher has to know if a file is being created or removed to check if the
target already exists, or to actually unlink the file when a hunk emptying it
is applied. This was done by embedding the creation/removal information in the
first (and only) hunk attached to the file.

There are two problems with this approach:

- creation/removal is really a property of the file being patched and not its
  hunk.

- for regular patches, file creation cannot be deduced at parsing time: there
  are case where the *stripped* file paths must be compared. Modifying hunks
  after their creation is clumsy and prevent further refactorings related to
  copies handling.

Instead, we delegate this job to selectfile() which has all the relevant
information, and remove the hunk createfile() and rmfile() methods.
2011-05-27 21:50:09 +02:00
Steven Brown
b13eee65a4 patch: restore the previous output of 'diff --stat'
Restore the previous diffstat behaviour of scaling by the maximum number of
changes to a single file. Changeset 7bb0e22a7988 modified the diffstat to be
scaled by the total number of changes. This seems to have been unintentional.
2011-05-26 22:51:02 +08:00
Matt Mackall
c6e850b04b context: make forget work like commands.forget
Switch users of wctx.delete(..., False) to forget.
2011-05-26 17:15:35 -05:00
Patrick Mezard
32250cd067 patch: remove EOL support from linereader class
This was only used when reading patched files which is now done by backends.
2011-05-24 14:21:04 +02:00