Commit Graph

626 Commits

Author SHA1 Message Date
Ryan McElroy
edae55b64a patch: avoid repeated binary checks if all files in a patch are text
Summary: This backports upstream rev 079b27b5a869. It saves 3-4% on diffs.

Reviewed By: quark-zju

Differential Revision: D6951334

fbshipit-source-id: 889851e9638e2eeb43549af31e25d75632eccc2b
2018-04-13 21:51:09 -07:00
Jun Wu
f1c575a099 flake8: enable F821 check
Summary:
This check is useful and detects real errors (ex. fbconduit).  Unfortunately
`arc lint` will run it with both py2 and py3 so a lot of py2 builtins will
still be warned.

I didn't find a clean way to disable py3 check. So this diff tries to fix them.
For `xrange`, the change was done by a script:

```
import sys
import redbaron

headertypes = {'comment', 'endl', 'from_import', 'import', 'string',
               'assignment', 'atomtrailers'}

xrangefix = '''try:
    xrange(0)
except NameError:
    xrange = range

'''

def isxrange(x):
    try:
        return x[0].value == 'xrange'
    except Exception:
        return False

def main(argv):
    for i, path in enumerate(argv):
        print('(%d/%d) scanning %s' % (i + 1, len(argv), path))
        content = open(path).read()
        try:
            red = redbaron.RedBaron(content)
        except Exception:
            print('  warning: failed to parse')
            continue
        hasxrange = red.find('atomtrailersnode', value=isxrange)
        hasxrangefix = 'xrange = range' in content
        if hasxrangefix or not hasxrange:
            print('  no need to change')
            continue

        # find a place to insert the compatibility  statement
        changed = False
        for node in red:
            if node.type in headertypes:
                continue
            # node.insert_before is an easier API, but it has bugs changing
            # other "finally" and "except" positions. So do the insert
            # manually.
            # # node.insert_before(xrangefix)
            line = node.absolute_bounding_box.top_left.line - 1
            lines = content.splitlines(1)
            content = ''.join(lines[:line]) + xrangefix + ''.join(lines[line:])
            changed = True
            break

        if changed:
            # "content" is faster than "red.dumps()"
            open(path, 'w').write(content)
            print('  updated')

if __name__ == "__main__":
    sys.exit(main(sys.argv[1:]))
```

For other py2 builtins that do not have a py3 equivalent, some `# noqa`
were added as a workaround for now.

Reviewed By: DurhamG

Differential Revision: D6934535

fbshipit-source-id: 546b62830af144bc8b46788d2e0fd00496838939
2018-04-13 21:51:09 -07:00
Jun Wu
2946a1c198 codemod: use single blank line
Summary: This makes test-check-code cleaner.

Reviewed By: ryanmce

Differential Revision: D6937934

fbshipit-source-id: 8f92bc32f75b9792ac67db77bb3a8756b37fa941
2018-04-13 21:51:08 -07:00
Yuya Nishihara
2c190381f2 patch: do not break up multibyte character when highlighting word
This changes {\W} to {\W - any 8bit characters} so that multibyte sequences
are taken as words. Since we don't know the encoding of user content, this
is the most sensible definition of a non-word.
2017-12-11 22:38:31 +09:00
Pulkit Goyal
87c0c860ea py3: handle keyword arguments correctly in patch.py
Differential Revision: https://phab.mercurial-scm.org/D1639
2017-12-10 04:48:00 +05:30
Matthieu Laneuville
eb58359793 patch: move part of tabsplitter logic in _inlinediff
It cannot be entirely moved within _inlinediff as long as worddiff is
experimental (when turned off, matches is always an empty dict).
2017-12-08 17:20:11 +09:00
Matthieu Laneuville
b5b163535f patch: catch unexpected case in _inlinediff
If operation is neither 'diff.inserted' or 'diff.deleted', label and token won't
be define. This patch explicitely catches that exception.
2017-12-08 16:54:59 +09:00
Matthieu Laneuville
9b54f9af3b patch: reverse _inlinediff output for consistency 2017-12-08 16:47:18 +09:00
Matthieu Laneuville
14498f2bf5 patch: add within-line color diff capacity
The `diff' command usually writes deletion in red and insertions in green. This
patch adds within-line colors, to highlight which part of the lines differ.
Lines to compare are decided based on their similarity ratio, as computed by
difflib SequenceMatcher, with an arbitrary threshold (0.7) to decide at which
point two lines are considered entirely different (therefore no inline-diff
required).

The current implementation is kept behind an experimental flag in order to test
the effect on performance. In order to activate it, set inline-color-diff to
true in [experimental].
2017-10-26 00:13:38 +09:00
Gregory Szorc
32815ee95d py3: define __next__ in patch.py
This needed to appease Python 3's iterator protocol.

This is crasher #5 in Python 3.

Differential Revision: https://phab.mercurial-scm.org/D1480
2017-11-20 23:13:09 -08:00
Martin von Zweigbergk
13e75f668f patch: accept prefix argument to changedfiles() helper
I'd like to call the function from an extension, passing both "strip"
and "prefix", but it currently only accepts "strip". The only in-tree
caller seems to be mq.py, which doesn't even pass "strip".

Differential Revision: https://phab.mercurial-scm.org/D1413
2017-11-14 10:26:36 -08:00
Denis Laxalde
e400287a20 revert: do not reverse hunks in interactive when REV is not parent (issue5096)
And introduce a new "apply" operation verb for this case as suggested in
issue5096. This replaces the no longer used "revert" operation.

In interactive revert, when reverting to something else that the parent
revision, display an "apply this change" message with a diff that is not
reversed.

The rationale is that `hg revert -i -r REV` will show hunks of the diff from
the working directory to REV and prompt the user to select them for applying
(to working directory). This contradicts 79cc693b4406 in which it was
decided to have the "direction" of prompted hunks reversed. Later on
[1], there was a broad consensus (but no decision) towards the "as to
be applied direction". Now that --interactive is no longer experimental
(97d754ba45c4), it's time to switch and thus we drop no longer used
"experimental.revertalternateinteractivemode" configuration option.

[1]: https://www.mercurial-scm.org/pipermail/mercurial-devel/2016-November/090142.html


.. feature::

  When interactive revert is run against a revision other than the working
  directory parent, the diff shown is the diff to *apply* to the working directory,
  rather than the diff to *discard* from the working copy. This is in line with
  related user experiences with `git` and appears to be less confusing with
  `ui.interface=curses`.
2017-11-03 14:47:37 +01:00
Yuya Nishihara
57fc1b2b62 patch: improve heuristics to not take the word "diff" as header (issue1879)
The word "diff" is likely to appear in a commit message. Let's make it less
likely by requiring "diff -" for "diff -r" or "diff --git".
2017-10-21 16:50:57 +09:00
Denis Laxalde
6a72a6393c log: add an assertion about fctx not being None in patch.diff()
As noted in the comment, this should not happen as removed files (the cause of
fctx2 being None) are caught earlier.
2017-10-19 15:06:33 +02:00
Denis Laxalde
356644ac1a diff: pass a diff hunks filter function from changeset_printer to patch.diff()
We add a 'hunksfilterfn' keyword argument in all functions of the call
stack from changeset_printer.show() to patch.diff(). This is a callable
that will be used to filter out hunks by line range and will be used in
the "-L/--line-range" option of "hg log" command introduced in the
following changesets.
2017-10-06 14:45:17 +02:00
Denis Laxalde
f37bd71289 diff: also yield file context objects in patch.trydiff() (API)
And retrieve them in patch.diffhunks(). We'll use these in forthcoming
changesets to filter diff hunks by line range.
2017-10-05 21:20:08 +02:00
Jun Wu
0ca89d02a2 patch: do not cache translated messages (API)
Previously the code caches `i18n._` results in module variables. That causes
issues after an encoding change. Instead of invalidating them manually, we
now just recalculate the translated messages every time `filterpatch` gets
called.

This makes test-commit-interactive.t pass regardless of whether chg or
demandimport is used or not.

.. api: `patch.messages` now lives in `patch.getmessages()`.

   Extensions adding new messages should now wrap the `patch.getmessages`
   method instead of changing `patch.messages` directly.

Differential Revision: https://phab.mercurial-scm.org/D959
2017-10-05 13:38:48 -07:00
Denis Laxalde
5b704efceb patch: rename "header" variable into "hdr" in diff()
The "header" variable was hiding the eponymous class, hence preventing its
usage.
2017-09-26 18:17:47 +02:00
Boris Feld
2063677b32 configitems: register the 'diff.*' config
All the config were already using a unified function with a forced default, so,
registering them all at once seems safe.
2017-10-08 21:47:14 +02:00
Augie Fackler
67c46dd503 patch: remove superfluous pass statements 2017-09-30 07:45:07 -04:00
Yuya Nishihara
012701ee30 py3: fix doctests in patch.py to be compatible with Python 3
We were lucky that parsepatch() could concatenate a character slice as if
it were a list of chunks.
2017-09-17 12:23:16 +09:00
Yuya Nishihara
57f81f3f7c py3: stop using bytes[n] in patch.py 2017-09-17 12:20:35 +09:00
Yuya Nishihara
b772b7f536 error: move patch.PatchError so it can easily implement __bytes__ (API) 2017-09-03 16:45:33 +09:00
Yuya Nishihara
dcc07e5503 doctest: use print_function and convert bytes to unicode where needed 2017-09-03 14:56:31 +09:00
Yuya Nishihara
a71f259bd2 doctest: bulk-replace string literals with b'' for Python 3
Our code transformer can't rewrite string literals in docstrings, and I
don't want to make the transformer more complex.
2017-09-03 14:32:11 +09:00
Yuya Nishihara
70990906a5 py3: replace bytes[n] with bytes[n:n + 1] in patch.py where needed 2017-09-03 16:19:20 +09:00
Yuya Nishihara
02f6cbfc54 py3: fix type of regex literals in patch.py 2017-09-03 16:12:15 +09:00
Pulkit Goyal
5caf86603b patch: take messages out of the function so that extensions can add entries
Extensions will want to have interactive thing for more operations or
particulary want to show more verbs. So this patch takes out the message thing
from the function so that extensions can add verbs to this. The curses one is
also not in any function so extensions can add more actions and verbs there.

Differential Revision: https://phab.mercurial-scm.org/D567
2017-08-30 18:19:14 +05:30
David Soria Parra
6d9f90fa8d mdiff: add a --ignore-space-at-eol option
Add an option that only ignores whitespaces at EOL. The name of the option is
the same as Git.

.. feature::

   Added `--ignore-space-at-eol` diff option to ignore whitespace differences
   at line endings.

Differential Revision: https://phab.mercurial-scm.org/D422
2017-08-29 18:20:50 -07:00
Jun Wu
45a4782018 record: fix revert -i for lines without newline (issue5651)
This is a regression caused by 10c1efcbeb1e. Code prior to 10c1efcbeb1e
seems to miss the "\ No newline at end of file" line.

Differential Revision: https://phab.mercurial-scm.org/D528
2017-08-27 13:39:17 -07:00
Augie Fackler
2ebd830d1d patch: update copying of dict keys and values to work on Python 3 2017-07-24 14:42:55 -04:00
Jun Wu
d30e8ac7be patch: use devel.all-warnings to replace devel.all
It appears to be a misspell in patch.py.
2017-07-12 15:24:47 -07:00
Jun Wu
266a8360a7 patch: make parsepatch optionally trim context lines
Previously there is a suspicious `if False and delta > 0` which dates back
to the beginning of hgext/record.py (f995f03023c7).

The "trimming context lines" feature could be useful (and is used by the
next patch). So let's enable it. This patch adds a new `maxcontext`
parameter to `recordhunk` and `parsepatch`, changing the `if False`
condition to respect it.

The old `trimcontext` implementation is also wrong - it does not update
`toline` correctly and it does not do the right thing for `before` context.
A doctest was added to guard us from making a similar mistake again.

Since `maxcontext` is set to `None` (unlimited), there is no behavior
change.
2017-07-04 16:41:28 -07:00
Pierre-Yves David
97921d9b37 configitems: register the 'patch.eol' config 2017-06-30 03:43:35 +02:00
Martin von Zweigbergk
d1edc85314 patch: remove unused fsbackend._join()
The function lost its last caller in ae209b610844 (patch: replace
functions in fsbackend to use vfs, 2014-06-05) when the callers
started relying on the opener to do the join.
2017-06-29 23:04:47 -07:00
Martin von Zweigbergk
436a49bcd7 patch: add close() to abstractbackend
patchbackend() seems to call it on an arbitrary backend, so it seems
to be part of the API. Since all subclasses do something in their
close() methods, I decided to let this one raise an exception rather
than just pass.
2017-06-30 09:07:24 -07:00
Pulkit Goyal
0d776078c8 py3: use '%d' to convert integers to bytes 2017-06-26 17:22:45 +05:30
Pulkit Goyal
d7e69a39e2 py3: add b'' to make the regex pattern bytes 2017-06-25 03:11:55 +05:30
Pierre-Yves David
414fa65e17 configitems: register 'patch.fuzz' as first example for 'configint'
This exercise the default value handling in 'configint'.
2017-06-17 13:17:10 +02:00
Jun Wu
5e22630fbf patch: rewrite reversehunks (issue5337)
The old reversehunks code accesses "crecord.uihunk._hunk", which is the raw
recordhunk without crecord selection information, therefore "revert -i"
cannot revert individual lines, aka. issue5337.

The patch rewrites related logic to return the right reverse hunk for
revert. Namely,

 1. "fromline" and "toline" are correctly swapped [1]
 2. crecord.uihunk generates a correct reverse hunk [2]

Besides, reversehunks(hunks) will no longer modify its input "hunks", which
is more expected.

[1]: To explain why "fromline" and "toline" need to be swapped, take the
     following example:

  $ cat > a <<EOF
  > 1
  > 2
  > 3
  > 4
  > EOF

  $ cat > b <<EOF
  > 2
  > 3
  > 5
  > EOF

  $ diff a b
  1d0   <---- "1" is "fromline" and "0" is "toline"
  < 1         and they are swapped if diff from the reversed direction
  4c3             |
  < 4             |
  ---             |
  > 5             |
                  |
  $ diff b a      |
  0a1   <---------+
  > 1
  3c4   <---- also "4c3" gets swapped to "3c4"
  < 5
  ---
  > 4

[2]: This is a bit tricky.

For example, given a file which is empty in working parent but has 3 lines
in working copy, and the user selection:

    select hunk to discard
    [x] +1
    [ ] +2
    [x] +3

The user intent is to drop "1" and "3" in working copy but keep "2", so the
reverse patch would be something like:

        -1
         2 (2 is a "context line")
        -3

We cannot just take all selected lines and swap "-" and "+", which will be:

        -1
        -3

That patch won't apply because of "2". So the correct way is to insert "2"
as a "context line" by inserting it first then deleting it:

        -2
        +2

Therefore, the correct revert patch is:

        -1
        -2
        +2
        -3

It could be reordered to look more like a common diff hunk:

        -1
        -2
        -3
        +2

Note: It's possible to return multiple hunks so there won't be lines like
"-2", "+2". But the current implementation is much simpler.

For deletions, like the working parent has "1\n2\n3\n" and it was changed to
empty in working copy:

    select hunk to discard
    [x] -1
    [ ] -2
    [x] -3

The user intent is to drop the deletion of 1 and 3 (in other words, keep
those lines), but still delete "2".

The reverse patch is meant to be applied to working copy which is empty.
So the patch would be:

        +1
        +3

That is to say, there is no need to special handle the unselected "2" like
the above insertion case.
2017-06-20 23:22:38 -07:00
Yuya Nishihara
6130be9a6c diffhelpers: switch to policy importer
# no-check-commit
2016-08-13 12:15:49 +09:00
Andrew Zwicky
faec617779 diffstat: properly count lines starting in '--' or '++' (issue5479)
Lines that start in '--' or '++' were previously not counted
as deletions or additions in diffstat, resulting in incorrect
addition/deletion counts.  The bug was present if the start
of the line, combined with the diff character resulted
in '---' or '+++'.

diffstatdata will now track, for each file, if it has moved
pas the header section by looking for a line beginning with
'@@'.  Once that has happened, lines beginning with '-'
or '+' will be counted for deletions and additions.  Once a
line beginning with 'diff' is found, the process starts over.
2017-05-17 20:51:17 -05:00
Yuya Nishihara
ab046506ef base85: proxy through util module
I'm going to replace hgimporter with a simpler import function, so we can
access to pure/cext modules by name:

  # util.py
  base85 = policy.importmod('base85')  # select pure.base85 or cext.base85

  # cffi/base85.py
  from ..pure.base85 import *  # may re-export pure.base85 functions

This means we'll have to use policy.importmod() function in place of the
standard import statement, but we wouldn't want to write it every place where
C extension modules are used. So this patch makes util host base85 functions.
2017-04-26 21:56:47 +09:00
Jun Wu
9147494fab diff: add a fast path to avoid loading binary contents
When diffing binary contents, with certain configs, we can show
"Binary file <name> has changed" without actual content.

That allows a fast path where we could avoid providing actual binary
contents. Note: in that case we still need to test if two contents are the
same, that's done by using "filectx.cmp", which could have its own fast
path.
2017-05-03 23:50:41 -07:00
Jun Wu
49fe7ea4f6 diff: correct binary testing logic
This seems to be more correct given the table drawn in the previous patch.

Namely, "losedatafn" and "opts.git" are removed, "not opts.text" is added.

  - losedatafn: diff output (binary) should not be affected by "losedatafn"
  - opts.git: binary testing is helpful for detecting a fast path in the
    next path. the fast path can also be used if opts.git is False
  - opts.text: if it's set, we should treat the content as non-binary
2017-05-05 17:20:32 -07:00
Jun Wu
2b452ab2f5 diff: draw a table about binary diff behaviors
The table should make it easier to reason about future changes.
2017-05-05 16:48:58 -07:00
Jun Wu
119973f0ba diff: use fctx.size() to test empty
fctx.size() could have a fast path that does not require loading content.
2017-05-03 22:20:44 -07:00
Jun Wu
8b31c4d99f diff: use fctx.isbinary() to test binary
The end goal is to avoid calling fctx.data() when unnecessary. For example,
if diff.nobinary=1 and files are binary, the expected behavior is to print
"Binary file has changed". That could avoid reading fctx.data() sometimes.

This is mainly to enable an external LFS extension to skip expensive binary
file loading sometimes (read: most of the time with diff.nobinary=1 and
diff.text=0), without any behavior changes to mercurial (i.e. whether a file
is LFS or not does not change any behavior, LFS could be 100% transparent to
users).
2017-05-03 22:16:54 -07:00
Boris Feld
46290fc257 record: update help message to use operation instead of "record" (issue5432)
Update the hunk selector help message to use the operation name instead
of using "record" for all operations. Extract the help message in the same way
as other single and multiple message line.
Update tests to make sure that both "revert" and "discard" variants are tested.
2017-04-24 17:13:24 +02:00
Alexander Fomin
b6338c907a diff: add --binary option for git mode diffs
This patch adds --binary option to `hg diff` and `hg export` to allow more
control about when binary diffs are displayed in Git mode as well as some
tests to verify it behaves correctly (issue5510).
2017-04-05 15:31:08 -07:00