In the body of the loop in trydiff(), there are conditions like:
addedset or (f in modifiedset and to is None)
The second half of that expression is to account for the fact that
merge-in additions appear as additions. By instead fixing up the sets
of modified and added files to compensate for this fact, we can
simplify the body of the loop. It also fixes one case where the
addedset was checked without the additional check (the "have we
already reported a copy above?" case in the code, also see fixed test
case).
The similar condition with 'removedset' in it seems to have served no
purpose even before this change, so it could have been simplified even
before.
Note that there is a comment saying "ctx2 date may be dynamic". The
comment was introduced in 8ed3d2a60500 (patch: use contexts for diff,
2006-12-25), but it seems like it stopped being dynamic in that very
changeset -- before that changeset, the date seems to have been the
file's mtime, but after the changeset, it seems to be the changeset's
timestamp (current time for workingctx). Since no one seems to have
missed the "dynamicness", let's simplify and extract a date2 for
symmetry with date1.
This only has a noticeable effect on diffs touching A LOT of
files. For example, it takes
hg diff -r FIREFOX_AURORA_30_BASE -r FIREFOX_AURORA_35_BASE
from 1m55.465s to 1m32.354s. That diff has over 50k files.
These aren't exactly format-breaking features -- just ones for which patches
applied to a repo will produce incorrect commits, In any case, some commands
like record and annotate only care about this feature.
Not all callers are interested in all diffopts -- for example, commands like
record (which use diff internally) break when diffopts like noprefix are
enabled. This function will allow us to add flags that callers can use to
enable only the features they're interested in.
In an upcoming patch we'll enable support as an option to 'hg diff' as well.
The tests reflect the current state of the world -- as we add support we'll see
changes in the test output.
Upcoming patches will add an option that will almost certainly break diff
output parsers when enabled. Add support for forcing an option to something in
plain mode, as a fallback. Options passed in via the CLI are not affected,
though -- it is assumed that any script passing the option in explicitly knows
what it is doing.
The following patch splits up changed lines along tabs (using
re.findall), and gives them a "diff.tab" label. This can be used by
the color extension for colorising tabs, like it does right now with
trailing whitespace.
I also provide corresponding tests.
The internal API used IOError to indicate that a file should be marked as
removed.
There is some correlation between IOError (especially with ENOENT) and files
that should be removed, but using IOErrors to represent file removal internally
required some hacks.
Instead, use the value None to indicate that the file not is present.
Before, spurious IO errors could cause commits that silently removed files.
They will now be reported like all other IO errors so the root cause can be
fixed.
Future patches will allow patch.diff to take a basectx so node1 (and node2)
could make hexfunc error out. Instead, we'll call the node function on the
context object directly.
The `hg import` command gains a `--partial` flag. When specified, a commit will
always be created from a patch import. Any hunk that fails to apply will
create .rej file, same as what `hg qimport` would do. This change is mainly
aimed at preserving changeset metadata when applying a patch, something very
important for reviewers.
In case of failure with `--partial`, `hg import` returns 1 and the following
message is displayed:
patch applied partially
(fix the .rej files and run `hg commit --amend`)
When multiple patches are imported, we stop at the first one with failed hunks.
In the future, someone may feel brave enough to tackle a --continue flag to
import.
Before this patch, "contrib/check-code.py" can't detect these
problems, because the regexp pattern to detect "% inside _()" doesn't
suppose the case that format string consists of multiple string
components concatenated implicitly or explicitly,
This patch does below for that regexp pattern to detect "% inside _()"
problems in such case.
- put "+" into separator part ("[ \t\n]") for explicit concatenation
("...." + "...." style)
- enclose "component and separator" part by "(?:....)+" for
concatenation itself ("...." "...." or "...." + "....")
When creating patches modifying binary files using "git format-patch",
git creates 'literal' and 'delta' hunks. Mercurial currently supports
'literal' hunks only, which makes it impossible to import patches with
'delta' hunks.
This changeset adds support for 'delta' hunks. It is a reimplementation
of patch-delta.c from git :
http://git.kernel.org/cgit/git/git.git/tree/patch-delta.c
This is arguably a workaround, a better fix may be in the repo to
ensure that it won't list a file 'modified' unless there is a file
context for the previous version.
addremove required paths relative to the cwd, which meant a lot of extra code
that transformed paths into relative ones. That code is now gone as well.
This reduces the likelihood of a traceback when trying to email a
patch that happens to have 'diff --git' at the beginning of a line
in the description, as this patch did:
http://markmail.org/message/wxpgowxd7ucxygwe
The check pattern only checked for whitespace between keyword and operator.
Now it also warns:
> x = f(),7
missing whitespace after ,
> x = f()+7
missing whitespace in expression
In an upcoming patch, we will add index information to all git diffs, not
only binary diffs, so this code needs to be moved to a more appropriate
place.
Also, since this information is used for patch headers, it makes more
sense to be in the patch module, along with other patch-related metadata.
addmodehdr is a header helper, same as diffline, so it doesn't
need to be a top-level function and can be nested under trydiff.
In upcoming patches we will generalize this approach for
all headers.
Make diffline more readable, using strings with placeholders
rather than appending to a list from many ifs that makes
difficult to understand the actual output format.
Before, quiet mode produced no diffline header for mercurial as a
side effect of not populating "revs". This was a weird side effect,
and we will always need revs for git index header that will be added
in upcoming patches, so now we just check ui.quiet from diffline
directly.
diffline is not part of diff computation, so it makes more sense
to place it with other header generation in patch module.
In upcoming patches we will generalize this approach for
all headers added in the patch, including the git index
header.
diffline was called from trydiff for binary diffs and from unidiff
for text diffs. In this patch we unify those calls into one.
diffline is also a header, not part of diff mechanisms, so it makes
sense to remove that responsibility from the mdiff module. In
upcoming patches we will move diffline to patch module and
keep grouping responsibilities.
b85diff generates a binary diff, so we move this code to mdiff module
along with unidiff for text diffs. All diffing mechanisms will be in the
same place.
In an upcoming patch we will remove the responsibility to print the
index header from b85diff and move it back to patch, since it's
a patch metadata header, not part of the diff generation.
When applying a patch renaming/copying 'a' to 'b' on a revision where
'a' does not exist, the patching process would abort immediately,
without processing the remaining hunks and without reporting it. This
patch makes the patching no longer abort and possible hunks applied on
the copied/renamed file be written in reject files.
There have been quite a few places where we pop elements off the
front of a list. This can turn O(n) algorithms into something more
like O(n**2). Python has provided a deque type that can do this
efficiently since at least 2.4.
As an example of the difference a deque can make, it improves
perfancestors performance on a Linux repo from 0.50 seconds to 0.36.