Commit Graph

42924 Commits

Author SHA1 Message Date
Jun Wu
6542d0ebf4 indexedlog: add comment about index file format
Summary: Document the format. Actual implementation in later diffs.

Reviewed By: DurhamG

Differential Revision: D7190575

fbshipit-source-id: 243992fd052ca7a9688d54d20694e65daebb9660
2018-04-13 21:51:25 -07:00
Jun Wu
015a4ac5d6 indexedlog: port base16 iterator from radixbuf
Summary:
The append-only index is too different so it's cleaner to cherry-pick code
from radixbuf, instead of modifying radixbuf which would break code
depending on it.

Started by picking the base16 iterator part.

`rustc-test` does not work with buck, and seems to be in an unmaintained
state, so benchmark tests are migrated to criterion.

Reviewed By: DurhamG

Differential Revision: D7189143

fbshipit-source-id: 459a79b4cf16f35d2ff86f11a5980ba1fc627951
2018-04-13 21:51:25 -07:00
Jun Wu
d2c457a6e2 indexedlog: integrity check utility on an append-only file
Summary:
Filesystem is hard. Append-only sounds like a safe way to write files, but it
only really helps with process crashes. If the OS crashes, it's possible that
other parts of the file gets corrupted. As source control, data integrity check
is important. So bytes not logically touched by appending also needs to be
checked.

Implement a `ChecksumTable` which adds integrity check ability to append-only
files. It's intended to be used by future append-only indexes.

Reviewed By: DurhamG

Differential Revision: D7108433

fbshipit-source-id: 16daf6b8d04bba464f1ee9221716beba69c1d47b
2018-04-13 21:51:24 -07:00
Jun Wu
0518016553 indexedlog: initial boilerplate
Summary:
First step of a storage-related building block that is in Rust. The goal is
to use it to replace revlog, obsstore and packfiles.

Extern crates that are likely useful are added to reduce future churns.

Reviewed By: DurhamG

Differential Revision: D7108434

fbshipit-source-id: 97ebd9ba69547d876dcecc05e604acdf9088877e
2018-04-13 21:51:24 -07:00
Ryan McElroy
89ebed5996 morestatus: show conflict info whenever merge state exists
Summary:
Previously, we could be in a merge state (as from `hg update --merge`)
and morestatus did not show any information about the conflicts. Now we will
show conflict info whenever there is a merge state.

Reviewed By: phillco

Differential Revision: D7149411

fbshipit-source-id: e4e03036f3a11bda3edc3628d503a8b3aea412be
2018-04-13 21:51:24 -07:00
Durham Goode
6f734efacd hg: add basic caching to flat-to-tree conversion
Summary:
When converting an incoming bundle, rebuilding the flat text every time
is very expensive. Since we're usually converting a series of manifests that
build upon each other, let's cache the previous flat texts.

Reviewed By: quark-zju

Differential Revision: D7126948

fbshipit-source-id: 9d0671c0b1cd6a63a4acecc614b255c4214328bb
2018-04-13 21:51:24 -07:00
Durham Goode
dbd7c7241a hg: convert flat manifest changegroups into trees
Summary:
Previously, in treeonly mode we would ignore any flat manifests that
were received in changegroups (via bundles or pulls, etc). This ended up causing
data loss in practice when people applied old bundles from before the
treemanifest conversion. Instead of just dropping those manifests, let's convert
them on the fly. This may be expensive, but it's better than losing the data.

A future diff may add caching to reuse flat text to speed up applying multiple
deltas.

Reviewed By: quark-zju

Differential Revision: D7083038

fbshipit-source-id: d2e350325d7e9005c8ddd5462034040274f790ff
2018-04-13 21:51:24 -07:00
Stanislau Hlebik
a82197d560 use getcmdanddefaultopts in rage
Summary:
It makes sure that adding new options to the commands won't break them. For
example, rage sparse output was broken, and this diff fixes it.

Note that this changes behavior of the rage - if, say, smartlog extension is not enabled on the client, then there will be no output in the rage. Previously it wasn't the case. I think that's not the big problem, because:
a) The extensions are the most common ones and enabled everywhere
b) If they are disabled, then we'll immediately see a problem in hg rage.

However I need to change a test to make it pass. I need to add extensions and change grep output, because `hg rage --preview` has a few lines with `blackbox` in it.

Differential Revision: D7193297

fbshipit-source-id: dde2752ebc7dd3e3edea5c44576d0986f7d18744
2018-04-13 21:51:24 -07:00
Stanislau Hlebik
87a6499b5f add getcmdanddefaultopts
Summary:
Add a function that returns command and all the default options already
initialized. It should be used by commands that call other commands. For
example, calling pull inside of update, calling log inside of show etc.
getcmdanddefaultopts has important benefits:
1) It returns "wrapped" command i.e. command with all the overrides applied. On
the other hand, commands.pull doesn't return it.
2) It correctly initializes options to their default value and correctly
changes their name - replace '-'  with '_'.

Reviewed By: ryanmce

Differential Revision: D7193296

fbshipit-source-id: e8673bd4e16aad6156498660f2a7ed788ed2cac3
2018-04-13 21:51:24 -07:00
Stanislau Hlebik
f4cba47550 fix infinitepush import lint in __init__.py
Differential Revision: D7211380

fbshipit-source-id: d1fdfd51998bb2e28feb2e8ff5456314331f1e38
2018-04-13 21:51:24 -07:00
Phil Cohen
c234b91ee0 rebase: check for other unfinished states if using IMM
Summary:
For some reason I thought we could defer both of these checks if IMM, but we can only
really skip the local changes check.

Reviewed By: quark-zju

Differential Revision: D7208076

fbshipit-source-id: 10d1ed50b7d7eadcf66cef4d11185690ccd8d07b
2018-04-13 21:51:24 -07:00
Kostia Balytskyi
0ef59877cd hg: some portability fixes to py-cdatapack.h
Summary:
1. Variable Length Arrays are not supported by MSVC, but since this is a C++ code, we can just use heap allocation
2. Replacing `inet` with portability version

Depends on D7196403

Reviewed By: quark-zju

Differential Revision: D7196605

fbshipit-source-id: a0d88b6e06f255ef648c0b35a99b42ba3bee538a
2018-04-13 21:51:24 -07:00
Kostia Balytskyi
e91017d5d0 compatibility: fix core mpatch.h and compat.h
Summary:
This both fixes semantics and makes `compat.h` a bit more readable.
This because necessary, because we migtrated from external `compat.h` to `mercurial/compat.h` in D7064623.

Reviewed By: DurhamG

Differential Revision: D7196403

fbshipit-source-id: 0005cc2f4e58951adfe8f7f795067da728ad64ae
2018-04-13 21:51:24 -07:00
Durham Goode
7c43ca7c6b hg: fix tests on OSX
Summary: sed -i without arguments doesn't work on OSX.

Reviewed By: farnz

Differential Revision: D7195193

fbshipit-source-id: a8eead927c94404a37ce5df956de82d29bc1b6a8
2018-04-13 21:51:24 -07:00
Stanislau Hlebik
1b816bd86e revert bisect change
Summary:
In D7001328 we've added a new feature that skips commtis if there are no
changes relative to the sparse checkout. Unfortunately that causes lots of
treepacks downloads and makes bisect unusable. Let's revert the change.

Reviewed By: ryanmce, farnz

Differential Revision: D7182016

fbshipit-source-id: 274b29ca6a7b4c3faf83883b64f5ad3b0289873e
2018-04-13 21:51:23 -07:00
Ryan Prince
573a8eb9cc fixing xdiff build on windows
Summary: fixing xdiff build on windows

Reviewed By: quark-zju

Differential Revision: D7189839

fbshipit-source-id: ef05219d911af44f3546bc51fb74539d06b443b5
2018-04-13 21:51:23 -07:00
Jun Wu
d64b27a888 hgsubversion: use list for obsolete relations
Summary:
This is an alternative fix. But it's generally better to avoid using
generator in this case.

Reviewed By: DurhamG

Differential Revision: D7189763

fbshipit-source-id: 0697f2b80e8ba0a4da7c538e0701a150386410e5
2018-04-13 21:51:23 -07:00
Jun Wu
bf8d6fd2f6 obsolete: covert obsolete relation to a list before processing
Summary: Somce code passes it as a generator.

Reviewed By: DurhamG

Differential Revision: D7189765

fbshipit-source-id: d3447c355e4be66aad362687b2423a4776dbdccb
2018-04-13 21:51:23 -07:00
Phil Cohen
eb25ddd617 hg: remove code for rage --oncall
Summary: The option has been gone from the command decorator so it's been inaccessible, in addition to be rarely used, unsupported on laptops, and highly facebook-specific.

Reviewed By: ryanmce

Differential Revision: D7142733

fbshipit-source-id: ee4c833f170e8b8036624ca28cf286e8a0b0cf2d
2018-04-13 21:51:23 -07:00
Mateusz Moneta
4cfb665650 Update markers during hg pullbackup
Summary:
Before this change `hg pullbackup` did not set correct markers on commits.

This change make possible to see what changes already landed even when we are restoring repository from backup.
Before the change `fbclone` + `hg pullbackup` of repo with `C1` commit landed would result in:
```
o  o C2
|    |
o  o C1
|  /
|
o
```
after:
```
o  o C2
|    |
o  x C1
|  /
|
o
```

Reviewed By: StanislavGlebik

Differential Revision: D7032572

fbshipit-source-id: ffee3c7cc23c24a3df9a89c999c9dd2de226dbff
2018-04-13 21:51:23 -07:00
Jun Wu
af8ecd5f80 obsolete: copy obsmarkers from old commits automatically
Summary:
Rewriting a set of commits where there are replacement relationship among the
commits do not have an optimal UX today. For example, `rebase -s A -d Z` or
`metaedit A` in the below graph. B1, B2, C will all be replaced. But the new B1
and B2 replacement won't have the B1 -> B2 relationship, and the "new B1"
appears to be revived.

```
  o C
  |
  x  B1 (amended as B2)
  |
  | o B2
  |/
  o  A    o  Z
```

One solution is to avoid rebasing `obsolete()::`, as implemented in D7067121
for metaedit. That would result in

```
  o C
  |
  x  B1 (amended as B2) o new B2
  |                     |
  x  A                  o new A
```

The stack of A, B1, C is forced to break into two parts. This is fine for
power users. But n00b users would wonder why C is left behind. Per discussion
with simpkins at an internal post about the metaedit case, we think a more
linear history is more user-friendly. That is:

```
  o new C
  |
  x  new B1 (amended as *new* B2)
  |
  | o new B2
  |/
  o new A
```

The stack stays in a same shape.

This diff implements the "copying obsmarkers" behavior at the "createmarkers"
level so everything using that API would get the feature for free, including
metaedit and rebase.

D7067121 is reverted since the new UX is preferred. The test added is for
`metaedit` command, changes to rebase will be added in a later patch.

Differential Revision: D7121487

fbshipit-source-id: fd3c8a96ab434b131fb86d9882ccbdff8f63f05e
2018-04-13 21:51:23 -07:00
Martijn Pieters
216613fa86 sparse: treat semicolons as comments too
Summary:
Sparse profile files are INI files, and semicolons are the traditional comment line starter.

There are already profiles that use the semicolon as a comment, see diffusion/FBS/browse/master/tools/scm/sparse/fbobjc/sandcastle and diffusion/FBS/browse/master/tools/scm/sparse/fbandroid/sandcastle

Reviewed By: farnz

Differential Revision: D7181613

fbshipit-source-id: a42171f6bd6213147c6363f8f359f885af38b8af
2018-04-13 21:51:23 -07:00
Durham Goode
bf42a6c236 hg: prevent treemanifest server from trying to prefetch nodes
Summary:
Recently we unified the client and server code paths a bit, which
can cause the treemanifest server to attempt to do a prefetch (which doesn't
make sense since it has no where to prefetch from). It ends up throwing an Abort
error about not having a remote server configured. The fix is to make the
prefetch path smarter about when it's run on the server and to throw a standard
MissingNodesError instead. That kind of error is already handled in the hybrid
repository case and we just eat it and server the flat manifests like normal.

Once we move to treeonly mode, that error handler will re-raise the exception so
real issues with missing nodes won't be hidden.

Reviewed By: phillco

Differential Revision: D7182283

fbshipit-source-id: 15ed6549d9d7da1fee0570e1fa10338545ed92b1
2018-04-13 21:51:23 -07:00
Durham Goode
bc5366b741 hg: add test for treemanifest and infinitepush rebundling
Summary:
There's a bug where infinitepush attempting to rebundle a bundle that
does not contain trees causes an exception because the server attempts to
prefetch those trees (which fails because there's no where to prefetch from).

This diff just adds a test for that case. The next diff will fix it.

Reviewed By: phillco

Differential Revision: D7182284

fbshipit-source-id: a3fbb576cf3318c81b18943e0f0d466aa65e54fb
2018-04-13 21:51:23 -07:00
Xinjie Lei
ef5e9eed07 Remove "bookprevnode" and "pushbackbookmarks"
Summary:
All of the repos that use commit cloud have remotenames extension enabled, so bookprevnode and pushbackbookmarks parameters are not used. Local bookmarks won't be updated after a push.

We remove "bookprevnode" and "pushbackbookmarks" and functions related to them.

Reviewed By: StanislavGlebik

Differential Revision: D7122411

fbshipit-source-id: 0c6b3bc3f41f5b03d4bb2bc297ae35d77c90fedf
2018-04-13 21:51:22 -07:00
Durham Goode
147d85a46e hg: backout automatic conversion of flat manifests to trees
Summary:
The feature that automatically converted flat manifests to trees is
dependent on the hg server not sending flat manifests to treeonly clients
(otherwise it's very, very slow). Since the server rpms got reverted, we need to
backout these changes until the server issues are fixed.

Reviewed By: farnz

Differential Revision: D7181025

fbshipit-source-id: 1e4aad04d15909a3ce4f69313419e50c14bc8c19
2018-04-13 21:51:22 -07:00
Durham Goode
53b29c4a80 hg: add basic caching to flat-to-tree conversion
Summary:
When converting an incoming bundle, rebuilding the flat text every time
is very expensive. Since we're usually converting a series of manifests that
build upon each other, let's cache the previous flat texts.

Reviewed By: quark-zju

Differential Revision: D7126948

fbshipit-source-id: d31442f71b5a13f5afcd54b019c9bbc85f6f889e
2018-04-13 21:51:22 -07:00
Durham Goode
4d44789410 hg: convert flat manifest changegroups into trees
Summary:
Previously, in treeonly mode we would ignore any flat manifests that
were received in changegroups (via bundles or pulls, etc). This ended up causing
data loss in practice when people applied old bundles from before the
treemanifest conversion. Instead of just dropping those manifests, let's convert
them on the fly. This may be expensive, but it's better than losing the data.

A future diff may add caching to reuse flat text to speed up applying multiple
deltas.

Reviewed By: quark-zju

Differential Revision: D7083038

fbshipit-source-id: 4912ec5ea5097163cede00158df821f116d92c9b
2018-04-13 21:51:22 -07:00
Jun Wu
3b8120083a changegroup: add a config to tweak deltabase selection
Summary:
This diff adds a config option to tweak deltabase in changegroup. It has 3
options:

  - Always null - always use "null" as delta base, effectively make
    everything full text
  - No external - delta bases cannot be a revision outside the changegroup
  - Default - the current behavior: delta bases can be anything that client
    thinks the server should have.

This gives Mononoke more time to bake delta related logic, as we can
choose "always null" first, then incrementally increase the complexity.

Reviewed By: phillco

Differential Revision: D7158585

fbshipit-source-id: 5f6d9a78d1108093e8d08b9f296568f4f7e7471b
2018-04-13 21:51:22 -07:00
Durham Goode
26f64a8aa2 hg: get rid of len(revlog) requirement in verify
Summary:
Verify had some logic that checked the length of the changelog and
manifest to decide if either existed. This allowed for simplifying certain error
messages (like not reporting all the broken changelog manifest pointers if the
manifest was simply gone, and just reporting the manifest was gone).
Unfortunately, in future changelog and manifest implementations len() will be an
expensive function, so let's just get rid of that optimization.

This fixes hg verify in a treeonly repository.

Reviewed By: quark-zju

Differential Revision: D7127168

fbshipit-source-id: 8ddc3dfe3c3c913efd4b7af5fc9715a3e48b60a1
2018-04-13 21:51:22 -07:00
Durham Goode
09c987f22f hg: don't send public trees during pull/push
Summary:
Currently if you push or pull a bunch of commits between peers we will
include all the trees as part of the push. If the source repo doesn't have all
the necessary trees, it will go to the server to get them. Since the other
machine can just as easily go to the server (and probably won't need most of
those trees anyways), lets just have the source client send all draft trees and
skip the public commits,

Reviewed By: phillco

Differential Revision: D7141623

fbshipit-source-id: 6d33ae9d4c9cc32bf6dfa76f733c87c06890d719
2018-04-13 21:51:22 -07:00
Durham Goode
31756a594d hg: update generatemanifests to respect sendtrees enum
Summary:
As part of unifying all our pre-pull/push prefetches, let's move the
changegroup-building prefetch into the cansendtrees function. In a future diff
we'll change this logic to not send trees for public commits in a peer to peer
push/pull.

Reviewed By: mjpieters

Differential Revision: D7141625

fbshipit-source-id: 0253fa32993666f3e03c10c98163d8d60370a97c
2018-04-13 21:51:22 -07:00
Durham Goode
71bccf633c hg: move cansendtrees to shallowbundle
Summary:
A future diff will make it so we can send only draft trees instead of
all trees. To prepare for this, let's move the cansendtrees logic to
shallowbundle (since it will be used by both shallowbundle and by treemanifest)
and change it to return an enum.

Reviewed By: quark-zju

Differential Revision: D7141624

fbshipit-source-id: 34c78b0d1cdb6f8d86a99fb74665e80b2af12c5c
2018-04-13 21:51:22 -07:00
Durham Goode
d9979c9928 hg: update test to show what trees were downloaded
Summary:
A future diff is going to change what trees are sent during a peer to
peer push. Let's update this test so we can see the actual changes in the next
diff.

Reviewed By: singhsrb

Differential Revision: D7141626

fbshipit-source-id: 75e61e9c417d86c48ed1762d6ab67bd4204f67c7
2018-04-13 21:51:22 -07:00
Kostia Balytskyi
174c81d3c9 hg: remove -r option from sed call as it is not supported by OSX
Summary:
`-r` seems to be unneeded, as tests pass without it.

Also, it does not look like the regex itself uses anything not mentioned in `man re_format` on OSX, so we can just use the non-extended re.

Reviewed By: StanislavGlebik

Differential Revision: D7167503

fbshipit-source-id: 3c5c520e9bf2627523cabc771226fc37dc2e9171
2018-04-13 21:51:22 -07:00
Martijn Pieters
15383f0bd3 sparse: use manifest matcher when discovering profiles
Summary:
This echos the change in D7056650; there is no need to special-case
treemanifests here; delegation to the manifest .match method allows the
manifest to apply optimisations when available.

Differential Revision: D7100363

fbshipit-source-id: 66a35850a132f804efb407712d2e4db737c10cff
2018-04-13 21:51:22 -07:00
Martijn Pieters
3b4b3d8cbb sparse: make use of matchers for faster path listing
Summary:
The current code iterates over all files in the manifest, filtering against a prefix.

But a manifest supports using a matcher directly, and efficient implementations like the treemanifest will prune the tree to a much smaller subset rapidly based on the path in a matcher. Switching to using a matcher dramatically improves --cwd-list performance in fbsource, when treemanifests are available.

Reviewed By: quark-zju

Differential Revision: D7056650

fbshipit-source-id: 2bf62ea93680323a49c9282266118805881d7b02
2018-04-13 21:51:21 -07:00
Ryan McElroy
cc75e9e988 remotefilelog: inform user of progress while finding missing objects
Summary:
During big `hg update` calls, the user will often see no progress
initially while we discover which blobs we need to download. Let's fix that.

Reviewed By: quark-zju

Differential Revision: D6903134

fbshipit-source-id: 35b174120b6dce412dd337b6b93c9f5b4233522d
2018-04-13 21:51:21 -07:00
Ryan McElroy
883f750613 merge: show progress when updating dirstate after update
Summary:
For large updates, the dirstate update can take a while. Let's show
progress so the user understands what is happening and how long to wait.

Reviewed By: quark-zju

Differential Revision: D6903133

fbshipit-source-id: f7f6c3c14e1d3221a383da4a6e311aa12a8d3a98
2018-04-13 21:51:21 -07:00
Phil Cohen
eb1a1616ae pasterage: fix crash on printing advice
Summary: `advice` should default to '' if not set, because concatenating `None` and a string will crash. Also set the config everywhere.

Reviewed By: markbt, quark-zju

Differential Revision: D7151898

fbshipit-source-id: 8243267c379da13e293f8e4b2d3cd1976bafbf9d
2018-04-13 21:51:21 -07:00
Aida Getoeva
93252b7fd2 Enable batch mode for SSH during background pushbackup
Summary:
Added passing BatchMode option to SSH call only when puchbackup runs in background.
Also fixed dummyssh in skipping options before hostname, added unittest.

Reviewed By: quark-zju

Differential Revision: D7119123

fbshipit-source-id: 2c8e66fee44cca5b23389cba8e21e3a0b237268e
2018-04-13 21:51:21 -07:00
Mark Thomas
629b2a9ec4 smartlog: always show latest public ancestor of draft commits
Summary:
Smartlog is supposed to show the latest public ancestor of all draft commits,
however this doesn't always happen.

The reason is a boundary error in the test for finding public commits.  If the
latest public ancestor is also the common ancestor (fairly normal), then it
will be excluded.

Reviewed By: quark-zju

Differential Revision: D7140139

fbshipit-source-id: 6999f7ad14f86653ebe4d4f6543b9c7533871cf2
2018-04-13 21:51:21 -07:00
Jun Wu
b7eb2e64e3 mdiff: use xdiff for diff calculation
Summary:
Let's switch to xdiff for its better diff quality and performance!

The test changes demonstrate xdiff's better diff quality.

Reviewed By: ryanmce

Differential Revision: D7135206

fbshipit-source-id: 1775df6fc0f763df074b4f52779835d6ef0f3a4e
2018-04-13 21:51:21 -07:00
Jun Wu
1ce8dceb3a tests: add tests for upcoming xdiff change
Summary:
The next test is going to switch bdiff to xdiff. This diff adds related tests
so we can clearly how xdiff improves the diff quality.

It also solves [issue5091](https://bz.mercurial-scm.org/5091) because xdiff
will shift hunks up and down to group them together. So that was also added as
a test. Although in a more complex case where the hunks are separated by some
common lines (ex. "Y"), xdiff won't help either.

Reviewed By: ryanmce

Differential Revision: D7147444

fbshipit-source-id: 3605290b5dfdfc7b8b004b38c7f7ee9534915380
2018-04-13 21:51:21 -07:00
Jun Wu
81e68a9a57 xdiff: decrease indent heuristic overhead
Summary:
Add a "boring" threshold to limit the search range of the indention heuristic,
so the performance of the diff algorithm is mostly unaffected by turning on
indention heuristic.

Reviewed By: ryanmce

Differential Revision: D7145002

fbshipit-source-id: 024ec685f96aa617fb7da141f38fa4e12c4c0fc9
2018-04-13 21:51:21 -07:00
Jun Wu
2bfb1f6996 xdiff: enable indent heuristic
Summary:
Enable the indent heuristic feature, since it provides nice visual
improvements for a wide range of cases. See the added test, and [1].

The only downside is it can slow things down. In a crafted case, this could
make `--indent-heuristic` several times slower than `--no-indent-heuristic`.

```
open('a', 'w').write(" \n" * 1000000)
open('b', 'w').write(" \n" * 1000001)
```

```
git diff --no-indent-heuristic a b  0.21s user 0.03s system 100% cpu 0.239 total
git diff --indent-heuristic a b     0.77s user 0.02s system 99% cpu 0.785 total
```

[1]: 433860f3d0

Reviewed By: ryanmce

Differential Revision: D7135452

fbshipit-source-id: 019b7e89225f288bba0a1d042591b13b5419ad0e
2018-04-13 21:51:21 -07:00
Jun Wu
884aac4596 xdiff: add a python wrapper
Summary:
Implement a `mercurial.cext.xdiff` module that exposes the xdiff algorithm.

`xdiff.blocks` should be a drop-in replacement for `bdiff.blocks`.

In theory we can change the pure C version of `bdiff.c` directly. However
that means we lose bdiff entirely. It seems more flexible to have both at
the same time so they can be easily switched via Python code. Hence the
Python module approach.

Reviewed By: ryanmce

Differential Revision: D7135205

fbshipit-source-id: 48cd3b5be7fd5ef41b64eab6c76a5c8a6ce99e05
2018-04-13 21:51:21 -07:00
Jun Wu
511ec41260 xdiff: add a bdiff hunk mode
Summary:
xdiff generated hunks for the differences (ex. questionmarks in the
`@@ -?,?  +?,? @@` part from `diff --git` output). However, bdiff generates
matched hunks instead.

This patch adds a `XDL_EMIT_BDIFFHUNK` flag used by the output function
`xdl_call_hunk_func`.  Once set, xdiff will generate bdiff-like hunks
instead. That makes it easier to use xdiff as a drop-in replacement of bdiff.

Note that since `bdiff('', '')` returns `[(0, 0, 0, 0)]`, the shortcut path
`if (xscr)` is removed. I have checked functions called with `xscr` argument
(`xdl_mark_ignorable`, `xdl_call_hunk_func`, `xdl_emit_diff`,
`xdl_free_script`) work just fine with `xscr = NULL`.

Reviewed By: ryanmce

Differential Revision: D7135207

fbshipit-source-id: cfb8c363e586841c06c94af283c7f014ba65fcc0
2018-04-13 21:51:21 -07:00
Jun Wu
3dc0156874 xdiff: add a binary utility that runs xdiff
Summary:
Add a simple binary that runs xdiff in a minimal way. This is mainly for
exposing xdiff logic so it can be used in command line for testing purpose.

It also serves as an example of how to use xdiff.

Reviewed By: ryanmce

Differential Revision: D7133531

fbshipit-source-id: ceb608f5754b61eaa95804730b3c89643ff1837b
2018-04-13 21:51:20 -07:00
Jun Wu
56a738fce4 xdiff: remove patience and histogram diff algorithms
Summary:
Patience diff is the normal diff algorithm, plus some greediness that
unconditionally matches common common unique lines.  That means it is easy to
construct cases to let it generate suboptimal result, like:

```
open('a', 'w').write('\n'.join(list('a' + 'x' * 300 + 'u' + 'x' * 700 + 'a\n')))
open('b', 'w').write('\n'.join(list('b' + 'x' * 700 + 'u' + 'x' * 300 + 'b\n')))
```

Patience diff has been advertised as being able to generate better results for
some C code changes. However, the more scientific way to do that is the
indention heuristic [1].

Since patience diff could generate suboptimal result more easily and its
"better" diff feature could be replaced by the new indention heuristic, let's
just remove it and its variant histogram diff to simplify the code.

[1]: 433860f3d0

Reviewed By: ryanmce

Differential Revision: D7124711

fbshipit-source-id: 127e8de6c75d0262687a1b60814813e660aae3da
2018-04-13 21:51:20 -07:00