Summary:
During big `hg update` calls, the user will often see no progress
initially while we discover which blobs we need to download. Let's fix that.
Reviewed By: quark-zju
Differential Revision: D6903134
fbshipit-source-id: 35b174120b6dce412dd337b6b93c9f5b4233522d
Summary:
For large updates, the dirstate update can take a while. Let's show
progress so the user understands what is happening and how long to wait.
Reviewed By: quark-zju
Differential Revision: D6903133
fbshipit-source-id: f7f6c3c14e1d3221a383da4a6e311aa12a8d3a98
Summary: `advice` should default to '' if not set, because concatenating `None` and a string will crash. Also set the config everywhere.
Reviewed By: markbt, quark-zju
Differential Revision: D7151898
fbshipit-source-id: 8243267c379da13e293f8e4b2d3cd1976bafbf9d
Summary:
Added passing BatchMode option to SSH call only when puchbackup runs in background.
Also fixed dummyssh in skipping options before hostname, added unittest.
Reviewed By: quark-zju
Differential Revision: D7119123
fbshipit-source-id: 2c8e66fee44cca5b23389cba8e21e3a0b237268e
Summary:
Smartlog is supposed to show the latest public ancestor of all draft commits,
however this doesn't always happen.
The reason is a boundary error in the test for finding public commits. If the
latest public ancestor is also the common ancestor (fairly normal), then it
will be excluded.
Reviewed By: quark-zju
Differential Revision: D7140139
fbshipit-source-id: 6999f7ad14f86653ebe4d4f6543b9c7533871cf2
Summary:
Let's switch to xdiff for its better diff quality and performance!
The test changes demonstrate xdiff's better diff quality.
Reviewed By: ryanmce
Differential Revision: D7135206
fbshipit-source-id: 1775df6fc0f763df074b4f52779835d6ef0f3a4e
Summary:
The next test is going to switch bdiff to xdiff. This diff adds related tests
so we can clearly how xdiff improves the diff quality.
It also solves [issue5091](https://bz.mercurial-scm.org/5091) because xdiff
will shift hunks up and down to group them together. So that was also added as
a test. Although in a more complex case where the hunks are separated by some
common lines (ex. "Y"), xdiff won't help either.
Reviewed By: ryanmce
Differential Revision: D7147444
fbshipit-source-id: 3605290b5dfdfc7b8b004b38c7f7ee9534915380
Summary:
Add a "boring" threshold to limit the search range of the indention heuristic,
so the performance of the diff algorithm is mostly unaffected by turning on
indention heuristic.
Reviewed By: ryanmce
Differential Revision: D7145002
fbshipit-source-id: 024ec685f96aa617fb7da141f38fa4e12c4c0fc9
Summary:
Enable the indent heuristic feature, since it provides nice visual
improvements for a wide range of cases. See the added test, and [1].
The only downside is it can slow things down. In a crafted case, this could
make `--indent-heuristic` several times slower than `--no-indent-heuristic`.
```
open('a', 'w').write(" \n" * 1000000)
open('b', 'w').write(" \n" * 1000001)
```
```
git diff --no-indent-heuristic a b 0.21s user 0.03s system 100% cpu 0.239 total
git diff --indent-heuristic a b 0.77s user 0.02s system 99% cpu 0.785 total
```
[1]: 433860f3d0
Reviewed By: ryanmce
Differential Revision: D7135452
fbshipit-source-id: 019b7e89225f288bba0a1d042591b13b5419ad0e
Summary:
Implement a `mercurial.cext.xdiff` module that exposes the xdiff algorithm.
`xdiff.blocks` should be a drop-in replacement for `bdiff.blocks`.
In theory we can change the pure C version of `bdiff.c` directly. However
that means we lose bdiff entirely. It seems more flexible to have both at
the same time so they can be easily switched via Python code. Hence the
Python module approach.
Reviewed By: ryanmce
Differential Revision: D7135205
fbshipit-source-id: 48cd3b5be7fd5ef41b64eab6c76a5c8a6ce99e05
Summary:
xdiff generated hunks for the differences (ex. questionmarks in the
`@@ -?,? +?,? @@` part from `diff --git` output). However, bdiff generates
matched hunks instead.
This patch adds a `XDL_EMIT_BDIFFHUNK` flag used by the output function
`xdl_call_hunk_func`. Once set, xdiff will generate bdiff-like hunks
instead. That makes it easier to use xdiff as a drop-in replacement of bdiff.
Note that since `bdiff('', '')` returns `[(0, 0, 0, 0)]`, the shortcut path
`if (xscr)` is removed. I have checked functions called with `xscr` argument
(`xdl_mark_ignorable`, `xdl_call_hunk_func`, `xdl_emit_diff`,
`xdl_free_script`) work just fine with `xscr = NULL`.
Reviewed By: ryanmce
Differential Revision: D7135207
fbshipit-source-id: cfb8c363e586841c06c94af283c7f014ba65fcc0
Summary:
Add a simple binary that runs xdiff in a minimal way. This is mainly for
exposing xdiff logic so it can be used in command line for testing purpose.
It also serves as an example of how to use xdiff.
Reviewed By: ryanmce
Differential Revision: D7133531
fbshipit-source-id: ceb608f5754b61eaa95804730b3c89643ff1837b
Summary:
Patience diff is the normal diff algorithm, plus some greediness that
unconditionally matches common common unique lines. That means it is easy to
construct cases to let it generate suboptimal result, like:
```
open('a', 'w').write('\n'.join(list('a' + 'x' * 300 + 'u' + 'x' * 700 + 'a\n')))
open('b', 'w').write('\n'.join(list('b' + 'x' * 700 + 'u' + 'x' * 300 + 'b\n')))
```
Patience diff has been advertised as being able to generate better results for
some C code changes. However, the more scientific way to do that is the
indention heuristic [1].
Since patience diff could generate suboptimal result more easily and its
"better" diff feature could be replaced by the new indention heuristic, let's
just remove it and its variant histogram diff to simplify the code.
[1]: 433860f3d0
Reviewed By: ryanmce
Differential Revision: D7124711
fbshipit-source-id: 127e8de6c75d0262687a1b60814813e660aae3da
Summary:
Vendor git's xdiff library from git commit
d7c6c2369d7c6c2369ac21141b7c6cceaebc6414ec3da14ad using GPL2+ license.
There is another recent user report that hg diff generates suboptimal
result. It seems the fix to issue4074 isn't good enough. I crafted some
other interesting cases, and hg diff barely has any advantage compared with
gnu diffutils or git diff.
| testcase | gnu diffutils | hg diff | git diff |
| | lines time | lines time | lines time |
| patience | 6 0.00 | 602 0.08 | 6 0.00 |
| random | 91772 0.90 | 109462 0.70 | 91772 0.24 |
| json | 2 0.03 | 1264814 1.81 | 2 0.29 |
"lines" means the size of the output, i.e. the count of "+/-" lines. "time"
means seconds needed to do the calculation. Both are the smaller the better.
"hg diff" counts Python startup overhead.
Git and GNU diffutils generate optimal results. For the "json" case, git can
have an optimization that does a scan for common prefix and suffix first,
and match them if the length is greater than half of the text. See
https://neil.fraser.name/news/2006/03/12/. That would make git the fastest
for all above cases.
About testcases:
patience:
Aiming for the weakness of the greedy "patience diff" algorithm. Using
git's patience diff option would also get suboptimal result. Generated using
the Python script:
```
open('a', 'w').write('\n'.join(list('a' + 'x' * 300 + 'u' + 'x' * 700 + 'a\n')))
open('b', 'w').write('\n'.join(list('b' + 'x' * 700 + 'u' + 'x' * 300 + 'b\n')))
```
random:
Generated using the script in `test-issue4074.t`. It practically makes the
algorithm suffer. Impressively, git wins in both performance and diff
quality.
json:
The recent user reported case. It's a single line movement near the end of a
very large (800K lines) JSON file.
Reviewed By: ryanmce
Differential Revision: D7124455
fbshipit-source-id: 832651115da770f9d2ed5fdff2e200453c0013f8
Summary:
verify.skipmanifests was added to let us skip the parts of verify that
were expensive in a lazy-manifest world. We also need to skip .hgsubstate
verification since it requires the manifests.
Reviewed By: singhsrb
Differential Revision: D7127353
fbshipit-source-id: 377fffb8556f7578da3e51c3da53f97554fb5d74
Summary: This change minimally addresses the issue that `debugrebuilddirstate` can crash if the dirstate file is very corrupt.
Reviewed By: markbt
Differential Revision: D7028370
fbshipit-source-id: 72fc7a2900a8bc1bb5f062454530b4fc4c806f09
Summary:
This logic exists for the following case: if becuase of some circumstances (user error, NFS/Samba weirdness) a symlink file becomes corrupted and we check this file in while preserving the 'l' flag in the manifest, the revision becomes un-checoutable on Posix filesystems (according to the original explanations anyway).
This option basically adds an ability to take this risk. I am planning to enable this option for `ovrsource`, since it already has very few symlinks so the probability of corrupting something is pretty low.
Differential Revision: D7127333
fbshipit-source-id: f85d9f3aef676afca641280c9b7f0ecfb87b9fab
Summary: They help making tests easier to write.
Reviewed By: phillco
Differential Revision: D7121645
fbshipit-source-id: 9c7181d45c4e28155eb68f355cf1c4cfc077d191
Summary:
`.t` tests have some highly repeatitive logics, like creating a repo, etc.
This patch adds a common shell file that defines frequently used functions.
For now, `newrepo` and `enable` are added. The latter can be used to
enable a feature (ex. obsstore), or an extension.
`test-fb-hgext-fbamend-next.t` and `test-fb-hgext-absorb.t` are migrated
to use the new shell functions.
Reviewed By: phillco
Differential Revision: D7121485
fbshipit-source-id: 167fcc20e4e30864199b6c5af0958b80bfb68817
Summary:
Don't think this is required or used anymore and reveals information
about the structure of our project if we open source our mercurial.
Reviewed By: quark-zju
Differential Revision: D7128203
fbshipit-source-id: 4cdfa008631d08321a4d5a1c8f18cef429c35077
Summary:
Add "-l" flag that means:
Dump only bin files that are not sent to dewey-lfs.
Return hashes for files uploaded to dewey-lfs.
`-l --lfs Provide sha256 for lfs files instead of dumping`
Further hashes of theses files will be used by phabricator to download form dewey.
Related [post](https://fb.facebook.com/groups/614495118724031/permalink/850507588456115/)
Differential Revision: D7000198
fbshipit-source-id: fa4ba81a021884dccb0471154f5a490284ea0e59
Summary:
- Print some problem-solving advice after the pasterage. (Some of this was/is the advice in the pinned post)
- Use a spinner when generating the paste, and when posting it
- Use `subprocess` to capture the output of `arc paste`. That way we can colorize the link, and we don't have to print "receiving paste from stdin", which could confuse the user.
- Make the link more prominent
Reviewed By: mjpieters
Differential Revision: D7108933
fbshipit-source-id: d4f90af8c602f7553bd2f427e661c7b1e045eb50
Summary:
The breakage we had on branch importer was related to filetransaction trying to
close a file that didn't exist. We're still not sure why this happens yet, but
the workaround was to use mercurial's transaction and to force not using workers,
which is this change.
Differential Revision: D7108127
fbshipit-source-id: 71fa63824984bfb91de3b732166f7bae496187ad
Summary:
The addmemtree function was originally created to allow making
pending trees readable (i.e. trees that have been written but not committed).
Now that all writes are immediately readable, we don't need addmemtree anymore.
Reviewed By: quark-zju
Differential Revision: D7101314
fbshipit-source-id: f9ecabf366ba7bc59abee42d264e17ab66b7f0dd
Summary: dispatch.lazyaliasentry didn't work with slices i.e. lazyasliasentry[:2] failed. This diff fixes it
Reviewed By: farnz
Differential Revision: D7109779
fbshipit-source-id: c704cd44fea0944ae4be68df36d32df98b7fc09b
Summary:
This allows us to decode VLQ integers at a given offset, for anything that
implements `AsRef<[u8]>`. Instead of having to couple with a `&mut Read`
interface. The main benefit is to get rid of `mut`. The old `VLQDecode`
interface has to use `&mut Read` since reading has a side effect of changing
the internal position counter.
Reviewed By: markbt
Differential Revision: D7093998
fbshipit-source-id: 20cb14e38c828462c34f32245d0f0f512028b647
Summary:
I'm going to add more ways to do VLQ parsing (ex. reading from a `&[u8]`
instead of a `Read` which has to be mutable). So let's add a benchmark to
compare the `&[u8]` version with the `Read` version.
Reviewed By: DurhamG
Differential Revision: D7092960
fbshipit-source-id: e1189de10396516c732dc73b45b7690a1718f1c0
Summary:
Previously pushrebase would only send changegroups using the cg1
format. remotefilelog will soon require cg2 (and it results in better deltas
anyway), so let's change pushrebase to allow using cg2.
Initially it is off by default. We will change it to be on by default once the
server has been upgrade to to handle the received part.
Reviewed By: mjpieters
Differential Revision: D7108732
fbshipit-source-id: ff4ad3a3fc2801aec4876db30c8130ce743b2e6a
Summary:
When background prefetch is enabled, let's use the base parameter to
limit how many files we download. This makes the operation O(files changed), and
on windows results in a significant speed up.
Reviewed By: mjpieters
Differential Revision: D7108838
fbshipit-source-id: a46b8a7d897ee204b9a4c1f1c65d875dbd3e9bc7
Summary:
There is no need to strip path separators from the normalised path; normpath
will never leave any.
We also ensure that the path prefix we test ends in a path separator, to avoid
matching on sibling paths that happen to share a prefix.
Reviewed By: ryanmce
Differential Revision: D7056649
fbshipit-source-id: 10b78a78ba44fbc8d9c05fb7ffd0ffd1c1496a67
Summary:
Historically treeonly clients were ignoring flat manifests sent in
bundles, but with a recent change they will now try to recreate those manifests
when they receive such a bundle (so we don't lose any data when a user does 'hg
unbundle' on a flat-only bundle). This means we need to stop sending flat
manifests from the server to treeonly clients.
Differential Revision: D7083028
fbshipit-source-id: e4580b00a8be96fbef0ee624529c58f41cfa2752
Summary:
Previously we were only wrapping the changegroup packers for the
clients. In a future diff we want to use the shallow changegroup packers to
govern when to send trees from the server, so we need them enabled for the
server.
It turns out they were mostly enabled for the server already. While we weren't
replacing the classes in changegroup._packermap (which is what they get
instantiated from), we were replacing them via the interposeclass decorator.
This meant that the server would construct a cg3packer, but it would have a
shallowcg1packer in the class hierarchy, so most of the code was running
already. So this should be a pretty low risk change. In theory.
Differential Revision: D7083042
fbshipit-source-id: 5ce44a9ceda4d7d4bd126f52a01e45e6e1e7de40
Summary:
Eventually all the clients using the tree manifest will operate in
treeonly mode. In treeonly mode, we only read from tree manifests and
therefore, manifest related operations would fail for commits which only have
flat manifest. Example of such commits are old draft commits which were created
before the existence of tree manifest. One way to resolve this is to
automatically convert any flat manifests we come across to tree manifests. This
commit achieves that same.
Differential Revision: D7083033
fbshipit-source-id: 092fc7852ffc6d1b4130b5a1fc8d9e124cef4fcb
Summary:
We were already hacking the ui object onto the manifestlog via the
treemanifest extension, but it didn't cover all cases. Let's just put it on the
manifestlog via the constructor and get rid of the hack. An upcoming change
required this
Reviewed By: phillco
Differential Revision: D7083032
fbshipit-source-id: 4c577cb80193a9c4799853d75a71c26719348e8c
Summary:
Previously we added the mutable packs to the union stores directly when
the mutable packs were first created. This meant we didn't have fine grained
control over what order the mutable pack was in the store. Since we want the
mutable packs to be before the remote store and before the upcoming ondemand
store, we need to insert it into the list at store initialization time.
We can't insert the mutable packs themselves, since they are opened and closed
all the time, but we can insert a proxy class that knows how to find the current
mutable pack on the manifestlog.
Differential Revision: D7083045
fbshipit-source-id: c08a877783c6bb6b95beac4e40544880a6bd3a8f
Summary:
In some cases the generating store could result in infinite recursion
if the generate function didn't actually produce the desired value. Let's add a
context manager to guard against this.
Reviewed By: mjpieters
Differential Revision: D7083030
fbshipit-source-id: 5e0037addbf2ba9fa9d4e222291cc4543da5c659
Summary:
Adds the mutable stores to the union store when they are created. This
will allow reads to access data that has been written but not finalized. This
will be important in later patches for converting a series of trees as part of a
single transaction.
Differential Revision: D7083031
fbshipit-source-id: 25ddbb1dbd29ad6b4164733b6d893b9c69d9d65e
Summary:
This makes the mutable history pack implement the history store read
api so we can add it to the union store and read the contents of things that
have been written but not yet committed.
The mutablehistorypack fileentries variable has been changed to contain a dict
instead of a list so we can access it quickly during reads. The list is from a
legacy requirement where we used to maintain the order that the writer wrote in.
We no longer do that (instead we topologically sort what they've given us), so
switching from a list to a dict should be fine.
Differential Revision: D7083036
fbshipit-source-id: ae511db60ab6432059714a2271c175dc9683b8e1
Summary:
Now that _writeclientmanifest is basically just calling
manifestlog.add, let's get rid of it.
Differential Revision: D7083029
fbshipit-source-id: eee18cefd5a6ae3d95bba58b419364fc9fdb15b3
Summary:
As part of moving the manifest mutable pack logic to be used optionally
without transactions, let's move the lifetime logic
out to the repo.transaction() function instead of being part of the primary manifest
write logic.
Differential Revision: D7083035
fbshipit-source-id: 9947d78db61a41896cc8bdfaaa20504ebb03a125
Summary:
Previously the mutable packs were kept on the transaction and had a
lifetime that corresponded with it. In a future patch we want to enable mutable
packs that span for longer than the lifetime of the transaction, so let's move
the mutable pack maintenance on the manifestlog. For now the lifetime is still
maintained by the transaction, but a future diff will change that as well (and
will get rid of _writeclientmanifest entirely).
Differential Revision: D7083034
fbshipit-source-id: 3735eadfc18e5dd1015bfb82dbf5b9e9e6965cdf
Summary:
As part of a series of diffs that will remove the need for addmemtree,
let's remove it from the _writeclientmanifest function so we can replace
_writeclientmanifest with a cleaner interface that doesn't call addmemtree.
Reviewed By: ryanmce
Differential Revision: D7083037
fbshipit-source-id: 51885ec547df5aa21e66afe36eb1f3224c3eae66
Summary:
We now have a unified place for writing trees to packs, so let's move
the fastmanifest code to use it.
Reviewed By: ryanmce
Differential Revision: D7083043
fbshipit-source-id: f5fc312b7614906b917fc7ca10866705fbd47aac
Summary:
A future diff will be removing the fastmanifest tree write helper. To
simplify that transition, let's move all the fastmanifest specific logic out of
the fastmanifest helper function.
Reviewed By: ryanmce
Differential Revision: D7083054
fbshipit-source-id: bf023efb857af2511b4ed7ae7ef069ee15575f08