Summary: Document the format. Actual implementation in later diffs.
Reviewed By: DurhamG
Differential Revision: D7190575
fbshipit-source-id: 243992fd052ca7a9688d54d20694e65daebb9660
Summary:
The append-only index is too different so it's cleaner to cherry-pick code
from radixbuf, instead of modifying radixbuf which would break code
depending on it.
Started by picking the base16 iterator part.
`rustc-test` does not work with buck, and seems to be in an unmaintained
state, so benchmark tests are migrated to criterion.
Reviewed By: DurhamG
Differential Revision: D7189143
fbshipit-source-id: 459a79b4cf16f35d2ff86f11a5980ba1fc627951
Summary:
Filesystem is hard. Append-only sounds like a safe way to write files, but it
only really helps with process crashes. If the OS crashes, it's possible that
other parts of the file gets corrupted. As source control, data integrity check
is important. So bytes not logically touched by appending also needs to be
checked.
Implement a `ChecksumTable` which adds integrity check ability to append-only
files. It's intended to be used by future append-only indexes.
Reviewed By: DurhamG
Differential Revision: D7108433
fbshipit-source-id: 16daf6b8d04bba464f1ee9221716beba69c1d47b
Summary:
First step of a storage-related building block that is in Rust. The goal is
to use it to replace revlog, obsstore and packfiles.
Extern crates that are likely useful are added to reduce future churns.
Reviewed By: DurhamG
Differential Revision: D7108434
fbshipit-source-id: 97ebd9ba69547d876dcecc05e604acdf9088877e
Summary:
Previously, we could be in a merge state (as from `hg update --merge`)
and morestatus did not show any information about the conflicts. Now we will
show conflict info whenever there is a merge state.
Reviewed By: phillco
Differential Revision: D7149411
fbshipit-source-id: e4e03036f3a11bda3edc3628d503a8b3aea412be
Summary:
When converting an incoming bundle, rebuilding the flat text every time
is very expensive. Since we're usually converting a series of manifests that
build upon each other, let's cache the previous flat texts.
Reviewed By: quark-zju
Differential Revision: D7126948
fbshipit-source-id: 9d0671c0b1cd6a63a4acecc614b255c4214328bb
Summary:
Previously, in treeonly mode we would ignore any flat manifests that
were received in changegroups (via bundles or pulls, etc). This ended up causing
data loss in practice when people applied old bundles from before the
treemanifest conversion. Instead of just dropping those manifests, let's convert
them on the fly. This may be expensive, but it's better than losing the data.
A future diff may add caching to reuse flat text to speed up applying multiple
deltas.
Reviewed By: quark-zju
Differential Revision: D7083038
fbshipit-source-id: d2e350325d7e9005c8ddd5462034040274f790ff
Summary:
It makes sure that adding new options to the commands won't break them. For
example, rage sparse output was broken, and this diff fixes it.
Note that this changes behavior of the rage - if, say, smartlog extension is not enabled on the client, then there will be no output in the rage. Previously it wasn't the case. I think that's not the big problem, because:
a) The extensions are the most common ones and enabled everywhere
b) If they are disabled, then we'll immediately see a problem in hg rage.
However I need to change a test to make it pass. I need to add extensions and change grep output, because `hg rage --preview` has a few lines with `blackbox` in it.
Differential Revision: D7193297
fbshipit-source-id: dde2752ebc7dd3e3edea5c44576d0986f7d18744
Summary:
Add a function that returns command and all the default options already
initialized. It should be used by commands that call other commands. For
example, calling pull inside of update, calling log inside of show etc.
getcmdanddefaultopts has important benefits:
1) It returns "wrapped" command i.e. command with all the overrides applied. On
the other hand, commands.pull doesn't return it.
2) It correctly initializes options to their default value and correctly
changes their name - replace '-' with '_'.
Reviewed By: ryanmce
Differential Revision: D7193296
fbshipit-source-id: e8673bd4e16aad6156498660f2a7ed788ed2cac3
Summary:
For some reason I thought we could defer both of these checks if IMM, but we can only
really skip the local changes check.
Reviewed By: quark-zju
Differential Revision: D7208076
fbshipit-source-id: 10d1ed50b7d7eadcf66cef4d11185690ccd8d07b
Summary:
1. Variable Length Arrays are not supported by MSVC, but since this is a C++ code, we can just use heap allocation
2. Replacing `inet` with portability version
Depends on D7196403
Reviewed By: quark-zju
Differential Revision: D7196605
fbshipit-source-id: a0d88b6e06f255ef648c0b35a99b42ba3bee538a
Summary:
This both fixes semantics and makes `compat.h` a bit more readable.
This because necessary, because we migtrated from external `compat.h` to `mercurial/compat.h` in D7064623.
Reviewed By: DurhamG
Differential Revision: D7196403
fbshipit-source-id: 0005cc2f4e58951adfe8f7f795067da728ad64ae
Summary: sed -i without arguments doesn't work on OSX.
Reviewed By: farnz
Differential Revision: D7195193
fbshipit-source-id: a8eead927c94404a37ce5df956de82d29bc1b6a8
Summary:
In D7001328 we've added a new feature that skips commtis if there are no
changes relative to the sparse checkout. Unfortunately that causes lots of
treepacks downloads and makes bisect unusable. Let's revert the change.
Reviewed By: ryanmce, farnz
Differential Revision: D7182016
fbshipit-source-id: 274b29ca6a7b4c3faf83883b64f5ad3b0289873e
Summary:
This is an alternative fix. But it's generally better to avoid using
generator in this case.
Reviewed By: DurhamG
Differential Revision: D7189763
fbshipit-source-id: 0697f2b80e8ba0a4da7c538e0701a150386410e5
Summary: The option has been gone from the command decorator so it's been inaccessible, in addition to be rarely used, unsupported on laptops, and highly facebook-specific.
Reviewed By: ryanmce
Differential Revision: D7142733
fbshipit-source-id: ee4c833f170e8b8036624ca28cf286e8a0b0cf2d
Summary:
Before this change `hg pullbackup` did not set correct markers on commits.
This change make possible to see what changes already landed even when we are restoring repository from backup.
Before the change `fbclone` + `hg pullbackup` of repo with `C1` commit landed would result in:
```
o o C2
| |
o o C1
| /
|
o
```
after:
```
o o C2
| |
o x C1
| /
|
o
```
Reviewed By: StanislavGlebik
Differential Revision: D7032572
fbshipit-source-id: ffee3c7cc23c24a3df9a89c999c9dd2de226dbff
Summary:
Rewriting a set of commits where there are replacement relationship among the
commits do not have an optimal UX today. For example, `rebase -s A -d Z` or
`metaedit A` in the below graph. B1, B2, C will all be replaced. But the new B1
and B2 replacement won't have the B1 -> B2 relationship, and the "new B1"
appears to be revived.
```
o C
|
x B1 (amended as B2)
|
| o B2
|/
o A o Z
```
One solution is to avoid rebasing `obsolete()::`, as implemented in D7067121
for metaedit. That would result in
```
o C
|
x B1 (amended as B2) o new B2
| |
x A o new A
```
The stack of A, B1, C is forced to break into two parts. This is fine for
power users. But n00b users would wonder why C is left behind. Per discussion
with simpkins at an internal post about the metaedit case, we think a more
linear history is more user-friendly. That is:
```
o new C
|
x new B1 (amended as *new* B2)
|
| o new B2
|/
o new A
```
The stack stays in a same shape.
This diff implements the "copying obsmarkers" behavior at the "createmarkers"
level so everything using that API would get the feature for free, including
metaedit and rebase.
D7067121 is reverted since the new UX is preferred. The test added is for
`metaedit` command, changes to rebase will be added in a later patch.
Differential Revision: D7121487
fbshipit-source-id: fd3c8a96ab434b131fb86d9882ccbdff8f63f05e
Summary:
Sparse profile files are INI files, and semicolons are the traditional comment line starter.
There are already profiles that use the semicolon as a comment, see diffusion/FBS/browse/master/tools/scm/sparse/fbobjc/sandcastle and diffusion/FBS/browse/master/tools/scm/sparse/fbandroid/sandcastle
Reviewed By: farnz
Differential Revision: D7181613
fbshipit-source-id: a42171f6bd6213147c6363f8f359f885af38b8af
Summary:
Recently we unified the client and server code paths a bit, which
can cause the treemanifest server to attempt to do a prefetch (which doesn't
make sense since it has no where to prefetch from). It ends up throwing an Abort
error about not having a remote server configured. The fix is to make the
prefetch path smarter about when it's run on the server and to throw a standard
MissingNodesError instead. That kind of error is already handled in the hybrid
repository case and we just eat it and server the flat manifests like normal.
Once we move to treeonly mode, that error handler will re-raise the exception so
real issues with missing nodes won't be hidden.
Reviewed By: phillco
Differential Revision: D7182283
fbshipit-source-id: 15ed6549d9d7da1fee0570e1fa10338545ed92b1
Summary:
There's a bug where infinitepush attempting to rebundle a bundle that
does not contain trees causes an exception because the server attempts to
prefetch those trees (which fails because there's no where to prefetch from).
This diff just adds a test for that case. The next diff will fix it.
Reviewed By: phillco
Differential Revision: D7182284
fbshipit-source-id: a3fbb576cf3318c81b18943e0f0d466aa65e54fb
Summary:
All of the repos that use commit cloud have remotenames extension enabled, so bookprevnode and pushbackbookmarks parameters are not used. Local bookmarks won't be updated after a push.
We remove "bookprevnode" and "pushbackbookmarks" and functions related to them.
Reviewed By: StanislavGlebik
Differential Revision: D7122411
fbshipit-source-id: 0c6b3bc3f41f5b03d4bb2bc297ae35d77c90fedf
Summary:
The feature that automatically converted flat manifests to trees is
dependent on the hg server not sending flat manifests to treeonly clients
(otherwise it's very, very slow). Since the server rpms got reverted, we need to
backout these changes until the server issues are fixed.
Reviewed By: farnz
Differential Revision: D7181025
fbshipit-source-id: 1e4aad04d15909a3ce4f69313419e50c14bc8c19
Summary:
When converting an incoming bundle, rebuilding the flat text every time
is very expensive. Since we're usually converting a series of manifests that
build upon each other, let's cache the previous flat texts.
Reviewed By: quark-zju
Differential Revision: D7126948
fbshipit-source-id: d31442f71b5a13f5afcd54b019c9bbc85f6f889e
Summary:
Previously, in treeonly mode we would ignore any flat manifests that
were received in changegroups (via bundles or pulls, etc). This ended up causing
data loss in practice when people applied old bundles from before the
treemanifest conversion. Instead of just dropping those manifests, let's convert
them on the fly. This may be expensive, but it's better than losing the data.
A future diff may add caching to reuse flat text to speed up applying multiple
deltas.
Reviewed By: quark-zju
Differential Revision: D7083038
fbshipit-source-id: 4912ec5ea5097163cede00158df821f116d92c9b
Summary:
This diff adds a config option to tweak deltabase in changegroup. It has 3
options:
- Always null - always use "null" as delta base, effectively make
everything full text
- No external - delta bases cannot be a revision outside the changegroup
- Default - the current behavior: delta bases can be anything that client
thinks the server should have.
This gives Mononoke more time to bake delta related logic, as we can
choose "always null" first, then incrementally increase the complexity.
Reviewed By: phillco
Differential Revision: D7158585
fbshipit-source-id: 5f6d9a78d1108093e8d08b9f296568f4f7e7471b
Summary:
Verify had some logic that checked the length of the changelog and
manifest to decide if either existed. This allowed for simplifying certain error
messages (like not reporting all the broken changelog manifest pointers if the
manifest was simply gone, and just reporting the manifest was gone).
Unfortunately, in future changelog and manifest implementations len() will be an
expensive function, so let's just get rid of that optimization.
This fixes hg verify in a treeonly repository.
Reviewed By: quark-zju
Differential Revision: D7127168
fbshipit-source-id: 8ddc3dfe3c3c913efd4b7af5fc9715a3e48b60a1
Summary:
Currently if you push or pull a bunch of commits between peers we will
include all the trees as part of the push. If the source repo doesn't have all
the necessary trees, it will go to the server to get them. Since the other
machine can just as easily go to the server (and probably won't need most of
those trees anyways), lets just have the source client send all draft trees and
skip the public commits,
Reviewed By: phillco
Differential Revision: D7141623
fbshipit-source-id: 6d33ae9d4c9cc32bf6dfa76f733c87c06890d719
Summary:
As part of unifying all our pre-pull/push prefetches, let's move the
changegroup-building prefetch into the cansendtrees function. In a future diff
we'll change this logic to not send trees for public commits in a peer to peer
push/pull.
Reviewed By: mjpieters
Differential Revision: D7141625
fbshipit-source-id: 0253fa32993666f3e03c10c98163d8d60370a97c
Summary:
A future diff will make it so we can send only draft trees instead of
all trees. To prepare for this, let's move the cansendtrees logic to
shallowbundle (since it will be used by both shallowbundle and by treemanifest)
and change it to return an enum.
Reviewed By: quark-zju
Differential Revision: D7141624
fbshipit-source-id: 34c78b0d1cdb6f8d86a99fb74665e80b2af12c5c
Summary:
A future diff is going to change what trees are sent during a peer to
peer push. Let's update this test so we can see the actual changes in the next
diff.
Reviewed By: singhsrb
Differential Revision: D7141626
fbshipit-source-id: 75e61e9c417d86c48ed1762d6ab67bd4204f67c7
Summary:
`-r` seems to be unneeded, as tests pass without it.
Also, it does not look like the regex itself uses anything not mentioned in `man re_format` on OSX, so we can just use the non-extended re.
Reviewed By: StanislavGlebik
Differential Revision: D7167503
fbshipit-source-id: 3c5c520e9bf2627523cabc771226fc37dc2e9171
Summary:
This echos the change in D7056650; there is no need to special-case
treemanifests here; delegation to the manifest .match method allows the
manifest to apply optimisations when available.
Differential Revision: D7100363
fbshipit-source-id: 66a35850a132f804efb407712d2e4db737c10cff
Summary:
The current code iterates over all files in the manifest, filtering against a prefix.
But a manifest supports using a matcher directly, and efficient implementations like the treemanifest will prune the tree to a much smaller subset rapidly based on the path in a matcher. Switching to using a matcher dramatically improves --cwd-list performance in fbsource, when treemanifests are available.
Reviewed By: quark-zju
Differential Revision: D7056650
fbshipit-source-id: 2bf62ea93680323a49c9282266118805881d7b02
Summary:
During big `hg update` calls, the user will often see no progress
initially while we discover which blobs we need to download. Let's fix that.
Reviewed By: quark-zju
Differential Revision: D6903134
fbshipit-source-id: 35b174120b6dce412dd337b6b93c9f5b4233522d
Summary:
For large updates, the dirstate update can take a while. Let's show
progress so the user understands what is happening and how long to wait.
Reviewed By: quark-zju
Differential Revision: D6903133
fbshipit-source-id: f7f6c3c14e1d3221a383da4a6e311aa12a8d3a98
Summary: `advice` should default to '' if not set, because concatenating `None` and a string will crash. Also set the config everywhere.
Reviewed By: markbt, quark-zju
Differential Revision: D7151898
fbshipit-source-id: 8243267c379da13e293f8e4b2d3cd1976bafbf9d
Summary:
Added passing BatchMode option to SSH call only when puchbackup runs in background.
Also fixed dummyssh in skipping options before hostname, added unittest.
Reviewed By: quark-zju
Differential Revision: D7119123
fbshipit-source-id: 2c8e66fee44cca5b23389cba8e21e3a0b237268e
Summary:
Smartlog is supposed to show the latest public ancestor of all draft commits,
however this doesn't always happen.
The reason is a boundary error in the test for finding public commits. If the
latest public ancestor is also the common ancestor (fairly normal), then it
will be excluded.
Reviewed By: quark-zju
Differential Revision: D7140139
fbshipit-source-id: 6999f7ad14f86653ebe4d4f6543b9c7533871cf2
Summary:
Let's switch to xdiff for its better diff quality and performance!
The test changes demonstrate xdiff's better diff quality.
Reviewed By: ryanmce
Differential Revision: D7135206
fbshipit-source-id: 1775df6fc0f763df074b4f52779835d6ef0f3a4e
Summary:
The next test is going to switch bdiff to xdiff. This diff adds related tests
so we can clearly how xdiff improves the diff quality.
It also solves [issue5091](https://bz.mercurial-scm.org/5091) because xdiff
will shift hunks up and down to group them together. So that was also added as
a test. Although in a more complex case where the hunks are separated by some
common lines (ex. "Y"), xdiff won't help either.
Reviewed By: ryanmce
Differential Revision: D7147444
fbshipit-source-id: 3605290b5dfdfc7b8b004b38c7f7ee9534915380
Summary:
Add a "boring" threshold to limit the search range of the indention heuristic,
so the performance of the diff algorithm is mostly unaffected by turning on
indention heuristic.
Reviewed By: ryanmce
Differential Revision: D7145002
fbshipit-source-id: 024ec685f96aa617fb7da141f38fa4e12c4c0fc9
Summary:
Enable the indent heuristic feature, since it provides nice visual
improvements for a wide range of cases. See the added test, and [1].
The only downside is it can slow things down. In a crafted case, this could
make `--indent-heuristic` several times slower than `--no-indent-heuristic`.
```
open('a', 'w').write(" \n" * 1000000)
open('b', 'w').write(" \n" * 1000001)
```
```
git diff --no-indent-heuristic a b 0.21s user 0.03s system 100% cpu 0.239 total
git diff --indent-heuristic a b 0.77s user 0.02s system 99% cpu 0.785 total
```
[1]: 433860f3d0
Reviewed By: ryanmce
Differential Revision: D7135452
fbshipit-source-id: 019b7e89225f288bba0a1d042591b13b5419ad0e
Summary:
Implement a `mercurial.cext.xdiff` module that exposes the xdiff algorithm.
`xdiff.blocks` should be a drop-in replacement for `bdiff.blocks`.
In theory we can change the pure C version of `bdiff.c` directly. However
that means we lose bdiff entirely. It seems more flexible to have both at
the same time so they can be easily switched via Python code. Hence the
Python module approach.
Reviewed By: ryanmce
Differential Revision: D7135205
fbshipit-source-id: 48cd3b5be7fd5ef41b64eab6c76a5c8a6ce99e05
Summary:
xdiff generated hunks for the differences (ex. questionmarks in the
`@@ -?,? +?,? @@` part from `diff --git` output). However, bdiff generates
matched hunks instead.
This patch adds a `XDL_EMIT_BDIFFHUNK` flag used by the output function
`xdl_call_hunk_func`. Once set, xdiff will generate bdiff-like hunks
instead. That makes it easier to use xdiff as a drop-in replacement of bdiff.
Note that since `bdiff('', '')` returns `[(0, 0, 0, 0)]`, the shortcut path
`if (xscr)` is removed. I have checked functions called with `xscr` argument
(`xdl_mark_ignorable`, `xdl_call_hunk_func`, `xdl_emit_diff`,
`xdl_free_script`) work just fine with `xscr = NULL`.
Reviewed By: ryanmce
Differential Revision: D7135207
fbshipit-source-id: cfb8c363e586841c06c94af283c7f014ba65fcc0
Summary:
Add a simple binary that runs xdiff in a minimal way. This is mainly for
exposing xdiff logic so it can be used in command line for testing purpose.
It also serves as an example of how to use xdiff.
Reviewed By: ryanmce
Differential Revision: D7133531
fbshipit-source-id: ceb608f5754b61eaa95804730b3c89643ff1837b
Summary:
Patience diff is the normal diff algorithm, plus some greediness that
unconditionally matches common common unique lines. That means it is easy to
construct cases to let it generate suboptimal result, like:
```
open('a', 'w').write('\n'.join(list('a' + 'x' * 300 + 'u' + 'x' * 700 + 'a\n')))
open('b', 'w').write('\n'.join(list('b' + 'x' * 700 + 'u' + 'x' * 300 + 'b\n')))
```
Patience diff has been advertised as being able to generate better results for
some C code changes. However, the more scientific way to do that is the
indention heuristic [1].
Since patience diff could generate suboptimal result more easily and its
"better" diff feature could be replaced by the new indention heuristic, let's
just remove it and its variant histogram diff to simplify the code.
[1]: 433860f3d0
Reviewed By: ryanmce
Differential Revision: D7124711
fbshipit-source-id: 127e8de6c75d0262687a1b60814813e660aae3da