Summary:
When we upgraded from structopt 0.2 to 0.3 in D17138630, we had to put a no_version on all of our structopt data structures or else structopt would fail to compile with the error:
```
`CARGO_PKG_VERSION` environment variable is not defined, use `version = "version"` to
set it manually or `no_version` to not set it at all
```
Fbcode isn't using Cargo so there wasn't a CARGO_PKG_VERSION associated with each cli.
I raised with the maintainers in [TeXitoi/structopt#243](https://github.com/TeXitoi/structopt/issues/243) that requiring no_version attributes everywhere is annoying, and it's been fixed in structopt 0.3.1. This diff updates to 0.3.1 and removes 42 no_version attributes from fbcode.
Reviewed By: bolinfest
Differential Revision: D17231406
fbshipit-source-id: 2bab2864afbf23b34ea5d73884462455d863c139
Summary:
Add benchmark for the newly added mincode serialization. An example run shows:
serialize by cbor 32.573 ms 9.701 MB
deserialize by cbor 79.471 ms
serialize by cbor-packed 28.574 ms 7.801 MB
deserialize by cbor-packed 73.337 ms
serialize by bincode 25.175 ms 6.789 MB
deserialize by bincode 23.193 ms
serialize by mincode 19.687 ms 5.389 MB
deserialize by mincode 24.852 ms
serialize by handwritten 2.939 ms 5.389 MB
deserialize by handwritten 7.963 ms
serialize by abomonation 1.752 ms 10.389 MB
deserialize by abomonation 6.060 ms
Interesting facts:
- mincode serialization is actually faster than bincode. (it would appear
slower if vec was not preallocated).
- mincode is much slower than handwritten. This is partially caused by serde
not having a native "fixed array" type so it cannot do `write_all(&[u8; 20])`
but has to `write_u8(x)` 20 times (which translates to
`Vec::extend_from_slice` 20 times.
Regardless, the main reason I think mincode is compelling is its compactness
and relatively good performance. Although handwritten is the fastest, the
mincode performance is fine in the commit storage usecase, since mincode is
probably not the bottleneck.
Reviewed By: kulshrax
Differential Revision: D17087352
fbshipit-source-id: 820ff8538d3ab9ebef2eda0a40cad126e26db622
Summary:
The existing API is a bit harder to use - the caller must provide a
`&mut Vec<u8>`. Rename that to `serialize_into` and provide a `serialize` that
creates the `Vec<u8>` automatically. This is more consistent with the `bincode`
API interface.
Reviewed By: kulshrax
Differential Revision: D17087349
fbshipit-source-id: 0307b85818ab0f7549a264b41a08e53f7573372e
Summary: Add a basic test to ensure mincode is somehow working.
Reviewed By: kulshrax
Differential Revision: D17087350
fbshipit-source-id: efa5d18e579688521e91fe28ee67f1d7f7c15558
Summary:
This completes the mincode implementation.
I later learned wez has published a very similar crate at
https://github.com/wez/varbincode. Great minds think alike.
Benchmark with varbincode shows this crate is a little bit faster
(varbincode is not in tp2 so we cannot easily add it to benchmark now).
Reviewed By: kulshrax
Differential Revision: D17087351
fbshipit-source-id: 179392029e7b34d7cb92f00b7926ee6be4722d09
Summary:
Initialize the mincode library using serde-rs/bench. See the previous diff for
context.
Reviewed By: kulshrax
Differential Revision: D17087354
fbshipit-source-id: 6e9b6376b8aaec61e85a76320933b63d0ba1b698
Summary:
I'd like to have a mincode-like serialization format that is really simple:
vlqencoding for integers and use bincode-like strategy for the rest.
There is a mincode crate, but it does not compile due to extra depenencies and
out-of-date serde. It has extra features like bitpack, bitvec which are less
interesing in our commit serialization usecase.
The serde API needs a lot of boilerplate since it supports 20+ types. To make
it easier, I'm importing dtolnay's [serde-rs/bench](9718508c30)
as a starting point.
The serde API needs a lot of boilerplate. Import a minimal implementation so we
can customize on top of it. serde-rs/bench is similar to bincode but much smaller:
bincode:
333 config.rs
486 de/mod.rs
185 de/read.rs
115 error.rs
173 internal.rs
173 lib.rs
788 ser/mod.rs
2253 total
mincode:
57 bitpack.rs
84 bitvec.rs
95 float.rs
90 lib.rs
434 refbox.rs
103 rustc_serialize/mod.rs
427 rustc_serialize/reader.rs
460 rustc_serialize/writer.rs
123 serde/mod.rs
661 serde/reader.rs
682 serde/writer.rs
3216 total
serde-rs/bench:
407 de.rs
60 error.rs
25 lib.rs
442 ser.rs
934 total
The following (unrelated) files are excluded in this import:
README.md
benches/bincode.rs
tests/bincode.rs
.gitignore
.travis.yml
The code does not compile internally due to mismatched `bincode` version. It
will be fixed in the next diff.
Reviewed By: dtolnay
Differential Revision: D17087353
fbshipit-source-id: 26329b36e1488c0c6149287f1f2dcd89acd15b0b
Summary:
The `dag` crate provides low-level primitives for DAG queries. To be able to
replace the revlog-based changelog, we need to store actual commit data for
draft commits. That requires some serialization logic.
Till now, I have been writing serialization logic by hand for raw performance.
Given the fact that the commit object might be a bit more complex, it seems
a good idea to try some serialization framework. This diff adds some benchmark
for common choices to get a sense about how expensive serialization is.
A pre-allocated Vec was used to avoid benchmarking `realloc` performance.
A sample run outputs:
serialize by cbor 30.838 ms 9.701 MB
deserialize by cbor 78.422 ms
serialize by cbor-packed 29.122 ms 7.801 MB
deserialize by cbor-packed 72.826 ms
serialize by bincode 23.120 ms 6.789 MB
deserialize by bincode 23.184 ms
serialize by handwritten 2.797 ms 5.389 MB
deserialize by handwritten 8.238 ms
serialize by abomonation 1.644 ms 10.389 MB
deserialize by abomonation 6.033 ms
The `handwritten` format is designed to be compact, similar to mincode [1].
Some interesting facts:
- bincode is more compact than cbor-packed/rmps-packed, although bincode does
not have compact integers. This is likely because cbor/msgpack needs to
encode "type" information and bincode doesn't.
- Handwritten is pretty close to abomonation (basically memcpy).
- Handwritten (mincode) is the most compact format.
[1]: https://github.com/Boscop/mincode
Reviewed By: kulshrax
Differential Revision: D17087348
fbshipit-source-id: 52252aae0d4e44bb326b4dc52f3be767ac04d26c
Summary:
This can be done by using SpanSet::iter. But the `min`, `max` code is easier to
read.
Benchmark does not show significant change.
Reviewed By: markbt
Differential Revision: D17065952
fbshipit-source-id: 9ed8352ceb25499143931bd890e694de475ee9d2
Summary: This will be used in a later change.
Reviewed By: markbt
Differential Revision: D17065955
fbshipit-source-id: 562ee050cf308f3740de123e77f81a29c35089bd
Summary:
Previously, `SpanSet::from_spans` is one of the few ways to get a non-compact
SpanSet (ex. contains spans that can be merged). This diff solves it.
There are no usage of `SpanSet::from_spans` for large sets. Benchmark does not
change.
Since checking overlapped spans now take extra work. Overlapped spans are now
merged automatically. This makes `SpanSet::from_spans` panic-free.
Reviewed By: markbt
Differential Revision: D17065954
fbshipit-source-id: 38947852d9a5a06cee1eabb05b81b410bff755f9
Summary:
There are cases we already know spans are sorted. Provide an API for that.
This affects some of the future changes.
Reviewed By: markbt
Differential Revision: D17065953
fbshipit-source-id: 350844d7c1770d4a9c70a5d7f41d6b20d17d9757
Summary:
Implement a more fridently SpanSet::Debug and use it in tests.
This makes the test code shorter and easier to read.
Reviewed By: markbt
Differential Revision: D16988045
fbshipit-source-id: 3ac51dd5b525de03c406f7fd138d96b2b2e8e5b0
Summary: Also remove debug commands that are strictly related to remote lfs.
Reviewed By: markbt
Differential Revision: D17184902
fbshipit-source-id: da38a2150212500bab62191ddcfab0990276605e
Summary:
Instead of the lfs remote storage it was chosen to send the snapshot metadata via bundles.
Snapshot metadata consists of: the actual metadata blob + several other blobs (untracked files etc).
If we have several snapshot revisions in a single bundle, the blobs could repeat.
Then we should store each blob as a separate entry in a binary stream, keeping its id and contents.
Here we introduce a new bundle part type `"b2x:snapshotmetadataparttype"`.
```
1 byte of version info
[ # a list of binary entries, each corresponds to a separate file
# (either a metadata file itself or a related -- externally stored -- file)
<oid><length><data>
:oid: is a 64char string with the hash of the file
:length: is an unsigned int with length of the data
:data: is binary data of length <length>, the actual file contents
]
```
So far there is an ongoing discussion on the exact format of serialization.
Actual state is at [the quip doc](https://fb.quip.com/R5OVAzabX8oo).
Reviewed By: markbt
Differential Revision: D17184222
fbshipit-source-id: 90f833ec71556e90d513e3be3f3efa7f870b037d
Summary: That way it'll be easier to pack it into a blob.
Reviewed By: markbt
Differential Revision: D17183018
fbshipit-source-id: 44e21103f201aafb6f417a5b5a7b3d4735f32039
Summary: In the next diff I will replace remote lfs with bundle2
Reviewed By: markbt
Differential Revision: D17132405
fbshipit-source-id: a0dfff3ebad067abb0231cf31de08ae62affe7ce
Summary:
The `debugmutation` command uses the unfiltered repo to resolve the
user-provided revs. It shouldn't do this unless the user passes `--hidden`.
Reviewed By: mitrandir77
Differential Revision: D17156722
fbshipit-source-id: 5ab7704acc598cf8b7c1640a3096ba0ce6ac73e9
Summary:
Update the debugmutation format to collapse long chains. Add
`hg debugmutation -s` to follow the successor relationship (printing what the
commit became) instead of the predecessor relationship.
Reviewed By: mitrandir77
Differential Revision: D17156463
fbshipit-source-id: 44a68692e3bdea6a2fe2ef9f53b533199136eab1
Summary:
If treemanifest finds there are too many shared packs (more than
`packs.maxpackfilecount`) then it will purge them. This is a shame if there is
currently a repack in progress, as it will purge the packfiles from underneath
the repack, deleting lots of cache data that will be imminently repacked).
Skip the purge if there is a repack ongoing.
Reviewed By: mitrandir77
Differential Revision: D17155854
fbshipit-source-id: 20d46f29c252e508177b1fde08ca7a69841dcd7e
Summary:
Instead of hardcoding `--target <target>` and hardcode `$2` in the script.
Let's just use environment variables so `target` and other things have explicit
names.
Explicit set `REAL_CWD` so the script can learn the current directory before
it gets reset to the repo root.
Reviewed By: xavierd
Differential Revision: D17213186
fbshipit-source-id: 6a4fc4cf2cbf6e2c623400bc6bc13f7758a46c49
Summary: Let's consolidate these 2 to allow easy switching the the Rust based one.
Reviewed By: quark-zju
Differential Revision: D17187154
fbshipit-source-id: 5ccadabac2e2e4b684ca44917f1502e9a05d41d6
Summary:
While a corrupted packfile can be safely removed from the shared hgcache, the
same isn't true for local packfile. When building the packstore, let's allow
the behavior on corrupted packfile to be chosen. This is voluntarily made as an
explicit argument to DataPackStore and HistoryPackStore constructor so the
caller can take this into account.
Reviewed By: quark-zju
Differential Revision: D17187155
fbshipit-source-id: 658fce401f8902a74cfd92780013d1b96e20a590
Summary:
This makes metaedit support `-M / --reuse-message`, which I found handy when
rewriting prototype commits to formal commits.
Reviewed By: xavierd
Differential Revision: D17168991
fbshipit-source-id: fa768a2916ea3ef4db4c31a48989d10897379e92
Summary:
This just moves `-M / --reuse-message` handling from the `commit` command
to a lower layer, making it more reusable.
Reviewed By: xavierd
Differential Revision: D17168992
fbshipit-source-id: 4fe7e93ceae45eff281214dcff03ef3f9ee0c898
Summary:
Change `cmdutil.logmessage` to take a `repo` instead of `ui`. This makes the
next change easier.
Reviewed By: xavierd
Differential Revision: D17168990
fbshipit-source-id: 47c1707e5a9dbf06d07452b4c400903453992379
Summary:
I have seen multiple user complaints about slow hg commands that turned out to
be fsmonitor scanning the whole working copy. Print a warning in those cases.
Hopefully this can reduce our oncall burden a bit.
Reviewed By: xavierd
Differential Revision: D17170520
fbshipit-source-id: 8fd5721d123853136c84229d936c3e0c999f3d87
Summary:
I found this very slow. Log it so we can have some ideas about how long it
takes for others.
Reviewed By: xavierd
Differential Revision: D17066510
fbshipit-source-id: 3f9de9b816bcd2d062beb44bc03ea4114d829596
Summary:
`findrecenttrees` can take very long in my case - it only tests 1280 trees in
30 seconds. Log it so we can get some ideas about how long it takes.
As we're here, teach `util.timefunction` to figure out the `ui` object
automatically.
Reviewed By: xavierd
Differential Revision: D17066278
fbshipit-source-id: 7e59c8683359a7ce8d4e87fde92af36b95d37b2f
Summary:
Make sigtrace use smarttraceback so it prints more context.
As we're here, also make it print to stderr so we don't need to find the
traceback from /tmp.
Reviewed By: xavierd
Differential Revision: D17066277
fbshipit-source-id: 9a1000803fed27a71ec381b8ddbd76400dae99c9
Summary: It will be used by snapshot extension too.
Reviewed By: markbt
Differential Revision: D17132134
fbshipit-source-id: 6c9fc285e0f1eb445bfa0abe0b6f4de4a1bd1db0
Summary:
now blob vfs will be in core mercurial. It will be used by snapshot
extension too.
Reviewed By: quark-zju
Differential Revision: D17112671
fbshipit-source-id: e721749d27db37f55bb9eb6af3ea042e8036ddfa
Summary: This brings the Rust based PackStore on par with the Python implementation.
Reviewed By: quark-zju
Differential Revision: D17082190
fbshipit-source-id: 8cf925c3d6136c0ba586e7578a2b90b7f39192e9
Summary:
Instead of adding more arguments to the PackStore constructor, let's use the
Builder pattern.
Reviewed By: quark-zju
Differential Revision: D17082191
fbshipit-source-id: b9e31251c0fa942e76339d568049c1b75226cb88
Summary:
When created, the PackStore are empty as no disk scan are performed. As
get_missing is one of the first API to be called on the store when a prefetch
is needed, that means that get_missing always pretend that everything is
missing, which isn't true.
Since we don't want to always rescan on get_missing, let's respect the scanning
frequency and then figure out what's missing.
Reviewed By: quark-zju
Differential Revision: D16794077
fbshipit-source-id: 460fda8f118c1b36e5d5b29472e8a06ef0754ec9
Summary:
This allows `failure` to figure out a traceback with meaningful numbers.
For example, D16905460 v5 regressed on `hg --version`:
./hg --version
abort: invalid arguments
(use '--help' to get help)
Its backtrace without debug info looks like:
0: failure::backtrace::internal::InternalBacktrace:🆕:h3a5ad050e0746635 (0x561926a6f9b0)
1: <failure::backtrace::Backtrace as core::default::Default>::default::he69621ea9abcbf0e (0x561926a6fb70)
2: <clidispatch::global_flags::HgGlobalOpts as core::convert::TryFrom<cliparser::parser::ParseOutput>>::try_from::h4f631f28df465c27 (0x561926a44da6)
3: clidispatch::dispatch::dispatch::ha7ec35fc3ba053da (0x561926a27371)
4: hgcommands::run::run_command::h129879a16627d8ea (0x5619269dd85d)
5: hgmain::main::h1dcaed9591328b6f (0x5619269d8dc5)
`TryFrom` was used twice in `dispatch` and it's unclear which one is causing the issue.
Unfortunately this increases hg from 4MB (1MB gzipped) to 30MB (7MB gzipped).
Ideally we can limit debuginfo to only our crates, not 3rd party ones (ex.
regex, failure). The feature exists in cargo nightly [1].
[1]: https://doc.rust-lang.org/nightly/cargo/reference/unstable.html#profile-overrides
Reviewed By: xavierd
Differential Revision: D17097798
fbshipit-source-id: 20da9c7621e66ead4f60365769650db3319f68e1
Summary:
The Rust manifest does not allow for directories to be added if files with
the same name exist.
Reviewed By: quark-zju
Differential Revision: D17143550
fbshipit-source-id: fe2533b6f0c049d7b22f2fbb49f3e04959aea39c
Summary:
It is somewhat difficult to fetch the raw entry on the p1 side in the Rust
Manifests. These entries are used to write deltas to revlogs or to datapacks.
Reviewed By: xavierd
Differential Revision: D17143551
fbshipit-source-id: 6624116324664354d199d5f6ac55712c8ed29b9d
Summary:
The Rust code is almost at parity with the Python code, let's expose it to
Python.
Reviewed By: quark-zju
Differential Revision: D16794076
fbshipit-source-id: faf1da775b4e57328be62a06d0065c7becf1b9f4
Summary:
In some rare situation, the packs directory may not exist, the PackStore
shouldn't fail in these case, it should just act as if nothing is present in
it.
Reviewed By: quark-zju
Differential Revision: D16794079
fbshipit-source-id: e33bd55e0f378b9be58831a85ec822221af7a9bc
Summary:
The LruStore keeps the packfiles in a somewhat ordered manner to reduce the
cost of finding an object in the stores. For now, its implementation is very
basic as it just moves the store where data was found at the front.
Reviewed By: quark-zju
Differential Revision: D16746800
fbshipit-source-id: 67375e6ab8a4d9e54da9a9bb4af5d95061446e6f
Summary:
In general, mutation tracking doesn't care about divergence. However, in the
case of rebase, it doesn't make sense to allow divergence to occur if we can
avoid it by omitting some of the commits to rebase.
This makes rebase behaviour more like old obsmarker-based behaviour. This
breaks a test for mutation copying markers, so update those to use metaedit,
which has the copying behaviour for both obsmarkers and mutation. At some
point we should make rebase behave better in these cases.
Reviewed By: quark-zju
Differential Revision: D17136480
fbshipit-source-id: 9e465b7fc8bda33e7a746e4df68410713e2be37e
Summary:
Convert the test case `test-amend-nextrebase` to use new mutation and visibility tracking.
In doing so, reveals a bug where `hg next --rebase` can rebase obsolete commits.
Reviewed By: quark-zju
Differential Revision: D17136483
fbshipit-source-id: dcda88d1e8c1f435d6211cf5b76791c5a76ee343
Summary:
Add a new test for `hg next --rebase` for when a predecessor of the commit
being rebased is also visible. The predecessor should not be rebased.
Reviewed By: quark-zju
Differential Revision: D17136482
fbshipit-source-id: fa2c91ebc14c72f6a8c13c4549447809090489b3