Summary:
Hex encoding and decoding is in the hot path for gettreepacks, particularly
when we have large fetches (for example with clones), we can easily do this in
the order of tens or hundreds of thousands.
Switch to using a SSE/AVX2 optimized hex encoding/decoding which results in
a 10x performance improvement for decoding. Encoding was already using a
relatively optimized version, so gain is expected to be much lower for encoding.
Reviewed By: farnz
Differential Revision: D15940209
fbshipit-source-id: 28734f45f7508a94b110e25c01e1baa955ebd4e4
Summary:
If a file has two parent filenodes and one of them is ancestor of another then
we want to keep only the descendant filenode as a parent.
(NOTE) Note that this diff doesn't fix all the corner cases. In integration tests a file that has
two parents is modified in a merge commit. Note that if it wasn't modified in
a merge commit then Mercurial produces a hash different from Mononoke. I'll
investigate why it happens
(NOTE) We had incorrect hashes in our test fixtures - this diff fixes them as well
Reviewed By: farnz
Differential Revision: D15896735
fbshipit-source-id: ea31071bc69fab02935887c665f6d03b64d5c572
Summary:
The problem was in using `file_changes()` of a bonsai object. If a file
replaces a directory, then it just returns an added file, but not a removed
directory.
However `changed_entry_stream` didn't return an entry if just it's mode was changed (i.e. file became executable, or file became a symlink). This diff fixes it as well
Let's use the same computing changing files method instead of `file_changes()`.
Differential Revision: D14279470
fbshipit-source-id: 976b0abd93646f7d68137c83cb07a8564922ce17
Summary: The Copy trait means that something is so cheap to copy that you don't even need to explicitly do `.clone()` on it. As it doesn't make much sense to pass &i64 it also doesn't make much sense to pass &<Something that is Copy>, so I have removed all the occurences of passing one of ouf hashes that are Copy.
Reviewed By: fanzeyi
Differential Revision: D13974622
fbshipit-source-id: 89efc1c1e29269cc2e77dcb124964265c344f519
Summary:
Let's split reachability index crate. The main goal is to reduce compilation
time. Now crates like revsets will only depend on traits definition but not on
the actual implementation (skiplist of genbfs).
Reviewed By: lukaspiatkowski
Differential Revision: D13878403
fbshipit-source-id: 022eca50ac4bc7416e9fe5f3104f0a9a65195b26
Summary:
This crate depends on BlobRepo and will be used only in tests. For the rest of
the revsets we'll be able to get rid of BlobRepo dependency.
Reviewed By: lukaspiatkowski
Differential Revision: D13878389
fbshipit-source-id: bf5c5861882b18397842ff5f779999a52b963c2b
Summary:
Some crates, namely revsets and reachabilityindex, currently depend on
blobrepo, while all they need is the ability to fetch commits.
By moving changeset_fetcher outside this dependency will be removed. That may
make builds faster
Reviewed By: lukaspiatkowski
Differential Revision: D13878369
fbshipit-source-id: 9ee8973a9170557a4dede5404dd374aa4a000405
Summary: This hint is passed to many places, so it reduces the code.
Reviewed By: StanislavGlebik
Differential Revision: D13802159
fbshipit-source-id: 891eef00c236b2241571e24c50dc82b9862872cc
Summary: This diff is created to separate the lint formatting work from the rest of the code changes in D13632296
Reviewed By: lukaspiatkowski
Differential Revision: D13691680
fbshipit-source-id: 8e12016534d2e6066d803b51b5f12cbf6e89a822
Summary:
All revsets should use bonsai changesets and not hg chnagesets.
This diff replaces usages of SingleNodeHash with SingleChangesetId.
It doesn't remove all of the usages, but it removes most of them
Reviewed By: aslpavel
Differential Revision: D13467116
fbshipit-source-id: 92c5b8f63f07e13af642a8cdb91fc77c46cdd595
Summary:
It's a test function, and passing additional parameter is annoying. Let's just
create mock context
Reviewed By: ikostia
Differential Revision: D13467118
fbshipit-source-id: fd27893d80f6b0ba59c2b7e5083d4ec7727a0e89
Summary: PhantomData only used for test builds.
Reviewed By: StanislavGlebik
Differential Revision: D13460298
fbshipit-source-id: e712e468a4dacd6ddad3b6159c3020d49e87306f
Summary:
Made changes to SetDifferenceNodeStream struct, and associated member functios.
Also ran :RustFmt on quickcheck.rs and setdifferencenodestream.rs
Reviewed By: StanislavGlebik
Differential Revision: D13448043
fbshipit-source-id: c38567ad8fb94d55b463b28abf4bd78987a9c68a
Summary:
Refactored UnionNodeStream to use changesetid
Modified all other classes using it
Modified tests.
Depends on: D13275956
Reviewed By: StanislavGlebik
Differential Revision: D13326684
fbshipit-source-id: fd34f52739cf3e2876aa7a30c5060b3b3410f413
Summary:
Refactored intersect nodestream to use changesetid
Modified all other classes using it
Modified tests (some still need to be modified, see `TODO Hrenic`)
Reviewed By: StanislavGlebik
Differential Revision: D13275956
fbshipit-source-id: 90d3c191e161be8a9ce1841de0e04ce60438764b
Summary:
Previously `max_gen()` function did a linear scan through all the keys, and it
was linear. Let's use `UniqueHeap` datastructure to track maximum generation
number.
Reviewed By: lukaspiatkowski
Differential Revision: D13275471
fbshipit-source-id: 21b026c54d4bc08b26a96102d2b77c58a981930f
Summary:
Recursion can easily become too deep and overflow the stack. Let's use
`loop_fn` instead
Reviewed By: lukaspiatkowski
Differential Revision: D13169015
fbshipit-source-id: bf5cf151e83fd4bd785ff4b81a93858e7e2dcfde
Summary:
NodeFrontier is a hashset that stores commits and their generation numbers.
We'll use it to figure out what nodes client already has (`exclude` nodes).
Before this change we used `GroupedByGenenerationStream`, but it doesn't allow
us to skip some commits. We can skip commits with NodeFrontier though.
Reviewed By: lukaspiatkowski
Differential Revision: D13122521
fbshipit-source-id: 08eddb71e49b16b879f65bc9b8b177dc5dbcc034
Summary:
Added a get_stats() Hashmap<String, Box<Any>> method for all ChangesetFetchers.
The CachingChangesetFetcher now returns statistics for cache.misses: usize, cache.hits: usize, fetches.from.blobstore: usize and max.latency: Duration.
Reviewed By: StanislavGlebik
Differential Revision: D10852637
fbshipit-source-id: 34114fd94c47aa26ea525fcc4ff76ad60827bc71
Summary:
Mercurial stores executable bit as part of the manifest, so if changeset only changes that attribute of a file Hg reuses file hash. But mononoke has been creating additional file node. So this change tries to handle this special case. Note this kind of reuse only happens if file has only one parent [P60183653](P60183653)
Some of our fixtures repo were effected, hence this hashes were replaced with updated ones
```
396c60c14337b31ffd0b6aa58a026224713dc07d => a5ab070634ab9cbdfc92404b3ec648f7e29547bc
339ec3d2a986d55c5ac4670cca68cf36b8dc0b82 => c10443fa4198c6abad76dc6c69c1417b2e821508
b47ca72355a0af2c749d45a5689fd5bcce9898c7 => 6d0c1c30df4acb4e64cb4c4868d4c974097da055
```
Reviewed By: farnz
Differential Revision: D10357440
fbshipit-source-id: cdd56130925635577345b08d8ed0ae6e229a82a7
Summary:
Main reason is to make startup faster because ChangesetFetcher can uses bulk
caches in the same way as getbundle does.
Reviewed By: farnz
Differential Revision: D10032435
fbshipit-source-id: 717114339edf31865b498893d75695968447bb43
Summary:
Revsets should use ChangesetId instead of NodeHash. This diff cleans up
ancestors revset
Reviewed By: farnz
Differential Revision: D10032436
fbshipit-source-id: 2c7d170738826154e3b606e9e29a739a34b1840e
Summary:
High-level goal: we want to make certain big getbundle requests faster. To do
that we'd store blobs of commits that are close to each other in the blobstore
and fetch them only if we had too many cache misses. All this logic will be
hidden in ChangesetFetcher trait implementation. ChangesetFetcher will be
created per request (hence the factory).
Reviewed By: farnz
Differential Revision: D9869659
fbshipit-source-id: 9e3ace3188b3c13f83ef1bd61b668d4f22103f74
Summary: Use `ChangesetId` in `DifferenceOfUnionsOfAncestorsNodeStream` instead of `HgNodeHash`. This avoids several bonsai lookups of parent nodes.
Reviewed By: StanislavGlebik
Differential Revision: D9631341
fbshipit-source-id: 1d1be7857bf4e84f9bf5ded70c28ede9fd3a2663
Summary:
Use .chain_err() where appropriate to give context to errors coming up from
below. This requires the outer errors to be proper Fail-implementing errors (or
failure::Error), so leave the string wrappers as Context.
Reviewed By: lukaspiatkowski
Differential Revision: D9439058
fbshipit-source-id: 58e08e6b046268332079905cb456ab3e43f5bfcd
Summary: Cleans things up a bit, esp when matching Context/Chain.
Reviewed By: lukaspiatkowski
Differential Revision: D9439062
fbshipit-source-id: cde8727437f58b288bed9dfacb864bdcd7dea45c
Summary: Now pushrebasing stacks as well. Again, still no conflicts checks
Reviewed By: aslpavel
Differential Revision: D9359807
fbshipit-source-id: 9f6e7a05b45fb80b40faaaaa4fe2434b7a591a7c
Summary:
Revsets must use ChangesetId, not HgNodeHash. I'm going to use
`RangeNodeStream` in pushrebase so I thought it was a good time to change it
Reviewed By: farnz
Differential Revision: D9338827
fbshipit-source-id: 50bbe8f73dba3526d70d3f816ddd93507db99be5
Summary:
Alas, the diff is huge. One thing is changing Changesets to use ChangesetId.
This is actually quite straightforward. But in order to do this we need to
adapt our test fixtures to also use bonsai changesets. Modifying existing test
fixtures to work with bonsai changesets is very tricky. Besides, existing test
fixtures is a big pile of tech debt anyway, so I used this chance to get rid of
them.
Now test fixtures use `generate_new_fixtures` binary to generate an actual Rust
code that creates a BlobRepo. This Rust code creates a bonsai changeset, that
is converted to hg changeset later.
In many cases it results in the same hg hashes as in old test fixtures.
However, there are a couple of cases where the hashes are different:
1) In the case of merge we are generating different hashes because of different
changed file list (lukaspiatkowski, aslpavel, is it expected?). this is the case for test
fixtures like merge_even, merge_uneven and so on.
2) Old test fixtures used flat manifest hashes while new test fixtures are tree
manifest only.
Reviewed By: jsgf
Differential Revision: D9132296
fbshipit-source-id: 5c4effd8d56dfc0bca13c924683c19665e7bed31
Summary:
Back out "[mononoke] Switch to cachelib for blob caching"
Original commit changeset: 2549d85dfcba
Back out "[mononoke] Remove unused asyncmemo imports"
Original commit changeset: e34f8c34a3f6
Back out "mononoke: fix blobimport"
Original commit changeset: b540201b93f1
Reviewed By: StanislavGlebik
Differential Revision: D8989404
fbshipit-source-id: e4e7c629cb4dcf196aa56eb07a53a45f6008eb4e
Summary: These files import the asyncmemo crate but do not use it. Remove it.
Reviewed By: StanislavGlebik
Differential Revision: D8951887
fbshipit-source-id: e34f8c34a3f652156b63d795023d67260242a58e
Summary:
Deleted the RepoGenCache structure, associated file, and public exports.
Also deleted the containing repoinfo crate, as nothing else was using it now.
Deleted some existing references to it in experimental code which weren't caught in the test plan but were blocking this from landing.
Reviewed By: StanislavGlebik
Differential Revision: D8787103
fbshipit-source-id: 0b90c758ea8175cb0f3ec74c371592b9ca5b192e
Summary: Removed RepoGenCache from RangeNodeStream structure, methods, and tests. Last step is to delete RepoGenCache structure.
Reviewed By: StanislavGlebik
Differential Revision: D8786943
fbshipit-source-id: 70fa5b82da548d11126a289eb2cb11aaace18463
Summary: Completely removed references to RepoGenCache in the test helper function assert_node_sequence, and updated all usages of it in the revset tests.
Reviewed By: StanislavGlebik
Differential Revision: D8786191
fbshipit-source-id: f1504e7a95e5555c86a2f02cb98ce1f28e374eab