Commit Graph

44 Commits

Author SHA1 Message Date
David Budischek
ff5c92b653 Add performance counters to analyse is_ancestor
Summary:
Is_ancestor was inexplicably slow so I added a few perf counters to it. The slowness should be fixed with D14540318.

This will only add the most important perf counters as I have seen some slow down when adding more.

I also changed the param naming from maybe_ancestor to ancestor. The maybe is implied through the method name being is_ancestor. This change also brings skiplist in line with apiserver

What will we log:
Ancestor/Descendant Generation
skip/noskip iterations
skipped generations

Reviewed By: StanislavGlebik

Differential Revision: D14085659

fbshipit-source-id: e0c7d34ed218e94c2c5a39f98d3c0b3d4265214f
2019-03-22 06:30:32 -07:00
David Budischek
2a93fe345c Block non fastforward bookmark moves
Summary:
This is a hook in mercurial, in Mononoke it will be part of the implementation. By default all non fastforward pushes are blocked, except when using the NON_FAST_FORWARD pushvar (--non-forward-move is also needed to circumvent client side restrictions). Additionally certain bookmarks (e.g. master) shouldn't be able to be moved in a non fastforward manner at all. This can be done by setting block_non_fast_forward field in config.

Pushrebase can only move the bookmark that is actually being pushrebased so we do not need to check whether it is a fastforward move (it always is)

Reviewed By: StanislavGlebik

Differential Revision: D14405696

fbshipit-source-id: 782b49c26a753918418e02c06dcfab76e3394dc1
2019-03-18 04:12:09 -07:00
Stanislau Hlebik
e6c8ef00aa mononoke: fix computing of changed files
Summary:
The problem was in using `file_changes()` of a bonsai object. If a file
replaces a directory, then it just returns an added file, but not a removed
directory.

However `changed_entry_stream` didn't return an entry if just it's mode was changed (i.e. file became executable, or file became a symlink). This diff fixes it as well

Let's use the same computing changing files method instead of `file_changes()`.

Differential Revision: D14279470

fbshipit-source-id: 976b0abd93646f7d68137c83cb07a8564922ce17
2019-03-08 06:28:49 -08:00
Kostia Balytskyi
e561682ecd mononoke: rename crates to contain underscores instead of dashes
Summary: Let's not use dashes in crate names.

Reviewed By: StanislavGlebik

Differential Revision: D14341596

fbshipit-source-id: 85a7ded60cf2e326997ac70ee47a29116af97590
2019-03-06 07:18:28 -08:00
David Budischek
405d45dbbb Deprecate unused GenBFS
Summary: Use SkiplistIndex instead for better performance.

Reviewed By: StanislavGlebik

Differential Revision: D14004167

fbshipit-source-id: 85019cbb83357d5c582040f56a31faf345d77f4c
2019-02-11 01:34:48 -08:00
Lukas Piatkowski
515a2909eb monononoke hashes: remove usages of borrows of hashes which are Copy
Summary: The Copy trait means that something is so cheap to copy that you don't even need to explicitly do `.clone()` on it. As it doesn't make much sense to pass &i64 it also doesn't make much sense to pass &<Something that is Copy>, so I have removed all the occurences of passing one of ouf hashes that are Copy.

Reviewed By: fanzeyi

Differential Revision: D13974622

fbshipit-source-id: 89efc1c1e29269cc2e77dcb124964265c344f519
2019-02-06 15:11:35 -08:00
Zeyi Fan
ef0272abca Convert thrift to failure
Summary: Migrate `rust-thrift` from `error-chain` to `failure`.

Reviewed By: Imxset21

Differential Revision: D13853559

fbshipit-source-id: 4bab8c544588265c85dee2f668ed47b8e97d321e
2019-02-06 10:34:10 -08:00
Stanislau Hlebik
a6f6f28564 mononoke: split reachability index
Summary:
Let's split reachability index crate. The main goal is to reduce compilation
time. Now crates like revsets will only depend on traits definition but not on
the actual implementation (skiplist of genbfs).

Reviewed By: lukaspiatkowski

Differential Revision: D13878403

fbshipit-source-id: 022eca50ac4bc7416e9fe5f3104f0a9a65195b26
2019-01-31 00:41:48 -08:00
Stanislau Hlebik
3d06332693 mononoke: move changeset_fetcher out of blobrepo into separate crate
Summary:
Some crates, namely revsets and reachabilityindex, currently depend on
blobrepo, while all they need is the ability to fetch commits.

By moving changeset_fetcher outside this dependency will be removed. That may
make builds faster

Reviewed By: lukaspiatkowski

Differential Revision: D13878369

fbshipit-source-id: 9ee8973a9170557a4dede5404dd374aa4a000405
2019-01-31 00:41:48 -08:00
Stanislau Hlebik
6bcddad1a6 mononoke: blobrepo errors & blob-changeset to a separate crates
Summary:
Hook manager depends on blob changeset, but it needs nothing else from
BlobRepo. Let's move it to separate crate. That also requires moving errors to
a separate crate, and that will also help with removing dependence from
skiplist to Blobrepo.

Reviewed By: lukaspiatkowski

Differential Revision: D13878207

fbshipit-source-id: 479716497f5af2da0265340cea1d44de47a3e03a
2019-01-31 00:41:48 -08:00
Liubov Dmitrieva
abe70d4324 move Send + Sync to the trait for LeastCommonAncestorsHint
Summary: This hint is passed to many places, so it reduces the code.

Reviewed By: StanislavGlebik

Differential Revision: D13802159

fbshipit-source-id: 891eef00c236b2241571e24c50dc82b9862872cc
2019-01-24 07:59:46 -08:00
Stanislau Hlebik
5dfab1ed2b mononoke: make sure we don't have stackoverflows in reachability_query
Summary:
reachability_query uses recursion so we can run out of stack. However getting
rid of recursion is easy because it was already implemented for `lca_hint`
method.

I also used this diff as an opportunity to rename `src_hash` and `dst_hash` to
`maybe_descendant_hash` and `maybe_ancestor_hash` respectively to make code
clearer.

Reviewed By: lukaspiatkowski

Differential Revision: D13650422

fbshipit-source-id: 0e52ae592992208a03691b1a5c24021a4fb94313
2019-01-14 08:56:46 -08:00
Stanislau Hlebik
6fc5c58810 mononoke: rustfmt
Summary: Format skiplist.rs file before we make changes to it.

Reviewed By: aslpavel

Differential Revision: D13650420

fbshipit-source-id: 394e07d94b57814fc9b7b345ffc81e06f95446d7
2019-01-14 06:10:30 -08:00
Jeremy Fitzhardinge
d0521d19c1 tp2/rust: update tools and crates to Rust 1.31.0
Reviewed By: HarveyHunt

Differential Revision: D13376931

fbshipit-source-id: 354ab35fe2b2b1220f7ab6da74cb3c13ef7c5952
2018-12-07 08:39:51 -08:00
Lukas Piatkowski
02e79837a4 mononoke: pass CoreContext down to bookmarks
Reviewed By: StanislavGlebik

Differential Revision: D13302943

fbshipit-source-id: 356ec3cd3c47f843a5869edb7079d4cbd0ee33aa
2018-12-04 01:16:32 -08:00
Stanislau Hlebik
bd7246dcf7 mononoke: use common traits from NodeFrontier
Summary: Address aslpavel's comments from D13275471

Reviewed By: aslpavel

Differential Revision: D13301218

fbshipit-source-id: 5e51dd3b19a6cd41ae9b4071ec3edb2f0c81999c
2018-12-03 05:09:02 -08:00
Lukas Piatkowski
14636545aa mononoke: pass CoreContext down to changesets
Reviewed By: StanislavGlebik

Differential Revision: D13277448

fbshipit-source-id: 6e9a8dac77af8ab991005d14f654e315c234fe44
2018-11-30 10:14:22 -08:00
Lukas Piatkowski
08db0a35eb mononoke: pass CoreContext down to bonsai-hg-mapping
Reviewed By: aslpavel

Differential Revision: D13277450

fbshipit-source-id: 97cfbd917b321727bb4d960c91a784787660eb5b
2018-11-30 10:14:22 -08:00
Stanislau Hlebik
c6f21b1cc8 mononoke: efficient search of max generation in NodeFrontier
Summary:
Previously `max_gen()` function did a linear scan through all the keys, and it
was linear. Let's use `UniqueHeap` datastructure to track maximum generation
number.

Reviewed By: lukaspiatkowski

Differential Revision: D13275471

fbshipit-source-id: 21b026c54d4bc08b26a96102d2b77c58a981930f
2018-11-30 04:34:02 -08:00
Stanislau Hlebik
3ae6daeb0c mononoke: use skiplist in getbundle
Reviewed By: jsgf, lukaspiatkowski

Differential Revision: D13169016

fbshipit-source-id: 59c567663680a99dd977e087f38374e77c72afd6
2018-11-29 08:19:30 -08:00
Stanislau Hlebik
52f12e92ac mononoke: remove recursion from SkiplistIndex lca_hint
Summary:
Recursion can easily become too deep and overflow the stack. Let's use
`loop_fn` instead

Reviewed By: lukaspiatkowski

Differential Revision: D13169015

fbshipit-source-id: bf5cf151e83fd4bd785ff4b81a93858e7e2dcfde
2018-11-29 08:19:30 -08:00
Stanislau Hlebik
af03216ee2 mononoke: admin command to build and read skiplist indexes
Summary:
Let's add a command that builds and reads a skiplist indexes. This indexes will
be used by getbundle wireproto request to decrease the latency and cpu usage.

Note that we are saving only the longest "jump" from the skiplist. This is done
in order to save space.

Reviewed By: jsgf

Differential Revision: D13169018

fbshipit-source-id: 4d654284b0c0d8a579444816781419ba6ad86baa
2018-11-29 08:19:30 -08:00
Stanislau Hlebik
a48e88dc28 mononoke: remove lazy skiplist building
Summary:
Previously if a commit didn't have a skiplist index then it'd be built lazily.
So if a getbundle requests a very old commit then skiplist index entries for
too many commits might be built and stored in memory.

Instead of lazily building the index let's add a constructor that accepts a
skiplist graph. That'd allow us to skiplist graph persistently and create it on
demand

Reviewed By: jsgf

Differential Revision: D13169019

fbshipit-source-id: 813bdd7e6cc90312d97275a7db5e266d97b545e0
2018-11-29 08:19:30 -08:00
Stanislau Hlebik
e704bd24fe mononoke: use NodeFrontier in getbundle
Summary:
NodeFrontier is a hashset that stores commits and their generation numbers.
We'll use it to figure out what nodes client already has (`exclude` nodes).
Before this change we used `GroupedByGenenerationStream`, but it doesn't allow
us to skip some commits. We can skip commits with NodeFrontier though.

Reviewed By: lukaspiatkowski

Differential Revision: D13122521

fbshipit-source-id: 08eddb71e49b16b879f65bc9b8b177dc5dbcc034
2018-11-29 08:19:30 -08:00
Stanislau Hlebik
0f5153e1df mononoke: skiplist thrift serialization
Summary:
We'd like to save skiplist data in blobstore so that we won't have to recompute
it. Let's use thrift serialization for it

Reviewed By: lukaspiatkowski

Differential Revision: D13122386

fbshipit-source-id: f44cdcf38fa6a219df9217906a872e60c319e391
2018-11-29 08:19:30 -08:00
Stanislau Hlebik
402f03056d mononoke: use ChangesetFetcher in skiplist code
Summary:
Let's make it use the same ChangesetFethcer as getbundle already does. It will
be used in the next diffs

Reviewed By: lukaspiatkowski

Differential Revision: D13122344

fbshipit-source-id: 37eba612935a209098a245f4be0af3bc18c5787e
2018-11-29 08:19:29 -08:00
Stanislau Hlebik
631556ed0d mononoke: migrate skiplist to bonsai changesets
Summary:
Most of our revsets are already migrated, let's migrate skiplists as well since
we want to use them in getbundle requests.

Reviewed By: lukaspiatkowski

Differential Revision: D13083910

fbshipit-source-id: 4c3bc40ccff95c3231c76b9e920af5db31b80d01
2018-11-29 08:19:29 -08:00
Pavel Aslanov
38c5145e9b hadle change only in executable bit same way as Hg
Summary:
Mercurial stores executable bit as part of the manifest, so if changeset only changes that attribute of a file Hg reuses file hash. But mononoke has been creating additional file node. So this change tries to handle this special case. Note this kind of reuse only happens if file has only one parent [P60183653](P60183653)

Some of our fixtures repo were effected, hence this hashes were replaced with updated ones
```
396c60c14337b31ffd0b6aa58a026224713dc07d => a5ab070634ab9cbdfc92404b3ec648f7e29547bc
339ec3d2a986d55c5ac4670cca68cf36b8dc0b82 => c10443fa4198c6abad76dc6c69c1417b2e821508
b47ca72355a0af2c749d45a5689fd5bcce9898c7 => 6d0c1c30df4acb4e64cb4c4868d4c974097da055
```

Reviewed By: farnz

Differential Revision: D10357440

fbshipit-source-id: cdd56130925635577345b08d8ed0ae6e229a82a7
2018-10-15 02:16:50 -07:00
Matthew Dippel
8d8d5b4b8f Refactor SkiplistIndex to handle large depth indexing
Summary:
The previous implementation of `lazy_index` would hit max recursion depth, due to long chains of `Future`s all calling poll on each other when the `Future` representing indexing of a node is waiting for the `Future` of the parent, all the way down. This modification avoids this by:
* Doing a BFS down to the desired depth, remembering all the nodes seen and returning them in topological order (oldest to newest). This is done in a `loop_fn` and so doesn't have the long chain of futures polling each other.
* Synchronously indexing the list of discovered nodes.

Reviewed By: StanislavGlebik

Differential Revision: D9228578

fbshipit-source-id: 0f472d13ee5a0a33472700d1fea29bd7a9938011
2018-08-13 17:36:13 -07:00
Matthew Dippel
24fd74c71c impl LeastCommonAncestorHint for SkiplistIndex
Summary: Implements the functionality to use SkiplistIndex to quickly find ancestors of a given set of nodes with generation <= a given parameter. This will be used to help speed up "A % B" revset operations.

Reviewed By: StanislavGlebik

Differential Revision: D9120502

fbshipit-source-id: 5e51057ab23ec6bdaf727e3e71870aad6ab27e30
2018-08-13 17:36:13 -07:00
Matthew Dippel
9e002a21d2 Trait definition for indexes which can compute an "LCA hint"
Summary:
Added a definition for a trait `LeastCommonAncestorHint`, for indexes which can compute an advanced frontier from a starting set of nodes.
Any implementation of this trait will need a method which takes a set of `HgNodeHash` "nodes", and a `Generation` value "gen", and returns a set of nodes "C" which satisfies:
* Max generation in "C" is <= "gen"
* Any ancestor of "nodes" with generation <= gen is also an ancestor of "C"

The plan is to implement this trait for `SkiplistIndex` and introduce it into parts of the revset code.

To elaborate, the current way `DifferenceOfUnionOfAncestorsNodeStream` operates, it assumes that the nodes to exclude can be used as a stream, producing (Generation, HashSet<HgNodeHash>) pairs when polled. In order to use this work to improve that, this assumption will have to be removed, because we will want to advance the exclude_nodes based on the current maximum generation of nodes to include that we know about. Then, the only structure that needs to be replaced in `DifferenceOfUnionOfAncestorsNodeStream` is exclude_nodes. It is used in the following way:
* When trying to determine if an include node should be yielded, the exclude_stream is peeked at. If it is the generation equal to the node being checked, it is checked if that node is in the exclude_stream node set. Else if the generation is less than the node being checked, the node is yielded by default.
* If the generation of the peeked exclude_stream is larger than that of the node being checked, it cannot yet be determined if that node should be yielded or not. And so the exclude_stream is "polled", which forces it to move to the next generation value.

Replacing this structure with a `NodeFrontier`, and the code that uses it so that it can use `LeastCommonAncestorHint` to move the exclude_node frontier forward, should suffice to introduce this work into `DifferenceOfUnionOfAncestorsNodeStream`.

Reviewed By: jsgf

Differential Revision: D9120503

fbshipit-source-id: 317af81a1e335e66cf72603899aa06f28c85b027
2018-08-10 15:37:05 -07:00
Matthew Dippel
c04e4d49e6 Remove 'mut' from self signature in ReachabilityIndex signature.
Summary: The 'mut' requirement wasn't required for structs implementing `ReachabilityIndex`, and will get in the way when incorporating this work into the Mononoke server / API server.

Reviewed By: StanislavGlebik

Differential Revision: D9142238

fbshipit-source-id: 4853b468bf04493289fb017bf56b3a1753f29dcd
2018-08-06 11:06:28 -07:00
Stanislau Hlebik
9abd29d4c3 mononoke: use ChangesetId in Changesets
Summary:
Alas, the diff is huge. One thing is changing Changesets to use ChangesetId.
This is actually quite straightforward. But in order to do this we need to
adapt our test fixtures to also use bonsai changesets. Modifying existing test
fixtures to work with bonsai changesets is very tricky. Besides, existing test
fixtures is a big pile of tech debt anyway, so I used this chance to get rid of
them.

Now test fixtures use `generate_new_fixtures` binary to generate an actual Rust
code that creates a BlobRepo. This Rust code creates a bonsai changeset, that
is converted to hg changeset later.
In many cases it results in the same hg hashes as in old test fixtures.
However, there are a couple of cases where the hashes are different:
1) In the case of merge we are generating different hashes because of different
changed file list (lukaspiatkowski, aslpavel, is it expected?). this is the case for test
fixtures like merge_even, merge_uneven and so on.
2) Old test fixtures used flat manifest hashes while new test fixtures are tree
manifest only.

Reviewed By: jsgf

Differential Revision: D9132296

fbshipit-source-id: 5c4effd8d56dfc0bca13c924683c19665e7bed31
2018-08-06 10:36:43 -07:00
Matthew Dippel
5ad21c71fc impl ReachabilityIndex for SkiplistIndex
Summary:
Implemented the methods which query for reachability on the SkiplistIndex, and calls to the appropriate tests on the test fixtures. All the functionality for this exists in the new method `query_reachability_with_generation_hints`. This method operates as follows:
* If there are skip list edges, traverse the one that goes the closest to the generation number of dst without passing it, then recurse for reachability from the new node..
* If there are only parent edges, recurse on each parent, and return the logical or of the answers.
* If the node is unindexed, perform `lazy_index_node` on it, and take the difference in generation numbers between the src and dst nodes as the maximum depth to index to.

Reviewed By: StanislavGlebik

Differential Revision: D9009060

fbshipit-source-id: ec3d372bd3ce99d9d9853972f76dc4a39e316b19
2018-07-31 12:21:46 -07:00
Matthew Dippel
7f11783f84 Moving / deleting some struct elements in SkiplistIndex and SkiplistEdgeMapping
Summary:
* Moved the field `skip_edges_per_node` from `SkiplistIndex` to `SkiplistEdgeMapping`.
* Removed the `Arc<BlobRepo>` field from `SkiplistIndex`.

The first change was to avoid having to pass around the `skip_edges_per_node` value into all the closures. Since these closures include an `Arc<SkiplistEdgeMapping>`, they can just get the value from that.
The second change is to be more in line with the `ReachabilityIndex` trait that I defined and to fit more easily into how I wrote the tests which take a generic `ReachabilityIndex` object. One other possibility, if I decide that the object should hold onto a reference to the repo, and not receive it as input at query time, is to add another trait method `use_repo`, which would start using a given `Arc<BlobRepo>`. A single instance of `ReachabilityIndex` shouldn't be used for different repos, and this could be a way to check that at compiler time. But right now I'm assuming it's passed at query time.

Reviewed By: StanislavGlebik

Differential Revision: D9009059

fbshipit-source-id: 9d63acf8a690816b3a17c84108a6feaa98ebe2ba
2018-07-30 18:06:26 -07:00
Matthew Dippel
743149a8c2 fetch_generation_and_join helper function, and refactoring of helper functions to use impl Future
Summary:
Added a helper function which attempts to fetch the Generation number for a node hash, and if successful, returns the (HgNodeHash, Generation) tuple. I found myself needing this in several places while working on the SkiplistIndex, and will probably continue to need it, so I gave it its own method.

Refactored existing helper methods to use `impl Future` instead of `BoxFuture`, since it was pointed out that this could be a performance issue in lower level helpers. This didn't change any functionality or require refactoring elsewhere, it just involved changing the signatures and removing any calls to `boxify()`.

Reviewed By: StanislavGlebik

Differential Revision: D9009057

fbshipit-source-id: 6b30bb92513213787fc92bdca1507ed117f6e715
2018-07-30 18:06:25 -07:00
Matthew Dippel
98fecf7f7c Refactor ReachabilityIndex tests to accept a closure instead of a trait object
Summary:
The tests for ReachabilityIndex now accept a closure that creates a trait object, instead of a trait object directly.
I was having problems later in this stack where I couldn't pass a SkiplistIndex object into these tests, because it help Arcs and hence wasnt 'UnwindSafe'. Modifying the tests to take a closure that creates the index object inside of the tokio async unit test fixes this.

Reviewed By: StanislavGlebik

Differential Revision: D9009053

fbshipit-source-id: de5e698b4d27aec1ab47fa7fda73da8ad46aab95
2018-07-30 18:06:25 -07:00
Matthew Dippel
bf8a175125 lazy_index_node method for SkiplistIndex
Summary:
Implemented the method lazy_index_node, which takes a starting node and a maximum depth value, and returns a future representing the computation to index all nodes beneath the start, up until previously indexed nodes, or nodes at depth at least the maximum desired depth. This will allow for partial indexing during warmup and during queries, in an environment where indexing the entire set of nodes reachable from the current master will be prohibitively expensive. This also includes a few synchronous helper methods, such as computing the skip list edges assuming a reasonable number of nodes below the desired node have been indexed.

This also includes tests to make sure that the behavior is as expected for things like, confirming the nodes you would expect to be indexed are actually indexed after a lazy_index_node call. The real test will be if this is consistent for reachability queries. It's possible that reachability query tests, and in particular how these methods are used by reachability queries, will expose some bugs. These tests will be introduced when I add the actual reachability query to this index. But I feel confident that the methods in this diff will require at most minimal modification going forward.

Reviewed By: StanislavGlebik

Differential Revision: D8959089

fbshipit-source-id: a025034dfac11215a412114de70a0233d7598f30
2018-07-30 15:06:04 -07:00
Matthew Dippel
bef6bcfaad Moved GenerationBFS tests to a separate tests.rs file so that they can be reused.
Summary: The tests for GenerationBFS only test the ReachabilityIndex interface, so they should be reusable for future indexes. I made them generic and moved them to a separate module, so that they can be reused later on.

Reviewed By: StanislavGlebik

Differential Revision: D8919884

fbshipit-source-id: f5c4668f71e0ee51b72cc1e7e760eda0e0afef4b
2018-07-26 10:09:31 -07:00
Matthew Dippel
8504b1f00d Moved some functionality from GenerationBFS into a shared helpers.rs file.
Summary:
Moved two types of functionality to a shared 'helpers' file so that they can be used by other indexes:
* Getting the Generation number of a changeset. The BlobRepo method currently returns an Option<Generation> as the success type, so putting the combinator calls to get the underlying Generation or map to an Error as a separate method will help keep the code more readible, and allow this logic to be reused in other parts.
* Checking if a node exists in the repo. This wraps the changeset_exists method from BlobRepo and returns an error if it was false or an error itself, else success with a void item. This just helps with code readability, so it will be obvious if the result of a future is being used, or if its success is just a prereq for the rest of the operations.
* Convert a collection of HgChangesetId to a collection of (HgNodeHash, Generation). Again, will help with code readability in more complicated functions, since the combinators of this method are, in my opinion, cluttering up the other methods using this functionality and making them more difficult to follow.

Reviewed By: StanislavGlebik

Differential Revision: D8919874

fbshipit-source-id: fc6cdf6e3a1f0dfa73c74ec94f0abac4a7860794
2018-07-23 17:08:05 -07:00
Matthew Dippel
faf514d8a6 Initial structure for a SkiplistIndex that can be put into the API server.
Summary:
The initial structure for storing skip list edges for the commit graph, to support faster reachability queries.

The plan for this index is to have two main methods:
- lazy_index, which takes a starting node, and a max_depth value, and indexes all nodes within distance max_depth of the start, or until it reaches a node which has already been indexed.
- query_with_indexing, which attempts to answer a reachability query, while indexing nodes it comes across with lazy_index if necessary.

Initialization of the index will start with a warm up by calling lazy_index on master, with a choice of how deep to warm up. 240K seems to be a good value, because the average query we've seen has distance <= 80K, and it is still a reasonable warm up time (4 minutes to obtain all database information if you assume 1ms per get_parents call, but even less if this information has already been cached by another warm up).

Then, when a query comes in, it would perform as follows:
- Pursue as many skip list edges as possible, assuming we are starting at an indexed node, until we reach a node which we can't pursue skip edges from, either because it is unindexed, or because it is a merge node.
- Once at a node like this:
    - If it is unindexed, perform a lazy_index call, where the depth parameter could either be a magic number, or it could be related to the query. For example, the difference between the unindexed node's generation and the destination generation. Then restart the query from this node.
    - If it is indexed but doesnt have skip edges, then a future must be spawned for each parent, as another query_with_indexing call. Then the query waits for all futures to complete, and returns the or of their results. This gives us the benefit of doing the indexing work.

The most noteworthy part of this initial structure is the choice of chashmap. This is a good choice for this structure because:
- After a node's skip edges are inserted, they are immutable. Thus we have low contention for writes on the same key.
- It separates locks on key buckets. So one thread can perform a reachability query while other threads are updating the index for unrelated commits in parallel.
- Even if two threads try to update the same key, for example if two different queries are simultaneously filling out the index, it doesn't
  matter which write wins out. In our case, the skip edges are deterministic, so the writes will be identical. Even if they were randomized,
  either write choice will maintain consistency of the index.

Reviewed By: StanislavGlebik

Differential Revision: D8919861

fbshipit-source-id: b6d0d61fb5484d406fec269d15b3e6c4eb9dac4a
2018-07-23 09:23:07 -07:00
Matthew Dippel
58a67bc780 Finer grained errors for GenerationBFS
Summary: Added an errors.rs which defines some specific types of errors that can occur during the BFS, and now properly maps them during the BFS process.

Reviewed By: StanislavGlebik

Differential Revision: D8862779

fbshipit-source-id: d7df1681cff7f7ae98e3b59b7d06993a55f6f707
2018-07-23 09:23:07 -07:00
Matthew Dippel
d38177f72b initial GenerationBfs impl of ReachabilityIndex
Summary: Implementation of reachability queries doing a BFS that stops exploration when the generation numbers seen are smaller than the destination's.

Reviewed By: StanislavGlebik

Differential Revision: D8808875

fbshipit-source-id: 5856dede27a22417add3882d748c4ce67e8f190d
2018-07-23 09:23:07 -07:00
Matthew Dippel
d2cc91b79f Initial package for reachabilityindex
Summary:
Creating the package for reachabilityindex, where methods that support efficient reachability queries on BlobRepo objects will live.

Includes the initial definition of the trait ReachabilityIndex which methods supporting reachability queries on BlobRepo objects will implement.

Reviewed By: StanislavGlebik

Differential Revision: D8792977

fbshipit-source-id: 293a810518e0a1fa260ccbc38902484e56ef2038
2018-07-11 09:55:43 -07:00