Commit Graph

181 Commits

Author SHA1 Message Date
Thomas Orozco
097e4ad00c mononoke: remove tokio-compat (i.e. use tokio 0.2 exclusively)
Summary:
The earlier diffs in this stack have removed all our dependencies on the Tokio
0.1 runtime environment (so, basically, `tokio-executor` and `tokio-timer`), so
we don't need this anymore.

We do still have some deps on `tokio-io`, but this is just traits + helpers,
so this doesn't actually prevent us from removing the 0.1 runtime!

Note that we still have a few transitive dependencies on Tokio 0.1:

- async-unit uses tokio-compat
- hg depends on tokio-compat too, and we depend on it in tests

This isn't the end of the world though, we can live with that :)

Reviewed By: ahornby

Differential Revision: D26544410

fbshipit-source-id: 24789be2402c3f48220dcaad110e8246ef02ecd8
2021-02-22 09:22:42 -08:00
Lukas Piatkowski
cd0b6d50e2 autocargo v1: changes to match autocargo v2 generation results.
Summary:
The changes (and fixes) needed were:
- Ignore rules that are not rust_library or thrift_library (previously only ignore rust_bindgen_library, so that binary and test dependencies were incorrectly added to Cargo.toml)
- Thrift package name to match escaping logic of `tools/build_defs/fbcode_macros/build_defs/lib/thrift/rust.bzl`
- Rearrange some attributes, like features, authors, edition etc.
- Authors to use " instead of '
- Features to be sorted
- Sort all dependencies as one instead of grouping third party and fbcode dependencies together
- Manually format certain entries from third-party/rust/Cargo.toml, since V2 formats third party dependency entries and V1 just takes them as is.

Reviewed By: zertosh

Differential Revision: D26544150

fbshipit-source-id: 19d98985bd6c3ac901ad40cff38ee1ced547e8eb
2021-02-19 11:03:55 -08:00
Lukas Piatkowski
87ddbe2f74 autocargo v1: update autocargo field format to allow transition to autocargo v2
Summary:
Autocargo V2 will use a more structured format for autocargo field
with the help of `cargo_toml` crate it will be easy to deserialize and handle
it.

Also the "include" field is apparently obsolete as it is used for cargo-publish (see https://doc.rust-lang.org/cargo/reference/manifest.html#the-exclude-and-include-fields). From what I know this might be often wrong, especially if someone tries to publish a package from fbcode, then the private facebook folders might be shipped. Lets just not set it and in the new system one will be able to set it explicitly via autocargo parameter on a rule.

Reviewed By: ahornby

Differential Revision: D26339606

fbshipit-source-id: 510a01a4dd80b3efe58a14553b752009d516d651
2021-02-12 23:28:25 -08:00
Thomas Orozco
2a21e4fb17 third-party/rust: update Tokio to 0.2.25 + add a patch to disable coop scheduling
Summary:
See the patch & motivation here:

818f943db3

Reviewed By: StanislavGlebik

Differential Revision: D26399890

fbshipit-source-id: e184a3f6c1dd03cb4cdb7ea18073c3392d7ce355
2021-02-12 04:56:23 -08:00
Stefan Filip
93c1231c55 segmented_changelog: update hash_to_location to gracefully handle unknown hashes
Summary:
One of the primary use cases for hash_to_location is translating user provided
hashes. It is then perfectly valid for the hashes that are provided to not
exist.  Where we would previously return an error for the full request if a
hash was invalid, we now omit the hash from the response.

Reviewed By: quark-zju

Differential Revision: D26389472

fbshipit-source-id: c59529d43f44bed7cdb2af0e9babc96160e0c4a7
2021-02-11 12:17:35 -08:00
Stefan Filip
9c6b9af8e0 segmented_changelog: add SegmetedChangelog::changeset_id_to_location
Summary:
Get the graph location of a given commit identifier.

The client using segmented changelog will have only a set of identifiers for
the commits in the graph. The client needs a way to translate user input to
data that it has locally.  For example, when checking out an older commit by
hash the client will have to retrieve a location to understand the place in the
graph of the commit.

Reviewed By: quark-zju

Differential Revision: D26289623

fbshipit-source-id: 4192d91a4cce707419fb52168c5fdff53ac3a9d0
2021-02-10 10:19:03 -08:00
Stefan Filip
f256b5b752 segmented_changelog: add IdMap::find_many_vertexes
Summary: Batch variation for `find_vertex`. Useful for batching hash to location.

Reviewed By: quark-zju

Differential Revision: D26289618

fbshipit-source-id: b5e642d77715651f44acf64b5eb07529301a493f
2021-02-10 10:19:03 -08:00
Stefan Filip
8117a681e8 segmented_changelog: add guardrail to location_to_changeset_id
Summary:
If `location_to_changeset_id` somehow does not get a vector with one element
it will panic and our server will stop immediately. Not great for a server.
Changes are low but we already return `anyhow::Result` so no big pain here.

Reviewed By: quark-zju

Differential Revision: D26180417

fbshipit-source-id: 6986f3fdd0b34f7c2606162bc35aacb9857ea04c
2021-02-09 11:31:31 -08:00
Stefan Filip
78bc732d5e segmented_changelog: use dag_types::Location for location
Summary:
We had individual fields for location components. This change will make it
easier for people to read through the code. This completes the integration
of the Location struct.

Reviewed By: quark-zju

Differential Revision: D26162272

fbshipit-source-id: 76259578430bac88317afb1935f63e06b6e8284e
2021-02-09 11:31:31 -08:00
Stefan Filip
fe4e0be42e commit: use dag_types::Location for location_to_hash
Summary:
This is removing `edenapi::CommitLocation` in order to use
`dag_types::Location`.

First, `edenapi::CommitLocation` has a `count` field and `dag_types::Location`
does not. I find `count` to be difficult to attach to a more general structure.
In practice `edenapi::CommitLocation` is replaced by `CommitLocationToHashRequest`.
On top of the request we have the batch object: `CommitLocationToHashRequestBatch`.

Second, `edenapi::CommitLocation` did not have Wire types, the new structures do.

Reviewed By: quark-zju

Differential Revision: D26159865

fbshipit-source-id: f2508e123e11988726868c7f439a2ed186afce5c
2021-02-09 11:31:30 -08:00
Stefan Filip
65794e7c5b segmented_changelog: update SegmentedChangelogBuilder to always consume self
Summary:
In the public API at least. A public method will consume the builder. If some
code wants to call multiple methods using the same configuration, they can
safely clone the builder to get a second instance.

SegmentedChangelogBuilder needs to pass references internally to build
individual components otherwise it would have to clone itself excessively.
This pattern leaked towards public methods too. Some tests use this builder too
and they use some crate public methods that need to be defined using
references. I don't know if we should remove that dependency. Anyway, the
Builder is hopefully easier to use now.

Reviewed By: quark-zju

Differential Revision: D26152066

fbshipit-source-id: 63285e200d8e9fde06fede03773b7d4c02e9cea7
2021-02-01 11:44:03 -08:00
Stefan Filip
5bf8012412 segmented_changelog: add caching to IdMap
Summary:
Caching for the IdMap to speed things up.
Values for a key do not change. The IdMap is versioned for a given Repository.
We use the version of the IdMap in the generation of the cache keys. We set the
"site version" to be the IdMap version.

Reviewed By: krallin

Differential Revision: D26121498

fbshipit-source-id: 7e82e40b818d1132a7e86f4cd7365dd38056348e
2021-01-29 16:41:42 -08:00
Thomas Orozco
6c6f698e99 mononoke/segmented_changelog: fix leader fallbacks being the wrong way around
Summary:
We have fallback logic to go to the leader if the data we want is missing in
the replica, but right now it's backwards so we go to the leader to find data
we actually *did* find in the replica (and we don't go to the leader for
missing data).

Reviewed By: sfilipco

Differential Revision: D26103898

fbshipit-source-id: 535abab2a3093165f1d55359d102a7a7cb542a9c
2021-01-27 12:29:06 -08:00
Daniel Xu
5715e58fce Add version specificiation to internal dependencies
Summary:
Lots of generated code in this diff. Only code change was in
`common/rust/cargo_from_buck/lib/cargo_generator.py`.

Path/git-only dependencies (ie `mydep = { path = "../foo/bar" }`) are not
publishable to crates.io. However, we are allowed to specify both a path/git
_and_ a version. When building locally, the path/git is chosen. When publishing,
the version on crates.io is chosen.

See https://doc.rust-lang.org/cargo/reference/specifying-dependencies.html#multiple-locations .

Note that I understand that not all autocargo projects are published on crates.io (yet).
The point of this diff is to allow projects to slowly start getting uploaded.
The end goal is autocargo generated `Cargo.toml`s that can be `cargo publish`ed
without further modification.

Reviewed By: lukaspiatkowski

Differential Revision: D26028982

fbshipit-source-id: f7b4c9d4f4dd004727202bd98ab10e201a21e88c
2021-01-25 22:10:24 -08:00
Thomas Orozco
4dd3461824 third-party/rust: update Tokio 0.2.x to 0.2.24 & futures 1.x to 1.30
Summary:
When we tried to update to Tokio 0.2.14, we hit lots of hangs. Those were due
to incompatibilities between Tokio 0.2.14 and Futures 1.29. We fixed some of
the bugs (and others had been fixed and were pending a release), and Futures
1.30 have now been released, which unblocks our update.

This diff updates Tokio accordingly (the previous diff in the stack fixes an
incompatibility).

The underlying motivation here is to ease the transition to Tokio 1.0.
Ultimately we'll be pulling in those changes one or way or another, so let's
get started on this incremental first step.

Reviewed By: farnz

Differential Revision: D25952428

fbshipit-source-id: b753195a1ffb404e0b0975eb7002d6d67ba100c2
2021-01-25 08:06:55 -08:00
Radu Szasz
5fb5d23ec8 Make tokio-0.2 include test-util feature
Summary:
This feature is useful for testing time-dependent stuff (e.g. it
allows you to stop/forward time). It's already included in the buck build.

Reviewed By: SkyterX

Differential Revision: D25946732

fbshipit-source-id: 5e7b69967a45e6deaddaac34ba78b42d2f2ad90e
2021-01-18 10:38:08 -08:00
Alex Hornby
ce85f95e55 mononoke: add choice of direction to bulkops
Summary: When scrubbing repos it is preferable to scrub newest data first.  This diff adds Direction::NewestFirst to bulkops for use in scrubbing and updates existing call sites to Direction::OldestFirst so as not to change behaviour

Reviewed By: StanislavGlebik

Differential Revision: D25742279

fbshipit-source-id: 363a4854b14e9aa970b2c1ec491dcaccac7a6ec9
2021-01-11 10:55:39 -08:00
Alex Hornby
48ec577119 mononoke: remove ChangesetBulkFetch trait
Summary: There is only one implementation of the trait so remove it and use that impl directly.  Removing the trait  makes it simpler to work on bulkops in the rest of this stack.

Reviewed By: farnz

Differential Revision: D25804021

fbshipit-source-id: 22fe797cf87656932d383ae236f2f867e788a832
2021-01-07 08:18:50 -08:00
Stefan Filip
65054f2044 segmented_changelog: add comments around IdMap insert expectations
Summary:
Comments for why we don't need a lock when updating the SqlIdMap with multiple
writers. Structure can definitely be improved but I'll live with this for a
short time.

No fundamental change in logic. I added extra checks to the insert function and
changed from an optimistic insert race logic to a pessimistic version. I
explain in the comments that it's to have an easier time reasoning about what
happens and that theoretically doesn't matter.

Reviewed By: quark-zju

Differential Revision: D25606290

fbshipit-source-id: ea21915fc797fe759b3fe481e8ad9e8cb594fb6a
2020-12-23 16:51:08 -08:00
Stefan Filip
5f6d1a2c61 edenapi: add full_idmap_clone endpoint
Summary:
The end goal is to have clients using a sparse IdMap. There is still some work
to get there though. In the mean time we can test repositories that don't use
any revlogs. The current expections for those repositories are that they have
a full idmap locally.

Reviewed By: quark-zju

Differential Revision: D25075341

fbshipit-source-id: 52ab881fc9c64d0d13944e9619c087e0d4fb547c
2020-12-08 18:30:24 -08:00
Stefan Filip
3afaeb858a segmented_changelog: add SegmentedChangelog::full_idmap_clone_data
Summary:
The client dag cannot currently be instantiated with a sparse idmap (aka
universal commit idmap). Is should be usable with a full idmap.  To test
repositories that use segmented changelog exclusively we add the capability of
cloning the full idmap.

I currently see StreamCloneData as an experiment. I am open to suggestions
around what structure we should have for the regular long term clone endpoint.
That said, I am leaning towards converting clone_data to return
StreamCloneData.  Overall, Segmented Changelog has a few knobs that influence
how big the IdMap ends up being so the code that is more flexible will be more
useful long term.  To add to that, we transform data higher in the stack using
streaming and this data does similar fetching, it seems that we should have a
stream idmap exposed by clone_data.

Reviewed By: quark-zju

Differential Revision: D24966338

fbshipit-source-id: 019b363568e3191280bd5ac09fc15062711e5523
2020-12-08 18:30:24 -08:00
Pavel Aslanov
337bab2744 convert to new type futures
Summary: Convert `ChangesetFetcher` to new type futures

Reviewed By: StanislavGlebik

Differential Revision: D25244213

fbshipit-source-id: 4207386d81397a930a566db008019bb8f31bf602
2020-12-02 15:40:12 -08:00
Stefan Filip
4b9dc9074f segmented_changelog: measure runs/failures/duration for updates
Summary: Basic observability for how the segmeted changelog update process is performing.

Reviewed By: krallin

Differential Revision: D25108739

fbshipit-source-id: b1f406eb0c862464b186f933d126e0f3a08144e4
2020-12-01 17:29:23 -08:00
Stefan Filip
b2aac949cf cmds: update segmented-changelog-tailer to run on a list of repos
Summary:
The update of the segmented changelog is light weight enough that we can
consider all repositories sharing a common tailer process. With all
repositories sharing a single tailer the the maintenance burden will be lower.

Things that I am particularly unsure about are: tailer configuration setup and
tailer structure. With regards to setup, I am not sure if this is more or less
than what production servers do to instantiate. With regards to structure, I
think that it makes a lot of sense to have a function that takes a single repo
name as parameter but the configuration setup has an influence on the details.
I am also unsure how important it is to paralelize the instantiation of the
blobrepos.

Finally, it is worth mentioning that the SegmentedChangelogTailer waits for
`delay` after an update finishes rather than on a period. The benefit is that
we don't have large updates taking down a process because we schedule the same
large repo update too many timer. The drawback is that scheduling gets messed
up over time and multiple repo updates can end up starting at the same time.

Reviewed By: farnz

Differential Revision: D25100839

fbshipit-source-id: 5fff9f87ba4dc44a17c4a7aaa715d0698b04f5c3
2020-12-01 17:29:23 -08:00
Kostia Balytskyi
e4dab84619 scuba: turn ScubaSampleBuilderExt into a wrapper struct
Summary:
This diff prepares the Mononoke codebase for composition-based extendability of
`ScubaSampleBuilder`. Specifically, in the near future I will add:
- new methods for verbose scuba logging
- new data field (`ObservabilityContext`) to check if verbose logging should
  be enabled or disabled

The higher-level goal here is to be able to enable/disable verbose Scuba
logging (either overall or for certain slices of logs, like for a certain
session id) in real time, without restarting Mononoke. To do so, I plan to
expose the aforementioned verbose logging methods, which will run a check
against the stored `ObservabilityContext` and make a decision of whether the
logging is enabled or not. `ObservabilityContext` will of course hide
implementation details from the renamed `ScubaSampleBuilderExt`, and just provide a yes/no
answer based on the current config and sample fields.

At the moment this should be a completely harmless change.

Reviewed By: krallin

Differential Revision: D25211089

fbshipit-source-id: ea03dda82fadb7fc91a2433e12e220582ede5fb8
2020-11-30 21:26:24 -08:00
Pavel Aslanov
4a0cb69c4e convert BlobRepo::{changeset_exists_by_bonsai, get_changeset_parents_by_bonsai} to new futures
Summary: convert `BlobRepo::{changeset_exists_by_bonsai, get_changeset_parents_by_bonsai}` to new futures

Reviewed By: ahornby

Differential Revision: D25195811

fbshipit-source-id: 0238440aa0757af6362effe09f1771c939bda030
2020-11-27 11:11:18 -08:00
Lukas Piatkowski
fa1a195fd0 mononoke/blobstore: pass CoreContext via borrowed instead of owned value
Summary: Follow up after removing 'static from blobstore.

Reviewed By: StanislavGlebik

Differential Revision: D25182106

fbshipit-source-id: e13a7a31d71b4674425123268e655ae66127f1b7
2020-11-27 03:31:07 -08:00
Stefan Filip
3ffb223968 config: add SegmentedChangelog that downloads dag for functionality
Summary:
Under this configuration SegmentedChangelog Dags (IdDag + IdMap) are always
downloaded from saves. There is no real state kept in memory.

It's a simple configuration and somewhat flexible with treaks to blobstore
caching.

Reviewed By: krallin

Differential Revision: D24808330

fbshipit-source-id: 450011657c4d384b5b42e881af8a1bd008d2e005
2020-11-11 22:53:38 -08:00
Stefan Filip
3446a65526 segmented_changelog: add SegmentedChangelog::clone_data
Summary:
Constructs and returns `CloneData<ChangesetId>`. This object can then be used
to bootstrap a client dag that speaks bonsai commits.
Short term we are going to be using this data in the Mercurial client which
doesn't use bonsai. Hg MononokeRepo will convert it.

Long term we may decide that we want to download cached artifacts for
CloneData.  I don't see an issue getting there, I see this as a valid path
forward that cuts down on the configuration required to get to the cached
artifacts.  All that said, I think that using whatever dag is available in
memory would be a viable production option.

Reviewed By: krallin

Differential Revision: D24717915

fbshipit-source-id: 656924abb4bbfa1a11431000b6ca6ed2491cdc74
2020-11-11 22:53:37 -08:00
Stefan Filip
59d8ccf690 segmented_changelog: add SegmentedChangelogManager
Summary: The SegmentedChangelogManager abstracts saving and loading Dags. This is currently used in the tailer and seeder processes. It will also be used to load dags while the server is running.

Reviewed By: krallin

Differential Revision: D24717925

fbshipit-source-id: 30dff7dfc957f455be6cf733b20449c804511b43
2020-11-11 22:53:37 -08:00
Mark Juggurnauth-Thomas
0eb32649a6 segmented_changelog: resync autocargo
Summary: The `dag` dependency now generates a `for-tests` feature requirement.

Reviewed By: sfilipco, krallin

Differential Revision: D24888944

fbshipit-source-id: 6da646d71ae99118dcdc33673565056462a4c8ad
2020-11-11 09:31:42 -08:00
Stefan Filip
18a6d2aef3 segmented_changelog: update sql query for last idmap entry
Summary:
MySQL doesn't like that the idmap table is renamed to `inner`. For good reason,
inner is a keyword, best to rename it.

Reviewed By: ahornby

Differential Revision: D24568914

fbshipit-source-id: 7a3790e835931b29658c7652cc89069c6b9b5bab
2020-10-29 17:40:19 -07:00
Stefan Filip
1089012b05 segmented_changelog: add SegmentedChangelogBuilder::with_blobrepo
Summary:
I avoided this function because it interacts in a weird ways with dependencies.
At this point I am no longer concerned about that and it can help us simplify
some code.
Looking ahead I think that we will refactor things into having fewer
dependencies.

Reviewed By: krallin

Differential Revision: D24555935

fbshipit-source-id: 994b25d90da491bb5cc593b6c33085790c4fb322
2020-10-29 17:40:19 -07:00
Stefan Filip
2391173a3f segmented_changelog: add segmented changelog tailer command
Summary:
The command reads the last SegmentedChangelog that was saved for a repository
and updates head to match a given bookmark (master).

Right now this is just a command that works on one repository. Follow up
changes will look at deployment options and handling multiple repositories.

Reviewed By: krallin

Differential Revision: D24516438

fbshipit-source-id: 8f04f9426c2f2d7748c5363d2dbdf9f3acb79ddd
2020-10-29 17:40:19 -07:00
Stefan Filip
07200876bb segmented_changelog: account for iddag lag in incremental build
Summary:
I initially saw the incremental build as something that would be run in places
that had IdMap and IdDag stored side by side in process. I am reconsidering
to use incremental build in the tailing process to keeps Segmented Changelog
artifacts up to date.

Since we update the IdMap before we update the IdDag, it is likely that we
will have runs that only update the IdMap and fail to update IdDags. This diff
adds a mechanism for the IdDag to catch up.

Reviewed By: krallin

Differential Revision: D24516440

fbshipit-source-id: 3a99248451d806ae20a0ba96199a34a8a35edaa4
2020-10-29 17:40:19 -07:00
Stefan Filip
a30217fe1b segmented_changelog: context and debug derives for easier debugging
Summary:
Nice to have things for debugging.

This isn't an exhaustive list of places that we could add context too. I'll
probably look to complete the list after the current changes are done.

Reviewed By: krallin

Differential Revision: D24516437

fbshipit-source-id: 7f29e7afde5a5918aea419181d786f48da9b8b14
2020-10-29 17:40:19 -07:00
Stefan Filip
7f274cf1ff segmented_changelog: style updates to segmented changelog seeder
Summary: Style.

Reviewed By: krallin

Differential Revision: D24516439

fbshipit-source-id: 11582b25e86b20c3e27a4ac4c299119f0b3c72a0
2020-10-29 17:40:19 -07:00
Stefan Filip
d1229b7fad segmented_changelog: update iddag store key to blake2
Summary:
The general goal is to allign segmented changelog blobstore usage with the
general pattern in Mononoke.

Reviewed By: quark-zju

Differential Revision: D24605796

fbshipit-source-id: 808985609f74ebc45f3fcc57583e55f3af9bce1d
2020-10-29 17:40:18 -07:00
Stefan Filip
2207e27ce0 segmented_changelog: replace sql log tables with scuba
Summary:
From an OSS perspective, I think that the log tables have a place. However for
daily use perspective, next to scuba they don't add much except retention and
instead feel more heavy weight to manage. This change probably simplifies
things and makes the Segmented Changelog component easier to maintain.

Reviewed By: krallin

Differential Revision: D24213548

fbshipit-source-id: 48a4ea57e3f3911c3bf82b0cc51f118d72119e19
2020-10-09 11:17:03 -07:00
Stanislau Hlebik
5251028e61 mononoke: fix build
Summary: looks like we got land time conflict

Reviewed By: krallin

Differential Revision: D24196362

fbshipit-source-id: 27da83a2f86cc7fe5f59fe583d4b719f69df0248
2020-10-08 12:23:19 -07:00
Stefan Filip
fa0c15ab87 cmds: add segmented_changelog seeder
Summary:
Mononoke command for running the SegmentedChangelogSeeder for an existing
repository. The result is going to be a new IdMap version in the metadata
store and a new IdDag stored in the the blobstore resulting in a brand new
SegmentedChangelog bundle.

Reviewed By: krallin

Differential Revision: D24096963

fbshipit-source-id: 1eaf78392d66542d9674a99ad0a741f24bc2cb1b
2020-10-08 09:43:47 -07:00
Stefan Filip
aeae90f1ee segmented_changelog: add SegmentedChangelogSeeder
Summary:
The SegmentedChangelogSeeder has the role of constructing a new IdMap for a
given repository. That would happen when a repository is onboarded or when
algorithm improvements are made.

This change comes with small refactoring. We had the Dag which did a bit of
everything. Now the on_demand_update and the seeder functionalities are in
their separate files. The tests from `dag.rs` were moved to the `tests.rs` and
updated to use the seeder and on_demand_update structures.

`SegmentedChangelogSeeder::run` is the main logic added in this diff.

Reviewed By: quark-zju

Differential Revision: D24096965

fbshipit-source-id: 0f655e8c226ca0051f3e925342e92b1e7979aab2
2020-10-08 09:43:47 -07:00
Stefan Filip
225c4083da segmented_changelog: add IdDagSaveStore
Summary:
The IdDagStore provides the ability to save and later load prebuilt instances
of the IdDag.
This is going to be used in the clone API where we send one of these blobs to
the client. It is also going to be used by servers starting up.
Right now the serialization is naive, relying on serde::Serialize. The key
schema would provide the means for evolving the serialization format in cases
where we would require breaking changes.

Reviewed By: quark-zju

Differential Revision: D24096967

fbshipit-source-id: 2c883e5e82c05bec03c429c3c2a2d545170a8c05
2020-10-08 09:43:46 -07:00
Stefan Filip
6883e90d30 segmented_changelog: add IdMap, IdDag, Bundle version stores
Summary:
This IdMapVersionStore determines which is the latest IdMapVersion that commit
"tailing" processes should use when building new Dag bundles.  The "seed"
process will update the versions of the IdMap. The plan for the "seed" process
is to write a new IdMap version to Sql then update the store with a new entry.
New "tailer" processes will then start to use the newly built IdMapVersion.
The tailing processes that will build fresh IdDags for general consumption.
These IdDags will be used by the clone operation. These dags will also be used
by servers instances spinning up.
DagBundles specify (id dag version, id map vession). This pair specified a
prebuilt Segmented Changelog that is ready to be loaded.

Reviewed By: quark-zju

Differential Revision: D24096968

fbshipit-source-id: 413f49ed185a770a73afd17dfbc952901ab53b42
2020-10-08 09:43:46 -07:00
David Tolnay
0cb8a052f5 Update formatter to rustfmt 2.0
Reviewed By: zertosh

Differential Revision: D23591021

fbshipit-source-id: e664aa2fdd3aaa457796a59080be6b94f604a112
2020-09-09 07:52:33 -07:00
Stefan Filip
3f0b08e46f segmented_changelog: add version field to IdMap
Summary:
The version is going to be used to seamlessly upgrade the IdMap. We can
generate the IdMap in a variety of ways. Naturally, algorithms for generating
the IdMap may change, so we want a mechanism for updating the shared IdMap.

A generated IdDag is going to require a specific IdMap version. To be more
precise, the IdDag is going to specify which version of IdMap it has to be
interpreted with.

Reviewed By: quark-zju

Differential Revision: D23501158

fbshipit-source-id: 370e6d9f87c433645d2a6b3336b139bea456c1a0
2020-09-03 16:33:20 -07:00
Stefan Filip
58a4821fe3 segmented_changelog: add IdMap trait with SqlIdMap implementation
Summary:
Separate the operational bits of the IdMap from the core SegmentedChangelog
requirements.

I debaded whether it make sense to add repo_id to SqlIdMap. Given the current
architecture I don't see a reason not to do it. On the contrary separating
two objects felt convoluted.

Reviewed By: quark-zju

Differential Revision: D23501160

fbshipit-source-id: dab076ab65286d625d2b33476569da99c7b733d9
2020-09-03 16:33:20 -07:00
Stefan Filip
f3c353edbc segmented_changelog: change idmap module from file to directory
Summary:
Planning to add a trait for core idmap functionality (that's just translating
cs_id to vertex and back). The current IdMap will then be an implementation of
that trait.

Reviewed By: quark-zju

Differential Revision: D23501159

fbshipit-source-id: 34e3b26744e4b5465cd108cca362c38070317920
2020-09-03 16:33:20 -07:00
Stefan Filip
e57b1f9265 segmented_changelog: add on-demand updating dag implementation
Summary:
The Segmented Changelog must be built somewhere. One of the simplest deployments
of involves the on-demand update of the graph. When a commit that wasn't yet
processed is encountered, we sent it to processing along with all of it's
ancestors.

At this time not much attention was paid to the distinction of master commit
versus non-master commit. For now the expectation is that only commits from
master will exercise this code path. The current expectation is that clients
will only call location-to-hash using commits from master.
Let me know if there is an easy way to check if a commit is part of master.
Later changes will invest more in handling non-master commits.

Reviewed By: aslpavel

Differential Revision: D23456218

fbshipit-source-id: 28c70f589cdd13d08b83928c1968372b758c81ad
2020-09-02 17:20:42 -07:00
Stefan Filip
d50e09a41d segmented_changelog: add SegmentedChangelogBuilder
Summary:
This builders implements SqlConstruct and SqlConstuctFromMetadataDatabaseConfig
to make handling the Sql connection for IdMap consistent with what happens in
Mononoke in general.

Reviewed By: aslpavel

Differential Revision: D23456219

fbshipit-source-id: 6998afbbfaf1e0690a40be6e706aca1a3b47829f
2020-09-02 17:20:42 -07:00
Stefan Filip
66706d77c5 segmented_changelog: add SegmentedChangelog trait
Summary:
The trait provides two methods for location to hash translation. The first
returns a single hash and is existing functionality. The second returns a
list of hashes and represents new functionality. This diff also adds this
functionality to the Dag structure which is currently the only real
implementation for SegmentedChangelog.

Reviewed By: aslpavel

Differential Revision: D23456215

fbshipit-source-id: 0c2ca91672cf23129342c585f98446c0ebbdf7ef
2020-09-02 17:20:41 -07:00
Jun Wu
6fd7a2e582 dag: use concrete error types
Summary:
This is more complex than previous libraries, mainly because `dag` defines APIs
(traits) used by other code, which might raise error type not interested
by `dag` itself. `BackendError::Other(anyhow::Error)` is currently used to
capture types that do not fit in `dag`'s predefined error types.

Reviewed By: sfilipco

Differential Revision: D22883865

fbshipit-source-id: 3699e14775f335620eec28faa9a05c3cc750e1d1
2020-08-06 12:31:57 -07:00
Jun Wu
8d0f48c4da dag: rename some anyhow::Result to dag::Result
Summary:
Prefix some `Result` with `dag::Result`. Since `dag::Result` is just
`anyhow::Result` for now, this does not change anything but makes
it more compatible with upcoming changes.

Reviewed By: sfilipco

Differential Revision: D22883864

fbshipit-source-id: 95a26897ed026f1bb8000b7caddeb461dcaad0e7
2020-08-06 12:31:57 -07:00
Stefan Filip
2f3e569120 mononoke_api: add segmented changelog location to hash translation
Summary:
This functionality is going to be used in EdenApi. The translation is required
to unblock removing the changelog from the local copy of the repositories.
However the functionality is not going to be turned on in production just yet.

Reviewed By: kulshrax

Differential Revision: D22869062

fbshipit-source-id: 03a5a4ccc01dddf06ef3fb3a4266d2bfeaaa8bd2
2020-08-04 11:22:39 -07:00
Arun Kulshreshtha
5f0181f48c Regenerate all Cargo.tomls after upgrade to futures 0.3.5
Summary: D22381744 updated the version of `futures` in third-party/rust to 0.3.5, but did not regenerate the autocargo-managed Cargo.toml files in the repo. Although this is a semver-compatible change (and therefore should not break anything), it means that affected projects would see changes to all of their Cargo.toml files the next time they ran `cargo autocargo`.

Reviewed By: dtolnay

Differential Revision: D22403809

fbshipit-source-id: eb1fdbaf69c99549309da0f67c9bebcb69c1131b
2020-07-06 20:49:43 -07:00
Stefan Filip
422c84b659 mononoke: monitor replication lag in segmented_changelog::IdMap
Summary:
The (re)construction process for the IdMap will generate millions of rows
to be inserted in our database. We want to throttle the inserts so that
the database doesn't topple over.

Reviewed By: ikostia

Differential Revision: D22104349

fbshipit-source-id: 73b7c2bab12ae0cd836080bcf1eb64586116e70f
2020-07-01 18:18:55 -07:00
Jeremy Fitzhardinge
35b292ce9d eden: manual dependency fixes
Summary:
Tooling can't handle named_deps yet, but it can warn about them

P133451794

Reviewed By: StanislavGlebik

Differential Revision: D22083499

fbshipit-source-id: 46de533c19b13b2469e912165c1577ddb63d15cd
2020-06-17 17:55:04 -07:00
Jeremy Fitzhardinge
1b4edb5567 eden: remove unused Rust dependencies
Summary:
Remove unused dependencies for Rust targets.

This failed to remove the dependencies in eden/scm/edenscmnative/bindings
because of the extra macro layer.

Manual edits (named_deps) and misc output in P133451794

Reviewed By: dtolnay

Differential Revision: D22083498

fbshipit-source-id: 170bbaf3c6d767e52e86152d0f34bf6daa198283
2020-06-17 17:55:03 -07:00
Lukas Piatkowski
07e0427eb4 mononoke: make segmented_changelog OSS buildable
Reviewed By: krallin

Differential Revision: D21980974

fbshipit-source-id: 60e8f0e78b2bd654e5109b39c85f08ee5b3e9490
2020-06-12 06:54:32 -07:00
Jeremy Fitzhardinge
d03da109c7 mononoke: fix some little lints
Reviewed By: StanislavGlebik

Differential Revision: D22004427

fbshipit-source-id: 1418f91a3322ef3a3ff620c4e17745fc765dc971
2020-06-12 01:33:20 -07:00
Stefan Filip
c06de91150 mononoke: add tracking for SQL connection usage
Summary: Incrementing CoreContext::perf_counters().

Reviewed By: krallin

Differential Revision: D21875162

fbshipit-source-id: 0187034c3004efb86d74ce23ddf61a2b469efc61
2020-06-11 12:38:49 -07:00
Stefan Filip
754c1d00c0 mononoke: add counters to segmented changelog calls
Summary: Title.

Reviewed By: krallin

Differential Revision: D21875163

fbshipit-source-id: e7c50e30d10a4d192edb082c0cabe290c4e6d29a
2020-06-11 12:38:49 -07:00
Stefan Filip
0bb787d892 mononoke: add batch query capability to segmented changelog IdMap
Summary:
Useful for keeping the insert logic working nicely even when we double insert
or have other inconsistencies.

Reviewed By: krallin

Differential Revision: D21853048

fbshipit-source-id: 958cbc0435f330749c5aae1ea0f41ecd03b01468
2020-06-11 12:38:48 -07:00
Mark Thomas
76bff4c32d segmented_changelog: fix warnings
Reviewed By: sfilipco

Differential Revision: D21998893

fbshipit-source-id: f0b8d82ec4937263fbe0db2e626a3847118e5b08
2020-06-11 10:55:31 -07:00
Stefan Filip
da6a88756e mononoke: add IdMap::insert_many
Summary: Bulk insertion to the SQL backed IdMap.

Reviewed By: StanislavGlebik

Differential Revision: D21655847

fbshipit-source-id: 937cb910edbe399fed4e6b0c4013e18378cea82f
2020-06-02 19:17:53 -07:00
Stefan Filip
134a7a425c mononoke: update SegmentedChangelog building to pick up where it left off
Summary:
Instead of always building from scratch, continue assiging Vertexes and
Segments from the last commit that was processed.

Reviewed By: StanislavGlebik

Differential Revision: D21634699

fbshipit-source-id: 9f8b890dcf65c59a66651343f0ccc1487efc2394
2020-06-02 19:17:53 -07:00
Stefan Filip
e44681e307 mononoke: add IdMap::get_last_entry
Summary: Useful for determining where an incremental building step left off.

Reviewed By: StanislavGlebik

Differential Revision: D21634698

fbshipit-source-id: e9b0473003c529d5c934754f1ece23df69c4be66
2020-05-26 07:37:11 -07:00
Stefan Filip
cafbbcd9c8 mononoke: add MemIdMap
Summary:
This structure has similar functionality to the IdMap that is backed by SQL.
It is probably going to be useful for caching in the case of batch operations.

Reviewed By: quark-zju

Differential Revision: D21601820

fbshipit-source-id: 9c3ebc3e9dc92a59ce0908fc241bd2b97da88dca
2020-05-21 17:19:16 -07:00
Stefan Filip
b9a93d49e5 mononoke: bulk construction of the segmented changelog Dag
Summary:
`Dag::build_all_graph` will load the whole graph for a given repository and
construct the segmented changelog from it.

Reviewed By: StanislavGlebik

Differential Revision: D21538029

fbshipit-source-id: b4ba846bb2870ba73257bed6128b8e198a0aab3e
2020-05-21 17:19:15 -07:00
Thomas Orozco
4852b9a9ff mononoke/segmented_changelog: remove a warning
Summary: What it says in the title

Reviewed By: StanislavGlebik

Differential Revision: D21549635

fbshipit-source-id: 75939ebbfb317a9beaa9acd1fc1a7c6f41b0f88f
2020-05-13 10:47:10 -07:00
Stefan Filip
aae5b96b8d segmented_changelog: add support for multiple repositories
Summary:
How is this Dag structure going to be used? This is probably the interesting
question for this diff.
On one side the structure could maintain a view of all the repositories and
manage the DAGs for all repositories in a central place. On the other side the
`Dag` is just an instance of a Changelog and Mononoke manages repositories that
each have a `Dag`. I went with the former pattern as it seems to me to be more
in line with the general architecture of Mononoke.

We can see the Dag being another part of the BlobRepo in the future. We will
want to avoid depending on the BlobRepo for actual functionality to avoid
cyclic dependencies. Currently the BlobRepo is used in construction for
convenience but that will have to change in the future.

Reviewed By: StanislavGlebik

Differential Revision: D21418367

fbshipit-source-id: 7c133eac0f38084615c2b9ba1466de626d2ffcbe
2020-05-11 09:12:08 -07:00
Stefan Filip
bbe605a47f Update Segmented Changelog IdMap storage to SQL
Summary:
The transformation is pretty direct. I didn't add additional functionality
to the IdMap and I did not update the construction algorithm yet. The querying
method on IdMap were updated to async and then there are the SQL interaction
details.

In follow up changes I want to update the construction algorithm and add support
for multiple repositories.

I am not happy with the names of the columns or naming in general in this code.
Open to suggestions. One idea could be matching the client nomenclature as much
as possible.

Reviewed By: StanislavGlebik

Differential Revision: D20929576

fbshipit-source-id: 12104892faa69f37c141e8baf54d5fb24fc5df6b
2020-05-08 07:35:16 -07:00
Stefan Filip
ea89b541e1 segmented_changelog: add Dag struct and location_to_name functionality
Summary:
The IdDag provides graph algorithms using Segments.
The IdMap allows converting from the SegmentedChangelogId domain to the
ChangesetId domain.
The Dag struct wraps IdDag and IdMap in order to provide graph algorithms using
the common application level identifiers for commits (ChangesetId).

The construction of the Dag is currently mocked with something that can only be
used in a test environment (unit tests but also integration tests).

This diff also implements a location_to_name function. This is the most
important new functionality that segmented changelog clients require. It
recovers the hash of a commit for which the client only has a segmented
changelog Id. The current assumption is that clients have identifiers for all
merge commit parents so the path to a known commit always follow a set
of first parents.

The IdMap queries will have to be changed to async in the future, but IdDag
queries we expect to stay sync.

Reviewed By: quark-zju

Differential Revision: D20635577

fbshipit-source-id: 4f9bd8dd4a5bd9b0de55f51086f3434ff507963c
2020-03-27 13:48:52 -07:00
Stefan Filip
a853c7a92b segmented_changelog: use [fbinit::compat_test] for idmap tests
Summary: Modernizing the codebase.

Reviewed By: krallin

Differential Revision: D20655252

fbshipit-source-id: c97fd46f1a224ca74606f4b42d5fa6b1a00c8ea8
2020-03-27 13:48:52 -07:00
Stefan Filip
c0019225b4 segmented_changelog: move idmap to its own file
Summary:
Making room to new components. Individual files will be easier to read than
all code in lib.rs.

Reviewed By: krallin

Differential Revision: D20635579

fbshipit-source-id: 3966b03658b039e9d46e400a00fc50416d60780b
2020-03-27 13:48:52 -07:00
Stefan Filip
031c6b2fb4 segmented_changelog: add dependency to scm/lib/dag
Summary:
It's going to be useful to share certain structures between client and server.
Looking ahead, the plan is to share the segment graph along with all the
algorithms implemented for it.

Reviewed By: StanislavGlebik

Differential Revision: D20550951

fbshipit-source-id: f498a6b0cba1bcdd35fc9720125b223d7e891a44
2020-03-24 13:58:07 -07:00
David Tolnay
e988a88be9 rust: Rename futures_preview:: to futures::
Summary:
Context: https://fb.workplace.com/groups/rust.language/permalink/3338940432821215/

This codemod replaces *all* dependencies on `//common/rust/renamed:futures-preview` with `fbsource//third-party/rust:futures-preview` and their uses in Rust code from `futures_preview::` to `futures::`.

This does not introduce any collisions with `futures::` meaning 0.1 futures because D20168958 previously renamed all of those to `futures_old::` in crates that depend on *both* 0.1 and 0.3 futures.

Codemod performed by:

```
rg \
    --files-with-matches \
    --type-add buck:TARGETS \
    --type buck \
    --glob '!/experimental' \
    --regexp '(_|\b)rust(_|\b)' \
| sed 's,TARGETS$,:,' \
| xargs \
    -x \
    buck query "labels(srcs, rdeps(%Ss, //common/rust/renamed:futures-preview, 1))" \
| xargs sed -i 's,\bfutures_preview::,futures::,'

rg \
    --files-with-matches \
    --type-add buck:TARGETS \
    --type buck \
    --glob '!/experimental' \
    --regexp '(_|\b)rust(_|\b)' \
| xargs sed -i 's,//common/rust/renamed:futures-preview,fbsource//third-party/rust:futures-preview,'
```

Reviewed By: k21

Differential Revision: D20213432

fbshipit-source-id: 07ee643d350c5817cda1f43684d55084f8ac68a6
2020-03-03 11:01:20 -08:00
David Tolnay
fe65402e46 rust: Move futures-old rdeps to renamed futures-old
Summary:
In targets that depend on *both* 0.1 and 0.3 futures, this codemod renames the 0.1 dependency to be exposed as futures_old::. This is in preparation for flipping the 0.3 dependencies from futures_preview:: to plain futures::.

rs changes performed by:

```
rg \
    --files-with-matches \
    --type-add buck:TARGETS \
    --type buck \
    --glob '!/experimental' \
    --regexp '(_|\b)rust(_|\b)' \
| sed 's,TARGETS$,:,' \
| xargs \
    -x \
    buck query "labels(srcs,
        rdeps(%Ss, fbsource//third-party/rust:futures-old, 1)
        intersect
        rdeps(%Ss, //common/rust/renamed:futures-preview, 1)
    )" \
| xargs sed -i 's/\bfutures::/futures_old::/'
```

Reviewed By: jsgf

Differential Revision: D20168958

fbshipit-source-id: d2c099f9170c427e542975bc22fd96138a7725b0
2020-03-02 21:02:50 -08:00
Thomas Orozco
16384599a8 mononoke (+ rust/shed/async_unit): update async_unit to expect async fn's
Summary:
This allows code that is being exercised under async_unit to call into code
that expects a Tokio 0.2 environment (e.g. 0.2 timers).

Unfortunately, this requires turning off LSAN for the async_unit tests, since
it looks like LSAN and Tokio 0.2 don't work very well together, resulting in
LSAN reporting leaked memory for some TLS structures that were initialized by
tokio-preview (regardless of whether the Runtime is being dropped):
https://fb.workplace.com/groups/rust.language/permalink/3249964938385432/

Considering async_unit is effectively only used in Mononoke, and Mononoke
already turns off LSAN in tests for precisely this reason ... it's probably
reasonable to do the same here.

The main body of changes here is also about updating the majority of our
changes to stop calling wait(), and use this new async unit everywhere. This is
effectively a pretty big batch conversion of all of our tests to use async fns
instead of the former approaches. I've also updated a substantial number of
utility functions to be async fns.

A few notable changes here:

- Some pushrebase tests were pretty flaky — the race they look for isn't
  deterministic. I added some actual waiting (using pushrebase hooks) to make
  it more deterministic.  This is kinda copy pasted from the globalrev hook
  (where I had introduced this first), but this will do for now.
- The multiplexblob tests don't work at all with new futures, because they call
  `poll()` all over the place. I've updated them to new futures, which required
  a bit of reworking.
- I took out a couple tests in async unit that were broken anyway.

Reviewed By: StanislavGlebik

Differential Revision: D19902539

fbshipit-source-id: 352b4a531ef5fa855114c1dd8bb4d70ed967dd55
2020-02-18 01:55:00 -08:00
Lukasz Piatkowski
542d1f93d3 Manual synchronization of fbcode/eden and facebookexperimental/eden
Summary:
This commit manually synchronizes the internal move of
fbcode/scm/mononoke under fbcode/eden/mononoke which couldn't be
performed by ShipIt automatically.

Reviewed By: StanislavGlebik

Differential Revision: D19722832

fbshipit-source-id: 52fbc8bc42a8940b39872dfb8b00ce9c0f6b0800
2020-02-11 11:42:43 +01:00
Lukasz Piatkowski
e8d62b64d5 mononoke: move the codebase under eden/ directory
fbshipit-source-id: 43a0252cb3ec42aa365f20d1b6faa4d24d74c9b8
2020-02-06 13:46:04 +01:00