sapling/eden/mononoke/cmds/segmented_changelog_tailer.rs

189 lines
6.6 KiB
Rust
Raw Normal View History

/*
* Copyright (c) Facebook, Inc. and its affiliates.
*
* This software may be used and distributed according to the terms of the
* GNU General Public License version 2.
*/
#![deny(warnings)]
use std::sync::Arc;
use std::time::Duration;
use anyhow::{format_err, Context, Error};
Allow cli commands to build any "repo object" Summary: The important change on this diff is in this file: `eden/mononoke/cmdlib/src/args/mod.rs` On this diff I change that file's repo-building functions to be able to build both `BlobRepo` and `InnerRepo` (added on D28748221 (https://github.com/facebookexperimental/eden/commit/e4b6fd3751ff283261f9bdb34e8d2925a5760e58)). In fact, they are now able to build any facet container that can be built by the `RepoFactory` factory, so each binary can specify their own subset of needed "attributes" and only build those ones. For now, they're all still using BlobRepo, this diff is only a refactor that enables easily changing the repo attributes you need. The rest of the diff is mostly giving hints to the compiler, as in several places it couldn't infer it should use `BlobRepo` directly, so I had to add type hints. ## High level goal This is part of the blobrepo refactoring effort. I am also doing this in order to: 1. Make sure every place that builds `SkiplistIndex` uses `RepoFactory` for that. 2. Then add a `BlobstoreGetOps` trait for blobstores, and use the factory to feed it to skiplist index, so it can query the blobstore while skipping cache. (see [this thread](https://www.internalfb.com/diff/D28681737 (https://github.com/facebookexperimental/eden/commit/850a1a41b7758aee693af8b7323973117ac48675)?dst_version_fbid=283910610084973&transaction_fbid=106742464866346)) Reviewed By: StanislavGlebik Differential Revision: D28877887 fbshipit-source-id: b5e0093449aac734591a19d915b6459b1779360a
2021-06-09 15:14:14 +03:00
use blobrepo::BlobRepo;
use clap::Arg;
use futures::future::join_all;
use slog::{error, info};
use blobstore_factory::{make_metadata_sql_factory, ReadOnlyStorage};
use bookmarks::{BookmarkName, Bookmarks};
use cmdlib::{
args::{self, MononokeMatches},
helpers,
};
use context::{CoreContext, SessionContainer};
use fbinit::FacebookInit;
use metaconfig_types::MetadataDatabaseConfig;
use segmented_changelog::{SegmentedChangelogSqlConnections, SegmentedChangelogTailer};
use sql_ext::facebook::MyAdmin;
use sql_ext::replication::{NoReplicaLagMonitor, ReplicaLagMonitor};
const ONCE_ARG: &str = "once";
const REPO_ARG: &str = "repo";
#[fbinit::main]
fn main(fb: FacebookInit) -> Result<(), Error> {
let app = args::MononokeAppBuilder::new("Updates segmented changelog assets.")
.with_scuba_logging_args()
.with_advanced_args_hidden()
.with_fb303_args()
.build()
.about("Builds a new version of segmented changelog.")
.arg(
Arg::with_name(REPO_ARG)
.long(REPO_ARG)
.takes_value(true)
.required(true)
.multiple(true)
.help("Repository name to warm-up"),
)
.arg(
Arg::with_name(ONCE_ARG)
.long(ONCE_ARG)
.takes_value(false)
.required(false)
.help("When set, the tailer will perform a single incremental build run."),
);
let matches = app.get_matches(fb)?;
let logger = matches.logger();
let session = SessionContainer::new_with_defaults(fb);
let ctx = session.new_context(logger.clone(), matches.scuba_sample_builder());
helpers::block_execute(
run(ctx, &matches),
fb,
&std::env::var("TW_JOB_NAME").unwrap_or_else(|_| "segmented_changelog_tailer".to_string()),
logger,
&matches,
cmdlib::monitoring::AliveService,
)
}
async fn run<'a>(ctx: CoreContext, matches: &'a MononokeMatches<'a>) -> Result<(), Error> {
let reponames: Vec<_> = matches
.values_of(REPO_ARG)
.ok_or_else(|| format_err!("--{} argument is required", REPO_ARG))?
.map(ToString::to_string)
.collect();
if reponames.is_empty() {
error!(ctx.logger(), "At least one repo had to be specified");
return Ok(());
}
let config_store = matches.config_store();
let mysql_options = matches.mysql_options();
let configs = args::load_repo_configs(config_store, matches)?;
let readonly_storage = ReadOnlyStorage(false);
let mut tasks = Vec::new();
for (index, reponame) in reponames.into_iter().enumerate() {
let config = configs
.repos
.get(&reponame)
.ok_or_else(|| format_err!("unknown repository: {}", reponame))?;
let repo_id = config.repoid;
let bookmark_name = &config.segmented_changelog_config.master_bookmark;
let track_bookmark = BookmarkName::new(bookmark_name).with_context(|| {
format!(
"error parsing the name of the bookmark to track: {}",
bookmark_name,
)
})?;
info!(
ctx.logger(),
"repo name '{}' translates to id {}", reponame, repo_id
);
let storage_config = config.storage_config.clone();
let db_address = match &storage_config.metadata {
MetadataDatabaseConfig::Local(_) => None,
MetadataDatabaseConfig::Remote(remote_config) => {
Some(remote_config.primary.db_address.clone())
}
};
let replica_lag_monitor: Arc<dyn ReplicaLagMonitor> = match db_address {
None => Arc::new(NoReplicaLagMonitor()),
Some(address) => {
let my_admin = MyAdmin::new(ctx.fb).context("building myadmin client")?;
Arc::new(my_admin.single_shard_lag_monitor(address))
}
};
let sql_factory = make_metadata_sql_factory(
ctx.fb,
storage_config.metadata,
mysql_options.clone(),
readonly_storage,
)
.await
.with_context(|| format!("repo {}: constructing metadata sql factory", repo_id))?;
let segmented_changelog_sql_connections = sql_factory
.open::<SegmentedChangelogSqlConnections>()
.with_context(|| {
format!(
"repo {}: error constructing segmented changelog sql connections",
repo_id
)
})?;
// This is a bit weird from the dependency point of view but I think that it is best. The
// BlobRepo may have a SegmentedChangelog attached to it but that doesn't hurt us in any
// way. On the other hand reconstructing the dependencies for SegmentedChangelog without
// BlobRepo is probably prone to more problems from the maintenance perspective.
Allow cli commands to build any "repo object" Summary: The important change on this diff is in this file: `eden/mononoke/cmdlib/src/args/mod.rs` On this diff I change that file's repo-building functions to be able to build both `BlobRepo` and `InnerRepo` (added on D28748221 (https://github.com/facebookexperimental/eden/commit/e4b6fd3751ff283261f9bdb34e8d2925a5760e58)). In fact, they are now able to build any facet container that can be built by the `RepoFactory` factory, so each binary can specify their own subset of needed "attributes" and only build those ones. For now, they're all still using BlobRepo, this diff is only a refactor that enables easily changing the repo attributes you need. The rest of the diff is mostly giving hints to the compiler, as in several places it couldn't infer it should use `BlobRepo` directly, so I had to add type hints. ## High level goal This is part of the blobrepo refactoring effort. I am also doing this in order to: 1. Make sure every place that builds `SkiplistIndex` uses `RepoFactory` for that. 2. Then add a `BlobstoreGetOps` trait for blobstores, and use the factory to feed it to skiplist index, so it can query the blobstore while skipping cache. (see [this thread](https://www.internalfb.com/diff/D28681737 (https://github.com/facebookexperimental/eden/commit/850a1a41b7758aee693af8b7323973117ac48675)?dst_version_fbid=283910610084973&transaction_fbid=106742464866346)) Reviewed By: StanislavGlebik Differential Revision: D28877887 fbshipit-source-id: b5e0093449aac734591a19d915b6459b1779360a
2021-06-09 15:14:14 +03:00
let blobrepo: BlobRepo =
args::open_repo_with_repo_id(ctx.fb, ctx.logger(), repo_id, matches).await?;
let segmented_changelog_tailer = SegmentedChangelogTailer::new(
repo_id,
segmented_changelog_sql_connections,
replica_lag_monitor,
blobrepo.get_changeset_fetcher(),
Arc::new(blobrepo.get_blobstore()),
Arc::clone(blobrepo.bookmarks()) as Arc<dyn Bookmarks>,
track_bookmark,
None,
);
info!(
ctx.logger(),
"repo {}: SegmentedChangelogTailer initialized", repo_id
);
if matches.is_present(ONCE_ARG) {
segmented_changelog_tailer
.once(&ctx)
.await
.with_context(|| format!("repo {}: incrementally building repo", repo_id))?;
info!(
ctx.logger(),
"repo {}: SegmentedChangelogTailer is done", repo_id,
);
} else if let Some(period) = config.segmented_changelog_config.tailer_update_period {
// spread out update operations, start updates on another repo after 7 seconds
let wait_to_start = Duration::from_secs(7 * index as u64);
let ctx = ctx.clone();
tasks.push(async move {
mononoke: update to tokio 1.x Summary: NOTE: there is one final pre-requisite here, which is that we should default all Mononoke binaries to `--use-mysql-client` because the other SQL client implementations will break once this lands. That said, this is probably the right time to start reviewing. There's a lot going on here, but Tokio updates being what they are, it has to happen as just one diff (though I did try to minimize churn by modernizing a bunch of stuff in earlier diffs). Here's a detailed list of what is going on: - I had to add a number `cargo_toml_dir` for binaries in `eden/mononoke/TARGETS`, because we have to use 2 versions of Bytes concurrently at this time, and the two cannot co-exist in the same Cargo workspace. - Lots of little Tokio changes: - Stream abstractions moving to `tokio-stream` - `tokio::time::delay_for` became `tokio::time::sleep` - `tokio::sync::watch::Sender::send` became `tokio::sync::watch::Sender::broadcast` - `tokio::sync::Semaphore::acquire` returns a `Result` now. - `tokio::runtime::Runtime::block_on` no longer takes a `&mut self` (just a `&self`). - `Notify` grew a few more methods with different semantics. We only use this in tests, I used what seemed logical given the use case. - Runtime builders have changed quite a bit: - My `no_coop` patch is gone in Tokio 1.x, but it has a new `tokio::task::unconstrained` wrapper (also from me), which I included on `MononokeApi::new`. - Tokio now detects your logical CPUs, not physical CPUs, so we no longer need to use `num_cpus::get()` to figure it out. - Tokio 1.x now uses Bytes 1.x: - At the edges (i.e. streams returned to Hyper or emitted by RepoClient), we need to return Bytes 1.x. However, internally we still use Bytes 0.5 in some places (notably: Filestore). - In LFS, this means we make a copy. We used to do that a while ago anyway (in the other direction) and it was never a meaningful CPU cost, so I think this is fine. - In Mononoke Server it doesn't really matter because that still generates ... Bytes 0.1 anyway so there was a copy before from 0.1 to 0.5 and it's from 0.1 to 1.x. - In the very few places where we read stuff using Tokio from the outside world (historical import tools for LFS), we copy. - tokio-tls changed a lot, they removed all the convenience methods around connecting. This resulted in updates to: - How we listen in Mononoke Server & LFS - How we connect in hgcli. - Note: all this stuff has test coverage. - The child process API changed a little bit. We used to have a ChildWrapper around the hg sync job to make a Tokio 0.2.x child look more like a Tokio 1.x Child, so now we can just remove this. - Hyper changed their Websocket upgrade mechanism (you now need the whole `Request` to upgrade, whereas before that you needed just the `Body`, so I changed up our code a little bit in Mononoke's HTTP acceptor to defer splitting up the `Request` into parts until after we know whether we plan to upgrade it. - I removed the MySQL tests that didn't use mysql client, because we're leaving that behind and don't intend to support it on Tokio 1.x. Reviewed By: mitrandir77 Differential Revision: D26669620 fbshipit-source-id: acb6aff92e7f70a7a43f32cf758f252f330e60c9
2021-04-28 17:35:21 +03:00
tokio::time::sleep(wait_to_start).await;
segmented_changelog_tailer.run(&ctx, period).await;
});
}
}
join_all(tasks).await;
Ok(())
}