Summary:
In the hg sync job, we need to load up the ancestors for all bookmarks known to
the server we are pushhing to, and for e.g. fbsource that might be > 10K
bookmarks. If we fetch those 1 by 1 (because e.g. cold cache), that will take a
very long time.
Unfortunately, we don't currently have a way of buffering access to changesets,
so for now let's mitigate by buffering.
Reviewed By: ikostia, HarveyHunt
Differential Revision: D21860228
fbshipit-source-id: 90977a9e00689c1df5ae53d149c267de9b2f973e
Summary:
When you call poll() from within an async context (as opposed to calling
await), awkward things happen, like this:
```
thread 'validation::test::slow_ready_validates' panicked at 'no Task is currently running',
```
So, let's stop doing that by instead checking that our stream eventually
yields. We can't count the polls this way, but that feels a bit immaterial so
that's probably OK. Also, let's clean up the code a little bit by removing a
bunch of conditional compilation.
Reviewed By: ahornby
Differential Revision: D21862830
fbshipit-source-id: 3eb49575d940ca85f59c49295dd6b0dcfb2e5d15
Summary:
We are going to start using tunables in Mononoke in the next diffs, and the
name clash between "tunables" and "newfilenodes::tunables" makes it confusing.
Let's rename newfilenodes::tunables to sql_timeout_knobs
Reviewed By: krallin
Differential Revision: D21879093
fbshipit-source-id: ab0bae4be3c319dcb6afeecdd1c13df395e79e3b
Summary: we need an option to rebuild skiplists from scratch in case of corruptions.
Reviewed By: farnz
Differential Revision: D21863639
fbshipit-source-id: 56d8360a6c2c38aeb35f534758f5cde410fef421
Summary: The `secure_utils` crate from common/rust/secure_utils was moved to rust-shed, the remaining crates in that folder are being refactored here into a single crate `identity_ext` for clarity.
Reviewed By: StanislavGlebik
Differential Revision: D21549861
fbshipit-source-id: 4da6566a09ba7a772e8062632f9d7520af2e09e6
Summary:
In the next diffs we'll make it possible to disable filenodes in Mononoke. See
D21787848 and attached task for more details, but TL;DR is that if xdb is down
we still want to serve "hg update" traffic.
If filenodes are disabled we obviously can't generate filenodes for new
commits. So one option would be to just return an error from
FilenodesOnlyPublic::derive(...) call. But that would mean that any attempt to
call derivation would fail, and e.g. Mononoke servers won't be able to start up
- (see https://fburl.com/diffusion/roau028d). We could change callers to always
process errors from FilenodesOnlyPublic, but I think it would be harder to
enforce and easier to forget.
So this diff changes FilenodesOnlyPublic to be an enum, and
FilenodesOnlyPublic::Disabled is returned immediately if filenodes are
disabled. For callers it means that they can't rely on filenodes being present
in db even after FilenodesOnlyPublic were derived. That's the whole of the
stack, and the next diffs will update the callers to properly deal with missing
filenodes.
One caveat is that when we re-enable filenodes back we might need to derive
them for a lot of commits.
I don't expect it to happen often (i.e. if xdb is down then we probably can't
commit anyway), but if somehow it happened, then we should be a bit more
careful with re-enabling them after the problem was fixed. For example, we can
first derive all the filenodes locally by e.g. running backfill_derived_data,
and only after that has finished successfully we can re-enable them.
Reviewed By: krallin
Differential Revision: D21840328
fbshipit-source-id: ce9594d4a21110a5cb392c3049ccaede064c1e66
Summary:
Instead of always building from scratch, continue assiging Vertexes and
Segments from the last commit that was processed.
Reviewed By: StanislavGlebik
Differential Revision: D21634699
fbshipit-source-id: 9f8b890dcf65c59a66651343f0ccc1487efc2394
Summary: Previously, `read_res` was called `data_util` and only dealt with EdenAPI data responses. Support for history responses was added later as a `history` subcommand. For consistency, let's move the top-level commands for data responses underneath a new `data` subcommand. When support for addition response types is added in the future, those can also go under their own subcommands.
Reviewed By: quark-zju
Differential Revision: D21825197
fbshipit-source-id: f5cb759a68324e7d0f98e3448bd5d1cba6417bad
Summary: Give this tool a more descriptive name. (It reads EdenAPI responses, so `read_res` seemed fitting.)
Reviewed By: quark-zju
Differential Revision: D21796964
fbshipit-source-id: 8a4ee365aa3bcf115fc7a3452406ed96b4a25edc
Summary:
Clean up some of the conversion functions by renaming variables that are
keywords in other languages, and simplifying error handling code.
Differential Revision: D21839019
fbshipit-source-id: d8945a14a230caa744040e134203a908ad9cef20
Summary: `ErrorKind` is not meaningful, and is an artifact of older-style error handling crates. A better name is `ConfigurationError`.
Reviewed By: krallin
Differential Revision: D21837271
fbshipit-source-id: 709d9e2ab7f18dd2f7cb2489f24e91612bc378db
Summary:
Replace the use of `RepoConfigs::read*` associated functions with free
functions. These didn't really need to be associated functions (and in the
case of the common and storage configs, really didn't belong there either).
Reviewed By: krallin
Differential Revision: D21837270
fbshipit-source-id: 2dc73a880ed66e11ea484b88b749582ebdf8a73f
Summary:
Refactor parsing of repo config using a new `Convert` trait to allow
definition of each part of parsing separately.
The wireproto logging args require access to the storage definitions, so need
to be parsed by their own special function for now.
Differential Revision: D21837269
fbshipit-source-id: 7ab0e3f4b3b8549aaefb45201388c3dfc7633ef7
Summary:
Refactor parsing of storage config using a new `Convert` trait to allow
definition of each part of parsing separately.
Differential Revision: D21766761
fbshipit-source-id: 7e224e9d322a3a16a64f5ebba2243bbe6341c8f0
Summary:
Refactor parsing of commit sync config using a new `Convert` trait to allow
definition of each part of parsing separately.
Differential Revision: D21766760
fbshipit-source-id: 3c95d70788753316d3c1f36280e7d6dbb52a9710
Summary:
We'd like to serve read traffic even if filenodes are disabled. Let's add a
tunable that can control it.
Reviewed By: HarveyHunt
Differential Revision: D21839672
fbshipit-source-id: 4ec4dd16b9e6e3ffb1ada0d812e1153e1a33a268
Summary: It was replaced with a parameter
Reviewed By: HarveyHunt
Differential Revision: D21839397
fbshipit-source-id: e75900b3da80985cd762659993b8b285411fe928
Summary:
DefferedDerivedMapping was added so that we can make deriving stack of commits faster - it does it by postponing updating
derived data mapping (e.g. writing to a blobstore) until the whole stack is derived.
While it probably makes derivation a bit faster, we now think it's better to remove it. A few reasons:
1) It's confusing to understand and it already caused us ubns before
2) It's increases write amplification - because we release the lease before we wrote to a blobstore, writers will try to rederive the same commit a few times. That has caused us a ubn today
Reviewed By: farnz
Differential Revision: D20113854
fbshipit-source-id: 169e05febcd382334bf4da209a20aace0b7c2333
Summary:
See D21765065 for more context. TL;DR is that we want to control
lfs rollout from client side to make sure we don't put lfs pointers in the
shared memcache
Reviewed By: xavierd
Differential Revision: D21822159
fbshipit-source-id: daea6078d95eb4e9c040d353a20bcdf1b6ae07b1
Summary: This cache didn't improve performance, and is a reasonable amount of extra code. Just rip it out, and improve higher caching layers if needed.
Reviewed By: StanislavGlebik
Differential Revision: D21818412
fbshipit-source-id: 3ca58bff180c6da91d451754b67b23e61b736059
Summary:
We can kill a single shard's replication with weight of puts; by tracking lag once a second, we can slow down to a point where no shard overloads.
This is insufficient by itself, as it slows all shards down equally (whereas we only want to slow down the laggy shard), but this now avoids high replication lag
Reviewed By: StanislavGlebik
Differential Revision: D21720833
fbshipit-source-id: a819f641206641e80f8edde92006fb08cdcf36a9
Summary:
Out `CommitSyncConfig` struct now contains a `version_name` field, which is intended to be used as an identifier of an individual version of the `commitsyncmap` in use. We want to record this value in the `synced_commit_mapping` table, so that later it is possible to attribute a commit sync map to a given commit sync.
This is part of a broader goal of adding support for sync map versioning. The broader goal is important, as it allows us to move faster (and troubleshoot better) when sync maps need changing.
Note that when commit is preserved across repos, we set `version_name` to `NULL`, as it makes no sense to attribute commit sync maps to those case.
Reviewed By: farnz
Differential Revision: D21765408
fbshipit-source-id: 11a77cc4d926e4a4d72322b51675cb78eabcedee
Summary:
The motivation for the whole stack:
At the moment if mysql is down then Mononoke is down as well, both for writes
and for reads. However we can relatively easily improve the situation.
During hg update client sends getpack() requests to fetch files, and currently
for each file fetch we also fetch file's linknode. However hg client knows how
to deal with null linknodes [1], so when mysql is unavailable we can disable
filenode fetching completely and just return null linknodes. So the goal of this stack is to
add a knob (i.e. a tunable) that can turn things filenode fetches on and off, and make
sure the rest of the code deals nicely with this situation.
Now, about this diff. In order to force callers to deal with the fact that
filenodes might unavailable I suggest to add a special type of result, which (in
later diffs) will be returned by every filenodes methods.
This diff just introduces the FilenodeResult and convert BlobRepo filenode
methods to return it. The reason why I converted BlobRepo methods first
is to limit the scope of changes but at the same time show how the callers' code will look
like after FilenodeResult is introduced, and get people's thoughts of whether
it's reasonable or not.
Another important change I'd like to introduce in the next diffs is modifying FilenodesOnlyPublic
derived data to return success if filenodes knob is off. If we don't do that
then any attempt to derive filenodes might fail which in turn would lead to the
same problem we have right now - people won't be able to do hg update/hg
pull/etc if mysql is down.
[1] null linknodes might make some client side operation slower (e.g. hg rebase/log/blame),
so we should use it only in sev-like situations
Reviewed By: krallin
Differential Revision: D21787848
fbshipit-source-id: ad48d5556e995af09295fa43118ff8e3c2b0e53e
Summary:
`flush()` takes a timeout, which is in milliseconds. However, that's not super
obvious, so that means whoever is using this has to re-discover it. Let's make
it explicit, and use a `Duration` argument instead.
(In fact, some places even got it wrong, notably common/rust/cli_usage)
Reviewed By: farnz
Differential Revision: D21783991
fbshipit-source-id: 6e3ac7c22b5c3297b41d5d61373ee077f12d5dd4
Summary: Update the JSON format for history requests to use an array rather than an object to represent keys, for the same reason as D21412989. (Namely, that it's possible for two keys to share the same path, making the path unsuitable for use as a field name in a JSON object.)
Reviewed By: xavierd
Differential Revision: D21782763
fbshipit-source-id: eb04013795d1279ecbf00a8a0be106318695bd05
Summary:
This diff adds support for the `version_name` field, coming from the
`commitsyncmap` config, stored in the configerator.
Note: ATM, this field is optional in the thrift config, but once we get past
the initial deployment stage, I expect it to be present always. This is why
in `CommmitSyncConfig` I make it `String` (with a default value of `""`) rather
than `Option<String>`. The code, which will be writing this value into
`synced_commit_mapping` should not ever care whether it's present or not, since
every mapping should always have a `version_name` present.
Reviewed By: StanislavGlebik
Differential Revision: D21764641
fbshipit-source-id: 35a7f487acf0562b309fd6d1b6473e3b8024722d
Summary: Reverting as D20763778 is a suspect in causing thrift_server_overload T67609407
Reviewed By: StanislavGlebik
Differential Revision: D21762641
fbshipit-source-id: 545b448afc0954271a2e29d1d3b48fdb959e3d3d
Summary:
Change the signature of `CreateCommitContext::as_file` and its associated
functions so that content is `impl Into<String>`, rather than
`impl AsRef<str>`. The content will immediately be converted to a `String`
anyway, so we can avoid a string copy if the caller already has a string that
can be moved.
Reviewed By: krallin
Differential Revision: D21743429
fbshipit-source-id: d54914386439489fe4e47e37ff9a75c52b1a0443
Summary:
Add support for drawdag in Mononoke unit tests. Tests can use ASCII DAGs to construct
commit graphs, and can optionally customize the content of each commit.
Reviewed By: krallin
Differential Revision: D21743431
fbshipit-source-id: 9e6a52d1efe67ef4a5519ed7783f953fef7358f1
Summary:
The parser currently uses pattern destructuring for `RawInfinitepushParams`. This will break
if new fields are added to this structure. Instead, use field access like the other raw
params parsers.
Reviewed By: mitrandir77
Differential Revision: D21742558
fbshipit-source-id: 6bfbb080a5e5cdbb02519855472f4df80f9d7453
Summary:
It was used only once for testing push redirection. We no longer need it, so
I'd like to delete it to remove this old code and also to make it easier to
support ManualMove bookmarks.
Differential Revision: D21745630
fbshipit-source-id: 362952d95edb923cc4b60359321b563c1e4961de
Summary: Useful for determining where an incremental building step left off.
Reviewed By: StanislavGlebik
Differential Revision: D21634698
fbshipit-source-id: e9b0473003c529d5c934754f1ece23df69c4be66
Summary:
This diff extends the integration test for the forward filler to execute queue operations, as well as the core business logic.
It also adds a test for the reverse filler, which does the same, but in a different difection.
Reviewed By: krallin
Differential Revision: D21628705
fbshipit-source-id: fb4ee0ecacc990d073425f3f37f794f74c057ea2
Summary:
This diff finally introduce the continuous reverse filler. Specifically, this adds a cli (and underlying wiring) to operate the filler logic in the `reversefillerqueue` table.
To achieve this:
- the filler class is turned into a base class with two subclasses for the forward and reverse fillers
- the main file is renamed from `forwardfiller.py` into `filler.py`, to better reflect the independence of direction.
Reviewed By: krallin
Differential Revision: D21628259
fbshipit-source-id: 5676a162a62f0dc6fe80e6300b72d30370fc80b4
Summary: Add devdb support to integration test runner so that one can more easily inspect mysql test data, also makes it easier to run tests with locally modified schema.
Reviewed By: HarveyHunt
Differential Revision: D21645234
fbshipit-source-id: ec75d70ef59f04548c7346a122298567dd09c264
Summary:
At first glance people will assume that changesets are returned in the same
order that they were added in the database or that at least commits are
returned in a deterministic fashion. That didn't happen because the both
changeset ids and changeset entries were received without any order.
This diff updates the function to returns results in order they were added
to the database.
Reviewed By: krallin
Differential Revision: D21676663
fbshipit-source-id: 912e6bea0532796b1d8e44e47d832c0420d97bc1
Summary:
This structure has similar functionality to the IdMap that is backed by SQL.
It is probably going to be useful for caching in the case of batch operations.
Reviewed By: quark-zju
Differential Revision: D21601820
fbshipit-source-id: 9c3ebc3e9dc92a59ce0908fc241bd2b97da88dca
Summary:
`Dag::build_all_graph` will load the whole graph for a given repository and
construct the segmented changelog from it.
Reviewed By: StanislavGlebik
Differential Revision: D21538029
fbshipit-source-id: b4ba846bb2870ba73257bed6128b8e198a0aab3e
Summary:
Change from CHashMap to DashMap for the walk state tracking.
DashMap is using slightly less memory in testing, and is slightly quicker in walk rate (number of graph nodes processed per second ).
Reviewed By: HarveyHunt
Differential Revision: D21662210
fbshipit-source-id: ea0601df17e0e596fd59b67d9d01d0dc4e90799b
Summary:
CacheBlob logs hit and miss stastics to Scuba. Let's add the same for
ODS.
Reviewed By: krallin
Differential Revision: D21640922
fbshipit-source-id: 8f7d17f048bf53bdc6cd8bda0384a51cae7b6a30
Summary:
CountedBlobstore is a Blobstore that wraps another blobstore in order
to report metrics about it, such as the number of gets or errors. It's commonly
used to wrap a CacheBlob blobstore, which itself is a caching wrapper over an
inner blobstore.
CountedBlobstore exposes metrics that are supposed to track the number of hits
or misses when fetching blobs. However, these metrics don't make sense as the
CountedBlobstore has no view into cache activity. These metrics actually report
the number of requests and the number of missing blobs rather than hits and
misses.
Remove these misleading counters.
Reviewed By: krallin
Differential Revision: D21640923
fbshipit-source-id: 07b9fc9864c70991415c2b84f35d631b702c17d1