Summary:
This diff migrates add_filenodes method to return FilenodeResult.
That means that all filenodes methods now return FilenodeResult and it's time
now to remove TODOs from derived_data filenodes.
Note that I had to change the test "derive_disabled_filenodes" a bit.
Previously FilenodesOnlyPublic::mapping::get() method immediately returned
FilenodesOnlyPublic::Disabled, while now it returns None if hg changeset is not
derived. This is an expected change in behaviour, so I just updated the test to
try to derive FilenodesOnlyPublic first, which in turns triggers generation of hg changeset.
Reviewed By: ahornby
Differential Revision: D21904401
fbshipit-source-id: f6f4cd14e6cdce5a4b95d8f3f9acff305ae6fa88
Summary:
Similar to get_all_filenodes_maybe_stale() make this method return
FilenodeResult if filenodes are disabled.
Note: this diff adds one TODO in fetch_root_filenode, which will be removed
together with other TODOs in the next diff.
Reviewed By: ahornby
Differential Revision: D21904399
fbshipit-source-id: 1569579699c02eb07021f8143aa652aa192d23bc
Summary:
Let's return FilenodeResult from get_all_filenodes_maybe_stale and change
callers to deal with that.
The change is straightforward with the exception of `file_history.rs`.
get_all_filenodes_maybe_stale() is used here to prefetch a lot filenodes in one
go. This diff changes it to return an empty vec in case filenodes are disabled.
Unfortunately this is not a great solution - since prefetched files are empty
get_file_history_using_prefetched() falls back to fetching filenodes
sequentially from the blobstore. that might be too slow, and the next diffs in
the stack will address this problem.
Reviewed By: krallin
Differential Revision: D21881082
fbshipit-source-id: a86dfd48a92182381ab56994f6b0f4b14651ea14
Summary: Add the mutation store to blobrepo.
Reviewed By: krallin
Differential Revision: D20871336
fbshipit-source-id: 777cba6c2bdcfb16b711dbad61fc6d0d2f337117
Summary:
We had accumulated lots of unused dependendencies, and had several test_deps in deps instead. Clean this all up to reduce build times and speed up autocargo processing.
Net removal is of around 500 unneeded dependency lines, which represented false dependencies; by removing them, we should get more parallelism in dev builds, and less overbuilding in CI.
Reviewed By: krallin, StanislavGlebik
Differential Revision: D20999762
fbshipit-source-id: 4db3772cbc3fb2af09a16601bc075ae8ed6f0c75
Summary:
Our phases caching wasn't great. If you tried to ask for a draft commit then
we'd call mark_reachable_as_public method, bu this method was bypassing
caches.
The reason why we had this problem was because we had caching on a higher level
than necessary - we had SqlPhases struct which was "smarter" (i.e. it has a
logic of traversing ancestors of public heads and marking these ancestors and
public) and SqlPhasesStore which just did sql access. Previously we had our
caching layer on top of SqlPhases, meaning that when SqlPhases calls
`mark_reachable_as_public` it can't use caches anymore.
This diff fixes it by moving caching one layer lower - now we have a cache
right on top of SqlPhasesStore. Because of this change we no longer need
CachingPhases, and they were removed. Also `ephemeral_derive` logic was
simplified a bit
Reviewed By: krallin
Differential Revision: D20834740
fbshipit-source-id: 908b7e17d6588ce85771dedf51fcddcd2fabf00e
Summary:
Migrate the configuration of sql data managers from the old configuration using `sql_ext::SqlConstructors` to the new configuration using `sql_construct::SqlConstruct`.
In the old configuration, sharded filenodes were included in the configuration of remote databases, even when that made no sense:
```
[storage.db.remote]
db_address = "main_database"
sharded_filenodes = { shard_map = "sharded_database", shard_num = 100 }
[storage.blobstore.multiplexed]
queue_db = { remote = {
db_address = "queue_database",
sharded_filenodes = { shard_map = "valid_config_but_meaningless", shard_num = 100 }
}
```
This change separates out:
* **DatabaseConfig**, which describes a single local or remote connection to a database, used in configuration like the queue database.
* **MetadataDatabaseConfig**, which describes the multiple databases used for repo metadata.
**MetadataDatabaseConfig** is either:
* **Local**, which is a local sqlite database, the same as for **DatabaseConfig**; or
* **Remote**, which contains:
* `primary`, the database used for main metadata.
* `filenodes`, the database used for filenodes, which may be sharded or unsharded.
More fields can be added to **RemoteMetadataDatabaseConfig** when we want to add new databases.
New configuration looks like:
```
[storage.metadata.remote]
primary = { db_address = "main_database" }
filenodes = { sharded = { shard_map = "sharded_database", shard_num = 100 } }
[storage.blobstore.multiplexed]
queue_db = { remote = { db_address = "queue_database" } }
```
The `sql_construct` crate facilitates this by providing the following traits:
* **SqlConstruct** defines the basic rules for construction, and allows construction based on a local sqlite database.
* **SqlShardedConstruct** defines the basic rules for construction based on sharded databases.
* **FbSqlConstruct** and **FbShardedSqlConstruct** allow construction based on unsharded and sharded remote databases on Facebook infra.
* **SqlConstructFromDatabaseConfig** allows construction based on the database defined in **DatabaseConfig**.
* **SqlConstructFromMetadataDatabaseConfig** allows construction based on the appropriate database defined in **MetadataDatabaseConfig**.
* **SqlShardableConstructFromMetadataDatabaseConfig** allows construction based on the appropriate shardable databases defined in **MetadataDatabaseConfig**.
Sql database managers should implement:
* **SqlConstruct** in order to define how to construct an unsharded instance from a single set of `SqlConnections`.
* **SqlShardedConstruct**, if they are shardable, in order to define how to construct a sharded instance.
* If the database is part of the repository metadata database config, either of:
* **SqlConstructFromMetadataDatabaseConfig** if they are not shardable. By default they will use the primary metadata database, but this can be overridden by implementing `remote_database_config`.
* **SqlShardableConstructFromMetadataDatabaseConfig** if they are shardable. They must implement `remote_database_config` to specify where to get the sharded or unsharded configuration from.
Reviewed By: StanislavGlebik
Differential Revision: D20734883
fbshipit-source-id: bb2f4cb3806edad2bbd54a47558a164e3190c5d1
Summary:
A lot of callsites want to know repo name. Currently they need to pass it from
the place where repo was initialized, and that's quite awkward, and in some
places even impossible (i.e. in derived data, where I want to log reponame).
This diff adds reponame in BlobRepo
Reviewed By: krallin
Differential Revision: D20363065
fbshipit-source-id: 5e2eb611fb9d58f8f78638574fdcb32234e5ca0d
Summary:
This updates microwave to also support changesets, in addition to filenodes.
Those create a non-trivial amount of SQL load when we warm up the cache (due to
sequential reads), which we can eliminate by loading them through microwave.
They're also a bottleneck when manifests are loaded already.
Note: as part of this, I've updated the Microwave wrapper methods to panic if
we try to access a method that isn't instrumented. Since we'd be running
the Microwave builder in the background, this feels OK (because then we'd find
out if we call them during cache warmup unexpectedly).
Reviewed By: farnz
Differential Revision: D20221463
fbshipit-source-id: 317023677af4180007001fcaccc203681b7c95b7
Summary:
We didn't use DelayBlob at all, however we use DelayedBlobstore in benchmark
lib. DelayedBlobstore seem to have more useful options, so let's remove
DelayBlob and use DelayedBlobstore instead.
Reviewed By: farnz
Differential Revision: D20245865
fbshipit-source-id: bd694a0e178367014adc2776185450693f87475d
Summary:
This introduces a new binary and library that (microwave: it makes warmup
faster..!) that can be used to accelerate cache warmup. The idea is the
microwave binary will run cache warmup and capture things that are loaded
during cache warmup, and commit those to a file.
We can then use that file when starting up a host to get a head start on cache
warmup by injecting all those entries into our local cache before actually
starting cache warmup.
Currently, this only supports filenodes, but that's already a pretty good
improvement. Changesets should be easy to add as well. Blobs might require a
bit more work.
Reviewed By: StanislavGlebik
Differential Revision: D20219905
fbshipit-source-id: 82bb13ca487f82ca53b4a68a90ac5893895a96e9
Summary:
Context: https://fb.workplace.com/groups/rust.language/permalink/3338940432821215/
This codemod replaces *all* dependencies on `//common/rust/renamed:futures-preview` with `fbsource//third-party/rust:futures-preview` and their uses in Rust code from `futures_preview::` to `futures::`.
This does not introduce any collisions with `futures::` meaning 0.1 futures because D20168958 previously renamed all of those to `futures_old::` in crates that depend on *both* 0.1 and 0.3 futures.
Codemod performed by:
```
rg \
--files-with-matches \
--type-add buck:TARGETS \
--type buck \
--glob '!/experimental' \
--regexp '(_|\b)rust(_|\b)' \
| sed 's,TARGETS$,:,' \
| xargs \
-x \
buck query "labels(srcs, rdeps(%Ss, //common/rust/renamed:futures-preview, 1))" \
| xargs sed -i 's,\bfutures_preview::,futures::,'
rg \
--files-with-matches \
--type-add buck:TARGETS \
--type buck \
--glob '!/experimental' \
--regexp '(_|\b)rust(_|\b)' \
| xargs sed -i 's,//common/rust/renamed:futures-preview,fbsource//third-party/rust:futures-preview,'
```
Reviewed By: k21
Differential Revision: D20213432
fbshipit-source-id: 07ee643d350c5817cda1f43684d55084f8ac68a6
Summary:
In targets that depend on *both* 0.1 and 0.3 futures, this codemod renames the 0.1 dependency to be exposed as futures_old::. This is in preparation for flipping the 0.3 dependencies from futures_preview:: to plain futures::.
rs changes performed by:
```
rg \
--files-with-matches \
--type-add buck:TARGETS \
--type buck \
--glob '!/experimental' \
--regexp '(_|\b)rust(_|\b)' \
| sed 's,TARGETS$,:,' \
| xargs \
-x \
buck query "labels(srcs,
rdeps(%Ss, fbsource//third-party/rust:futures-old, 1)
intersect
rdeps(%Ss, //common/rust/renamed:futures-preview, 1)
)" \
| xargs sed -i 's/\bfutures::/futures_old::/'
```
Reviewed By: jsgf
Differential Revision: D20168958
fbshipit-source-id: d2c099f9170c427e542975bc22fd96138a7725b0
Summary:
This updates our filenodes implementation to use different types for writing
(`PreparedFilenode`) and reading `(FilenodeInfo`).
The bottom line is that this avoids a bunch of cloning of paths on the read
path, which doesn't need to return the path to the caller, since the caller
already knows it! We can also take it out of Memcache, since we don't need
Memcache to tell us the path for a blob we could only possibly have found by
having the path to begin with.
This does update our filenodes serialization format. I bumped MC_CODEVER
accordingly.
Reviewed By: StanislavGlebik
Differential Revision: D19905400
fbshipit-source-id: 6037802c1773de564cade8e264d36087382ee15a
Summary:
This removes the old sqlfilenodes implementation, since we're now using the new
one. There's also a bit of cruft here and there we can get rid of.
Reviewed By: StanislavGlebik
Differential Revision: D19905395
fbshipit-source-id: 2526b6d65eeb981f5aedda9951b44b389ecec29d
Summary:
The API expects a stream of filenodes to insert, but we actually never used
that ability. Instead, every single callsites has a `Vec`, which it converts to
a stream and passes that in.
I'd like to change this for two reasons:
- It's un-necessary
- It makes the code more complex on the Filenodes implementation side, and less
efficient, since we need to `chunk()` there in small chunks, which might not
all be in the same shard. If we get the entire `Vec` at once, we can chunk on a
per-shard basis (this happens later in this stack).
Besides, if we end up having a stream and wanting the old behavior, we can
always call `chunk()` the stream and call `add_filenodes` on each batch (which
is actually nicer because if you have a futures 0.2 stream that isn't static,
you can do this, but you can't turn it into a `BoxStream`!).
Reviewed By: StanislavGlebik
Differential Revision: D19902537
fbshipit-source-id: a4c030c4a51afbb6e9db133b32464009eed197af
Summary:
Nearly all of the Mononoke SQL stores are instantiated once per repo but they don't store the `RepositoryId` anywhere so every method takes it as argument. And because providing the repo_id on every call is not ergonomical we tend to add methods to blob_repo that just call the right method with the right repo_id in on of the underlying stores (see `get_bonsai_from_globalrev` on blobrepo for example).
Because my reviewers [pushed back](https://our.intern.facebook.com/intern/diff/D19972871/?transaction_id=196961774880671&dest_fbid=1282141621983439) when I've tried to do the same for bonsai_git_mapping I've decided to make it right by adding the repo_id to the BonsaiGitMapping.
Reviewed By: krallin
Differential Revision: D20029485
fbshipit-source-id: 7585c3bf9cc8fa3cbe59ab1e87938f567c09278a
Summary:
By having it in blobrepo we can ensure that all parts of mononoke can access it
easily
Reviewed By: StanislavGlebik
Differential Revision: D19949474
fbshipit-source-id: ac3831d61177c4ef0ad7db248f2a0cc5edb933b1
Summary:
Currently if derivation of a particular derived data type is disabled, but a
client makes a request that requires that derived data type, we will fail with
an internal error.
This is not ideal, as internal errors should indicate something is wrong, but
in this case Mononoke is behaving correctly as configured.
Convert these errors to a new `DeriveError` type, and plumb this back up to
the SCS server. The SCS server converts these to a new `RequestError`
variant: `NOT_AVAILABLE`.
Reviewed By: krallin
Differential Revision: D19943548
fbshipit-source-id: 964ad0aec3ab294e4bce789e6f38de224bed54fa
Summary:
- Pushing .compat down from main into run function and switch to 0.3 timed function
Note: Possible next level of pushing down: pushing .compact into derive_fn and get rid of BoxFuture run's signature.
Reviewed By: ikostia
Differential Revision: D19943392
fbshipit-source-id: 65bd84492855d3e2e560299a586af6dd4fe9c3ea
Summary: remove the need to pass mapping to `::derive` method
Reviewed By: StanislavGlebik
Differential Revision: D19856560
fbshipit-source-id: 219af827ea7e077a4c3e678a85c51dc0e3822d79
Summary:
This commit manually synchronizes the internal move of
fbcode/scm/mononoke under fbcode/eden/mononoke which couldn't be
performed by ShipIt automatically.
Reviewed By: StanislavGlebik
Differential Revision: D19722832
fbshipit-source-id: 52fbc8bc42a8940b39872dfb8b00ce9c0f6b0800
Summary:
See D19787960 for more details why we need to do it.
This diff just adds a struct in BlobRepo
Reviewed By: HarveyHunt
Differential Revision: D19788395
fbshipit-source-id: d609638432db3061f17aaa6272315f0c2efe9328