Summary:
This diff makes MySQL FFI client the default option for a MySQL connection. It means that if no arguments provided, the MySQL FFI client is used. `--use-mysql-client` option is still accepted, as it is used in the configs, and will be removed a bit later.
I also removed raw connections as a way to connect to MySQL from Mononoke, as it is no longer used. Although I had to keep some `sql_ext` API for now because other projects rely on it.
(I talked to the teams and they are willing to switch to the new client as well. I'm helping where it's possible to replace these raw xdb conns.)
Reviewed By: krallin
Differential Revision: D27925435
fbshipit-source-id: 4f08eef07df676a4e6be58b6e351be3e3d3e8ab7
Summary:
Right now, we can't have defaults in our tunables, because some tests clobber
them. Let's start updating tunables instead of replacing them.
NOTE: I was planning to use this in my next diff, but I ended up not needing
it, that said, it feels like a generally positive improvements, so I figured
I'd keep it.
Reviewed By: StanislavGlebik
Differential Revision: D27915402
fbshipit-source-id: feeb868d99565a375e4e9352520f05493be94a63
Summary:
phases calculation could be expensive on the server and it should be a perf win to disable it if not needed
It shouldn't be needed if narrow heads are enabled
Reviewed By: quark-zju
Differential Revision: D27908691
fbshipit-source-id: 7000fb23f9332d58c2c488ffbef14d73af4ac532
Summary:
I'd like to make this a tunable, and having some tests turn it off makes that
more complicated. We can now actually flush this, so it also seems a bit
unnecessary. Let's remove.
Reviewed By: StanislavGlebik
Differential Revision: D27911049
fbshipit-source-id: 4634f9f3243132c53c228bd7476002412bace873
Summary:
Integration test showing that MismatchedHeads errors are sent over the wire to
the client.
Reviewed By: quark-zju
Differential Revision: D27798555
fbshipit-source-id: b14a213e9055486bf07ecbb4b5453385df701b48
Summary:
We want to distinguish between no location and failure to compute location.
It is useful to know on the client side if we failed to compute the location of
some hash or we are not sending it because we don't have a location for it.
We have different options for hash-to-location in particular but the problem is
a general one. Does it make sense for us to skip sending error at the EdenApi
handler layer? I don't think so, I think that's an old hask. We should be
evolving the EdenApi handler layer to handle application errors.
This is still at the exploration phase. I haven't fixed the client side to
handle the errors.
Reviewed By: quark-zju, kulshrax
Differential Revision: D27549926
fbshipit-source-id: 7094160fe997af0e69c8d006b9731ea7dfb3e3f8
Summary:
This is a baseline packer, to let us experiment with packing.
It chooses the dictionary that gives us the smallest size, but does nothing intelligent around key handling.
Note that at this point, packing is not always a win - e.g. the alias blobs are 432 bytes in total, but the resulting pack is 1528 bytes.
Reviewed By: ahornby
Differential Revision: D27795486
fbshipit-source-id: 05620aadbf7b680cbfcb0932778f5849eaab8a48
Summary:
To prevent bonsai changeset divergence between prod and backup repo by copying
bonsais from prod repo directly during hg sync job push.
See more details about motivation in D27824210
Reviewed By: ikostia
Differential Revision: D27852341
fbshipit-source-id: 93e0b1891008858eb99d5e692e4dd60c2e23f446
Summary:
Basically every single Mononoke binary starts with the same preamble:
- Init mononoke
- Init caching
- Init logging
- Init tunables
Some of them forget to do it, some don't, etc. This is a mess.
To make things messier, our initialization consists of a bunch of lazy statics
interacting with each other (init logging & init configerator are kinda
intertwined due to the fact that configerator wants a logger but dynamic
observability wants a logger), and methods you must only call once.
This diff attempts to clean this up by moving all this initialization into the
construction of MononokeMatches. I didn't change all the accessor methods
(though I did update those that would otherwise return things instantiated at
startup).
I'm planning to do a bit more on top of this, as my actual goal here is to make
it easier to thread arguments from MononokeMatches to RepoFactory, and to do so
I'd like to just pass my MononokeEnvironment as an input to RepoFactory.
Reviewed By: HarveyHunt
Differential Revision: D27767698
fbshipit-source-id: 00d66b07b8c69f072b92d3d3919393300dd7a392
Summary:
We actually require tunables in our binaries, but some of our tests have
historically not initialized them, because the underlying binaries don't
load tunables (so they get defaults).
I'd like to remove the footgun of binaries not initializing tunables, but to do
this I need tunables to be everywhere, which is what this does.
Reviewed By: StanislavGlebik
Differential Revision: D27791723
fbshipit-source-id: 13551a999ecebb8e35aef55c0e2c0df0dac20d43
Summary: Changing generic anyhow::Error to ErrorKind so there is no need to downcast when we want to match on errors.
Reviewed By: krallin
Differential Revision: D27742374
fbshipit-source-id: ba4c1779d5919eb989dadf5f457d893a3618fffc
Summary:
NOTE: The revisionstore LFS tests talk to prod Mononoke LFS, so the test here
will fail until that is deployed.
Like it says in the title, this adds support for downloading content in chunks
instead of all in one go. The underlying goal here is to let us do better load
balancing server side. This earlier diff in this stack explains that in more
detail: D27188610 (820b538001).
Reviewed By: quark-zju
Differential Revision: D27191594
fbshipit-source-id: 19b1be9bddc6b32b1fabfe4ab5738dfcf71b1a85
Summary:
If the caller asks us for a range that extends past the end of our file, we'd
rather give them an error instead of silently returning the file.
This actually revealed one of the tests needed work :)
Note that right now we'll just end up categorizing this as 500. I'd like to
rework the errors we emit in the Filestore, but that's a somewhat bigger
undertaking so I'd like to do it separately.
Reviewed By: quark-zju
Differential Revision: D27193353
fbshipit-source-id: 922d68859401eb343cffd201057ad06e4b653aad
Summary:
The backupbookmarks part was used for infinitepush backup bookmarks, which were
deprecated. Now stop sending the part entirely unless
`commitcloud.pushbackupbookmarks` is set.
Reviewed By: StanislavGlebik
Differential Revision: D27710099
fbshipit-source-id: 1eb404f106f5a8d9df6d73e11f60f89c1fa10400
Summary:
The hashes that are passed in as parameters to the hash-to-location function
may not be hashes that actually exist. This change updates the code so that
we don't return an error when an unknown hash is passed in. The unknown
hash will be skipped in the list of results.
Reviewed By: quark-zju
Differential Revision: D27526758
fbshipit-source-id: 8bf9b7a134a6a8a4f78fa0df276f847d922472f5
Summary:
We want to handle the case where the client has multiple heads for master. For
example when master is moved backwards (or when it get moved on the client by
force). Updating the request object for HashToLocation to send over all the
master heads.
When the server builds non-master commits, we will want to send over non-master
heads too. We may consider having one big list of heads but I think that we
would still want to distinguish the non-master commit case in order to optimize
that use-case.
Reviewed By: quark-zju
Differential Revision: D27521778
fbshipit-source-id: cc83119b47ee90f902c186528186ad57bf023804
Summary: Currently no matter through which VIP clients are connecting to, they get a response with `"href": "https://monooke-lfs.internal.tfbnw.net"` which prevents us from enabling c2p vip in corp.
Reviewed By: krallin
Differential Revision: D27331945
fbshipit-source-id: f215cce2f64a2a38accd6d55d5100d8d364ce77b
Summary:
Add support for returning unhydrated draft commits if requested by the client using a config option 'wantsunhydratedcommits'
This is needed to support slow enabling it for some clients like OnDemand.
Reviewed By: StanislavGlebik
Differential Revision: D27621442
fbshipit-source-id: 672129c8bfcbcdb4cee3ba1b092dac16c0b1877d
Summary:
We already log file count, but file sizes is another useful piece of
information.
I evaluated two options - either do as I did in this diff or just change ScribeToScuba
logging python script to query scs to get file sizes. I opted for option #1
mainly because scs doesn't have a method to query file sizes for many files at once, and
querying one by one might be too expensive. We can add a method like that, but
that's a bit of a bigger change that I'd like.
Reviewed By: andll
Differential Revision: D27620507
fbshipit-source-id: 2618e60845bc293535b190d4da85df5667a7ab60
Summary: Reduce amount of manual steps needed to restart a manual scrub by checkpointing where it has got to to a file.
Differential Revision: D27588450
fbshipit-source-id: cb0eda7d6ff57f3bb18a6669d38f5114ca9196b0
Summary:
I'd like to just get rid of that library since it's one more place where we
specify the Tokio version and that's a little annoying with the Tokio 1.x
update. Besides, this library is largely obsoleted by `#[fbinit::test]` and
`#[tokio::test]`.
Reviewed By: farnz
Differential Revision: D27619147
fbshipit-source-id: 4a316b81d882ea83c43bed05e873cabd2100b758
Summary:
Some manual scrub runs can take a long time. Provide progress feedback logging.
Includes a --quiet option for when progress reporting not required.
Reviewed By: farnz
Differential Revision: D27588449
fbshipit-source-id: 00840cdf2022358bc10398f08b3bbf3eeec2b299
Summary:
This warmer hasn't been used, and we aren't going to use it going further.
Let's remove it
Reviewed By: krallin
Differential Revision: D27533802
fbshipit-source-id: deaaf954ed535789ab6b5dc89b8da9967c40e84f
Summary:
We weren't validating any hashes in walker, and so it was entirely possible to
have corrupt content in the repo. Let's start hash validation it by validating hg
filenodes
Reviewed By: ahornby
Differential Revision: D27459516
fbshipit-source-id: 495d59436773d76fcd2ed572e1e724b01012be93
Summary:
Current we treat missing just as any other error, and that makes it hard to
have an alarm on blobs that are missing, because any transient error might make
this alarm go off.
Let's instead differentiate between a blob being missing and failed to fetch a
blob because of any other error (i.e. manifold is unavaialble). Turned out this
is not too difficult to do because a lot of the types implement Loadable trait,
which has Missing variant in its errors.
side-note:
It looks like we have 3 steps which treat missing blobs as not being an
error. These steps are:
1) fastlog_dir_step
2) fastlog_file_step
3) hg_filenode_step
I think this is wrong, and I plan to fix it in the next diff
Reviewed By: ahornby
Differential Revision: D27400280
fbshipit-source-id: e79fff25c41e4d03d77b72b410d6d2f0822c28fd
Summary:
Add a way to disable revnum resolution for non-automation (non-HGPLAIN)
usecases. Automation (ex. nuclide-core) still use revision numbers.
Some tests still use revnums. Put them in a list so this feature
does not break them.
Reviewed By: DurhamG
Differential Revision: D27144492
fbshipit-source-id: fba97fc90c7942e53914c29354938178b2637f44
Summary: Use Edenapi book request and response type in bookmarks edenapi endpoint. Serialize the response as cbor.
Reviewed By: kulshrax
Differential Revision: D27174122
fbshipit-source-id: 6bc7295c25bd355db4625da3c1f8c68349e7b0b7
Summary: We have number of small repos which doesn't have much traffic and load so it can be combined in one single tailer job.
Reviewed By: StanislavGlebik
Differential Revision: D27361196
fbshipit-source-id: 3326a7445cf4f0f0cb27fe0f4545cf8ee3357ff2
Summary: Use the test factory for test fixtures that are used in many existing tests.
Reviewed By: StanislavGlebik
Differential Revision: D27169432
fbshipit-source-id: bb3cbfa95b330cf6572d1009c507271a7b008dec
Summary:
Convert `BlobRepo` to a `facet::container`. This will allow it to be built
from an appropriate facet factory.
This only changes the definition of the structure: we still use
`blobrepo_factory` to construct it. The main difference is in the types
of the attributes, which change from `Arc<dyn Trait>` to
`Arc<dyn Trait + Send + Sync + 'static`>, specified by the `ArcTrait` alias
generated by the `#[facet::facet]` macro.
Reviewed By: StanislavGlebik
Differential Revision: D27169437
fbshipit-source-id: 3496b6ee2f0d1e72a36c9e9eb9bd3d0bb7beba8b
Summary: We have support for backup-repo-id, but tw blobimport doesn't have id and have source repo name to use. Let's add support similar to other repo-id/source-repo-id etc.
Reviewed By: StanislavGlebik
Differential Revision: D27325583
fbshipit-source-id: 44b5ec7f99005355b8eaa4c066cb7168ec858049
Summary:
This will let us better load balance across our clients by having them send
requests that are a fixed size. Indeed, right now, load imbalanced between
our hosts seem to be largely attributable to the fact that some hosts are
responsible for a few 150MB+ blobs, whereas others have none of those.
The motivation here is to let clients query us by range. Then, we can just have
clients fetch e.g. ~40MB at a time, and route those to different hosts, thus
eliminating said load imbalance.
Reviewed By: StanislavGlebik
Differential Revision: D27188610
fbshipit-source-id: b9bbe18904cdd88ec7bab2b76e096fd3585c7b78
Summary:
Git client was updated and it started outputting a hint that breaks our tests. It has no meaning in tests so just quiet it.
```
➜ fbcode git init
hint: Using 'master' as the name for the initial branch. This default branch name
hint: is subject to change. To configure the initial branch name to use in all
hint: of your new repositories, which will suppress this warning, call:
hint:
hint: git config --global init.defaultBranch <name>
hint:
hint: Names commonly chosen instead of 'master' are 'main', 'trunk' and
hint: 'development'. The just-created branch can be renamed via this command:
hint:
hint: git branch -m <name>
Initialized empty Git repository in /data/users/mzr/fbsource/fbcode/.git/
```
Reviewed By: StanislavGlebik
Differential Revision: D27232853
fbshipit-source-id: 683ebebb36049adb758e7c26f843f12159a45301
Summary:
While Mononoke should support pushing large commits, it's not clear if we need
(or want) to support pushing a lot of commits. At the very least pushing lots of commits at the same
time can create problems with derivation, but there could be more places that might break.
Besides there's usually an easy way to work around that i.e. sqush commits or push in small batches.
If we ever need to import a lot of it we use tools like blobimport/repo_import,
so blocking pushing of lots of commits should be fine.
Reviewed By: farnz
Differential Revision: D27237798
fbshipit-source-id: 435e61110f145b06a177d6e492905fccd38d83da
Summary: Update the logs so that it's more clear what is going on.
Reviewed By: quark-zju
Differential Revision: D27145099
fbshipit-source-id: 11ec7b467157d07dd41893dc82f251a1c555365f
Summary: We have new config fields available that can specify default compression level, let's read and use them.
Reviewed By: StanislavGlebik
Differential Revision: D27127455
fbshipit-source-id: 27935fd58da5f1150c9caf56d9601c37f2ae3581
Summary:
Better Git<->Bonsai conflict reporting.
It now prints the conflicting mapping, so one can see if there is a 2xbcs_id -> same git_sha1 or 2xgit_sha1 -> same bcs_id conflict. This is useful when trying to sort out why this issue appeared in the first place.
Reviewed By: ahornby, krallin
Differential Revision: D27058044
fbshipit-source-id: 9db7a210c1b0be3e0e90aa78b561293d6cf29c26
Summary:
Segmented Changelog distinguishes commits into two groups: MASTER and
NON_MASTER. The MASTER group is assumed to big and special attention is payed
to it. Algorithms optimize for efficiency on MASTER.
The current state for the segmented_changelog crate in Mononoke is that it does
not assign NON_MASTER commits. It doesn't need to right now. We want to
establish a baseline with the MASTER group. It was however possible for the
on demand update dag to assign commits that were no in the master group to the
master group because no explicit checks were performed. That could lead to
surprising behavior.
At a high level, the update logic that we want is: 1. assign the master
bookmark changeset to the MASTER group, 2. assign other commits that we need to
operate on to the NON_MASTER group. For now we need 1, we will implement 2
later.
Reviewed By: krallin
Differential Revision: D27070083
fbshipit-source-id: 922bcde3641ca25512000cd1a912c5b399bdff4b
Summary:
Pull in SegmentedChangelogConfig and build a SegmentedChangelog instance.
This ties the config with the object that we build on the servers.
Separating the instatiation of the sql connections from building any kind of
segmented changelog structure. The primary reason is that there may be multiple
objects that get instantiated and for that it is useful to be able to pass
this object around.
Reviewed By: krallin
Differential Revision: D26708175
fbshipit-source-id: 90bc22eb9046703556381399442117d13b832392
Summary:
Seeding fbsource I found that loading the commits from sql took longer than I
was expecting, around 90 minutes where I was expecting around 10 miuntes.
I added more logging to validate that commits were actively loaded rather
than something being stuck.
Reviewed By: krallin
Differential Revision: D27084739
fbshipit-source-id: 07972707425ecccd4458eec849c63d6d9ccd923d
Summary:
The way I plan to use new streaming_changelog in prod is by running it
periodically (say, every 15 mins or so). However some repos won't get many
commits in the last 15 mins (in fact, they might get just 1 or 2).
And even for high commit rate repos most of the times the last chunk
will not be a full chunk (i.e. it will be less that --max-data-chunk-size).
If we were just uploading last chunk regardless of its size then the size of
streaming changelog database table would've just keep growing by 1 entry every
15 mins even if it's completely unnecessary. Instead I suggest to add an option
to not upload the last chunk if it's not necessary.
Reviewed By: farnz
Differential Revision: D27045681
fbshipit-source-id: 2d0fed3094944c4ed921f36943b881af394d9c17
Summary:
This command can be used to update already existing streaming changelog.
It takes a newly cloned changelog and updates the new streaming changelog
chunks in the database.
The biggest difference from "create" command is that we first need to figure
out what's already uploaded to streaming changelog. For that two new methods
were added SqlStreamingChunksFetcher.
Reviewed By: farnz
Differential Revision: D27045386
fbshipit-source-id: 36fc9387f621e1ec8ad3eb4fbb767ab431a9d0bb