Summary:
Facebook
We need them since we are going to sync ovrsource commits into fbsource
Reviewed By: krallin
Differential Revision: D23701667
fbshipit-source-id: 61db00c7205d5d4047a4040992e7195f769005d3
Summary: Noticed there had been an upstream 3.11.10 release with a fix for a performance regression in 3.11.9, PR was https://github.com/xacrimon/dashmap/issues/100
Reviewed By: krallin
Differential Revision: D23690797
fbshipit-source-id: aff3951e5b7dbb7c21d6259e357d06654c6a6923
Summary:
This just makes it more convenient to use in circumstances when fns expects
`impl LCAHint` and we have `Arc<dyn LCAHint`.
Reviewed By: farnz
Differential Revision: D23650380
fbshipit-source-id: dacbfcafe6f3806317a81ed4677af190027f010b
Summary: Previously, `SignalStream` would assume that an error would always end the `Stream`, and would therefore stop and report the amount of transferred data upon encountering any error. This isn't always the desired behavior, as it is possible for `TryStream`s to return mid-stream errors without immediately ending. `SignalStream` should allow for this kind of usage.
Reviewed By: farnz
Differential Revision: D23643337
fbshipit-source-id: 2c7ffd9d02c05bc09c6ec0e282c0b2cca166e079
Summary:
Currently getbundle implementation works this way: it accepts two list of
changesets, a list of `heads` and a list of `common`. getbundle needs to return
enough changesets to the client so that client has `heads` and all of its
ancestors. `common` is a hint to the server that tells which nodes client
already has and doesn't need sending.
Current getbundle implementation is inefficient when any of the heads have a low
generation number but also we have a lot excludes which have a high generation
number. Current implementation needs to find ancestors of excludes that are
equal or lower than generation of this head with a low generation number.
This diff is hacky attempt to improve it. It does it by splitting the heads
into heads with high generation numbers and heads with low generation numbers,
sending separate getbundle requests for each of them and then combining the
results.
Example
```
O <- head 1
|
O <- exclude 1
|
... O <- head 2 <- low generation number
| |
O O <- exclude 2
```
If we have a request like `(heads: [head1, head2], excludes: [exclude1, exclude2])` then this optimization splits it into two requests: `(heads: [head1], excludes: [exclude1, exclude2] )` and `(heads:[head2], excludes:[exclude2])`, and then combines the results. Note that it might result in overfetching i.e. returning much more commits to the client then it requested. This might make a request much slower, and to combat that I suggest to have a tunable getbundle_low_gen_optimization_max_traversal_limit that prevents overfetching.
Reviewed By: krallin
Differential Revision: D23599866
fbshipit-source-id: fcd773eb6a0fb4e8d2128b819f7a13595aca65fa
Summary:
This completely converts mercurial changeset to be an instance of derived data:
- Custom lease logic is removed
- Custom changeset traversal logic is removed
Naming scheme of keys for leases has been changed to conform with other derived data types. This might cause temporary spike of cpu usage during rollout.
Reviewed By: farnz
Differential Revision: D23575777
fbshipit-source-id: 8eb878b2b0a57312c69f865f4c5395d98df7141c
Summary:
shed/sql library used mainly to communicate with Mysql db and to have a nice abstraction layer around mysql (which is used in production) and sqlite (integration tests). The library provided an interface, that was backed up from Mysql side my raw connections and by MyRouter.
This diff introduces a new backend - new Mysql client for Rust.
New backend is exposed as a third variant for the current model: sqlite, mysql (raw conn and myrouter) and mysql2 (new client). The main reason for that is the fact that the current shed/sql interface for Mysql
(1) heavily depends on mysql_async crate, (2) introduces much more complexity than needed for the new client and (3) it seems like this will be refactored and cleaned up later, old things will be deprecated.
So to not overcomplicate things by trying to implement the given interface for the new Mysql client, I tried to simplify things by adding it as a third backend option.
Reviewed By: farnz
Differential Revision: D22458189
fbshipit-source-id: 4a484b5201a38cc017023c4086e9f57544de68b8
Summary:
Pull Request resolved: https://github.com/facebookexperimental/eden/pull/56
Additionally to adding a flag as in the commit title the "test-scs*" tests are being excluded via pattern rather than listed out. This will help in future when more tests using scs_server will be added as the author won't have to list it in exclusion list.
Reviewed By: farnz
Differential Revision: D23647298
fbshipit-source-id: f5c263b9f68c59f4abf9f672c7fe73b63fa74102
Summary:
lookup has special support for requests that start with _gitlookup. It remaps
hg commit to git commit and vice versa. Let's support that in mononoke as well.
Reviewed By: krallin
Differential Revision: D23646778
fbshipit-source-id: fcde58500b5956201e718b0a609fc3fee1bbdd28
Summary: This diff refactors the tool to use changeset_ids instead of changesets, since the functionalities of the tool always use the ids over the changesets. For process recovery (next steps) it's more ideal to save the changeset_ids, since they are simpler to store. However, if we used changesets in the tool, we would need to load them with the saved ids just to only use the ids throughout the process. Therefore, this loading step becomes redundant and we should simply use the ids.
Reviewed By: krallin
Differential Revision: D23624639
fbshipit-source-id: d9b558ebb46c0670bd09556783060f12d3a279ed
Summary: Add a Scuba handler for the EdenAPI server, which will allow the server to log custom columns to Scuba via `ScubaMiddleware`. For now, the only application specific columns are `repo` and `method`.
Reviewed By: krallin
Differential Revision: D23619437
fbshipit-source-id: f08aaf9c84657b4d92f1a1cfe5f0cb5ccd408e5e
Summary: Add `OdsMiddleware` to the EdenAPI server to log various aggregate request stats to ODS. This middleware is directly based on the `OdsMiddleware` from the LFS server (which unfortunately can't be easily generalized due to the way in which the `stats` crate works).
Reviewed By: krallin
Differential Revision: D23619075
fbshipit-source-id: d361c73d18e0d1cb57347fd24c43bdb68fb7819d
Summary:
Add a new `HandlerInfo` struct that can be inserted into the request's `State` by each handler to provide information about the handler to post-request middleware. (This includes things like which handler ran, which repo was queried, etc.)
A note on design: I did consider adding these fields to `RequestContext` (similar to how it's done in the LFS server), but that proved somewhat problematic in that it would require using interior mutability to avoid forcing each handler to borrow the `RequestContext` mutably (and therefore prevent it from borrowing other state data). The current approach seems preferable to adding a `Mutex` inside of `RequestContext`.
Reviewed By: krallin
Differential Revision: D23619077
fbshipit-source-id: 911806b60126d41e2132d1ca62ba863c929d4dc9
Summary: Once we start to move the bookmark for the large repo commits, small repo commits should also start to appear for the dependent systems (e.g. Phabricator) through back-syncing. This diff adds this functionality to see if the commits have been recognised by the tools.
Reviewed By: StanislavGlebik
Differential Revision: D23566994
fbshipit-source-id: 2f6f3b9099bb864fec6a488064abe1abe7f06813
Summary: Useful when looking into blobstore corruption - you can compare all the blobstore versions by manual fetchees.
Reviewed By: krallin
Differential Revision: D23604436
fbshipit-source-id: 7b56947b0188536499514bae6615c6e81b9106c3
Summary: Going to add more features, so simplify by asyncifying first
Reviewed By: krallin
Differential Revision: D23604437
fbshipit-source-id: 52b2b372e4d3fbf1d59168c6c11311d9edf4ff0f
Summary: When we're scrubbing blobstores, it's not actually a success state if a scrub fails to write. Report this back to the caller - no-one will usually be scrubbing unless they expect repair writes to succeed, and a failure is a sign that we need to investigate further
Reviewed By: mitrandir77
Differential Revision: D23601541
fbshipit-source-id: d328935af9999c944719a6b863d0c86b28c54f59
Summary:
One test was fixed earlier by switching MacOS to use modern version of bash, the other is fixed here by installing "nmap" and using "ncat" from within it on both linux and mac.
Pull Request resolved: https://github.com/facebookexperimental/eden/pull/50
Reviewed By: krallin
Differential Revision: D23599695
Pulled By: lukaspiatkowski
fbshipit-source-id: e2736cee62e82d1e9da6eaf16ef0f2c65d3d8930
Summary:
Fixes a few issues with Mononoke tests in Python 3.
1. We need to use different APIs to account for the unicode vs bytes difference
for path hash encoding.
2. We need to set the language environment for tests that create utf8 file
paths.
3. We need the redaction message and marker to be bytes. Oddly this test still
fails with jq CLI errors, but it makes it past the original error.
Reviewed By: quark-zju
Differential Revision: D23582976
fbshipit-source-id: 44959903aedc5dc9c492ec09a17b9c8e3bdf9457
Summary:
typed_hash only implements serialize. Because of this, if we want to serialize a struct that contains e.g changesetid(s), we can't deserialize it later. This diff adds deserialize implementation for typed_hashes.
Implementation is similar to HgNodeHash's: https://fburl.com/diffusion/r3df5iga
Reviewed By: krallin
Differential Revision: D23598925
fbshipit-source-id: 4d48b75eb8a01028e6e2d9bcc1ae20051a97b7fb
Summary:
We are using older version of tokio which spawns as many threads as we have
physical cores instead of the number of logical cores. It was fixed in
https://github.com/tokio-rs/tokio/issues/2269 but we can't use it yet because
we are waiting for another fix to be released -
https://github.com/rust-lang/futures-rs/pull/2154.
For now let's hardcode it in mononoke
Reviewed By: krallin
Differential Revision: D23599140
fbshipit-source-id: 80685651a7a29ba8938d9aa59770f191f7c42b8b
Summary:
This change move logic associated with mercurial changeset derivation to `mercurial_derived_data` crate.
NOTE: it is not converted to derived data infrastructure at this point, it is a preparation step to actually do this
Reviewed By: farnz
Differential Revision: D23573610
fbshipit-source-id: 6e8cbf7d53ab5dbd39d5bf5e06c3f0fc5a8305c8
Summary:
The Mac integration test workflow already installs a modern curl that fixes https://github.com/curl/curl/issues/4801, but it does so after "hg" is built, so "hg" uses the system curl libraries, which fails when used with a certificate not present in keychain.
Pull Request resolved: https://github.com/facebookexperimental/eden/pull/53
Reviewed By: krallin
Differential Revision: D23597285
Pulled By: lukaspiatkowski
fbshipit-source-id: a7b8b6ae55ce338bfb9946a852cbb6b929e73203
Summary:
There are blobs that fail to scrub and terminate the process early for a variety of reasons; when this is running as a background task, it'd be nice to get the remaining keys scrubbed, so that you don't have a large number of keys to fix up later.
Instead of simply outputting to stdout, write keys to one of three files in the format accepted on stdin:
1. Success; you can use `sort` and `comm -3` to remove these keys from the input dat, thus ensuring that you can continue scrubbing.
2. Missing; you can look at these keys to determine which blobs are genuinely lost from all blobstores, and fix up.
3. Error; these will need running through scrub again to determine what's broken.
Reviewed By: krallin
Differential Revision: D23574855
fbshipit-source-id: a613e93a38dc7c3465550963c3b1c757b7371a3b
Summary:
With three blobstores in play, we have issues working out exactly what's wrong during a manual scrub. Make the error handling better:
1. Manual scrub adds the key as context for the failure.
2. Scrub error groups blobstores by content, so that you can see which blobstore is most likely to be wrong.
Reviewed By: ahornby, krallin
Differential Revision: D23565906
fbshipit-source-id: a199e9f08c41b8e967d418bc4bc09cb586bbb94b
Summary:
Sorting bookmark names can be expensive for the MySQL server. As we
don't rely on the ordering of bookmark names when requesting all bookmarks,
remove the sorting.
I've not modified the `Select.*After` queries as they are used for pagination,
which does rely on the order of bookmark names. Further, any queries for
bookmarks that have a limit other than `std::u64::MAX` will remain sorted.
Reviewed By: ahornby
Differential Revision: D23574741
fbshipit-source-id: 79e07b64bb8bb34229c429bdf885c5144963f140
Summary:
The test-blobimport.t creates few files that are conflicting in a case insensitive file system, so make them differ by changing number of underscores in one of the files.
test-pushrebase-block-casefolding.t is directly testing a feature of case sensitive file system, so it cannot be really tested on MacOS
Pull Request resolved: https://github.com/facebookexperimental/eden/pull/49
Reviewed By: farnz
Differential Revision: D23573165
Pulled By: lukaspiatkowski
fbshipit-source-id: fc16092d307005b6f0c8764c1ce80c81912c603b
Summary:
Previously we were not logging a redacted access if previous access was logged
less < MIN_REPORT_TIME_DIFFERENCE_NS ago. That doesn't work well with our
tests.
Let's instead add a sampling tunable.
Reviewed By: krallin
Differential Revision: D23595067
fbshipit-source-id: 47f6152945d9fdc2796fd1e74804e8bcf7f34940
Summary: Moving some of the functionality (which is required for mercurial changeset derivation) into a separate crate. This is required to convert mercurial changeset to derived data to avoid circular dependency it would create otherwise.
Reviewed By: StanislavGlebik
Differential Revision: D23566293
fbshipit-source-id: 9d30b4b3b7d8a922f72551aa5118c43104ef382c
Summary: This is needed in a later diff that requires "codec" feature from `future-util`.
Reviewed By: dtolnay
Differential Revision: D23575630
fbshipit-source-id: e9cdf11b6ec05e5f2744da6b6efd8cb7bf08b212
Summary:
This method is a future of synced-commit-mapping: there can be multiple query
results and we should make a decision of whether it is acceptable for the
business logic in the business logic, rather than pick a random one.
In later diffs I will introduce the consumers for this method.
Reviewed By: mitrandir77
Differential Revision: D23574165
fbshipit-source-id: f256f82c9848f54e5096c6e50d42600bfd260081
Summary:
Another preparatory step for the actuall mapping model fix. This just renames
`get` method into a `get_one` to emphasize it's use-case and to ease the search later.
At the end of this change, I expect there to be no use-cases for `get_one` and expect is to be gone.
Reviewed By: mitrandir77
Differential Revision: D23574116
fbshipit-source-id: f5015329b15f3f08961006607d0f9bf10f499a88
Summary: This is just preparatory extraction to make further work more convenient.
Reviewed By: mitrandir77
Differential Revision: D23574077
fbshipit-source-id: 352ca8ac62bae4fd8fcb980da05c95ce477a414e
Summary:
See D23538897 for context. This adds a killswitch so we can rollout client
certs gradually through dynamicconfig.
Reviewed By: StanislavGlebik
Differential Revision: D23563905
fbshipit-source-id: 52141365d89c3892ad749800db36af08b79c3d0c
Summary:
Like it says in the title, this updates remotefilelog to present client
certificates when connecting to LFS (this was historically the case in the
previous LFs extension). This has a few upsides:
- It lets us understand who is connecting, which makes debugging easier;
- It lets us enforce ACLs.
- It lets us apply different rate limits to different use cases.
Config-wise, those certs were historically set up for Ovrsource, and the auth
mechanism will ignore them if not found, so this should be safe. That said, I'd
like to a killswitch for this nonetheless. I'll reach out to Durham to see if I
can use dynamic config for that
Also, while I was in there, I cleaned up few functions that were taking
ownership of things but didn't need it.
Reviewed By: DurhamG
Differential Revision: D23538897
fbshipit-source-id: 5658e7ae9f74d385fb134b88d40add0531b6fd10
Summary:
Generated by formatting with rustfmt 2.0.0-rc.2 and then a second time with fbsource's current rustfmt (1.4.14).
This results in formatting for which rustfmt 1.4 is idempotent but is closer to the style of rustfmt 2.0, reducing the amount of code that will need to change atomically in that upgrade.
---
*Why now?* **:** The 1.x branch is no longer being developed and fixes like https://github.com/rust-lang/rustfmt/issues/4159 (which we need in fbcode) only land to the 2.0 branch.
---
Reviewed By: StanislavGlebik
Differential Revision: D23568780
fbshipit-source-id: b4b4a0aa683d236e2fdeb5b96d723ac2d84b9faf
Summary:
This test fail on sandcastle because the last two lines are not showing up.
I have a hunch that the last two lines just weren't flushed, and this diff
attempts to fix it.
Reviewed By: krallin
Differential Revision: D23570321
fbshipit-source-id: fd7a3315582c313a05e9f46b404e811384bd2a50
Summary: When we imported a repo (T71717570), we received a network connect error after querying a lot from graphql. I am not sure, if it's because of the frequent amount of queries, but just to be on the safe side, I increased the default sleep time between queries.
Reviewed By: krallin
Differential Revision: D23538886
fbshipit-source-id: 6a84f509e5e19f86880d3f8c6413f2f47e4a469b
Summary: Add Scuba logging using `ScubaMiddleware` from `gotham_ext`. Each request will be logged to the Scuba dataset specified by the `--scuba-dataset` flag, as well as optionally to the log file specified by `--scuba-log-file`.
Reviewed By: sfilipco
Differential Revision: D23547668
fbshipit-source-id: e6cd88ad729a40cf45b63538f7481ee098ea12dc
Summary: Import middleware directly from `gotham_ext` rather than relying on reexports in the `middleware` module.
Reviewed By: farnz
Differential Revision: D23547320
fbshipit-source-id: e64a8acff55445a646b0a1b3b1e71cf6606c3d02
Summary:
Move `ScubaMiddleware` out of the LFS server and into `gotham_ext`.
This change required splitting up the `ScubaKey` enum to separate generally useful column names (e.g., HTTP columns that would be applicable to any HTTP service) from LFS-specific columns. `ScubaMiddlwareState` has been modified to accept any type that implements `Into<String>` as a key, and the `ScubaKey` enum has been split up into `HttpScubaKey` (in `gotham_ext`) and `LfsScubaKey` (in `lfs_server`).
The middleware now takes a type parameter to specify a "handler" (implementing the new `ScubaHandler` trait) which allows the application to add application-specific Scuba columns in addition to the default columns. The application-specific columns will be added immediately prior to the sample being logged.
Reviewed By: krallin
Differential Revision: D23458748
fbshipit-source-id: 3e99f3e0b5d3475a4f5ac9eaefade2eeff12c2fa