Summary:
HgManifestFileNode is one of the last remaining types we don't walk ( other known one is the git derived data).
Its added as a separate NodeType from HgFileNode as HgManifestFileNode is used much less and users may want to see only the HgFileNodes. Server side the manifest file node is only used to build the bundles returned to the client.
Differential Revision: D28010248
fbshipit-source-id: ce4c773b0f1996df308f1b271890f29947c2c304
Summary:
This uses code that depends on Tokio 0.x so let's get this removed. I think
I could have updated this to just use manifold-thrift, but while I'm at it,
might as well update to the new shiny client.
Reviewed By: farnz
Differential Revision: D28025165
fbshipit-source-id: 4b79c8a4bd0b8789e6827d2135d36245db4447d5
Summary:
Fix missing stats for the bookmarks endpoint because we have the wrong name in
code.
Reviewed By: quark-zju
Differential Revision: D28008091
fbshipit-source-id: 128fe00e00a06ebe9b65fb11512cd03a042d55fe
Summary:
This doesn't link the errors into Scuba, which makes it not very useful for
debugging, since we're not routinely looking at stderr on our tasks, and makes
it impossible to e.g. look anywhere and find a count of failed requests.
Instead, update this to use `capture_first_err`, which will report both the
first error and the total error count to Scuba.
Reviewed By: sfilipco
Differential Revision: D27998037
fbshipit-source-id: b941d44a2ac21bbf640b9bc977de749207f12d9a
Summary:
In EdenAPI, most endpoints don't raise errors when they fail, and instead just take items out of the response stream (though there are some exceptions: D24315399 (0ccae1cef9).
Right now, this just gets logged to stderr, which isn't great. This diff introduces a CaptureFirstError wrapper we can use to capture those errors and expose them in post-response callbacks (i.e. Scuba).
Reviewed By: sfilipco
Differential Revision: D27998036
fbshipit-source-id: 960d0e09a82ca79adfafe22e2eeef2e0294d27dc
Summary:
This raw XDB connections API is not used in Mononoke although there are projects depending on it.
The API creates connection objects based on shed/sql crate.
This diff moves `SqlConnections` into `shed/sql:sql_common`, closer to the `Connection` definition, and moves `create_raw_xdb_connections` API into the `shed/sql:sql_facebook::raw`.
Reviewed By: krallin
Differential Revision: D28003638
fbshipit-source-id: ea4a29b6e239a89c97237877e2dfde4c7c7ff89b
Summary:
As I split the pushrebase crate in D27968029 (93b8cf116b) it makes sense to stop re-exporting hook definitions from it.
This change will also make dependencies more accurate.
Reviewed By: krallin
Differential Revision: D28028045
fbshipit-source-id: 622ef5d7db737e19153109f4d2dcefe54fba2bb4
Summary:
Finally! This is basically the end goal of this stack (though we still need to
do the same thing with the "ForwardErr" combinator used by EdenAPI next).
With this, we can now capture errors that occur while sending the response
stream, instead of just errors that occur while producing the response (which
is usually a little piece of the work we have to do).
Reviewed By: mitrandir77
Differential Revision: D27967991
fbshipit-source-id: a5483c58f0550a19e711e712cf860d9328a0eb9e
Summary:
`ContentMeta` sounds a lot like a struct containing content metadata, so let's
rename this to something less ambiguous (`ContentMetaProvider`).
The reaosn I'm doing this now is because I'd like to introduce a similar
trait for errors that occur on our response streams and there I'll need
both a struct and a trait.
Reviewed By: mitrandir77
Differential Revision: D27998039
fbshipit-source-id: f0372c62d13167ef4bd07cb9e9e9fb75ea105b9a
Summary:
Like it says in the title. The goal here is to make it not matter where the
error came from. In this diff, we capture the same errors as before, but we
do it via PostResponseInfo instead of via ad-hoc things in our `State`.
Reviewed By: mitrandir77
Differential Revision: D27967994
fbshipit-source-id: dbbb1a8f5ea1a439089c41b4a08cd6088476ae33
Summary: Like it says in the title.
Reviewed By: mitrandir77
Differential Revision: D27967992
fbshipit-source-id: 0deb4d90538a6889bee6b41de4c5d1533b29519b
Summary:
Very small refactor. I want this stuff to all be in the same module instead
of spread across `response` and `error`.
Reviewed By: mitrandir77
Differential Revision: D27967993
fbshipit-source-id: aca22f952d756d298e5e342f0c4f8ebd31f108bf
Summary:
This is a bit of an abstract change, but the goal of this change is to make
post-response handlers oblivious to the distinction between sending a response
(or failing to send one) and returning a response that actually contains a
(fallible) stream.
The underlying / ultimate goal here is to move our error reporting out of
ad-hoc router wrappers where we call `set_error_message` on some context
entity, and to instead move it into post-response callbacks.
The upshot of that is that if we fail to send a response even though we sent a
200 from the handler, we'll be able to log it! Indeed, remember that when
sending a streaming response, we have to send a 200 to start streaming but we
might hit an error producing later parts of the response!
Reviewed By: mitrandir77
Differential Revision: D27966422
fbshipit-source-id: ab49639bfcc4af23ddc2e84181278f105ebb28b9
Summary:
This stuff runs after we sent the response, so PostResponse is a more
appropriate name than PostRequest.
Reviewed By: ikostia
Differential Revision: D27966420
fbshipit-source-id: 1f7b7a55490f731eb512793024bcfafb0ea4ef79
Summary:
Those two have historically used different (but largely copy pasted) code to
produce their responses. Let's unify them by
While in there, let's also modernize the formatting a little bit by letting
anyhow do the formatting for us (we used to use `failure` when this code was
written, and it didn't do it).
There's a bit of ugliness here in the sense that out formatting is injecting
the error into the state so it can be logger later. This is equivalent to what
we had, but it's kinda clonwy. That said, I'm working on refactoring our error
handling in this stack, so the plan is to clean this up (i.e. it won't stay
very long).
Finally, note that this diff removes the `ResponseCreationFailure` error kind
in LFS. This is replaced by a `.context()` in `gotham_ext`. That said, we never
really use this stuff: creations are fallible in Hyper but you only run into
an error if you e.g. forget to add a status code, so we don't expect them to
actually occur outside of development.
Reviewed By: mitrandir77
Differential Revision: D27966421
fbshipit-source-id: 097f3b69f25fe39f8fbef925a272e88199896b39
Summary:
Like it says in the title, I'd like to use names that reflect that this isn't
just *any* content stream: it's specifically for responses.
Reviewed By: ahornby
Differential Revision: D27964045
fbshipit-source-id: 50530cf85ba7840a2baa14151351d0b288d9ae70
Summary:
We have a set of things that are meant to be used together that are spread
across 3-4 different modules. Let's move them together. This also allows us to
make some things (e.g. the `ContentMeta` trait) private as they're no longer
needed.
Note: this diff only move stuff around & renames things a bit. I'll also split
some of those modules in the next diff.
Reviewed By: HarveyHunt
Differential Revision: D27964047
fbshipit-source-id: 02528d84adfd70ec346c32163cb185d89266a53e
Summary:
We have a module called "content" that contains two completely unrelated
things: some enums for content encodings (+ associated parsing), and our output
streams.
I'd like to add more of said output streams, so let's clean this up.
Reviewed By: HarveyHunt
Differential Revision: D27964046
fbshipit-source-id: b42e56aa3fadaf9b93a44216977da19d950a16b9
Summary:
We used to have to do this because of overly strict trait bounds in Hyper, but
we no longer do, so let's get rid of it. Note that we have one Tokio task per
request, and polling the response stream is basically the only thing that task
does, so this should make little difference as far as task scheduling is
concerned besides avoiding unnecessary context switches.
Reviewed By: ahornby
Differential Revision: D27963458
fbshipit-source-id: 24e762eb173156dab909fefe11dcf2d58272048a
Summary: We don't use this anymore. Might as well remove it.
Reviewed By: ahornby
Differential Revision: D27963411
fbshipit-source-id: a6ac337936e8b2bd788dd79a835eef11b19dde70
Summary:
This change makes it so that our binaries do not instantiate a real configo
client in integration test setup.
Reviewed By: ahornby
Differential Revision: D28026790
fbshipit-source-id: 0fb9ce66a1324e845f4b8a80d4479263ec6e4ee1
Summary:
First, some background on the existing WrappedPath type: In Mononoke the MPath type is such that None==Root and Some(MPath)==NonRoot. This means that where a path may be present one needs to use double-Option with Option<Option<MPath>>, so that Root is Some(None).
To reduce the need for double Option, and subsequently to allow for newtype features like memoization, the walker has WrappedPath, so we can use Option<WrappedPath> instead.
This change introduces a similar type WrappedPathHash for MPathHash, which means that the sample_fingerprint for WrappedPath can be now be non-optional as even root paths/manifests can now have a sample_fingerprint.
Reviewed By: mitrandir77
Differential Revision: D27995143
fbshipit-source-id: b674abd4ec94749f4f5797c697ae7381e1a08d02
Summary:
This adds the first part of new logging from the walker that can be used to gather details on what keys might make sense to pack together.
Unlike the corpus command that dumps file content by path (which was useful for analysis on compression approaches), this adds logging to the scrub command that includes the path hash rather than the full path. This should keep memory usage down during the run, hopefully mean we log from existing scrub jobs and and mean the logs are more compact.
Reviewed By: mitrandir77
Differential Revision: D27974448
fbshipit-source-id: 47b55112b47e9b022f16fbb473cf233a7d46bcf3
Summary:
Knowing the prepushrebase changeset id is required for retroactive_review.
retroactive_review checks landed commits, but verify_integrity hook runs on a commit before landing. This way the landed commit has no straightforward connection with the original one and retroactive_review can't acknowledge if verify_integrity have seen it.
Reviewed By: krallin
Differential Revision: D27911317
fbshipit-source-id: f7bb0cfbd54fa6ad2ed27fb9d4d67b9f087879f1
Summary:
Split pushrebase crate into pushrebase hook definition and pushrebase implementation.
Before this change it was impossible to store an attribute in BlobRepo that would depend on PushrebaseHook as it would create a circular dependency `pushrebase -> blobrepo -> pushrebase`.
Reviewed By: krallin
Differential Revision: D27968029
fbshipit-source-id: 030ef1b02216546cd3658de1f417bc52b8dd9843
Summary:
This diff wires up actual `scs_server` methods `megarepo_add_sync_target_config`, `megarepo_add_sync_target`, `megarepo_add_sync_target_poll` to the underlying logic in the `CfgrMononokeMegarepoConfigs`.
One of these is a synchronous method (`megarepo_add_sync_target_config`), so it is implemented properly. This method only allows adding new configs to an existing target.
The other two are a pair of async methods (create reqest + poll request) for target creation with an initial config. On the one hand, we don't yet have any infrastructure for async methods, so properly implementing this pair is not possible. What's more, target creation is a two-part operation: save a config + create an initial repo commit. Second part is not yet implemented at all (and is what requires async implementation, as it is going to be expensive). On the other hand, I would like to expose the concept of target creation for the client to test, that's why I add `FAKE_ADD_TARGET_TOKEN` to mask a so-far synchronous impl of this method as asynchronous.
Once I implement async methods, I will come back and work on a proper `megarepo_add_sync_target` impl (this is the first method to be implemented).
Important: any use of these methods now should be considered experimental, and we'll have to delete all of these configs later (because all of the targets won't have any corresponding bookmarks in the real repos, which makes them invalid).
Reviewed By: StanislavGlebik
Differential Revision: D27885979
fbshipit-source-id: 9e2a914af1a7db2ec00ffa11a832ddd71fd19d0f
Summary:
This will help people to introduce new blobstore objects in their code (for
instance I intend to use it in the following diff).
The `private` module exists to allow the use of the exported macro without the
need to write a bunch of `use` statements, and without pollution the re-export
namespace. The idea is that everything needed by the exported macro exists in
the `private` module of the crate, and this module is public.
So long as some other crate imports the macro, it expands to
`$crate::private::NEEDED_THING` in the right places and no further `use`
statements of dependencies are needed. At the same time, the name `private`
should discourage people from using whatever is in this module directly. The
idea is taken from `anyhow`.
Reviewed By: StanislavGlebik
Differential Revision: D27997228
fbshipit-source-id: fd2c421d0daf0fe88e2b9001bb4088fe7b4d59b7
Summary:
Collecting into SortedVecMap an unsorted iterator is inefficient, because of
how [try_collect()](https://docs.rs/futures-util/0.3.0/src/futures_util/stream/try_stream/try_collect.rs.html#57) works. It calls `extend()` method every time a new element was fetched from the
stream. So if stream was unsorted, then we ran into a quadratic behaviour with
lots of reallocations.
Let's instead collect into BTreeMap, and then convert it into SortedVecMap.
Reviewed By: markbt
Differential Revision: D27997469
fbshipit-source-id: 58f837e6cc946ccc8809cce3d7a5a2e6ca24df40
Summary:
I'd like to have a quick way of documenting how this is supposed to be used,
so let's add it.
Reviewed By: HarveyHunt
Differential Revision: D27996500
fbshipit-source-id: 0d138ac3335a3ecb7f0e15aebbdc89d01941cbed
Summary:
scrub blobstore logging was missing the common server logging fields that LogBlob and MultiplexedBlobstore add.
Also moved the LogBlob scuba construction closer to use site for clarity.
Reviewed By: StanislavGlebik
Differential Revision: D27966453
fbshipit-source-id: 77fe70606602753301a2503691a490c0b11c755a
Summary:
Currently when we call with_mutated_scuba() we create a new LoggingContainer.
That means that all the data from the previous LoggingContainer like PerfCounters
is lost. I suspect this is the reason we don't log any BlobGets/Puts for
repo_create_commit methods (see
[scuba](https://fburl.com/scuba/mononoke_scs_server/fautos3s)) - we call
[with_mutated_scuba method](https://fburl.com/code/srd1c4xu) right before
calling repo_create_commit(), and I suspect this loses the counters.
Let's instead copy all the Logging fields when calling `with_mutated_scuba`.
Reviewed By: krallin
Differential Revision: D27964719
fbshipit-source-id: 881c11bb5fb1927dbf55d0d625ea8bfbf11be131
Summary:
In test-cross-repo-commit-validator.t and test-cross-repo-commit-sync., we
modify bookmarks outside of Mononoke, so we need to flush them before pulling.
In test-megarepo-invisible-merge.t, things are actually a little more subtle
and I wonder if there might be another issue laying around there. If we don't
flush bookmarks, then we attempt to upload one more hg commit, and that
blows up: P410235472. However, if we flush bookmarks, then we don't attempt to
upload, and all is fine. Here, flushing is just a workaround, but for no
that'll do. There was also another bug here, where we change configs but
don't force them to take effect.
Reviewed By: StanislavGlebik
Differential Revision: D27964959
fbshipit-source-id: 9c4304b38513177e402ee64f309e019e227ed2a7
Summary:
When we're packing, we pay an overhead price for keeping the key in the pack. As we're only bothered about reducing size, let's limit that price to when the savings from packing are worth it.
There are two cases where it's not worth it:
1. When the compressed pack is larger than the sum of single compressed files sizes.
2. When compressing a single file on its own.
Reviewed By: ahornby
Differential Revision: D27913258
fbshipit-source-id: 36cdc2a2b30aa508281ac3bbd70da41322533edb
Summary:
Like it says in the title. This test wants to see consistent bookmarks so it
should flush them.
Note that this used to just opt out of bookmarks caching entirely, but I'd like
us to try and avoid having so many snowflakes in our tests (because it makes
their maintenance harder), so instead of changing the environment, let's change
the test to do what it needs to do.
Reviewed By: mzr, HarveyHunt
Differential Revision: D27964099
fbshipit-source-id: 72e00bad07dec15f18faaf4aa2e32e78cb333ab0
Summary: This way the fallback server know which traffic is coming from mononoke
Reviewed By: krallin
Differential Revision: D27946019
fbshipit-source-id: 8c13ae641ba340ba55322871ca30fb6accb3f007
Summary:
Update the zstd crates.
This also patches async-compression crate to point at my fork until upstream PR https://github.com/Nemo157/async-compression/pull/117 to update to zstd 1.4.9 can land.
Reviewed By: jsgf, dtolnay
Differential Revision: D27942174
fbshipit-source-id: 26e604d71417e6910a02ec27142c3a16ea516c2b
Summary:
We get pretty frequent query errors from MySQL on this, but it's hard to debug
without knowing what is being queried.
Reviewed By: StanislavGlebik
Differential Revision: D27941603
fbshipit-source-id: 62e0f0fe9c3af36ed829c401e957ecf7683a4000
Summary:
This diff makes MySQL FFI client the default option for a MySQL connection. It means that if no arguments provided, the MySQL FFI client is used. `--use-mysql-client` option is still accepted, as it is used in the configs, and will be removed a bit later.
I also removed raw connections as a way to connect to MySQL from Mononoke, as it is no longer used. Although I had to keep some `sql_ext` API for now because other projects rely on it.
(I talked to the teams and they are willing to switch to the new client as well. I'm helping where it's possible to replace these raw xdb conns.)
Reviewed By: krallin
Differential Revision: D27925435
fbshipit-source-id: 4f08eef07df676a4e6be58b6e351be3e3d3e8ab7
Summary:
Right now, we can't have defaults in our tunables, because some tests clobber
them. Let's start updating tunables instead of replacing them.
NOTE: I was planning to use this in my next diff, but I ended up not needing
it, that said, it feels like a generally positive improvements, so I figured
I'd keep it.
Reviewed By: StanislavGlebik
Differential Revision: D27915402
fbshipit-source-id: feeb868d99565a375e4e9352520f05493be94a63
Summary:
This updates the bookmarks cache TTL to be something we configure using
tunables instead of repo configs. There's a few goals here:
- Less us tune the pressure we put on SQL via a tunable
- Letting us experiment more easily with disabling this cache and tuning the
WBC poll interval.
Right now, this defaults to what we had set up in all our repo configs, which
is 2000ms.
Reviewed By: farnz
Differential Revision: D27915403
fbshipit-source-id: 4361d38939e5b2a0fe37442ed8c1dd0611af5098
Summary:
One of our plans for this half is to replace the warm bookmarks cache with a
service, and we suspect this will effectively eliminate bookmarks queries from
our hosts, because we think they all come from the WBC.
But, before we invest our time into this, let's make sure that this assumption
is actually correct, by tracking who's querying bookmarks.
Reviewed By: StanislavGlebik
Differential Revision: D27938407
fbshipit-source-id: d9a9298e7409c9518a4b9bf8ac0a6cef53750473
Summary:
I'd like to be able to track the proportion of traffic coming to bookmarks from
warm bookmarks cache vs. from elsewhere. We don't have a great abstraction to
pass this via the CoreContext at this time, but the SessionClass seems like a
pretty good fit.
Indeed, since it's always available in the CoreContext, and can be freely
mutated without having to rebuild the whole session. Besides, it aligns pretty
well with the existing use cases we have for SessionClass, which is to give you
different level of service depending on who you are.
Reviewed By: StanislavGlebik
Differential Revision: D27938413
fbshipit-source-id: a9dcc5a10c8d1459ee9586324a727c668e2e4e40
Summary:
phases calculation could be expensive on the server and it should be a perf win to disable it if not needed
It shouldn't be needed if narrow heads are enabled
Reviewed By: quark-zju
Differential Revision: D27908691
fbshipit-source-id: 7000fb23f9332d58c2c488ffbef14d73af4ac532
Summary:
`MononokeMegarepoConfig` is going to be a single point of access to
config storage system - provide both writes and reads. It is also a trait, to
allow for unit-test implementations later.
This diff introduces a trait, as well as implements the write side of the
configerator-based implementor. The read side/oss impl/test impl
is left `unimplemented`. Read side and test impl will be implemented in the future.
Things I had to consider while implementing this:
- I wanted to store each version of `SyncTargetConfig` in an individual
`.cconf` in configerator
- at the same time, I did not want all of them to live in the same dir, to
avoid having dirs with thousands of files in it
- dir sharding uses sha1 of the target repo + target bookmark + version name,
then separates it into a dir name and a file name, like git does
- this means that these `.cconf` files are not "human-addressable" in the
configerator repo
- to help this, each new config creation also creates an entry in one of the
"index" files: human-readable maps from target + version name to a
corresponding `.cconf`
- using a single index file is also impractical, so these are separated by
ascification of the repo_id + bookmark name
Note: this design means that there's no automatic way to fetch the list of all
targets in use. This can be bypassed by maintaining an extra index layer, whihc
will list all the targets. I don't think this is very important atm.
Reviewed By: StanislavGlebik
Differential Revision: D27795663
fbshipit-source-id: 4d824ee4320c8be5187915b23e9c9d261c198fe1