Summary:
This updates multiplexedblob and logblob to capture perf counters for the
operations we run, and log them to Scuba. Along with the previous diffs in this
stack, this gives us the number of Manifold retries, total delay, and conflicts
all logged to blobstore trace on a per-operation basis.
Reviewed By: HarveyHunt
Differential Revision: D24333039
fbshipit-source-id: 9c7d0a467f8df08dcb2a0d3bb6b543cdb3ea1d90
Summary:
This updates ManifoldBlob to log the aformentioned data points to perf
counters. There's a bit of refactoring that also had to go into this to make
`ctx` available everywhere it's needed.
Reviewed By: aslpavel
Differential Revision: D24333040
fbshipit-source-id: 1b63bcd1e1ee36bae4dbbc1da053c7f1bdf96675
Summary:
This adds support for "forking" perf counters at a point in the stack, giving
you a CoreContext that logs to one or more sets of perf counters.
This is useful for low level operations where we want to collect granular more
logging data — in particular blobstore operations, where we'd like to collect
the time spent waiting for Manifold retries or the number of Manifold retries
in blobstore trace for each individual blobstore operation (we can't do that
using the `CoreContext` we have because that would be missing
The implementation supports a list of reference counted perf counters in the
CoreContext. When you want to add a new counter, we replace the list with a new
one, and give you a reference to the one you just added. When you write, we
write to all perf counters, and when you read, we read from the "top" perf
counter (which is always there). To read from one of the forked counters, you
use the reference you got whne you created it.
Reviewed By: aslpavel
Differential Revision: D24333041
fbshipit-source-id: ce318dfc04a1ea435b2454b53df4cae93d57c0a5
Summary:
The x-repo usually have "noop" mapping i.e. mapping that doesn't change the
paths (it might have arbitrary name though)
It's useful for commits that are identical between small and large repo to be
able to backfill this mapping. This diff adds a command to do that.
Reviewed By: krallin
Differential Revision: D24337281
fbshipit-source-id: 89a058f95677e4a5c8686122a317eadf8b1bb995
Summary: It will be used in the next diff, so let's move it to a separate function.
Reviewed By: krallin
Differential Revision: D24334717
fbshipit-source-id: e50d13d45c633397504cf08954f2ced9ace8f570
Summary: Convert derived data utils to use new style futures
Reviewed By: StanislavGlebik
Differential Revision: D24331068
fbshipit-source-id: ad658b278802afa1e4ecd44c5a24164135748790
Summary:
This is needed to be able to use `has_redaction_root_cause()` with a metadata
rebuilding error.
Reviewed By: StanislavGlebik
Differential Revision: D24360816
fbshipit-source-id: 388df8cedb769ff001bfe4ff9cd5063ccd9de9f1
Summary:
This is in line with other changes we're making to map logic now. Note that
apart from checking in-repo prefix-free-ness of the map, this also checked the
same across many small repos. It probably does not make sense to do that either
now that we allow non-prefix free maps within a repo.
Reviewed By: StanislavGlebik
Differential Revision: D24348161
fbshipit-source-id: caaa22953c8a15a08607157b99c9f0fd0edf633f
Summary:
Until we have the same standards for the native and push-redirected pushes,
these need to be automatically bypassed.
Reviewed By: krallin
Differential Revision: D24357372
fbshipit-source-id: f85459145f6a5217c07445d7017f3b11ed1284a7
Summary:
Except for the tail mode of x_repo_sync_job which we use normally there's also
"once" mode which means "sync a single commit". Previously it did just that -
synced a single commit and failed if parents of this commit weren't synced.
However this is very unpleasant to use - instead let's change the semantics to
sync the commit and all of its ancestors.
also I made target_bookmark an optional parameter - sometimes we just want to sync a commit without any bookmarks at all.
Reviewed By: mitrandir77
Differential Revision: D24135771
fbshipit-source-id: 341c1808a44c58f89536b8c07394b77d8ced3f37
Summary: These were stabilized in 1.45.0 and 1.47.0 respectively.
Reviewed By: StanislavGlebik
Differential Revision: D24353680
fbshipit-source-id: f2afe906e5260b1b360455acc20d9a806c988c9c
Summary:
Update from 1.44.0. Updates have been blocked because of a bad interaction
between platform007+mode/opt-clang-thinlto+gold. Now that the default linker is
lld, perhaps this is no longer an issue.
Various tweaks due to updates:
- `atomic_min_max`, `const_transmute`, `inner_deref`, `ptr_offset_from` and `str_strip` are now stable
- `asm` renamed to `llvm_asm`
- `intra_doc_link_resolution_failure` lint renamed to `broken_intra_doc_link` (ndmitchell I didn't fix Gazebo because you'd explicitly suppressed the warning - I'll let you work out what to do with that)
(This is caused by incompatibility between the llvm used by rustc in
platform007 and llvm-fb. In platform009, rustc uses llvm-fb directly so there's
no scope for incompatibility.)
Reviewed By: dtolnay
Differential Revision: D24288638
fbshipit-source-id: 5155d85c186fd79d3cc86cb0bb554ab77d76c12c
Summary: Rename futures01 types from Foo to Foo01 in the top level lib.rs and derive_impl.rs files in preparation for adding a trait method that returns new futures
Reviewed By: aslpavel
Differential Revision: D24311165
fbshipit-source-id: 4f3b12ba3eaf8023959d6d4bbb4568d191b1fffb
Summary: This allows the user to see how the mover of a particular version operates on any given path.
Reviewed By: StanislavGlebik
Differential Revision: D24335975
fbshipit-source-id: f67847112eb0d3c8c49584604e2f9d93579cdde4
Summary:
Those tests are kinda broken in a number of ways right now.
First, they try to connect to a prod DB to record what bundle they just pushed.
That's not ideal, so this adds a flag to have them not do this.
Second, they are racy by design, and they mostly don't pass at tall in
mode/dev.
The way we make the tests run here is by having them forwardfill for 10 seconds
then give up, and we hope that during that time, they've fetched the bundles
they should fetch from SQL, and synced them. However, that's not really
sufficient because establishing your first connection to SQL from a mode/dev
binary is quite slow, so in the 10 seconds, you might pick up your first
bundle, start replaying it, then exit before you get to the second one.
To fix this, this diff updates the fillers to expect a specific number of
bundles to replay. We still have a limit on the number of total iterations to
avoid letting the tests hang if the number of bundles isn't the one we expect.
Fixing this revealed further breakage, which I solved earlier in this stack.
Unfortunately, it's not sufficient to make test-commitcloud-reversefiller.t
work on Python 3, because the infinitepush extension appears to be broken
server side. I'll file a task for the Mercurial oncall for that.
Reviewed By: mitrandir77
Differential Revision: D24277474
fbshipit-source-id: 0a5e1f7db8dc0c0068b0fc203abc0503226107ec
Summary:
Our case conflict checking is very inefficient on large changesets. The root
cause is that we traverse the parent manifest for every single file we are
modifying in the new changeset.
This results in very poor performance on large changes since we end up
reparsing manifests and doing case comparisons a lot more than we should. In
some pathological cases, it results in us taking several *minutes* to do a case
conflict check, with all of that time being spent on CPU lower-casing strings
and deserializing manifests.
This is actually a step we do after having uploaded all the data for a commit,
so this is pure overhead that is being added to the push process (but note it's
not part of the pushrebase critical section).
I ended up looking at this issue because it is contributing to the high
latencies we are seeing in commit cloud right now. Some of the bundles I
checked had 300+ seconds of on-CPU time being spent to check for case
conflicts. The hope is that with this change, we'll get fewer pathological
cases, and might be able to root cause remaining instances of latency (or have
that finally fixed).
This is pretty easy to repro.
I added a binary that runs case conflict checks on an arbitrary commit, and
tested it on `38c845c90d59ba65e7954be001c1eda1eb76a87d` (a commit that I noted
was slow to ingest in commit cloud, despite all its data being present already,
meaning it was basically a no-op). The old code takes ~3 minutes. The new one
takes a second.
I also backtested this by rigging up the hook tailer to do case conflict checks
instead (P145550763). It is about the same speed for most commits (perhaps
marginally slower on some, but we're talking microseconds here), but for some
pathological commits, it is indeed much faster.
This notably revealed one interesting case:
473b6e21e910fcdf7338df66ee0cbeb4b8d311989385745151fa7ac38d1b46ef (~8K files)
took 118329us in the new code (~0.1s), and 86676677us in the old (~87 seconds).
There are also commits with more files in recent history, but they're
deletions, so they are just as fast in both (< 0.1 s).
Reviewed By: StanislavGlebik
Differential Revision: D24305563
fbshipit-source-id: eb548b54be14a846554fdf4c3194da8b8a466afe
Summary:
I'm reworking some of our case conflict handling, and as part of this, I'm
going to be using check_case_conflicts for all our checking of case conflicts,
and notably for the case where we introduce a new commit and check it against
its parent (which, right now, does not check for case conflicts).
To do this and provide a good user experience (i.e. indicate which files
conflicted and with what), I need `check_case_conflicts` to report what files
the change conflicts with. This is what this diff does.
This does mean a few more allocations, so I "paid those off" by updating our
case lowering to allocate one fewer Vec and one fewer String per MPathElement
being lowercased.
Reviewed By: StanislavGlebik
Differential Revision: D24305562
fbshipit-source-id: 8ac14466ba3e84a3ee3d9216a84c2d9125a51b86
Summary:
Those tests need an empty blobstore, but to do this, they `fixtures::linear`,
which a) instantiates a whole blobrepo, and b) fill it with commits we never
use.
In particular, a) is kinda annoying, because that creates a dependency on
blobrepo and therefore on most of Mononoke. This makes the tests much slower to
build than they could be.
This diff updates the test to just get their memblob directly instead of via a
fixture, thus bringing down the number of Mononoke crates we pull in from 120
to 21.
Reviewed By: StanislavGlebik
Differential Revision: D24305564
fbshipit-source-id: 91c9423f6284f985461744d9c61ce207024a65c8
Summary:
Current large to small movers do not cover some of the edge cases, and this
diff attempts to fix them. More details are in the comment
Reviewed By: aslpavel, ikostia
Differential Revision: D24328574
fbshipit-source-id: 35cceff61f21603baf446d83e8daa4bda2f17d2a
Summary:
This trait is no longer used all that much outsides of a handful of tests, the
walker, and an admin subcommand, as it has been replaced by the `Manifest`
trait, which works over all kinds of Manifests, and has stronger typing (its
sub-entries always have a path, and they are wrapped in an enum that knows if
they're leaves or trees).
This left a bunch of old legacy code here or there, which is worth removing
to make sure we don't introduce any new callsites to this. Another motivation
is that this legacy code is often not very compatible with new code, and has
historically made it a bit tricky (everything owns a blobstore in this code,
which is pretty awkward and not at all how we do things nowadays).
There is, I think, a bit more potential here since we could also perhaps try to
remove the `HgBlobEntry` struct, but that has a callsites still, so I'm not
doing this here.
Reviewed By: StanislavGlebik
Differential Revision: D24306946
fbshipit-source-id: 8a73dbbf40a904ce19ac65d791b732091c206263
Summary: We need to wait for records before checking said records.
Reviewed By: StanislavGlebik
Differential Revision: D24329119
fbshipit-source-id: c06564b98c12c805444a7e0c3fcc968a8b8cc175
Summary:
We don't want to log slow derivations in tests. To do that, let's look at at
unable to decide if we should log.
We only log if the threshold is > 0, so since the default is 0, that means we
won't log unless a config overrides it (which won't be the case in tests, but
will be the case everywhere else: D24329372).
Reviewed By: StanislavGlebik
Differential Revision: D24329377
fbshipit-source-id: e9d70236c614780e383870297d08afaf64450383
Summary:
We don't give `jq` a program here, and it looks like depending on the exact
setup, this might result in it printing the help instead of running, as
reported by Jun here:
https://fb.workplace.com/groups/scm.mononoke/permalink/1280281002334472/
I think this is the behaviro inquestion:
https://github.com/stedolan/jq/issues/1110. I'm guessing something might have
changed in jq 1.6 because I am on that version and don't get that behavior but
Jun's output uses jq 1.5.
Let's give jq a program to fix this.
Reviewed By: StanislavGlebik
Differential Revision: D24329472
fbshipit-source-id: f4ba4246770dd79630433a3c46c8bbd38aa4c842
Summary: Add support for deriving all types of derived data for a given repository
Reviewed By: StanislavGlebik
Differential Revision: D23842909
fbshipit-source-id: 2fa5c4a9444169b26c5cf70d91a6cc707cca8022
Summary: This change make it possible to find underived changesets for multiple derived data types at once, and then batch derive everything taking into account dependencies between derived data types.
Reviewed By: StanislavGlebik
Differential Revision: D23762152
fbshipit-source-id: 8112fc5deb86f3c55f2fa5079cf917a0506f045c
Summary: Introduces fetching of child entry IDs, and child file metadata for a specified tree manifest ID. The aux data lookup will only be performed if `with_file_metadata` is set, which is actually kind of wrong. Instead `with_children` from the wire type should be exposed in the API request type, and `with_*_metadata` should be hidden or used for data other than the child entry `Key`s.
Reviewed By: kulshrax
Differential Revision: D23886678
fbshipit-source-id: 0cba72cea7be47ae3348a406d407a19b60976c0c
Summary: This code predates the use of `pin_project`, which simplifies the implementation of pinned types using proc macros. We now use `pin_project` in most other places in this crate where we define custom `Future`s and `Stream`s, so let's use it in `SignalStream` as well. (This makes the code easier to modify and removes the need for `unsafe` code.)
Reviewed By: farnz
Differential Revision: D24285792
fbshipit-source-id: da018b9ed62c33d9d06ecf6caa4748969a057e66
Summary:
Add a stream combinator that causes a `TryStream` to end upon encountering the error. The resulting stream remains a `TryStream`, but will always terminate after yielding an error item.
Later in this stack, this will be used in the LFS server to cause mid-stream errors to always terminate the response.
Reviewed By: krallin
Differential Revision: D24301254
fbshipit-source-id: e71ddc16e79caeb195e23be7d6a93ee63a96a713
Summary:
We had a small issue with our logging when pushredirection was enabled. See for
example https://fburl.com/scuba/mononoke_test_perf/orsld6yh. If you count the
number of "Start processing" and "Command processed" entries you'll notice that
there's one more instance of the former. That's wrong, and turned out this
additional "Start processing" line is coming from backsyncing.
Turned out that the problem was two fold:
1) start_command() logs "Start processing" row and also it sets "Start
processing" as default message for every next scuba calls. That doesn't seem
intentional, so this diff clones scuba sample before logging "Start
processing" to avoid "poisoning" the "tag" field
2) Backsyncing weren't setting the log_tag at all. So this diff fixes it
Reviewed By: ikostia
Differential Revision: D24305167
fbshipit-source-id: 3448b88693e0c5c8f5a9b90f602bc2dfe99db15a
Summary:
In D19023924 (e9df42c192) ikostia allowed non-prefix free movers, however he left the
safeguard in just in case. Well, now we finally need to use non-prefix-free
mapping, so let's remove the safeguard.
Reviewed By: ikostia
Differential Revision: D24277040
fbshipit-source-id: 4d658ad813171ab0dcb23656e95e3e443ec9961a
Summary:
This is a tool that's handy to use right before the binding. It accepts a list
of large repo changesets, and for each commit does not yet have a mapping it
inserts a NotSyncCandidate. This is exactly what we want to do right before the
bind - we'd like large repo to have most of its commits marked as
NotSyncCandidate.
Reviewed By: ikostia
Differential Revision: D24276642
fbshipit-source-id: b040e5dee981c06da8a4c58dacbb8c2660f0917d
Summary:
Likely `MPath` constructor got more clever since this was written. In any case
I am just happy that it builds.
Reviewed By: farnz
Differential Revision: D24278551
fbshipit-source-id: 2378a43bd01b1826b7f81d294a035053fba5322f
Summary:
Bookmark filler doesn't make much sense outside of FB. In fact the commit
filler is already in the `facebook/` dir.
D24253307 contains the fbpkg change that has to be landed in-sync with this one.
Reviewed By: lukaspiatkowski
Differential Revision: D24253070
fbshipit-source-id: 52734ae34779801b4cae4882a6d0880586ef505f
Summary: For the purposes of megarepo, we need to treat certain paths differently depending on whether the push is native to a large repo, or is push-redirected from a small repo. Let's add a separate patterns list to the `deny_files` hook. Of course, we can also add a list, specific for push-redirected pushes, but it seems unneeded atm.
Reviewed By: StanislavGlebik
Differential Revision: D24278409
fbshipit-source-id: 9fd815940bb656ceac6ab234f3a0647a5c57db06
Summary:
If we fail to read from MySQL, we immediately go back to attempting to read.
That's the exact opposite of what we should be doing. This fixes that by not
changing our poll schedule if we hit an error.
Reviewed By: farnz
Differential Revision: D24279086
fbshipit-source-id: 2c05ec9c33a1e0cbdfb63dda63a68f2a93615512
Summary:
This diff adds all third party dependencies that are required by getdeps to be able to build and runn Mononoke's integration tests.
Also add a stub Makefile with no-op steps that will be filled in next diff.
Reviewed By: ahornby
Differential Revision: D24251894
fbshipit-source-id: 67384ecfd0ced6762dddc3c6e61feb1240b1162d
Summary: Reorder scrub walk expansion to reduce queue depth. The bounded_traversal_stream inserts to its unscheduled queue at the front, therefore reversing the order of the children, so we want to add the easiest to complete Nodes to the list of edges last.
Reviewed By: farnz
Differential Revision: D24247625
fbshipit-source-id: 59caa5898e7f38f41cc04a15723370de38f8474f
Summary:
Delegate chaosblob put to put_impl to remove some duplication.
This is possible as all chaosblob construction has an inner BlobstorePutOps.
Reviewed By: StanislavGlebik
Differential Revision: D24258401
fbshipit-source-id: dee9fce888e5ef9c2f34865c97921b9cc87ac3bf
Summary:
This change adds some wiring to allow the hooks to treat native vs push-redirected pushes differently. This is needed because `deny-files` needs to block native pushes to `.ovrsource-rest` and `arvr-legacy`, while it needs to allow push-redirection into these directories.
The plan further is to change the actual hook body for `deny_files` to have different handling of the two cases.
Reviewed By: StanislavGlebik
Differential Revision: D24257454
fbshipit-source-id: 2f5931149115210aeeeebb3294a6512effd36350
Summary:
The logic inside `eden/scm/tests/features.py` script enables certain
features of HG on tests based on the name of the test. Mononoke's integration
tests suit reuses eden/scm's tests suit and as a consequence it triggers the
same `features.py` logic. It was fine until D24201934 (f5988c415c) introduced a feature for
test that is named in the same way as one of the Mononoke tests - the
`test-clienttelemetry.t`.
In order to fix this problem Mononoke will pass a `--nofeatures` flag to the
`eden/scm/run-tests.py` invocations that will turn off the usage of
`features.py`.
Reviewed By: farnz
Differential Revision: D24276294
fbshipit-source-id: eb28ed55a05de3b012e37407603c2370adaaad16