Commit Graph

2628 Commits

Author SHA1 Message Date
Stefan Filip
9fdb3faff6 segmented_changelog: add builder with SegmentedChangelogConfig
Summary:
Pull in SegmentedChangelogConfig and build a SegmentedChangelog instance.
This ties the config with the object that we build on the servers.

Separating the instatiation of the sql connections from building any kind of
segmented changelog structure. The primary reason is that there may be multiple
objects that get instantiated and for that it is useful to be able to pass
this object around.

Reviewed By: krallin

Differential Revision: D26708175

fbshipit-source-id: 90bc22eb9046703556381399442117d13b832392
2021-03-17 20:12:27 -07:00
Stefan Filip
4217421d20 segmented_changelog: remove unused dependency
Summary:
This was lost somehow. I probably incorrectly resolved some conflict when
rebasing a previous change.

Reviewed By: quark-zju

Differential Revision: D27146022

fbshipit-source-id: 13bb0bb3df565689532b2ab5299cd757f278f26e
2021-03-17 19:49:58 -07:00
Katie Mancini
8decc4733f Update reclone instructions to include fbclone --reclone
Summary:
the reclone option code has landed for fbclone, so now we can direct
users there first, so they don't have to go through all these steps

(won't land until I check that this option has actually made it to production)

I also updated the wiki this points to tell users to use `eden list` to detect
EdenFs checkouts instead of looking for .eden, as these steps are also for when
an EdenFS checkout is borked and needs a reclone and `eden list` more reliably
works in this situation.

Reviewed By: StanislavGlebik

Differential Revision: D26435380

fbshipit-source-id: 9153e730e1be949d130af85d604623d2bfbd3990
2021-03-17 19:10:24 -07:00
Stanislau Hlebik
ac6c609e01 mononoke: do not change repo_id when logging noop hg sync job iteration
Summary:
In D26945466 (7a3539b9c6) I started to use correct repo name for backup repos whenever we
sync an entry. However most of the time sync job is idle, and while doing so it
also logs a heartbeat to scuba table. But it was using wrong repo_id for that
(i.e. for instagram-server_backup it was using instagram-server repo_id). This
diff fixes that.

Reviewed By: krallin

Differential Revision: D27123193

fbshipit-source-id: 80425a56ad0a432180f420f5c7957105407e0fc9
2021-03-17 11:13:03 -07:00
Thomas Orozco
840a572036 Daily common/rust/cargo_from_buck/bin/autocargo
Reviewed By: HarveyHunt

Differential Revision: D27124565

fbshipit-source-id: d2e4ca99324ee2037f05741c55a3d6ee8ad98211
2021-03-17 10:48:37 -07:00
Mark Juggurnauth-Thomas
c310245e9b repo_listener: log when connection handlers are cancelled
Summary:
Unlike the source control service, requests aren't usually cancelled in the
main server.  However, if the request doesn't complete within the shutdown
timeout, it does get cancelled.

Add logging for this case.

Reviewed By: krallin

Differential Revision: D27086622

fbshipit-source-id: dbd9dee1a6a84b4cd5570302a0a62fb96d2489aa
2021-03-17 08:59:19 -07:00
Alex Hornby
de84f76280 mononoke: packblob simplify key substring in create_packed
Summary: Small cleanup. Spotted this while doing previous diff

Reviewed By: krallin

Differential Revision: D27046218

fbshipit-source-id: 186c52e579d093a37a9e9c99015da3859f1d0e64
2021-03-17 03:49:44 -07:00
Alex Hornby
dc48654580 mononoke: packblob decode doesn't need BlobstoreMetadata
Summary: Can simplify the decode as it was just passing the metadata in then out again

Reviewed By: krallin

Differential Revision: D27044277

fbshipit-source-id: 4e8fb995d3643f5420f9315fab6453b027be6297
2021-03-17 03:49:44 -07:00
Alex Hornby
c34c30e8b9 mononoke: allow using cmdlib --blobstore-cachelib-attempt-zstd with benchmark_filestore
Summary: this is useful so we can measure the effect of cachelib marshalling overhead in CacheBlob separately without zstd compression in play

Reviewed By: krallin

Differential Revision: D27043229

fbshipit-source-id: cf7e35688bdd96c029ee7858f59e46583726f271
2021-03-17 02:21:15 -07:00
Alex Hornby
65699c56ef mononoke: use cmdlib throttling argument in benchmark_filestore
Summary: Can use the ones from BlobstoreOptions rather than doing our own

Reviewed By: krallin

Differential Revision: D27043230

fbshipit-source-id: d3db19adaf8819709d069296dec955b2159d5546
2021-03-17 02:21:15 -07:00
Alex Hornby
153c1132bb mononoke: allow running non-sharded sqlblob for benchmark_filestore
Summary: Benchmark had a partial duplicate of the sqlblob opening code from blobstore factory but without unsharded support, so use factory instead.

Reviewed By: krallin

Differential Revision: D27060010

fbshipit-source-id: d1c5704cdec17e3d0b1b54538caf7a3893c3610f
2021-03-17 02:21:15 -07:00
Stefan Filip
c81edb9f71 segmented_changelog: fix idmap assignment
Summary:
Finding a parent that was previously found signals that we want to assign
that changeset sooner if it was not already assigned.

Reviewed By: quark-zju

Differential Revision: D27092205

fbshipit-source-id: ed39a91460ff2f91a458236cdab8018341ec618b
2021-03-16 20:38:04 -07:00
Stefan Filip
f9599c714d segmented_changelog: add logging to seeder process commit loading
Summary:
Seeding fbsource I found that loading the commits from sql took longer than I
was expecting, around 90 minutes where I was expecting around 10 miuntes.
I added more logging to validate that commits were actively loaded rather
than something being stuck.

Reviewed By: krallin

Differential Revision: D27084739

fbshipit-source-id: 07972707425ecccd4458eec849c63d6d9ccd923d
2021-03-16 20:38:04 -07:00
Mark Juggurnauth-Thomas
991e24fab6 scs_server: log when requests are cancelled
Summary:
When requests are cancelled, their futures are dropped without completion.
Currently this causes no logs or statistics to be logged, as normally this
would happen after the request implementation completes.

Add logging for cancelled requests.  Include the gathered statistics so far,
so that we know how much time was spent on the cancelled request.

Reviewed By: StanislavGlebik

Differential Revision: D27084866

fbshipit-source-id: d4c5c276d496478f0c7caa700627b92d8f9e80a2
2021-03-16 13:04:32 -07:00
Chengxiong Ruan
4fb5ba1152 Use released cursive_tab and cursive_buffered_backend version (#8078)
Summary:
Pull Request resolved: https://github.com/facebookincubator/resctl/pull/8078

Pull Request resolved: https://github.com/facebookexperimental/rust-shed/pull/21

Pull Request resolved: https://github.com/facebookexperimental/eden/pull/76

Use released version to fix cursive_core version conflicts.

Reviewed By: boyuni

Differential Revision: D27032206

fbshipit-source-id: ba664b21cd55453dbc8124ff967a6f9d61fc4926
2021-03-16 10:00:37 -07:00
Stefan Filip
62cca2ec9b segmented_changelog: add scuba logs for loads
Summary: Logs. Minimal observability for loading Segmented Changelog.

Reviewed By: ahornby

Differential Revision: D27048940

fbshipit-source-id: 3005e7c71a32572743d06d5d371a009a030f8e4c
2021-03-16 09:30:55 -07:00
Stefan Filip
deae65979e segmented_changelog: update OverlayIdMap with assigned vertex ranges
Summary:
Pretty big bug here with the "Overlay" when we are updating both stores.  It
turns out that we don't really want a standard Overlay. We want the loaded
iddag to operate with the Ids in the shared IdMap and we want whatever is
updates to use the in process IdMap. The problem we have with the overlay is
that the shared IdMap may have more data than the in process IdMap. The shared
IdMap is always updated by the tailer, after all. This means that when we query
the overlay, we may get data from the shared store even if this is the first
time we are trying to update a changeset for the current process.

The solution here is to specify which vertexes are fetched from either store.

Reviewed By: quark-zju

Differential Revision: D27028367

fbshipit-source-id: e09f003d94100778eabd990724579c84b0f86541
2021-03-16 09:30:55 -07:00
Stefan Filip
c18b35a400 segmented_changelog: update PeriodicReload to work with any SegmentedChangelog
Summary:
Using the generic load function from SegmentedChangelogManager. This is the
config SegmentedChangelog that is consistent with the specified configuration.

I wanted to have another look at ArcSwap to understand if
`Arc<ArcSwap<Arc<dyn SegmentedChangelog>>>` was the type that it was
recommending for our situation and indeed it is.

Reviewed By: quark-zju

Differential Revision: D27028369

fbshipit-source-id: 7c601d0c664f2be0eef782700ef4dcefa9b5822d
2021-03-16 09:30:55 -07:00
Stefan Filip
19e10a7250 segmented_changelog: update reload and update to master periods
Summary:
Keep SegmentedChangelog up to date by triggerring an update to the master
bookmark every minute.
Updating SegmentedChangelog in process has the sideeffect of adding some in
process only bookkeeping. Over long periods of time this can result in
increased memory usage. To mitigate any potential issues, we reload Segmented
Changelog every hour. This will make it's parameters more predictable.

Reviewed By: quark-zju

Differential Revision: D27028368

fbshipit-source-id: dae581b9a067c6eae7975b4517203085b168e2f0
2021-03-16 09:30:55 -07:00
Stefan Filip
e097ff6951 segmented_changelog: clarify logs
Summary: Words.

Reviewed By: quark-zju

Differential Revision: D27028370

fbshipit-source-id: 4e4be1048837f09e18b1b65762b6f23c28cc4c6a
2021-03-16 09:30:54 -07:00
Mark Juggurnauth-Thomas
e9806d5a6f scs_server: commonize methods that resolve two commits
Summary:
Several methods (`commit_compare`, `commit_is_ancestor_of`, `commit_file_diffs`
and `commit_common_base_with`) operate on a pair of commits.  Currently these
all resolve the other commit manually and in different ways.  Commonize the
code, and add contextual information so the caller can see which of the two
commits failed to resolve.

Reviewed By: StanislavGlebik

Differential Revision: D27079920

fbshipit-source-id: a2b735801ed75232dd302061aaff2da23448d812
2021-03-16 08:02:20 -07:00
Mark Juggurnauth-Thomas
ee82a59544 scs_server: add context for service errors
Summary:
Add a `.context` method for `ServiceError`, which allows the addition of
context information in errors.

Since these are wrapped Thrift errors, we can't use the usual error-chain
mechanism of `std::error::Error`.  Instead, we just prepend the message that
the Thrift client will see with the context.

Add an extenstion to `Result` for results that contain an error that can be
converted into a `ServiceError` to allow the addition of context when
processing a chain of `Result`s.

Reviewed By: StanislavGlebik

Differential Revision: D27079921

fbshipit-source-id: a1200f44346530c91bd559f4be0ca2b04f7d4480
2021-03-16 08:02:20 -07:00
Toan Mai
adb30561fc Autocargo-ed serde_php (#80)
Summary:
Pull Request resolved: https://github.com/facebookexperimental/eden/pull/80

Trying to use this in `ctp` codebase, where most things are autocargo compatible.
Looks like it can be auto-cargoed as it doesn't have any fancy deps.

So I followed the instructions here to make that happen:
https://www.internalfb.com/intern/wiki/Rust-at-facebook/Cargo.toml_from_Buck_(autocargo)/

Reviewed By: farnz

Differential Revision: D27079404

fbshipit-source-id: 83c10c2899a6a0da52c8f3379c9fbfcde5052eea
2021-03-16 07:52:34 -07:00
Stanislau Hlebik
01b6a2faa8 mononoke: init cachelib only once and add more logging
Summary:
Initializing twice causes it to fail. Let's not do that, and also let's use
init_mononoke function instead of our adhoc logger and runtime creationg (at
the very least it also initializes tunables and sets correct tokio runtime
parameters).

Also let's add more logging to see the progress of uploading

Reviewed By: ahornby

Differential Revision: D27079673

fbshipit-source-id: 940135a9aed62f7139835b2450a1964b879e814b
2021-03-16 06:20:00 -07:00
Stanislau Hlebik
2e3b7d0a65 mononoke: add --skip-last-chunk argument to streaming changelog
Summary:
The way I plan to use new streaming_changelog in prod is by running it
periodically (say, every 15 mins or so). However some repos won't get many
commits in the last 15 mins (in fact, they might get just 1 or 2).
And even for high commit rate repos most of the times the last chunk
will not be a full chunk (i.e. it will be less that --max-data-chunk-size).

If we were just uploading last chunk regardless of its size then the size of
streaming changelog database table would've just keep growing by 1 entry every
15 mins even if it's completely unnecessary. Instead I suggest to add an option
to not upload the last chunk if it's not necessary.

Reviewed By: farnz

Differential Revision: D27045681

fbshipit-source-id: 2d0fed3094944c4ed921f36943b881af394d9c17
2021-03-16 03:29:54 -07:00
Stanislau Hlebik
fb4c1544b9 mononoke: add update command to streaming changelog
Summary:
This command can be used to update already existing streaming changelog.
It takes a newly cloned changelog and updates the new streaming changelog
chunks in the database.

The biggest difference from "create" command is that we first need to figure
out what's already uploaded to streaming changelog. For that two new methods
were added SqlStreamingChunksFetcher.

Reviewed By: farnz

Differential Revision: D27045386

fbshipit-source-id: 36fc9387f621e1ec8ad3eb4fbb767ab431a9d0bb
2021-03-16 03:29:54 -07:00
Stanislau Hlebik
d7d3064a79 mononoke: refactor upload_chunks_to_blobstore
Summary:
Small refactoring that will be used in the next diff. In the next diff we'll add
"update" command, and this command will specify the chunk number's itself.

So let's move setting chunk numbers from upload_chunks_to_blobstore function

Differential Revision: D27045387

fbshipit-source-id: c5387a60841fe184c6db5edc4812ddd409eb2215
2021-03-16 03:29:54 -07:00
Stanislau Hlebik
7163c47c82 mononoke: refactor split_into_chunks a bit
Summary:
Small refactoring that makes a few things easier to do in the later diffs:

1) Adds a verification that checks the data offset
2) We now read the first chunk's offset from revlog, instead of hardcoding it
to 0, 0. This will be useful in "update" commands which needs to skip revlog
entries that already exists in the database

Differential Revision: D27045388

fbshipit-source-id: 4ee80c96d9307c77b1108889e457f10e83c8beb7
2021-03-16 03:29:54 -07:00
Stanislau Hlebik
3d8cf49381 mononoke: fix getdeps build
Summary: Duplicate name caused getdeps build to fail. This diff fixes it

Reviewed By: krallin

Differential Revision: D27049661

fbshipit-source-id: b23fe52ad89cbe764e656dfe960921ff1ac92b32
2021-03-15 23:07:45 -07:00
Arun Kulshreshtha
0e6eba5880 edenapi_service: add edenapi prefix to scuba column names
Summary: Now that EdenAPI requests are being logged to the same dataset as regular requests (`mononoke_test_perf`), let's prefix the EdenAPI-specific columns with `edenapi_` to avoid confusion.

Reviewed By: krallin

Differential Revision: D26896670

fbshipit-source-id: 92a0710ff1a7297c9cf46ff9bd9576c9bc155e26
2021-03-15 15:44:34 -07:00
Alex Hornby
44cff08838 mononoke: add --read-count option to benchmark_filestore
Summary: Was fixed at 2 reads, add an option to allow testing read performance more throughly.

Reviewed By: farnz

Differential Revision: D27043234

fbshipit-source-id: 4bb5f49007a4fa67c42e872e236417fa5ce5c9a0
2021-03-15 08:50:09 -07:00
Thomas Orozco
7ad1561b68 mononoke/fileblob: create tempfiles in blobstore location
Summary:
Right now, fileblob crashes if those two things are on different devices
(because you cannot atomically rename across devices), so you need to make sure
your TMPDIR is on the same volume as your fileblob.

This is kinda annoying and kinda unnecessary. Let's just fix default to putting
the temporary files into the same location as the blobstore.

Note: while the temp files will be in the same directory as the rest of our
blobs, they don't have the `blob-` prefix (their prefix will be `.tmp`), so
they cannot be read by accident as if they were blobs.

Reviewed By: farnz

Differential Revision: D27046889

fbshipit-source-id: c2b47cd6927eef34ac19325f87f446a6f6532eaf
2021-03-15 08:32:59 -07:00
Alex Hornby
76e60fd0da mononoke: use the constants in benchmark_filestore
Summary: Tidy up a bit and use the constants

Reviewed By: krallin

Differential Revision: D27043233

fbshipit-source-id: 3208e2f35c67b4b22bb5f8189cd8c5b399604833
2021-03-15 03:53:41 -07:00
Stanislau Hlebik
480a0e9ef7 mononoke: start moving streaming changelog logic to rust
Summary:
Our current straming changelog updater logic is written in python, and it has a
few downsides:
1) It writes directly to manifold, which means it bypasses all the multiplexed
blobstore logic...
2) ...more importantly, we can't write to non-manifold blobstores at all.
3) There are no tests for the streaming changelogs

This diff moves the logic of initial creation of streaming changelog entry to
rust, which should fix the issues mentioned above. I want to highligh that
this implementation only works for the initial creation case i.e. when there are no
entries in the database. Next diffs will add incremental updates functionality.

Reviewed By: krallin

Differential Revision: D27008485

fbshipit-source-id: d9583bb1b98e5c4abea11c0a43c42bc673f8ed48
2021-03-12 14:46:30 -08:00
Robin Håkanson
d2a411abf6 gitimport add fb303 profiling support
Summary: gitimport add fb303 profiling support.

Reviewed By: krallin

Differential Revision: D27014516

fbshipit-source-id: 90183cc5c9069960536469ead57030c860a052a3
2021-03-12 12:35:05 -08:00
Carolyn Busch
bd89a4c855 edenapi_server: add bookmark endpoint
Summary: Add the EdenAPI endpoint for resolving bookmarks. This is a first pass that just takes a bookmark name as a path variable, to make sure that this is on the right track. We'll want to add a proper request type that includes a list of bookmarks and a response type that can indicate that no bookmark was found. Then the hg bookmark command will also need support for prefix listing capabilities.

Reviewed By: kulshrax

Differential Revision: D26920845

fbshipit-source-id: 067db6a636a75531ee5953392b734c038a58efb6
2021-03-12 12:07:35 -08:00
Stefan Filip
41049b62ca segmented_changelog: add scuba logs for updates
Summary:
Scuba stats provide a lot of context around the workings of the service.
The most interesting operation for segmented changelog is the update.

Reviewed By: krallin

Differential Revision: D26770846

fbshipit-source-id: a5250603f74930ef4f86b4167d43bdd1790b3fce
2021-03-12 11:29:40 -08:00
Stefan Filip
3d50bcc878 segmented_changelog: add stats for inprocess update
Summary:
STATS!!!
Count, success, failure, duration. Per instances, per repo.

I wavered on what to name the stats. I wondered whether it was worth being more
specific that "mononoke.segmented_changelog.update" with something like
"inprocess". In my view the in process stats are more important than the tailer
stats because the tailer is more simple and thus easier to understand. So I add
extra qualifications to the tailer stats and keep the name short for inprocess
stats.

Reviewed By: krallin

Differential Revision: D26770845

fbshipit-source-id: 8e02ec3e6b84621327e665c2099abd7a034e43a5
2021-03-12 11:29:39 -08:00
Stefan Filip
0bd89797a1 segmented_changelog: add repo_id to OnDemandUpdateSegmentedChangelog
Summary: Currently unused. Will add stats the reference it.

Reviewed By: krallin

Differential Revision: D26770847

fbshipit-source-id: d5694cd221c90ba3adaf89345ffeb06fa46b9e7b
2021-03-12 11:29:39 -08:00
Robin Håkanson
650f0ad049 Handle (ignore) git-sub-modules in gitimport
Summary:
Handle (ignore) git-submodules in gitimport.

git-sub-modules are represented as ObjectType::Commit inside the tree. For now we do not support git-sub-modules but we still need to import repositories that has sub-modules in them (just not synchronized), so ignoring any sub-module for now.

Reviewed By: StanislavGlebik

Differential Revision: D26999625

fbshipit-source-id: eb32247d4ad0325ee433e21a516ac4a92469fd90
2021-03-12 10:46:41 -08:00
Alex Hornby
93aee06296 mononoke: record checkpoint update stats and finish time
Summary: Record some more stats so we can see last finish time. Also record update stats for run and chunk number so can see how far along a run is.

Differential Revision: D26949482

fbshipit-source-id: 5e7df4412c25149559883b6e15afa70e1c670cdc
2021-03-12 10:46:40 -08:00
Stanislau Hlebik
819b26d2d1 mononoke: fix getdeps test
Summary:
Apparently getdeps uses python2 and hg-git outputs commits hashes differently.
This diff fixes it.

Reviewed By: krallin

Differential Revision: D27012424

fbshipit-source-id: 1a8a2fc8266b4035190cfd6056f37b52132a1b9b
2021-03-12 09:09:02 -08:00
Simon Farnsworth
1eecbfa82d Switch default Manifold client from Thrift to C++
Summary: All important jobs (SCS Server, LFS Server, Mononoke Server, derived data) have switched successfully. Roll up anything that's been missed by switching the default and letting contbuild take care of it

Reviewed By: krallin

Differential Revision: D26980991

fbshipit-source-id: 2c9f7cd56c38e9e1a2f8374c76141e7a99c88a2a
2021-03-12 08:37:15 -08:00
Mark Juggurnauth-Thomas
33ec4db653 bounded_traversal: require futures to be boxed
Summary:
Bounded traversal's internal book-keeping moves the futures returned from fold and unfold callbacks around while they are being queued to be scheduled.  If these futures are large, then this can result in a significant portion of bounded traversal's CPU time being spent on `memcpy`ing these futures around.

This can be prevented by always boxing the futures that are returned to bounded traversal.  Make this a requirement by changing the type from `impl Future<...>` to `BoxFuture<...>`.

Reviewed By: mitrandir77

Differential Revision: D26997706

fbshipit-source-id: 23a3583adc23c4e7d3607a78e82fc9d1056691c3
2021-03-12 08:12:57 -08:00
Stanislau Hlebik
f7e3a55184 mononoke: allow using streaming clone with sqlite
Summary:
Previously it was possible to use streaming clone only with xdb table. This
diff changes it

Reviewed By: farnz

Differential Revision: D27008486

fbshipit-source-id: b8d51832dd62b4343b36c3a7a96b83a327056025
2021-03-12 06:44:00 -08:00
Ilia Medianikov
8cf4bc15dc mononoke: pushrebase: save prepushrebase changeset id in bonsai changeset extra
Summary:
Knowing the prepushrebase changeset id is required for retroactive review.
Retroactive review checks landed commits, but verify integrity hook runs on a commit before landing. This way the landed commit has no straightforward connection with the original one and retroactive review can't acknowledge if verify integrity have seen it.

Reviewed By: StanislavGlebik

Differential Revision: D26944453

fbshipit-source-id: af1ec3c2e7fd3efc6572bb7be4a8065afa2631c1
2021-03-12 04:09:41 -08:00
Stanislau Hlebik
55dc8bf7d6 mononoke: remove unused getbundle_low_gen_num_threshold
Summary:
This tunable is not used anymore, we use
getbundle_high_low_gen_num_difference_threshold instead. Let's remove it.

Differential Revision: D26984966

fbshipit-source-id: 4e8ded5982f7e90c90476ff758b766df55644273
2021-03-12 03:14:51 -08:00
Alex Hornby
6e75bcc077 mononoke: add hg derived data support to benchmark_large_directory
Summary: Add ability to benchmark hg derivation

Reviewed By: krallin

Differential Revision: D26983309

fbshipit-source-id: 593f16c752610db4253e44509ffcf218dd241796
2021-03-12 02:58:53 -08:00
Alex Hornby
0035efe1f5 mononoke: add BonsaiHgMappingToHgBonsaiMapping edge to assist LFS validation
Summary:
The existing query to establish HgChangesetId on the path to FileContentMetadata for LFS validation is quite complex, using HgFilenode linknodes.

This change adds an optional edge from BonsaiHgMappingToHgBonsaiMapping that can be used to simplify the LFS validation case and load less data to get there.

Reviewed By: mitrandir77

Differential Revision: D26975799

fbshipit-source-id: 799acb8228721c1878f33254ebfa5e6345673e5d
2021-03-12 02:54:51 -08:00
Alex Hornby
40bc3dec6b mononoke: fix comment about thrift Blake2 encoding
Summary: Comment can go as we're using SmallVec now.

Reviewed By: farnz

Differential Revision: D26987009

fbshipit-source-id: f520c90b3a210283d139ba1de8ce140e12a4f875
2021-03-12 02:50:22 -08:00