Commit Graph

6160 Commits

Author SHA1 Message Date
Muir Manders
0cba5730b5 repo: expose scmstore stores
Summary: Retain and expose scmstore file and tree stores. I'm going to use this to share store instances between Python and Rust.

Reviewed By: sggutier

Differential Revision: D46683379

fbshipit-source-id: ad5204c47eb5d4147d27e9901611037384615bf8
2023-06-26 06:48:14 -07:00
Zoltán Nagy
72fadf3011 Upgrade clap, clap_completion
Summary:
This Clap update adds some new checks in `debug_asserts` that triggers new test failures in some existing CLIs (that I know of, we'll see if there's more red signals before landing). This diff fixes those along with the upgrade itself. Failure looks like this:

```
thread 'main' panicked at 'Command : Argument group 'multirepos' conflicts with non-existent 'repo' id', third-party/rust/vendor/clap_builder-4.2.7/src/builder/debug_asserts.rs:317:13
```

More context:
* https://github.com/clap-rs/clap/blob/master/clap_builder/src/builder/debug_asserts.rs#L317
* 185729a7dc

First added in Clap 4.2.5.

Reviewed By: zertosh

Differential Revision: D46934728

fbshipit-source-id: 2d17c2b02fb88af04ff65a0ea35a2be171acce31
2023-06-26 06:48:13 -07:00
Youssef Ibrahim
7d304fb8d3 sql_commit_graph_storage: allow querying from master in the lower level APIs and add a method for querying only changeset ids
Summary: Adds a parameter to the lower level APIs methods in SqlCommitGraphStorage to allow querying from master, and adds a method for querying only the changeset ids in a range without the associated changeset edges.

Differential Revision: D46862197

fbshipit-source-id: 57193eefdadd07474402108f452b9b6caf93b1f8
2023-06-26 05:12:38 -07:00
Youssef Ibrahim
efda6b6aae commit_graph: parallelize calculating disjoint segments
Summary: Each call to disjoint_segments is completely independent, so let's try to run them in parallel.

Reviewed By: markbt

Differential Revision: D46801400

fbshipit-source-id: 979ddaed4ac81d395debbea89b7a2cd84af25a0b
2023-06-26 05:12:38 -07:00
Youssef Ibrahim
7d9641bc72 cache_warmup: add cache warmup for commit graph segments
Summary: We need to make sure that the changesets the commit graph segments methods are querying are cache warm.

Reviewed By: markbt

Differential Revision: D46796191

fbshipit-source-id: 2d2d069222e909a6351ab7d596a41db590ceb87b
2023-06-26 05:12:38 -07:00
Youssef Ibrahim
21cbf11f30 edenapi: add commit/graph_segments handler
Summary: Adds a new EdenAPI endpoint commit/graph_segments that returns a segmented representation of ancestors of one set of commits (heads) excluding ancestors of another set ot commits (common).

Reviewed By: markbt

Differential Revision: D46796194

fbshipit-source-id: 45513e97a1b42f3fbdbe14ed94961512011cb0d5
2023-06-26 05:12:38 -07:00
Youssef Ibrahim
629a1fd0c3 commit_graph: implement ancestors_difference_segments
Summary: Implements a method called ancestors_difference_segments that returns a segmented representation of ancestors of one set of changesets (heads) excluding ancestors of another set of changesets (common).

Reviewed By: markbt

Differential Revision: D46796195

fbshipit-source-id: c355cb3fe59829ae26d15b6064cfdf35510825c0
2023-06-26 05:12:38 -07:00
Muir Manders
aba56cfd2e scmstore: kill "enableshim" flag
Summary: Kill scmstore.enableshim flag, always using scmstore over content store.

Reviewed By: sggutier

Differential Revision: D45556672

fbshipit-source-id: 159eb2bfa14a9870ac0a3931f18f654c638bcc3f
2023-06-25 23:58:14 -07:00
Mateusz Kwapich
35d04fb79f add --no-merge flag
Summary:
This flag allows the repo import to progress through all the steps
but bail before actually merging in the repos.

Currently the repo_import tool is so slow that all the pre-merge steps
can take hours. This makes it really hard to control when the actual merge
commit will happen. This flag will allow us to prepare all those steps
ahead-of-time and then resume with just merge in mind.

I don't think it's a proper long-term fix but I found it useful when working
wath whatsapp/biz and would use it again util we properly optimize repo_import
tool.

Differential Revision: D46802952

fbshipit-source-id: 2e8185482c4ba9c04fed20013efcc80d75e80bad
2023-06-22 10:31:22 -07:00
Egor Tkachenko
6daec5ad8d non-oss work 84/n
Differential Revision: D46901370

fbshipit-source-id: 45e8549191975ba143194e0de32f3c3948ebb36f
2023-06-22 07:46:20 -07:00
Egor Tkachenko
cd679ea8ec non-oss work 83/n
Differential Revision: D46860120

fbshipit-source-id: 21a71f785bc2261db654b0f72890668c12c3d30b
2023-06-22 07:46:20 -07:00
Haitao Mei
773bd915e7 new admin tool unlink keys after doing a sanitising check
Summary: This diff allows the new admin tool to double check if the key is really we wanted to delete, before doing the actual deletion.

Reviewed By: mitrandir77

Differential Revision: D46901306

fbshipit-source-id: 58904c5272d22b696dd22b3c83a9caf33fa3a0b2
2023-06-22 02:16:25 -07:00
Mateusz Kwapich
c79fc047d3 improve repo_create_commit docs
Summary:
Similarly named method in scmquery pushes commits onto some branch.
Let's make the description very clear.

Reviewed By: malmond77

Differential Revision: D46916234

fbshipit-source-id: 6137f42df97be80fff2775638f766f05c1113488
2023-06-21 18:33:21 -07:00
Egor Tkachenko
0c0282221b non-oss work 82/n
Differential Revision: D46836985

fbshipit-source-id: da87e06165fe72044c2944f45fcac045c8a40286
2023-06-21 10:04:36 -07:00
Mateusz Kwapich
0ee50a92c3 pushrebase - allow exemptions from casefolding check
Summary:
The casefolding pushrebase check is blocking the sync of www commits causing
problems. Let's exempt www/ dir from it.

We'll also have to modify commit hooks - but that's a separate thing.

Reviewed By: markbt

Differential Revision: D46860559

fbshipit-source-id: 87db959e0d025c0c1fc5c6cfecbdcf96af9e0f81
2023-06-20 16:48:55 -07:00
Haitao Mei
a66a40a339 new admin tool bulk unlinking refactoring
Summary:
This diff does the following refactoring:
* Remove redundent  key checking when parsing blobstore key, and repo_id from it
* Replace all the `writeln!` with `println!`
* Print errors using `eprintln!`
* Remove misuse of verbose

Differential Revision: D46857681

fbshipit-source-id: 1b065756eb91d8b4b67422ef49ded2783660992f
2023-06-20 10:39:12 -07:00
Haitao Mei
4d25e33775 new admin tool print how long does it take to delete all the keys from a file
Summary: This diffs allows the new admin tool to print out how long it takes to process each file during bulk blobstore key unlinking.

Differential Revision: D46838088

fbshipit-source-id: 92fa57edcce5b171a1d4f6be6d7a57571fdb618b
2023-06-20 09:07:49 -07:00
Haitao Mei
39d3400de2 new admin tool uses HashMap to cache the blobstores regarding a given repo_id
Summary:
This diff:
* Adds a new HashMap to store the blobstres, regarding a given repo_id, to avoid duplicated computation
* Removes some debugging output

Differential Revision: D46802649

fbshipit-source-id: 8ddf0243b192b5684af2d4a430cdb8873a78246d
2023-06-20 09:07:49 -07:00
Haitao Mei
29d634f146 new admin tool does the actual deletion for the bulk key unlinking
Summary:
This diff allows the new admin tool to actually delete the content from the blobstore.

When the key is not present, there is nothing happening.

Differential Revision: D46802651

fbshipit-source-id: ac1d9aabcb4fd17455263c29b1348f5d1619868b
2023-06-20 09:07:49 -07:00
Pierre Chevalier
6ed7f57fcc non-oss work 80/n
Summary:
[plan_deletion] Add subcommand to redact relevant blobs

To make it easy to use in the context of plan deletion, we take the provisional deletion plan issued from the `propose-deletion-plan` subcommand in and we output the redaction id to a file.

In the real world, one would use this redaction id to make a diff to the mononoke config in configerator and restart the mononoke servers to make sure that the redaction actually takes effect before moving on to the next step.

Reviewed By: markbt

Differential Revision: D46768075

fbshipit-source-id: 368dafd485a02d677f3a74c6ede59a80cbe931d4
2023-06-20 07:15:08 -07:00
Pierre Chevalier
f377936c00 Expose fetch-key-list through the API and the CLI
Summary:
The current interface relies on the assumption that redacted content only relates to file contents.
`list`, which is the way to read what was redacted discovers the blobs for filecontents by walking the fsnodes manifests.

For a feature I'm working on, I need to be able to redact arbitrary blobs, so I need a way to identify what was redacted
without relying on some derived data blobs being available.

Reviewed By: markbt

Differential Revision: D46806525

fbshipit-source-id: 78b5470d4dd741538e3e85353c0b1634f3b83b1c
2023-06-20 07:15:08 -07:00
Pierre Chevalier
14bbdd45bf Make the API to the redaction crate more minimal
Summary: Instead of taking in the full context of a `MononokeApp`, only take the two blobstores that we need.

Reviewed By: markbt

Differential Revision: D46768087

fbshipit-source-id: 773f0f31e12c58ee0cc8341e638fcbcedde8bab9
2023-06-20 07:15:08 -07:00
Pierre Chevalier
2fe77034e8 Extract redaction feature to its own crate
Summary:
The crux of the redaction process is this `create_key_list` function which appends each key to the redaction blobstores and provides instruction to generate the correct redaction config.

I will need to re-use this mechanism, so extract it to an accessible crate.

Reviewed By: markbt

Differential Revision: D46763714

fbshipit-source-id: 8f06a12c6348bc49d0ca4fe65155818a216c5b88
2023-06-20 07:15:08 -07:00
Rajiv Sharma
6da90296c8 Clippy Fixes
Summary: As in title

Reviewed By: mitrandir77

Differential Revision: D46855503

fbshipit-source-id: 715548f1a138e950bc2ae00e9b83a47a13944109
2023-06-20 05:19:24 -07:00
Haitao Mei
c8968599aa new admin tool extract the blobstore key from the given key
Summary:
This diff allows the new admin tool to extract the blobstore key from a given key.

For example, `flat/repo2122.blame.fileunode.blake2.2e58d897760aa5927c7ca1b0755992c9109be6048a9dc1deab1c86a4619f5839` will give `repo2122.blame.fileunode.blake2.2e58d897760aa5927c7ca1b0755992c9109be6048a9dc1deab1c86a4619f5839`.

Differential Revision: D46802650

fbshipit-source-id: bec63353b1ca50eeeab73cd84b5e3d8fb967c694
2023-06-20 05:17:23 -07:00
Haitao Mei
0b7b932ef6 Implement getting blobstores for a given repo id
Summary:
This diff adds
* a new function to extract repo id from a given key
* a new function to construct all the blobstores given a repo id

Differential Revision: D46763054

fbshipit-source-id: 64bb1ab4fd8bcae8ab414dd03e78f0a7c2dd5639
2023-06-20 05:17:23 -07:00
Youssef Ibrahim
034dc93e70 commit_graph: use fetch_edges_required and fetch_many_edges_required everywhere except in exists, changeset_parents and changeset_generation
Summary: Some of the usages of the commit graph (e.g. in pushrebase) query very recent commits that might not have been fully replicated in xdb yet. This diff switches to using fetch_edges_required and fetch_many_edges_required everywhere except in exists, changeset_parents and changeset_generation, to avoid any errors caused by replication lag. This shouldn't cause any performance degredation as we always use the normal read connection first and only query master in the rare case that a commit is missing from replicas.

Reviewed By: mitrandir77

Differential Revision: D46842804

fbshipit-source-id: 3822d883b079447d0af90c9169cd3db7600f9b25
2023-06-19 14:20:27 -07:00
Youssef Ibrahim
0b9254b329 commit_graph: use fetch_edges_required in changeset_generation_required and changeset_parents_required
Summary: The SQL implementation of fetch_edges_required first tries fetching the changeset edges using the normal read connection, then tries to fetch any missing changesets using the master connection. This can sometimes be the desired behavior when working with very recent commits so let's expose it through changeset_generation_required and changeset_parents_required.

Reviewed By: markbt

Differential Revision: D46796193

fbshipit-source-id: 38b3d79dd88ca4b8b98f36705ca1a215379a87b3
2023-06-19 08:30:20 -07:00
Youssef Ibrahim
1afff25d5b newadmin: add the missing blobstore fetch decodings
Summary: Added the missing blobstore fetch encoding except RedactionKeyList as that require creating a separate config and already has its own newadmin command.

Differential Revision: D46491769

fbshipit-source-id: 82208ac9dcb1acb5370ff7904c285e843be4f18f
2023-06-19 07:08:40 -07:00
Mateusz Kwapich
fbea1eb670 change the megarepo tool to not use commiter date
Summary:
This makes commits created by megarepo tools roundtrippable through bonsais (so far they were not).
D27852341 shows that it's not a problem for actual backup job but.. it's a
problem for example for mirror_hg_commits binary so let's make thos commits
nice and tidy.

Reviewed By: mzr

Differential Revision: D46489608

fbshipit-source-id: d15ba6f6622f51189a1ba9e76efd68faf1bf2b71
2023-06-19 06:28:57 -07:00
Mateusz Kwapich
e058be184f new, faster verification
Summary:
Verification is used to verify if given megarepo mapping configuration can be cleanly applied to given pair of small repo and large repo commits. It's used when we want to change current config and we want to check if according to the new config the mapping is still sound.

The previous *fast path* verification worked by limiting the amount of visited entries to `O(small repo)`

The new fast path verification doesn't walk every file in the repository, instead it leverages FSNodes to compare hashes of entire directories. This was if the repository verifies OK the verification is very fast. The amount of entries visited is `O(differences to report + number of config entries)`. When the configuration is simple and the repository is intact (usual case) it's almost instant. In case of more complex configurations (like ovrsource one) it's still **much** faster than current one.

WARNING: The implementation is a bit hacky due to the path mover functions being orignally designed with moving file paths not, directory paths. The hack is mostly contained to wrap_mover_result functiton.

Differential Revision: D45864549

fbshipit-source-id: 2fad0ed29a5718e655fa4a69b28306ca3db31dda
2023-06-19 03:38:35 -07:00
Mateusz Kwapich
6dc96eaecf add committer date and time support
Summary:
I want to test for some bug in megarepotooling and I'd like to use that
feature.

Differential Revision: D46440647

fbshipit-source-id: 1f4bb0d937f6e681a02f9843a7241fde4ae9b774
2023-06-19 03:33:58 -07:00
Mateusz Kwapich
b15dc88eb6 add author_date
Summary: For completeness it would be great to have it.

Differential Revision: D46440646

fbshipit-source-id: 0c57ab1a5b865ba3a93cc46ac92c9345e2a39d86
2023-06-19 03:33:58 -07:00
Mateusz Kwapich
b203146f0f add committer and committer date to test commits
Summary:
Some of our commits have those fields set so let's allow setting them on test
commits.

Differential Revision: D46440648

fbshipit-source-id: 393d9fcb616ddee5481f239264ab87bf5c8315e9
2023-06-19 03:33:58 -07:00
Chad Austin
d110d684f2 hooks: allow other nocommit spellings
Reviewed By: mzr

Differential Revision: D46464575

fbshipit-source-id: d2cf8059062ea68f465a5db40185f0faec3b89e3
2023-06-16 14:53:10 -07:00
Shayne Fletcher
ca860fbcef Update autocargo component on FBS:master
Reviewed By: zertosh

Differential Revision: D46811187

fbshipit-source-id: 5d4e3993ccbc3871bf56495822d15de8d73f832c
2023-06-16 14:06:54 -07:00
Mateusz Kwapich
f7e8639689 option to rewrite dates in forward syncer
Summary:
We need that for current catchup to avoid confusing clients with backdated
commits.

Reviewed By: markbt

Differential Revision: D46743487

fbshipit-source-id: a71ee30eada63dab7c64cbe43a4495662fe04371
2023-06-16 04:08:02 -07:00
Mark Juggurnauth-Thomas
8eed042d80 cmdlib_caching: remove deprecated cachelib options
Summary: Remove the deprecated `--skip-caching`, `--blobstore-cachelib-only` and `--cachelib-only-blobstore` options.

Reviewed By: clara-9

Differential Revision: D45089767

fbshipit-source-id: 52707ce00865bfd9ed7e09f0df4f6ce190ac7481
2023-06-16 03:21:13 -07:00
Mark Juggurnauth-Thomas
32d13cc799 walker: specify cache-mode default
Summary: Walker previously specified a default for the deprecated `cachelib-only-blobstore` argument.  Switch to `cache-mode`, allowing us to specify local caching for all types, not just the blobstore.

Differential Revision: D45089769

fbshipit-source-id: 978ac545590935be5a825669db9df58f0281c41a
2023-06-16 03:21:13 -07:00
Eddie Shen
5db8c486ab Replace remaining atty usages with stdlib
Summary:
`std::io::IsTerminal` was stabilized in 1.70. It is intended to be a drop in replacement for `atty::is(Stdout)`.

This codemod removes all remaining instances of atty

`arc f` was ran twice, followed by two invocations of `arc lint -e extra --take RUSTFIXDEPS`. `arc autocargo` was also ran

This script was a little naive though, and as a result, I had to manually move the newly added imports to the correct line. I ran lints again afterwards.

Reviewed By: dbxfb

Differential Revision: D46736264

fbshipit-source-id: a686b96b1fa0aa4389f65487ed426f226c86e8e9
2023-06-15 18:42:33 -07:00
Mateusz Kwapich
e7b8e8b8b5 option to skip syncing empty commits
Summary: www has lots of empty commits that are being synced to all small repos. Until we fix that problem let's refrain from backsyncing any empty commits. I'm making it a tunable for easy on-off

Reviewed By: liubov-dmitrieva

Differential Revision: D46727657

fbshipit-source-id: 5f204cc8231fec3e6a4eaf25b6575c0d637d67a3
2023-06-15 08:46:20 -07:00
Pierre Chevalier
1017a247d0 non-oss work 79/n
Summary:
[plan_deletion] Implement `commit-deletion-plan-to-deletion-log-db` subcommand

This is the last necessary step before enacting the deletion.
As the outcome of this step, the deletion_log db is populated and we can start taking the necessary steps to safely proceed with the actual deletion of changesets and blobs.

We still need to document and potentially write tools to help with the whole process, including blobs redaction.

Differential Revision: D46724273

fbshipit-source-id: bc60209a28e2ee8a50ac0371f78c0b3be234781b
2023-06-15 07:57:35 -07:00
Haitao Mei
c27e9a8576 new admin tool creates a new struct to implement all features of bulk unlink
Summary: This diff moves all the bulb unlinking functions into a struct.

Reviewed By: mitrandir77

Differential Revision: D46759318

fbshipit-source-id: 143171706485ec6a56152bdfea6202221d88fe03
2023-06-15 07:51:33 -07:00
Haitao Mei
61bcdc04d4 new admin tool making the geting blobstores function for unlinking a key to be public
Summary: This diff makes the `get_blobstores` function to be public, so that we can use it to get blobstores. These blobstores are going to be used to unlink keys in a bulk.

Reviewed By: mitrandir77

Differential Revision: D46729000

fbshipit-source-id: 11d7ddc5f90527ebdb79141f61f1671997108ac5
2023-06-15 07:51:33 -07:00
Haitao Mei
f76eb1f9c6 new admin tool bulk unlink keys subcommand read keys from all files
Summary:
This diff allows the new admin tool to do the following for the new bulk unlking keys subcommand:
* Read the key list from the files
* Print all the keys extracted when running in dry-run mode
* Report the progress
* Fix the argument to take a boolean value

Reviewed By: mitrandir77

Differential Revision: D46725312

fbshipit-source-id: c039b89e00a9ffbbbee41e20617f45543ef147ce
2023-06-15 07:51:33 -07:00
Haitao Mei
b556d7cc08 Adding a new command to bulk unlink keys into new admin tool
Summary: This diff adds a new subcommand for our new admin tool, so that we can delete keys bulkly. The input is the directory that contains a list of files, each file contains a list of *Manifold* keys to remove.

Reviewed By: mitrandir77

Differential Revision: D46689957

fbshipit-source-id: 077b6a4c76872ef01bc7f0eab7096abeb87d7047
2023-06-15 04:17:16 -07:00
Eddie Shen
842b985bd4 Replace atty::is(Stdout) with stdlib
Summary:
`std::io::IsTerminal` was stabilized in 1.70. It is intended to be a drop in replacement for `atty::is(Stdout)`.

This codemod replaces all usages of `atty::is(Stdout)` with `stdout().is_terminal()`, importing the necessary paths.

It used this script (generated with assistance from Metamate):
```
#!/bin/bash

files=$(fbgs -sl "atty::is(atty::Stream::Stdout)" | arc linttool debugfilterpaths --take RUSTFMT)

for file in $files; do
  sed -i 's/atty::is(atty::Stream::Stdout)/stdout().is_terminal()/g' $file
done

for file in $files; do
  sed -i '1i use std::io::{stdout, IsTerminal};' $file
done
```

`arc f` was ran twice, followed by two invocations of `arc lint -e extra --take RUSTFIXDEPS`.

This script was a little naive though, and as a result, I had to manually move the newly added imports to the correct line. I ran lints again afterwards.

Reviewed By: danielocfb

Differential Revision: D46601191

fbshipit-source-id: 3f2269c518e36241ab0e995b87976323d7da9cc6
2023-06-14 15:29:14 -07:00
Rajiv Sharma
c81071e34e Allow bypassing read-only status in gitimport
Summary: `gitimport` is typically used for mirror repos which by definition are locked. Creating or moving bookmarks on locked repos is disallowed by default. This behaviour can be overriden by passing the bypass flag. This diff adds argument support for bypassing the read-only check.

Differential Revision: D46722005

fbshipit-source-id: 9defcbc159a55d0e7ff197d8d732e3d5a208a998
2023-06-14 12:32:56 -07:00
Haitao Mei
354da6c438 newadmin tool now deletes keys from all the related blobstores II
Summary: We handle the error gracefully, when deleting a key from the underlying blobstores.

Reviewed By: RajivTS

Differential Revision: D46684795

fbshipit-source-id: 45349aab01e56b11fd1ab455aec2bb69d72c497b
2023-06-14 04:27:47 -07:00
Jan Mazur
5da5975071 import async-fcgi and http-body; add stream feature to hyper
Summary: Importing into third-party

Reviewed By: zertosh

Differential Revision: D46639634

fbshipit-source-id: c9eb7d056e75ae2c5d9a8bb9bcc781d419f65f4c
2023-06-13 17:56:22 -07:00