Commit Graph

3220 Commits

Author SHA1 Message Date
Stanislau Hlebik
39e915d8d9 mononoke: allow creation of multiple symlinks that point to the same directory
Summary:
Previously it wasn't possible because symlink target was a key in the map that
mega_grepo_sync was sending to scs, and so we can't have two different symlink
for the same symlink target. However we actually need it - some of aosp repos
have symlink different sources that point to the same symlink target.

This diff fixes it by reverting the key and valud in the `linkfiles` map.

Differential Revision: D29359634

fbshipit-source-id: da74d6e934350822d82d2135ab06c754824525c9
2021-06-28 04:04:46 -07:00
Xavier Deguillard
41897e3acc third-party: patch os_info to properly support Centos Stream
Summary:
This is just updating the os_info crate to my fork with a fix for Centos
Stream: https://github.com/stanislav-tkach/os_info/pull/267

Reviewed By: quark-zju

Differential Revision: D29410043

fbshipit-source-id: 3642e704f5a056e75fee4421dc59020fde13ed5e
2021-06-25 21:07:33 -07:00
Daniel Xu
431a4ed16b Fix autocargo skew
Summary: I think someone landed a dependency change or something and forgot to update autocargo

Reviewed By: dtolnay

Differential Revision: D29402335

fbshipit-source-id: e9a4906bf249470351c2984ef64dfba9daac8891
2021-06-25 17:23:33 -07:00
Arun Kulshreshtha
60139f5316 localrepo: add option to explicitly enable or disable EdenAPI globally
Summary: Add an option to allow manually forcing EdenAPI to be enabled or disabled. This is useful in a variety of cases, such as bypassing the normal EdenAPI activation logic in tests, or to forcibly disable EdenAPI in cases where it isn't working correctly.

Differential Revision: D29377923

fbshipit-source-id: f408efe2a46ef3f1bd2914669310c3445c7d4121
2021-06-25 15:33:00 -07:00
Mark Juggurnauth-Thomas
a818b2a73d mononoke_api: detect multiple copies or renames when diffing
Summary:
When diffing a changeset with its parents, if a file is copied to multiple places, then we should include all of those copies in the diff.

Furthermore, if the file is also removed, then the *first* of those copies
should be considered a move.  Note that "first" here means the first in the
lexicographic ordering of the repository manifest.

Reviewed By: liubov-dmitrieva

Differential Revision: D29359516

fbshipit-source-id: eeed630c2e4d20f3fb8c923611a0433c74fd25d0
2021-06-25 11:17:26 -07:00
Jeremy Fitzhardinge
174de901a2 thrift/rust: remove new constructor for newtype typedefs
Summary:
A `new` constructor isn't necessary because it's identical to just
`TypeName`. Now that user-provided constructor can be included, it occupies
valuable namespace.

#forcetdhashing

Reviewed By: krallin

Differential Revision: D29387037

fbshipit-source-id: 7de343c13842c74772f7eca83ddd7019e1040c5c
2021-06-25 10:15:36 -07:00
Jun Wu
4b7bcc2553 dag: rename parents_and_head to parents_head_and_roots
Summary: The returned value now includes roots. Rename the function to clarify.

Reviewed By: kulshrax

Differential Revision: D29383072

fbshipit-source-id: 02a255ce20d9797f482f6fe1c716f2d79a12d4e0
2021-06-25 09:29:03 -07:00
Stanislau Hlebik
5ef7ba764b mononoke: do not check non-prefix free paths and verify_config earlier
Summary:
1) Turned out it's possible to have non-prefix free paths in aosp manifests. So
we have to remove this check for now
2) also let's verify config earlier so that we can return an error to the user
faster

Differential Revision: D29335602

fbshipit-source-id: 3dd72d63a370515eca5d356b3b98bb2ac2245aee
2021-06-25 09:26:33 -07:00
Egor Tkachenko
9961e81a80 Add logging of rebased changeset to scuba
Summary:
When we do pushrebase changesets which send to us by the client will be rebased and get new hash, which is not available in mononoke_test_perf atm.
Lets log rebased changeset_id

Reviewed By: Croohand

Differential Revision: D29362816

fbshipit-source-id: bebab24b12de1be9a9b81502453fcf44444f94b5
2021-06-25 06:57:34 -07:00
Thomas Orozco
8c83bd9a1c third-party/rust: update Tokio to 1.7.1
Summary: There is a regression in 1.7.0 (which we're on at the moment) so we might as well update.

Reviewed By: zertosh, farnz

Differential Revision: D29358047

fbshipit-source-id: 226393d79c165455d27f7a09b14b40c6a30d96d3
2021-06-25 06:17:41 -07:00
Stanislau Hlebik
1ab8052daf mononoke: add a command to fetch deleted file manifest
Summary: It's nice to be able to expect it

Differential Revision: D29366561

fbshipit-source-id: 12b3cb31e9c5821d942a0a10f97962e3ae1ddc41
2021-06-25 03:29:25 -07:00
Yan Soares Couto
af0811bdbc Add type RedactionKeyList
Summary:
This adds the blob object RedactionKeyList, which just contains a list of Strings, each of which will be a key to be redacted.

This will be stored on the blobstore, while a key to this object will be stored in configerator.

Some stuff that might be worth discussing:
- This class just holds a list of strings, per se it doesn't have much to do with redaction. If we want to change this to a more generic object like `KeyList`, I'm happy to do it. By default I'll leave it like this.
- I used serde (more precisely, json) to (de)serialise it. The only reason I did it was because I wanted to make this as simple as possible, from what I see in other objects need to define a thrift struct with the same config, then write `into/from_thrift` implementations. If preferred, I can do that.

It's not used in this diff, will be used in the future, I split it mostly to make it easier to review.

Reviewed By: markbt

Differential Revision: D29033597

fbshipit-source-id: 5550dbf58c5214201b739f8150fd06471bd67ab8
2021-06-25 03:15:28 -07:00
Andrey Chursin
7ed94dde6a OnDemandUpdateSegmentedChangelog: build up master before generating pull data
Summary: This is required to make sure segmented changelog has all the data needed

Reviewed By: quark-zju

Differential Revision: D29347285

fbshipit-source-id: 82ee1ffca178492b7ad363c53cee7ec57058733f
2021-06-24 13:58:02 -07:00
Andrey Chursin
dc97c2544a edenapi_service: add fast forward pull handler
Reviewed By: quark-zju

Differential Revision: D29342138

fbshipit-source-id: 056dad3bb7c207b1f0e9d0ee50a95e96ad690254
2021-06-24 13:58:02 -07:00
Robin Håkanson
4056944213 Add git LFS support to gitimport and grepo branch_forest.
Summary:
Add git LFS support to gitimport and grepo branch_forest.

I did not want to add the parsing of .gitattributes and .lfsconfig to the gitimport library. This needs to be done by the users of gitimport before the import is started, And the GitImportLfs object needs to be configured accordingly. Currently we are extrating this data from the manifest files for the "g"repo imports.

I am not sure the simple git-lfs download client works with other git-lfs server back ends then Dewey. But it is a fairly simple implementation and it should be easy to extend to be more generic.

Reviewed By: farnz

Differential Revision: D29082867

fbshipit-source-id: a7b0272147b3d44a0b6b9782d2a1b8ec94653b8f
2021-06-24 13:49:20 -07:00
Stanislau Hlebik
129d4fa88f mononoke: support multiple directories in mononoke_admin rsync
Summary: It's useful to be able to copy multiple dirs at once

Reviewed By: markbt

Differential Revision: D29358375

fbshipit-source-id: f1cc351195cc2c19de36a1b6936b598e314848c3
2021-06-24 11:44:34 -07:00
Stanislau Hlebik
1044dd545d mononoke: support mononoke admin convert for git
Summary:
Previously only conversion between bonsai and hg was supported. Let's add git
as well.

Obviously you can use `scsc lookup`, but mononoke_admin can be useful for repos
that are not on scs yet.

Reviewed By: farnz

Differential Revision: D29360793

fbshipit-source-id: eb2b71eab192b3456ba3d580f7eb8c4a85b2fd1d
2021-06-24 07:32:51 -07:00
Yan Soares Couto
a3e0290fe1 Move CoreContext creation in repo_factory to a new function
Summary: Very simple refactor. This logic was already used twice and I will use it another time in following diffs.

Reviewed By: markbt

Differential Revision: D29033594

fbshipit-source-id: 96040a2eee2b58f6851646e51b67c46c6bf334fe
2021-06-24 06:33:04 -07:00
Mark Juggurnauth-Thomas
728f145e78 ephemeral_blobstore: add ephemeral blobstore
Summary:
Implement get and put for the ephemeral blobstore.  This allows blobs to
be stored and retrieved in bubbles.

Ephemeral bubbles always have a repo associated with them when they are opened,
to simplify blob prefixing.  It is valid for a bubble id to have multiple repos
associated with it, but they must be accessed separately, and in practice this
won't be used.

Reviewed By: StanislavGlebik

Differential Revision: D29067722

fbshipit-source-id: d870f695fc1d0c825fdaec9337c82a13209165ce
2021-06-24 04:13:58 -07:00
Mark Juggurnauth-Thomas
3c9bf458be metaconfig: add ephemeral blobstore config
Summary:
Extend metaconfig to include configuration for the ephemeral blobstore.

An ephemeral blobstore is optional: repos without an ephemeral blobstore cannot
store ephemeral commits or snapshots.

Reviewed By: StanislavGlebik

Differential Revision: D29067719

fbshipit-source-id: fe7d42173d5c34a937c99c72f4b2bd08af503889
2021-06-24 04:13:58 -07:00
Mark Juggurnauth-Thomas
5716174f8f packblob: generalise key prefixes
Summary:
Packblob currently expects key prefixes of the form `repoNNNN.` to be stripped , but also allows keys without this prefix. For the ephemeral blobstore we want to allow prefixes of the form `ephXXX.repoNNNN.` as well.

Generalise packblob so that we can have multiple key prefixes.

Packblob will enforce that none of the blobs in the packblob have a prefix that matches any of the patterns - this will prevent us from accidentally storing `repoNNNN.`-prefixed blobs in an ephemeral blobstore that requires `ephXXX.repoNNNN.` prefixes, for example.

Reviewed By: liubov-dmitrieva

Differential Revision: D29067720

fbshipit-source-id: 953909d47c9c4af91b529bcc684340d26411463d
2021-06-24 04:13:58 -07:00
Alex Hornby
196ade1c06 mononoke: extract chunking params in walker
Summary: Make it clearer which of the TailParams are only required when chunking, removing parallel Option<> so that all items that should be set together are inside one optional item.

Reviewed By: farnz

Differential Revision: D29264647

fbshipit-source-id: d64cddf94b35e62d6e50cd8afe906eef2444c730
2021-06-24 01:49:39 -07:00
Alex Hornby
3d59baacd5 mononoke: check if chunking in walker defer_visit()
Summary: Makes defer_visit return result, so we can detect if it is called when not chunking.

Reviewed By: farnz

Differential Revision: D29268346

fbshipit-source-id: b8ea503c2848adb5d7ca3fb0e61399be2930c3de
2021-06-24 01:49:39 -07:00
Andrey Chursin
ea95fbdee8 api: introduce segmented_changelog_pull_fast_forward_master
Reviewed By: quark-zju

Differential Revision: D29319057

fbshipit-source-id: 88ff9e1f4acc0109c8a1e4978914f84832ebeb36
2021-06-23 14:51:39 -07:00
Andrey Chursin
f9b85a5a93 segmented_changelog: impl for ReadOnlySegmentedChangelog::pull_fast_forward_master
Summary: This is rougly similar to algorithm in NameDag

Reviewed By: quark-zju

Differential Revision: D29318721

fbshipit-source-id: 51a9123daa2b4cf0fbe2346a8a0c7e75172d9afb
2021-06-23 14:51:39 -07:00
Andrey Chursin
b13454d54b segmented_changelog: introduce SegmentedChangelog::pull_fast_forward_master
Summary: The naming is used in other parts of dag crate - this introduce mononoke side binding for corresponding functions on dag side

Reviewed By: quark-zju

Differential Revision: D29318722

fbshipit-source-id: e9eea5536b041b6ab2ce578914817bca43a10d48
2021-06-23 14:51:39 -07:00
Stanislau Hlebik
3c14f3c20b mononoke: fix symlink handling in megarepo_api
Summary:
Path should be relative to the symlink path, not to the repo root. This diff
fixes it

Reviewed By: farnz

Differential Revision: D29327682

fbshipit-source-id: a51161a8039a88263fe941562f2c2134aa5d4fef
2021-06-23 04:20:33 -07:00
Meyer Jacobs
b5858adee1 scmstore: update remaining tests
Summary: Update the remaining tests for scmstore. In each of these cases we're just disabling scmstore for various reasons. I think `test-lfs-bundle.t` and `test-lfs.t`'s failures represents a legitimate issue with scmstore's contentstore fallback, but I don't think it should block the rollout

Reviewed By: kulshrax

Differential Revision: D29289515

fbshipit-source-id: 10d055bf679db8efdeb16ac96b7ed597d7b6d82c
2021-06-22 13:14:58 -07:00
Stanislau Hlebik
56c926297f mononoke: reuse hg manifest from parents if they are identical
Summary:
This is a followup from D28903515 (9a3fbfe311). In D28903515 (9a3fbfe311) we've added support for reusing
hg filenodes if parent has the same filenode. However we weren't reusing
manifests even if parent has an identical manifest, and this diff adds a
support to do so.

There's one caveat - we try to reuse parent manifests only if there are more
than one parent manifest. See explanation in the comments.

Reviewed By: farnz

Differential Revision: D29098908

fbshipit-source-id: 5ecfdc4b022ffc7620501cc024e7a659fb82f768
2021-06-22 11:50:02 -07:00
Andres Suarez
fc37fea20c Update itertools 0.8.2 to 0.10.1
Reviewed By: dtolnay

Differential Revision: D29286012

fbshipit-source-id: 6923c0b750692e6932e85fd539b076b172ff43b7
2021-06-22 04:09:00 -07:00
Alex Hornby
4c94a2bfc3 mononoke: no need to revisit deferred nodes in OldestFirst mode
Summary:
In the walker, an Option<NodeData> value of None is used to indicate that no data could be found for a node, and that for derived data mappings we should try again to load it later, when it may have been derived.

When a node is outside the chunk boundary this isn't appropriate,  we should just mark as visited and move on, which is what this change does.

Reviewed By: farnz

Differential Revision: D29230223

fbshipit-source-id: c2afdee9b914af89c7954c8e6a7d17a174df7ed1
2021-06-22 01:41:39 -07:00
Meyer Jacobs
43a75431bb scmstore: update additional test
Summary: Only four tests remaining after this.

Reviewed By: kulshrax

Differential Revision: D29229656

fbshipit-source-id: 56c0a17f6585263e983ce8bc3c345b1f266422e0
2021-06-21 20:32:50 -07:00
Meyer Jacobs
88ab7198bc scmstore: update more tests
Summary: Update more tests to avoid relying on pack files and legacy LFS, and override configs in `test-inconsistent-hash.t` to continue using pack files even after the scmstore rollout to test the Mononoke's response to corruption, which is not currently as easy with indexedlog.

Reviewed By: quark-zju

Differential Revision: D29229650

fbshipit-source-id: 11fe677fcecbb19acbefc9182b17062b8e1644d8
2021-06-21 20:32:50 -07:00
Andrew Gallagher
05cf7acd77 object-0.25.3: patch SHT_GNU_versym entsize fix
Summary:
Pull in a patch which fixes writing out an incorrect entsize for the
`SHT_GNU_versym` section:
ddbae72082

Reviewed By: igorsugak

Differential Revision: D29248208

fbshipit-source-id: 90bbaa179df79e817e3eaa846ecfef5c1236073a
2021-06-21 09:31:49 -07:00
Yan Soares Couto
73212fc9bf Return Arc instead of reference
Summary:
For context and high level goal, see: https://fb.quip.com/8zOkAQRiXGQ3

On RedactedBlobs, let's return an `Arc<HashMap>` instead of `&Hashmap`.

This is not needed now, but when reloading information from configerator, we won't be able to return a reference, only a pointer.

Reviewed By: StanislavGlebik

Differential Revision: D28962040

fbshipit-source-id: 0848acc1a81a87c0b51d968efe31f61dacd57c47
2021-06-21 08:42:16 -07:00
Yan Soares Couto
f0a287580e Add wrapper around redacted blobs
Summary:
For context and high level goal, see: https://fb.quip.com/8zOkAQRiXGQ3

Instead of using `HashMap<String, RedactedMetadata>` everywhere, let's use a `Arc<RedactedBlobs>` object from which we can instead borrow a map. The borrow function is async because it will need to be when we're fetching from configerator, as it may need to rebuild the redaction data.

Wrapping it in `Arc` will also makes it re-use the same across repos, I believe right now it's cloned everywhere.

In later diffs I'll use this enum to add a new way to fetch configs.

Reviewed By: markbt

Differential Revision: D28935506

fbshipit-source-id: befa96810ee7ebb9487f99f9e769a945981b58ed
2021-06-21 08:42:16 -07:00
Simon Farnsworth
23cd985c98 Add a tool to check working copy equivalence between git and Mononoke
Summary:
We're doing imports for AOSP megarepo work, and want a tool to quickly check that our imports are what we expect.

Use libgit2 and a simple LFS parser to read git SHA-256 entries, and FSNodes to get the Mononoke entries to match

Reviewed By: StanislavGlebik

Differential Revision: D29169743

fbshipit-source-id: 1ef1e2c780b8742c7fa5f15f9ee01bc0481a6543
2021-06-21 07:35:31 -07:00
Simon Farnsworth
e07bd8ab5a Fix up building of the test case for scrubbing
Summary: This is a minimal fix so that it builds, not enough to test the new bit, but enough to unbreak contbuild

Reviewed By: yancouto, HarveyHunt

Differential Revision: D29263246

fbshipit-source-id: c5430ff4bc885103664c33caca90af5819d97ddd
2021-06-21 07:29:25 -07:00
Alex Hornby
51ee68fa24 mononoke: improve couple of ifs in walker
Summary: Spotted this in passing. Save a DashMap lookup in the OldestFirst case by checking the enum first

Reviewed By: farnz

Differential Revision: D29232280

fbshipit-source-id: 72e93ee704767a42c36ffeec505fd79a22c4d88e
2021-06-21 02:37:24 -07:00
Stanislau Hlebik
fdaea05176 mononoke: log when derived data mapping was inserted
Summary:
At the moment we have a few ways of deriving data:
1) "normal", which is used by most of the mononoke code. In this case we insert
derived data mapping after all the data for a given derived data type was
safely saved.
2) "backfill", which is used when we backfilling a lot of commits. In this case
we write all the data to in-memory blobstore first, and only later we save data
to real blobstore, and then write derived data mapping
3) "batch", when we derive data for a few commits at once. It can be combined
with "backfill" mode.

We also have a special scuba table for derived data derivation, however there
are a few problems with it.

Only "normal" mode has good and predictable logging i.e. it logs once before we
attempt to derive a commit, and once after commit was derived or failed.

"backfill" logs right after data for a given commit was "derived", however this is an in-memory
derivation, and at this point no data was saved to the blobstore.
So if backfill process crashes a bit later then commit might not be derived
after all, and it's impossible to tell it just by looking at the scuba table.

With "batch" mode it's even worse - we don't get any logs at all.

A bigger refactoring is needed here, because currently the process of
derivation is very hard to grok. But for now I suggest to slightly improve
scuba logging by logging and even when a derived data mapping was actually written (or failed to be
written). After this diff we'll get the following:

1) "normal" mode will get three entries in scuba table in this order: derivation start,
mapping written, derivation end,
2) "backfill" mode will also get three entries in scuba table by in a different
order: derivation start, derivation end, mapping written
3) "batch" mode will get one entry for writing the mapping. Not great, but
better than nothing!

Reviewed By: farnz

Differential Revision: D29231404

fbshipit-source-id: 2c601e7dc58c00e22fda1ddd542833a818d1d023
2021-06-21 01:19:52 -07:00
Stanislau Hlebik
222352e0a5 mononoke: move derived data logging code to a separate file
Summary: Just moving a code around a bit to make derive_impl file a bit smaller

Reviewed By: farnz

Differential Revision: D29231405

fbshipit-source-id: c923f42710f4be98147bc58d5b828d5d6c7bf1a6
2021-06-21 01:19:52 -07:00
Simon Farnsworth
3404fb6b66 New manual_scrub mode for checking that a write-mostly store is populated
Summary:
I'm seeing significant Zippy load when I do a check scrub of our big repo to make sure that it's all in SQL Blobstore as well as our main blob stores.

Teach scrub to not bother talking to the main blobstores unless the write-mostly blobstore is either missing the data or unable to retrieve it.

Reviewed By: ahornby

Differential Revision: D29233349

fbshipit-source-id: 1127129ff283477558cddb03686c3c13aee47fb5
2021-06-18 10:26:22 -07:00
Aida Getoeva
b340165c59 mononoke/eden: reduce the number of ODS timeseries
Summary: We have over [17M timeseries](https://www.internalfb.com/intern/ods/category?cat_id=1475&selection=timeseries) now with the [edenapi far ahead](https://fburl.com/scuba/gorilla_keys/yurnzsfi). Let's not group the timeseries by repo name, as it's not very useful (we can look into Scuba for more details), and remove some of the percentiles.

Reviewed By: ahornby

Differential Revision: D29196854

fbshipit-source-id: 0158fe9e9526fb3db35a4ac6234bf580cbd6805b
2021-06-18 04:16:59 -07:00
Andres Suarez
845128485c Update bytecount
Reviewed By: dtolnay

Differential Revision: D29213998

fbshipit-source-id: 92e7a9de9e3d03f04b92a77e16fa0e37428fe2fb
2021-06-17 19:50:32 -07:00
Davide Cavalca
b82c5672fc Update several rust crate versions
Summary: Update versions for several of the crates we depend on.

Reviewed By: danobi

Differential Revision: D29165283

fbshipit-source-id: baaa9fa106b7dad000f93d2eefa95867ac46e5a1
2021-06-17 16:38:19 -07:00
Liubov Dmitrieva
1b818d114d add an option to pass some metadata in the token
Summary:
add an option to pass some metadata in the token

This will be used for content tokens, for example. We would like to guarantee that the specific content has been uploaded and it had the specific length. This will be used for hg filenodes upload.

Reviewed By: markbt

Differential Revision: D29136295

fbshipit-source-id: 2fbd3917ee0a55f43216351fdbc1a6686eb80176
2021-06-17 08:22:33 -07:00
Liubov Dmitrieva
500a232716 implement upload of file content into blobstore
Summary:
upload file content into blobstore

the existing Mononoke API already validates the provided hashes and calculates the missing one

we would probably need to write to all multiplexed blobstores, but multiplexing will be addressed separately

Reviewed By: markbt

Differential Revision: D29103111

fbshipit-source-id: 0cac837efc238f618a35420523279fb7aa91668a
2021-06-17 08:22:33 -07:00
Alex Hornby
8f2b3a8a9d mononoke: sqlblob allow inline mysql puts
Summary: Allow puts to sqlblob with mysql backing to use the InlineBase64 hash type.

Reviewed By: farnz

Differential Revision: D28829452

fbshipit-source-id: 265cf45e55284d34d3002a9db205e14eaee4fa39
2021-06-17 07:26:45 -07:00
Stanislau Hlebik
fbc07cb4c3 mononoke: make chunk size configurable in regenerate_filenodes binary
Summary:
It's useful to have it configurable.
While here, also use slog instead of println to attach timestamp as well

Reviewed By: Croohand

Differential Revision: D29165693

fbshipit-source-id: d844926560b15042445d5861a281870ac102d12e
2021-06-17 03:07:24 -07:00
Thomas Orozco
b170b80412 mononoke: add an --oncall argument to megarepo bind commits
Summary:
Like it says in the title. Let's allow specifying an oncall here since that
oncall will be tasked with retroactive review of the commit.

Reviewed By: StanislavGlebik

Differential Revision: D29162534

fbshipit-source-id: 9ed3ac43c38a1120bb16a2f5b5218fdbf80e0d47
2021-06-16 08:50:52 -07:00