Summary:
Previously it wasn't possible because symlink target was a key in the map that
mega_grepo_sync was sending to scs, and so we can't have two different symlink
for the same symlink target. However we actually need it - some of aosp repos
have symlink different sources that point to the same symlink target.
This diff fixes it by reverting the key and valud in the `linkfiles` map.
Differential Revision: D29359634
fbshipit-source-id: da74d6e934350822d82d2135ab06c754824525c9
Summary:
This is just updating the os_info crate to my fork with a fix for Centos
Stream: https://github.com/stanislav-tkach/os_info/pull/267
Reviewed By: quark-zju
Differential Revision: D29410043
fbshipit-source-id: 3642e704f5a056e75fee4421dc59020fde13ed5e
Summary: I think someone landed a dependency change or something and forgot to update autocargo
Reviewed By: dtolnay
Differential Revision: D29402335
fbshipit-source-id: e9a4906bf249470351c2984ef64dfba9daac8891
Summary: Add an option to allow manually forcing EdenAPI to be enabled or disabled. This is useful in a variety of cases, such as bypassing the normal EdenAPI activation logic in tests, or to forcibly disable EdenAPI in cases where it isn't working correctly.
Differential Revision: D29377923
fbshipit-source-id: f408efe2a46ef3f1bd2914669310c3445c7d4121
Summary:
When diffing a changeset with its parents, if a file is copied to multiple places, then we should include all of those copies in the diff.
Furthermore, if the file is also removed, then the *first* of those copies
should be considered a move. Note that "first" here means the first in the
lexicographic ordering of the repository manifest.
Reviewed By: liubov-dmitrieva
Differential Revision: D29359516
fbshipit-source-id: eeed630c2e4d20f3fb8c923611a0433c74fd25d0
Summary:
A `new` constructor isn't necessary because it's identical to just
`TypeName`. Now that user-provided constructor can be included, it occupies
valuable namespace.
#forcetdhashing
Reviewed By: krallin
Differential Revision: D29387037
fbshipit-source-id: 7de343c13842c74772f7eca83ddd7019e1040c5c
Summary: The returned value now includes roots. Rename the function to clarify.
Reviewed By: kulshrax
Differential Revision: D29383072
fbshipit-source-id: 02a255ce20d9797f482f6fe1c716f2d79a12d4e0
Summary:
1) Turned out it's possible to have non-prefix free paths in aosp manifests. So
we have to remove this check for now
2) also let's verify config earlier so that we can return an error to the user
faster
Differential Revision: D29335602
fbshipit-source-id: 3dd72d63a370515eca5d356b3b98bb2ac2245aee
Summary:
When we do pushrebase changesets which send to us by the client will be rebased and get new hash, which is not available in mononoke_test_perf atm.
Lets log rebased changeset_id
Reviewed By: Croohand
Differential Revision: D29362816
fbshipit-source-id: bebab24b12de1be9a9b81502453fcf44444f94b5
Summary: There is a regression in 1.7.0 (which we're on at the moment) so we might as well update.
Reviewed By: zertosh, farnz
Differential Revision: D29358047
fbshipit-source-id: 226393d79c165455d27f7a09b14b40c6a30d96d3
Summary:
This adds the blob object RedactionKeyList, which just contains a list of Strings, each of which will be a key to be redacted.
This will be stored on the blobstore, while a key to this object will be stored in configerator.
Some stuff that might be worth discussing:
- This class just holds a list of strings, per se it doesn't have much to do with redaction. If we want to change this to a more generic object like `KeyList`, I'm happy to do it. By default I'll leave it like this.
- I used serde (more precisely, json) to (de)serialise it. The only reason I did it was because I wanted to make this as simple as possible, from what I see in other objects need to define a thrift struct with the same config, then write `into/from_thrift` implementations. If preferred, I can do that.
It's not used in this diff, will be used in the future, I split it mostly to make it easier to review.
Reviewed By: markbt
Differential Revision: D29033597
fbshipit-source-id: 5550dbf58c5214201b739f8150fd06471bd67ab8
Summary: This is required to make sure segmented changelog has all the data needed
Reviewed By: quark-zju
Differential Revision: D29347285
fbshipit-source-id: 82ee1ffca178492b7ad363c53cee7ec57058733f
Summary:
Add git LFS support to gitimport and grepo branch_forest.
I did not want to add the parsing of .gitattributes and .lfsconfig to the gitimport library. This needs to be done by the users of gitimport before the import is started, And the GitImportLfs object needs to be configured accordingly. Currently we are extrating this data from the manifest files for the "g"repo imports.
I am not sure the simple git-lfs download client works with other git-lfs server back ends then Dewey. But it is a fairly simple implementation and it should be easy to extend to be more generic.
Reviewed By: farnz
Differential Revision: D29082867
fbshipit-source-id: a7b0272147b3d44a0b6b9782d2a1b8ec94653b8f
Summary: It's useful to be able to copy multiple dirs at once
Reviewed By: markbt
Differential Revision: D29358375
fbshipit-source-id: f1cc351195cc2c19de36a1b6936b598e314848c3
Summary:
Previously only conversion between bonsai and hg was supported. Let's add git
as well.
Obviously you can use `scsc lookup`, but mononoke_admin can be useful for repos
that are not on scs yet.
Reviewed By: farnz
Differential Revision: D29360793
fbshipit-source-id: eb2b71eab192b3456ba3d580f7eb8c4a85b2fd1d
Summary: Very simple refactor. This logic was already used twice and I will use it another time in following diffs.
Reviewed By: markbt
Differential Revision: D29033594
fbshipit-source-id: 96040a2eee2b58f6851646e51b67c46c6bf334fe
Summary:
Implement get and put for the ephemeral blobstore. This allows blobs to
be stored and retrieved in bubbles.
Ephemeral bubbles always have a repo associated with them when they are opened,
to simplify blob prefixing. It is valid for a bubble id to have multiple repos
associated with it, but they must be accessed separately, and in practice this
won't be used.
Reviewed By: StanislavGlebik
Differential Revision: D29067722
fbshipit-source-id: d870f695fc1d0c825fdaec9337c82a13209165ce
Summary:
Extend metaconfig to include configuration for the ephemeral blobstore.
An ephemeral blobstore is optional: repos without an ephemeral blobstore cannot
store ephemeral commits or snapshots.
Reviewed By: StanislavGlebik
Differential Revision: D29067719
fbshipit-source-id: fe7d42173d5c34a937c99c72f4b2bd08af503889
Summary:
Packblob currently expects key prefixes of the form `repoNNNN.` to be stripped , but also allows keys without this prefix. For the ephemeral blobstore we want to allow prefixes of the form `ephXXX.repoNNNN.` as well.
Generalise packblob so that we can have multiple key prefixes.
Packblob will enforce that none of the blobs in the packblob have a prefix that matches any of the patterns - this will prevent us from accidentally storing `repoNNNN.`-prefixed blobs in an ephemeral blobstore that requires `ephXXX.repoNNNN.` prefixes, for example.
Reviewed By: liubov-dmitrieva
Differential Revision: D29067720
fbshipit-source-id: 953909d47c9c4af91b529bcc684340d26411463d
Summary: Make it clearer which of the TailParams are only required when chunking, removing parallel Option<> so that all items that should be set together are inside one optional item.
Reviewed By: farnz
Differential Revision: D29264647
fbshipit-source-id: d64cddf94b35e62d6e50cd8afe906eef2444c730
Summary: Makes defer_visit return result, so we can detect if it is called when not chunking.
Reviewed By: farnz
Differential Revision: D29268346
fbshipit-source-id: b8ea503c2848adb5d7ca3fb0e61399be2930c3de
Summary: This is rougly similar to algorithm in NameDag
Reviewed By: quark-zju
Differential Revision: D29318721
fbshipit-source-id: 51a9123daa2b4cf0fbe2346a8a0c7e75172d9afb
Summary: The naming is used in other parts of dag crate - this introduce mononoke side binding for corresponding functions on dag side
Reviewed By: quark-zju
Differential Revision: D29318722
fbshipit-source-id: e9eea5536b041b6ab2ce578914817bca43a10d48
Summary:
Path should be relative to the symlink path, not to the repo root. This diff
fixes it
Reviewed By: farnz
Differential Revision: D29327682
fbshipit-source-id: a51161a8039a88263fe941562f2c2134aa5d4fef
Summary: Update the remaining tests for scmstore. In each of these cases we're just disabling scmstore for various reasons. I think `test-lfs-bundle.t` and `test-lfs.t`'s failures represents a legitimate issue with scmstore's contentstore fallback, but I don't think it should block the rollout
Reviewed By: kulshrax
Differential Revision: D29289515
fbshipit-source-id: 10d055bf679db8efdeb16ac96b7ed597d7b6d82c
Summary:
This is a followup from D28903515 (9a3fbfe311). In D28903515 (9a3fbfe311) we've added support for reusing
hg filenodes if parent has the same filenode. However we weren't reusing
manifests even if parent has an identical manifest, and this diff adds a
support to do so.
There's one caveat - we try to reuse parent manifests only if there are more
than one parent manifest. See explanation in the comments.
Reviewed By: farnz
Differential Revision: D29098908
fbshipit-source-id: 5ecfdc4b022ffc7620501cc024e7a659fb82f768
Summary:
In the walker, an Option<NodeData> value of None is used to indicate that no data could be found for a node, and that for derived data mappings we should try again to load it later, when it may have been derived.
When a node is outside the chunk boundary this isn't appropriate, we should just mark as visited and move on, which is what this change does.
Reviewed By: farnz
Differential Revision: D29230223
fbshipit-source-id: c2afdee9b914af89c7954c8e6a7d17a174df7ed1
Summary: Only four tests remaining after this.
Reviewed By: kulshrax
Differential Revision: D29229656
fbshipit-source-id: 56c0a17f6585263e983ce8bc3c345b1f266422e0
Summary: Update more tests to avoid relying on pack files and legacy LFS, and override configs in `test-inconsistent-hash.t` to continue using pack files even after the scmstore rollout to test the Mononoke's response to corruption, which is not currently as easy with indexedlog.
Reviewed By: quark-zju
Differential Revision: D29229650
fbshipit-source-id: 11fe677fcecbb19acbefc9182b17062b8e1644d8
Summary:
Pull in a patch which fixes writing out an incorrect entsize for the
`SHT_GNU_versym` section:
ddbae72082
Reviewed By: igorsugak
Differential Revision: D29248208
fbshipit-source-id: 90bbaa179df79e817e3eaa846ecfef5c1236073a
Summary:
For context and high level goal, see: https://fb.quip.com/8zOkAQRiXGQ3
On RedactedBlobs, let's return an `Arc<HashMap>` instead of `&Hashmap`.
This is not needed now, but when reloading information from configerator, we won't be able to return a reference, only a pointer.
Reviewed By: StanislavGlebik
Differential Revision: D28962040
fbshipit-source-id: 0848acc1a81a87c0b51d968efe31f61dacd57c47
Summary:
For context and high level goal, see: https://fb.quip.com/8zOkAQRiXGQ3
Instead of using `HashMap<String, RedactedMetadata>` everywhere, let's use a `Arc<RedactedBlobs>` object from which we can instead borrow a map. The borrow function is async because it will need to be when we're fetching from configerator, as it may need to rebuild the redaction data.
Wrapping it in `Arc` will also makes it re-use the same across repos, I believe right now it's cloned everywhere.
In later diffs I'll use this enum to add a new way to fetch configs.
Reviewed By: markbt
Differential Revision: D28935506
fbshipit-source-id: befa96810ee7ebb9487f99f9e769a945981b58ed
Summary:
We're doing imports for AOSP megarepo work, and want a tool to quickly check that our imports are what we expect.
Use libgit2 and a simple LFS parser to read git SHA-256 entries, and FSNodes to get the Mononoke entries to match
Reviewed By: StanislavGlebik
Differential Revision: D29169743
fbshipit-source-id: 1ef1e2c780b8742c7fa5f15f9ee01bc0481a6543
Summary: This is a minimal fix so that it builds, not enough to test the new bit, but enough to unbreak contbuild
Reviewed By: yancouto, HarveyHunt
Differential Revision: D29263246
fbshipit-source-id: c5430ff4bc885103664c33caca90af5819d97ddd
Summary: Spotted this in passing. Save a DashMap lookup in the OldestFirst case by checking the enum first
Reviewed By: farnz
Differential Revision: D29232280
fbshipit-source-id: 72e93ee704767a42c36ffeec505fd79a22c4d88e
Summary:
At the moment we have a few ways of deriving data:
1) "normal", which is used by most of the mononoke code. In this case we insert
derived data mapping after all the data for a given derived data type was
safely saved.
2) "backfill", which is used when we backfilling a lot of commits. In this case
we write all the data to in-memory blobstore first, and only later we save data
to real blobstore, and then write derived data mapping
3) "batch", when we derive data for a few commits at once. It can be combined
with "backfill" mode.
We also have a special scuba table for derived data derivation, however there
are a few problems with it.
Only "normal" mode has good and predictable logging i.e. it logs once before we
attempt to derive a commit, and once after commit was derived or failed.
"backfill" logs right after data for a given commit was "derived", however this is an in-memory
derivation, and at this point no data was saved to the blobstore.
So if backfill process crashes a bit later then commit might not be derived
after all, and it's impossible to tell it just by looking at the scuba table.
With "batch" mode it's even worse - we don't get any logs at all.
A bigger refactoring is needed here, because currently the process of
derivation is very hard to grok. But for now I suggest to slightly improve
scuba logging by logging and even when a derived data mapping was actually written (or failed to be
written). After this diff we'll get the following:
1) "normal" mode will get three entries in scuba table in this order: derivation start,
mapping written, derivation end,
2) "backfill" mode will also get three entries in scuba table by in a different
order: derivation start, derivation end, mapping written
3) "batch" mode will get one entry for writing the mapping. Not great, but
better than nothing!
Reviewed By: farnz
Differential Revision: D29231404
fbshipit-source-id: 2c601e7dc58c00e22fda1ddd542833a818d1d023
Summary: Just moving a code around a bit to make derive_impl file a bit smaller
Reviewed By: farnz
Differential Revision: D29231405
fbshipit-source-id: c923f42710f4be98147bc58d5b828d5d6c7bf1a6
Summary:
I'm seeing significant Zippy load when I do a check scrub of our big repo to make sure that it's all in SQL Blobstore as well as our main blob stores.
Teach scrub to not bother talking to the main blobstores unless the write-mostly blobstore is either missing the data or unable to retrieve it.
Reviewed By: ahornby
Differential Revision: D29233349
fbshipit-source-id: 1127129ff283477558cddb03686c3c13aee47fb5
Summary: Update versions for several of the crates we depend on.
Reviewed By: danobi
Differential Revision: D29165283
fbshipit-source-id: baaa9fa106b7dad000f93d2eefa95867ac46e5a1
Summary:
add an option to pass some metadata in the token
This will be used for content tokens, for example. We would like to guarantee that the specific content has been uploaded and it had the specific length. This will be used for hg filenodes upload.
Reviewed By: markbt
Differential Revision: D29136295
fbshipit-source-id: 2fbd3917ee0a55f43216351fdbc1a6686eb80176
Summary:
upload file content into blobstore
the existing Mononoke API already validates the provided hashes and calculates the missing one
we would probably need to write to all multiplexed blobstores, but multiplexing will be addressed separately
Reviewed By: markbt
Differential Revision: D29103111
fbshipit-source-id: 0cac837efc238f618a35420523279fb7aa91668a
Summary: Allow puts to sqlblob with mysql backing to use the InlineBase64 hash type.
Reviewed By: farnz
Differential Revision: D28829452
fbshipit-source-id: 265cf45e55284d34d3002a9db205e14eaee4fa39
Summary:
It's useful to have it configurable.
While here, also use slog instead of println to attach timestamp as well
Reviewed By: Croohand
Differential Revision: D29165693
fbshipit-source-id: d844926560b15042445d5861a281870ac102d12e
Summary:
Like it says in the title. Let's allow specifying an oncall here since that
oncall will be tasked with retroactive review of the commit.
Reviewed By: StanislavGlebik
Differential Revision: D29162534
fbshipit-source-id: 9ed3ac43c38a1120bb16a2f5b5218fdbf80e0d47