sapling

mirror of https://github.com/facebook/sapling.git synced 2024-10-11 17:27:53 +03:00

Author	SHA1	Message	Date
Stanislau Hlebik	3874bb6de0	mononoke: support stacks in sync job bundle generator Summary: Previously it was possible to sync just a single commit - this diff adds support for generating bundle with stacks of commits. It uses DifferenceOfUnionsOfAncestorsNodeStream which is exactly the same stream we use in Mononoke to serve `hg pull` requests. Reviewed By: krallin Differential Revision: D17498873 fbshipit-source-id: d94066ad0f0cbdd44fd958c57739de1e070eec9f	2019-09-24 05:29:42 -07:00
Stanislau Hlebik	d471529aec	mononoke: first diff for the sync job that generates bundles (only single commit) Summary: This is the very first and very limited implementation of a new mode of the sync job. This is the mode where bundles are generated by the sync job instead of reusing the bundles pushed by the client. At the moment it syncs only a single commit, new functionality has to be added later. A few notes: 1) We need to manually generate replycaps part . Mononoke essentially ignores it (with a few small exceptions), however Mercurial actually pays attention to it, and generates different response. At the moment reply capabilities are hard-coded in the sync job, and that might be problematic if replycaps change by Mercurial client but sync job hasn't been updated. There are a few protections from it - we can make replycaps dynamic (for example, specify it in the tw spec) and we can also rely on integration tests to catch regressions. 2) create_filenode_entries_stream has a complicated logic for renames - please have a look at it to double check my understanding! Reviewed By: krallin Differential Revision: D17477983 fbshipit-source-id: fdb4584e768bdc5637868468a035c81f1584f7fe	2019-09-24 05:29:42 -07:00
Kostia Balytskyi	7afc953929	mononoke: allow tests to work with multiple mononoke repos Reviewed By: farnz Differential Revision: D17499277 fbshipit-source-id: 99c3e624cea855a6984b0a3c1d991d16e13c64a3	2019-09-23 07:16:15 -07:00
Kostia Balytskyi	d7640811c6	mononoke: make all binaries in library.sh use --repo-id Reviewed By: StanislavGlebik Differential Revision: D17500362 fbshipit-source-id: c7e167e99313547e98a54256a28f33a184d0702d	2019-09-20 08:26:22 -07:00
Kostia Balytskyi	d1bb74ea06	mononoke: make megarepotool capable of pushsing to bookmarks Reviewed By: krallin Differential Revision: D17448128 fbshipit-source-id: ca5bd6f90ce370806ab23120d3a34567bedfae47	2019-09-19 03:33:54 -07:00
Kostia Balytskyi	13cde788a4	mononoke: refactor megarepotool Reviewed By: farnz Differential Revision: D17448129 fbshipit-source-id: ec8231f6174c3db811668bac6a22007d6ad567ff	2019-09-19 03:33:53 -07:00
Kostia Balytskyi	126fe17a2f	megarepotool: consume bookmarks as source arguments for move Reviewed By: farnz Differential Revision: D17448127 fbshipit-source-id: 7ff57fd3fff510bd293b818bfa027e0e37f7abf2	2019-09-19 03:33:53 -07:00
Kostia Balytskyi	e972a82545	mononoke: migrate megarepotool move subcommand to core mononoke movers Summary: Rather than using hardcoded path maps and ad-hoc moveres, lets use the logic which creates moves from config. Reviewed By: farnz Differential Revision: D17424678 fbshipit-source-id: 64a0a0b1c7332661408444a6d81f5931ed680c3c	2019-09-19 03:33:53 -07:00
Simon Farnsworth	e20c051ac5	Forbid forced pushrebase where the commit contains mutation markers Summary: Mercurial and Mononoke disagree on the meaning of a forced pushrebase - Mercurial does a partial rewrite, Mononoke does not. Handle one of the differences by rejecting a commit that Mercurial would rewrite. The user can get identical behaviour by rewriting their commit, anyway. Reviewed By: StanislavGlebik Differential Revision: D17420573 fbshipit-source-id: 8332c8f86e56bbcfb00e0bf126b78e11b63d83c3	2019-09-17 09:09:20 -07:00
David Tolnay	713973975d	Replace fbinit::FACEBOOK with #[fbinit::main] Summary: This diff moves initFacebook calls that used to happen just before FFI calls to instead happen at the beginning of main. The basic assumption of initFacebook is that it happens at the beginning of main before there are additional threads. It must be allowed to modify process-global state like env vars or gflags without the possibility of a data race from other code concurrently reading those things. As such, the previous approach of calling initFacebook through `fbinit::FACEBOOK` near FFI calls was prone to race conditions. The new approach is based on attribute macros added in D17245802. --- The primary remaining situations that still require `fbinit::FACEBOOK` are when we don't directly control the function arguments surrounding the call to C++, such as in lazy_static: lazy_static! { static ref S: Ty = { let _ = fbinit::FACEBOOK; /* call C++ / }; } and quickcheck: quickcheck! { fn f(/ args that impl Arbitrary /) { let _ = fbinit::FACEBOOK; /* call C++ */ } } I will revisit these in a separate diff. They are a small fraction of total uses of fbinit. Reviewed By: Imxset21 Differential Revision: D17328504 fbshipit-source-id: f80edb763e7f42b3216552dd32f1ea0e6cc8fd12	2019-09-13 20:17:29 -07:00
Thomas Orozco	53bf2886cc	mononoke: connect stdlog and slog Summary: This wires up the stdlog crate with our slog output. The upshot is that we can now run binaries with `RUST_LOG` set and expect it to work. This is nice because many crates use stdlog (e.g. Tokio, Hyper), so this is convenient to get access to their logging. For example, if you run with `RUST_LOG=gotham=info,hyper=debug`, then you get debug logs from Hyper and info logs from Gotham. The way this works is by registering a stdlog logger that uses the env_logger's filter (the one that "invented" `RUST_LOG`) to filter logs, and routes them to slog if they pass. Note that the slog Logger used there doesn't do any filtering, since we already do it before sending logs there. One thing to keep in mind is that we should only register the stdlog global logger once. I've renamed `get_logger` to `init_logging` to make this clearer. This behavior is similar to what we do with `init_cachelib`. I've updated callsites accordingly. Note that we explicitly tell the stdlog framework to ignore anything that we won't consider for logging. If you don't set `RUST_LOG`, then the default logging level is `Error`, which means that anything below error that is sent to stdlog won't even get to out filtering logic (the stdlog macros for logging check for the global level before actually logging), so this is cheap unless you do set `RUST_LOG`. As part of this, I've also updated all our binaries (and therefore, tests) to use glog for logging. We had been meaning to do this, and it was convenient to do it here because the other logger factory we were using didn't make it easy to get a Drain without putting it a Logger. Reviewed By: ahornby Differential Revision: D17314200 fbshipit-source-id: 19b5e8edc3bbe7ba02ccec4f1852dc3587373fff	2019-09-12 04:13:11 -07:00
Thomas Orozco	8b4012de75	mononoke: corrupt packs, not loose files Summary: Our tests were dead when D17244837 landed, but that diff removed loose files and broke our tests because they were trying to corrupt them. Let's fix this by corrupting packs instead. Reviewed By: StanislavGlebik Differential Revision: D17314430 fbshipit-source-id: 708f24b902ba1f94836f48989a32b696d2f6e52a	2019-09-11 05:40:56 -07:00
Kostia Balytskyi	a878f7ed13	mononoke: make blobimport capable of prefixing bookmarks Summary: We need this for `megarepo_test` blobimport, which will prefix `fbsource` and `ovrsource` bookmarks. Reviewed By: krallin Differential Revision: D17286275 fbshipit-source-id: 40bb3f97b0a06dcd636f891d9b32c7ef9b55a0fc	2019-09-11 03:19:38 -07:00
Thomas Orozco	15b6f902e3	mononoke/dbbookmarks: finally use >list for allowed bookmark updates Summary: Now that I've updated the `queries!` macro to allow a `>list` along with other arguments in a query, we can use it in the dbbookmarks update query that was checking for valid `UPDATE`s. Reviewed By: farnz Differential Revision: D17288575 fbshipit-source-id: 051e03bd711bf26e43ec79509051ffff68316db7	2019-09-11 01:40:11 -07:00
Kostia Balytskyi	800824a2bd	mononoke: add config reading logic for commit sync configs Summary: Commit sync will operate based on the following idea: - there's one "large" and potentially many "small" repos - there are two possible sync directions: large-to-small and small-to-large - when syncing a small repo into a large repo, it is allowed to change paths of each individual file, but not to drop files - large repo prepends a predefined prefix to every bookmark name from the small repo, except for the bookmarks, specified in `common_pushrebase_bookmarks` list, which refers to the bookmarks that can be advanced by any small repo Reviewed By: krallin Differential Revision: D17258451 fbshipit-source-id: 6cdaccd0374250f6bbdcbc9a280da89ccd7dff97	2019-09-10 07:23:15 -07:00
Thomas Orozco	9e458f7acf	mononoke/apiserver: remove LFS implementation Summary: This removes the LFS implementation from the apiserver. We haven't used it, and we weren't going to. Reviewed By: StanislavGlebik Differential Revision: D17263036 fbshipit-source-id: cad3e8a7d54abef06ff1bc1cb5e73c55342743b6	2019-09-10 06:39:54 -07:00
Thomas Orozco	11f0cc609b	mononoke: update LFS tests to use new LFS server Summary: This updates all the existing integration tests that talked to the API Server to tlak to the LFS Server instead. Note that some tests have changed a little bit. That's because unlike the old implementation, the new one doesn't ask clients to re-upload the things it already has. Reviewed By: HarveyHunt Differential Revision: D17263038 fbshipit-source-id: b3ef9af6f416458354d49b0ab9e99e588f8edd26	2019-09-10 06:39:53 -07:00
Thomas Orozco	cff0b359b3	mononoke/lfs_server: run without upstream + integration tests Summary: This adds support in the LFS Server for running without an upstream. This is useful to run tests, because it lets chain LFS Servers and have one acting as a proxy and one acting as a backend. This also highlighted a bug in my implementation (we were expecting a download action from upstream if the data didn't need to be uploaded there), which I fixed here. For the time being, we won't use this in production. Reviewed By: HarveyHunt Differential Revision: D17263039 fbshipit-source-id: 7cba550054e5f052a4b8953ebe0195907919aade	2019-09-10 06:39:53 -07:00
Thomas Orozco	73f713ca70	mononoke: remove manifold / MySQL flags + safe_writes Summary: None of these do anything. Reviewed By: StanislavGlebik Differential Revision: D16963565 fbshipit-source-id: bb0dd344daf151dd1d5c45392c43b014d2b17a07	2019-09-06 07:17:45 -07:00
Kostia Balytskyi	1ece1d8688	mononoke: add a tool for megarepo manipulations Reviewed By: StanislavGlebik Differential Revision: D17181600 fbshipit-source-id: 87565ac13f8bca36a65a09c0a6169c3dcf692ff9	2019-09-06 05:28:18 -07:00
Mark Thomas	3a124c9d82	debugmutation: improve format and render successors Summary: Update the debugmutation format to collapse long chains. Add `hg debugmutation -s` to follow the successor relationship (printing what the commit became) instead of the predecessor relationship. Reviewed By: mitrandir77 Differential Revision: D17156463 fbshipit-source-id: 44a68692e3bdea6a2fe2ef9f53b533199136eab1	2019-09-06 00:42:03 -07:00
Daniel Grzegorzewski	94f41ba5aa	Added option to use reponame insted of repo-id Summary: Currently in mononoke_admin tool we have to use repo-id to identify repository. Sometimes it can be inconvenient. Changed it, so we can use either repo-id or reponame. Reviewed By: StanislavGlebik Differential Revision: D17202962 fbshipit-source-id: d33ad55f53c839afc70e42c26501ecd4421e32c0	2019-09-05 09:36:27 -07:00
Stanislau Hlebik	f50e19a70e	mononoke: compress file history Summary: This diff adds handling of "overflows" i.e. making sure that each FastlogBatch doesn't store more than 60 entries. The idea is that FastlogBatch has "latest" entries and "previous_batches". New entry is prepended to "latest", but if "latest" has more than 10 entries then new FastlogBatch is uploaded and prepended to "previous_batches". If "previous_batches" has more than 5 entries than the oldest entries is removed from the batch. Note that merges are still not handled in this diff, this functionality will be added in the next diffs Reviewed By: krallin Differential Revision: D17160506 fbshipit-source-id: 956bc5320667c6c5511d2a567740a4f6ebd8ac1b	2019-09-03 07:30:55 -07:00
Stanislau Hlebik	90a4d40551	mononoke: do not generate hgchangeset unnecessarily Summary: We were generating hgchangesets in list_directory_unodes() method, however this is not always necessary - if a client sends a bookmark then we can fetch bonsai changesets and generate unodes directly bypassing hg changeset generation stage. Same goes for is_ancestor method. Reviewed By: krallin Differential Revision: D17146609 fbshipit-source-id: 9613e28f8bacbce5d8de94a6ab88b152d19b0a08	2019-09-02 04:18:16 -07:00
Pavel Aslanov	f43b19c431	remove not needed `BlobRepo::{store\|fetch}` methods Summary: `BlobRepo::{store\|fetch}` methods are subsumed by `Storable\|Loadable` Reviewed By: krallin Differential Revision: D17133306 fbshipit-source-id: 6379aea6335d57f0d90a9669ba5ef0300f82a399	2019-08-30 13:34:29 -07:00
Harvey Hunt	a494c558c4	mononoke: Prevent humans from moving specific bookmarks in fbsource Summary: There are a collection of bookmarks that should only be moved by automation. If they're moved by users, SEVs can happen (S184864). Configure the restrict_users hook to disallow moves of the following bookmarks: fbandroid/stable fbcode/stable fbcode/warm fbobjc/stable fbsource/stable I collected this list by running the following on a hg server: [svcscm@hg009.cln2 /data/scm/fbsource]$ hg config \| grep automation hooks.automationbookmarks=fbandroid/stable\nfbcode/stable\nfbcode/warm\nfbobjc/stable\nfbsource/stable The usernames came from the disable-nonff.py hook on hg servers. Reviewed By: suitingtseng, StanislavGlebik Differential Revision: D16960784 fbshipit-source-id: 9cf63b40492431bf34fb0574738794b0190703c6	2019-08-22 07:10:37 -07:00
Thomas Orozco	ce005af688	mononoke: add lfs_import command Summary: This adds a command that allows importing a set of LFS blobs into Mononoke by streaming them out of another storage location (e.g. Dewey). Reviewed By: HarveyHunt Differential Revision: D16917806 fbshipit-source-id: 4917d56e11a187c89e00c23a32c6e791b351f8ef	2019-08-21 02:33:31 -07:00
Kostia Balytskyi	5757d6196a	redaction: rename relevant constants into their desired names Summary: We do not want to use censorship/blacklisting, as those names are ambiguous and redaction has exactly the meaning we need. Reviewed By: quark-zju Differential Revision: D16890423 fbshipit-source-id: 1a6dcc61ca3fe77b7a938a37fb1ec9250ae9e15a	2019-08-20 04:01:00 -07:00
Kostia Balytskyi	cff091b4c6	mononoke: rename censorship/blacklisting into a redaction Summary: This is a mechanical part of rename, does not change any commit messages in tests, does not change the scuba table name/config setting. Those are more complex. Reviewed By: krallin Differential Revision: D16890120 fbshipit-source-id: 966c0066f5e959631995a1abcc7123549f7495b6	2019-08-20 04:01:00 -07:00
Harvey Hunt	b778fd2482	mononoke: Add a rechunker binary Summary: Create a new binary that can be used to rechunk files content using the filestore. The binary accepts multiple filenodes, that it will then go and rechunk using the filestore config provided to it. Reviewed By: krallin Differential Revision: D16802701 fbshipit-source-id: d7c05729f5072ff2925bbc90cdd89fcfed56bba2	2019-08-16 09:08:45 -07:00
Kostia Balytskyi	ee03dfa250	mononoke: introduce blacklist list admin subcommand Summary: We may need to inspect the list of blacklisted files for a given commit. This command is called to do so. Reviewed By: krallin Differential Revision: D16643467 fbshipit-source-id: ad389c4b9262ffa9ee6ea9b792b4866a23bebaa5	2019-08-16 02:32:45 -07:00
Thomas Orozco	895aa3c27a	mononoke: getpackv2 LFS support (+ getfiles / getpack / eden_get_data refactor) Summary: This updates Mononoke to support LFS metadata when serving data over getpackv2. However, in doing so, I've also refactored the various ways in which we currently access file data to serve it to clients or to process client uploads (when we need to compute deltas). The motivation to do that is that we've had several issues recently where some protocols knew about some functionality, and others didn't. Notably, redaction and LFS were supported in getfiles, but neither of them were supported in getpack or eden_get_data. This patch refactors all those callsites away from blobrepo and instead through repo_client/remotefilelog, which provides an internal common method to fetch a filenode and return its metadata and bytes (prepare_blob), and separate protocol specific implementations for getpackv1 (includes metadata + file content -- this is basically the existing fetch_raw_filenode_bytes function), getpackv2 (includes metadata + file contents + getpackv2 metadata), getfiles (includes just file content, and ties file history into its response) and eden_get_data (which uses getpackv1). Here are a few notable changes here that are worth noting as you review this: - The getfiles method used to get its filenode from get_maybe_draft_filenode, but all it needed was the copy info. However, the updated method gets its filenode from the envelope (which also has this data). This should be equivalent. - I haven't been able to remove fetch_raw_filenode_bytes yet because there's a callsite that still uses it and it's not entirely clear to me whether this is used and why. I'll look into it, but for now I left it unchanged. - I've used the Mercurial implementation of getpack metadata here. This feels like the better approach so we can reuse some of the code, but historically I don't think we've depended on many Mercurial crates. Let me know if there's a reason not to do that. Finally, there are a couple things to be aware of as you review: - I removed some more `Arc<BlobRepo>` in places where it made it more difficult to call the new remotefilelog methods. - I updated the implementation to get copy metadata out of a file envelope to not require copying the metadata into a mercurial::file::File only to immediately discard it. - I cleaned up an LFS integration test a little bit. There are few functional changes there, but it makes tests a little easier to work with. Reviewed By: farnz Differential Revision: D16784413 fbshipit-source-id: 5c045d001472fb338a009044ede1e22ccd34dc55	2019-08-14 08:48:35 -07:00
Kostia Balytskyi	da7be55f9f	mononoke: support tombstone approach to redaction in the apiserver Summary: This is needed, so that we do not break the existing clients when they update to commits with redacted content. Reviewed By: farnz Differential Revision: D16752748 fbshipit-source-id: fdd4e2931fd362d2bcb5eb590d56ac71b74a8b03	2019-08-12 08:03:10 -07:00
Kostia Balytskyi	b68d21b5b3	mononoke: support blaclisting magic string in getpack wireproto requests Summary: We need this to not break clients with `remotefilelog.fetchpacks=True`. Reviewed By: krallin Differential Revision: D16735900 fbshipit-source-id: a993610c2e95cfde95a8be1e31eaad825f95dc15	2019-08-12 08:03:10 -07:00
Kostia Balytskyi	533ca41d85	mononoke: add admin command to remove blacklisting Summary: This resolves the list of files at a given commit to a list of `ContentId`s and unblacklists all of those `ContentId`s. Note 1: This means that these particular versions of files will be unblacklisted in all commits which reference them. Note 2: There's no auditing of this admin command invokation, if the need for one arises, we should add it externally. Reviewed By: krallin Differential Revision: D16643436 fbshipit-source-id: 3af488e3e4d748027605587ab1e08c0abd0da109	2019-08-12 04:19:20 -07:00
Kostia Balytskyi	e332a64e32	mononoke: introduce cli structure for blacklisting subcommands Summary: In the following diffs, I will implement the actual logic. Reviewed By: krallin Differential Revision: D16639720 fbshipit-source-id: f42b91476035343a6f00b11e4921e26836b1a574	2019-08-12 04:19:19 -07:00
Thomas Orozco	98f94de92a	mononoke/blobimport: lfs retries Summary: This updates blobimport to retry on failures to fetch through its LFS helper. Reviewed By: HarveyHunt Differential Revision: D16667042 fbshipit-source-id: dfaec8fc029030a7d90806018301d22525d094df	2019-08-07 08:33:43 -07:00
Thomas Orozco	97b0763798	mononoke/blobimport: check for reusable LFS blobs Summary: Updates blobimport to check for existing LFS blobs. This is perfectly safe since at worse it'll fail closed (i.e. blobs will be missing and blobimport will hang). Reviewed By: HarveyHunt Differential Revision: D16648560 fbshipit-source-id: ed9da7f2fa69c28451ac058a4e1adc937d111b31	2019-08-07 08:33:42 -07:00
Thomas Orozco	823e7a14ee	mononoke/blobimport: log LFS helper context Summary: If you give blobimport a bad path for the LFS helper, it just tells you the file doesn't exist, which is not very useful if you don't know what file we're talking about. This fixes that. Reviewed By: HarveyHunt Differential Revision: D16621624 fbshipit-source-id: ff7fb5c8800604f8799c682e5957c888d2b01647	2019-08-07 08:33:41 -07:00
Thomas Orozco	07f3fa1a88	mononoke/integration: un-blackhole apiserver tests Summary: Johan fixed retry logic in Mercurial, so those tests can now succeed even if the blackhole is enabled (though we haven't fully understood why the blackhole was breaking them in the first place). Differential Revision: D16646032 fbshipit-source-id: 8b7ff2d8d284e003e49681e737367e9942370fa1	2019-08-05 05:21:14 -07:00
Thomas Orozco	68569e5d0c	mononoke/{types,filestore}: use a separate type for File Chunks Summary: NOTE: This changes our file storage format. It's fine to do it now since we haven't started enabling chunking yet (see: D16560844). This updates the Filestore's chunking to store chunks as their own entity in Thrift, as opposed to have them be just FileContents. The upside of this approach is that this we can't have an entity that's both a File and a Chunk, which means: - We don't have to deal with recursive chunks (since, unlike Files, Chunks can't contain be pointers to other chunks). - We don't have to produce metadata (forward mappings and backmappings) for chunks (the reason we had to produce it was to make sure we wouldn't accidentally produce inconsitent data if the upload for one of our chunks happened to have been tried as a file earlier and failed). Note that this also updates the return value from the Filestore to `ContentMetadata`. We were using `Chunk` before there because it was sufficient and convenient, but now that `Chunk` no longer contains a `ContentId`, it no longer is convenient, so it's worth changing :) Reviewed By: HarveyHunt Differential Revision: D16598906 fbshipit-source-id: f6bec75d921f1dea3a9ea3441f57213f13aeb395	2019-08-02 03:43:16 -07:00
Thomas Orozco	cfa4c8332f	mononoke/integration: disable blackhole for apiserver tests Summary: The network blackhole is causing the API server to occasionally hang while serving requests, which has broken some LFS tests. This appears to be have happened in the last month or so, but unfortunately, I haven't been able to root cause why this is happening. From what I can tell, we have an hg client that tries an upload to the API Server, and uploads everything... and then the API server just hangs. If I kill the hg client, then the API server responds with a 400 (so it's not completely stuck), but otherwise it seems like the API server is waiting for something to happen on the client-side, but the client isn't sending that. As far as I can tell, the API Server isn't actualy trying to make outbound requests (strace does report that it has a Scribe client that's trying to connect, but Scuba logging isn't enabled, and this is just trying to connect but not send anything), but something with the blackhole is causing this hg - API server interaciton to fail. In the meantime, this diff disables the blackhole for those tests that definitely don't work when it's enabled ... Reviewed By: HarveyHunt Differential Revision: D16599929 fbshipit-source-id: c6d77c5428e206cd41d5466e20405264622158ab	2019-08-01 07:36:02 -07:00
Thomas Orozco	4e30164506	mononoke/blobimport: support LFS filenodes Summary: This adds support for LFS filenodes in blobimport. This works by passing a `--lfs-helper` argument, which should be an executable that can provide a LFS blob's contents on stdout when called with the OID and size of a LFS blob. My thinking is to `curl` directly from Dewey when running this in prod. Note that, as of this change, we can blobimport LFS files, but doing so might be somewhat inefficient, since we'll roundtrip the blobs to our filestore, then generate filenodes. For now, however, I'd like to start with this so we can get a sense of whether this is acceptable performance-wise. Reviewed By: farnz Differential Revision: D16494241 fbshipit-source-id: 2ae032feb1530c558edf2cfbe967444a9a7c0d0f	2019-07-31 05:19:41 -07:00
Thomas Orozco	f9360cab9d	mononoke/filestore: incorporate in Mononoke Summary: NOTE: This isn't 100% complete yet. I have a little more work to do around the aliasverify binary, but I think it'll make sense to rework this a little bit with the Filestore anyway. This patch incorporates the Filestore throughout Mononoke. At this time, what this means is: - Blobrepo methods return streams of `FileBytes`. - Various callsites that need access to `FileBytes` call `concat2` on those streams. This also eliminates the Sha256 aliasing code that we had written for LFS and replaces it with a Filestore-based implementation. However, note that this does _not_ change how files submitted through `unbundle` are written to blobstores right now. Indeed, those contents are passed into the Filestore through `store_bytes`, which doesn't do chunking. This is intentional since it lets us use LFS uploads as a testbed for chunked storage before turning it on for everything else (also, chunking those requires further refactoring of content uploads, since right now they don't expect the `ContentId` to come back through a Future). The goal of doing it this way is to make the transition simpler. In other words, this diff doesn't change anything functionally — it just updates the underlying API we use to access files. This is also important to get a smooth release: it we had new servers that started chunking things while old servers tried to read them, things would be bad. Doing it this way ensures that doesn't happen. This means that streaming is there, but it's not being leveraged just yet. I'm planning to do so in a separate diff, starting with the LFS read and write endpoints in Reviewed By: farnz Differential Revision: D16440671 fbshipit-source-id: 02ae23783f38da895ee3052252fa6023b4a51979	2019-07-31 05:19:40 -07:00
Mateusz Kwapich	6c5b692bb3	autoformat all of fbcode/scm with black Summary: blackpylogo Reviewed By: krallin Differential Revision: D16543521 fbshipit-source-id: 75977d00329e1068a92a344c99e0ad0392601c54	2019-07-30 08:19:24 -07:00
Simon Farnsworth	ba39d7e2e6	Commits we are going to pushrebase should be treated as drafts Summary: When a commit is pushrebased, we generate a new Bonsai commit to represent the result of the commit, which is the commit we actually make public. Therefore, the commit that will be rebased should be uploaded as a draft. The rebase result will then provide filenodes, and thus we will generate good linknodes for pushrebaseed commits. Reviewed By: krallin Differential Revision: D16550469 fbshipit-source-id: 1f2f00f8cd9ad0f75441aca0eb8daae62ae299e0	2019-07-30 06:40:21 -07:00
George-Catalin Tintareanu	6976315469	Introduce quoted arguments Summary: Introduce quoted arguments to prevent unwanted behavior Reviewed By: ikostia Differential Revision: D16458255 fbshipit-source-id: 842ba54782b5baf3dea7a22b220a595b11f274af	2019-07-25 03:19:13 -07:00
Stanislau Hlebik	099856d71c	mononoke: avoid combinatoric explosion in derived_data framework Summary: In a very mergy repos we can hit a combinatoric explosion by visiting the same node over and over again. Derived data framework has the same problem, and this diff fixes it. I had a few attempts at implementing it: 1 Use `bounded_traversal`, but change unfold to filter out parents that were already visited. That wasn't correct because during fold will be called only with the "unvisited" parents. For example in a case like ``` D / \ C B \ / A ``` fold for C or B will be called with empty parents, and that's incorrect. 2 Use `bounded_traversal`, change unfold to filter out visited parents but also remember real parents. That won't work as well. The reason is that fold might be called before unfold for parents have finished. so in the case like ``` D / \ C B \ / A \| ... thousands of commits ``` If C reaches A first, then B won't visit any other node, and it will try to derive data for B. However derived data for A might not be ready yet, so deriving data for B might fail. 3 Change bounded_traversal to support DAGs not just tree. From two points above it's clear that bounded_traversal should be called bounded_tree_traversal, because on any other DAGs it might hit combinatoric explosion. I looked into changing bounded_traversal to support DAGs, and that was possible but that was not easy. Specifically we need to make sure that all unfold operations are called after fold operations, stop using integers for nodes etc. It might also have a perf hit for the tree case, but not clear how big is it. While I think supporting DAGs in bounded_traversal makes sense, I don't want to block derived data implementation on that. I'll create a separate task for that --------------------------------------------------------------------------------- The approach I took in the end was to use bounded_stream_traversal that don't visit the same node twice. Doing this will find all commits that need to be regenerated but it might return them in an arbitrary order. After that we need to topo_sort the commits (note that I introduced the bug for hg changeset generation in D16132403, so this diff fixes it as well). This is not the most optimal implementation because it will generate the nodes sequentially even if they can be generated in parallel (e.g. if the nodes are in different branches). I don't think it's a huge concern so I think it worth waiting for bounded_dag_traversal implementation (see point 3) above) --------------------------------------------------------------------------------- Finally there were concerns about memory usage from additional hashset that keeps visited nodes. I think these concerns are unfounded for a few reasons: 1) We have to keep the nodes we visited anyway because we need to generated derived data from parents to children. In fact, bounded_traversal keeps them in the map as well. That's true that bounded traversal can do it a bit more efficiently in cases we have two different branches that do not intersect. I'd argue that's a rare case and happens only on repo merges which have two independent equally sized branches. But even for the case it's not a huge problem (see below). 2) Hashset just keep commit ids which are 32 bytes long. So even if we have 1M commits to generate that would take 32Mb + hashset overhead. And the cases like that should never happen in the first place - we do not expect to generate derived data for 1M of commits except for the initial huge repo imports (and for those cases we can afford 30Mb memory hit). If we in the state where we need to generate too many commits we should just return an error to the user, and we'll add it in the later diffs. Reviewed By: krallin Differential Revision: D16438342 fbshipit-source-id: 4d82ea6111ac882dd5856319a16dda8392dfae81	2019-07-24 11:55:34 -07:00
George-Catalin Tintareanu	25e53f3dd0	Merged integration tests for censorship config Summary: `test-censoring-enabled` and `test-censoring-disabled` test if the config option to disable/enable censorship verification works . The diff merges these 2 tests since they test the same thing and the configuration is similar for both of the cases Reviewed By: krallin Differential Revision: D16441380 fbshipit-source-id: 97b3cfc6969be32d661d4676d6cdc7cfcd4eed04	2019-07-24 09:49:57 -07:00
George-Catalin Tintareanu	bfe906c02a	censored magic string response Summary: Any get request for a blacklisted file returns a magic string - not an error anymore - which replaces the actual content of the file. The magic string represents a sequence of 256 random characters. Put still returns an error if there is an attempt to change a blacklisted file, but that's the normal behavior for this type of request. Reviewed By: ikostia Differential Revision: D16333820 fbshipit-source-id: 54e0189ebe2a71ea365979950d44dbb3fd1e5dc7	2019-07-24 03:42:27 -07:00

1 2 3 4 5 ...

539 Commits