sapling

mirror of https://github.com/facebook/sapling.git synced 2024-10-10 16:57:49 +03:00

Author	SHA1	Message	Date
Pavel Aslanov	06c8fae85b	added `bounded_traversal_dag` Summary: This diff introduces `bounded_traversal_dag` which can handle arbitrary DAGs and detect invalid DAGs with cycles, but it has limitation in comparison to `bounded_traversal`: - `bounded_traversal_dag` keeps `Out` result of computation for all the nodes but `bounded_traversal` only keeps results for nodes that have not been completely evaluatated - `In` has additional constraints to be `Eq + Hash + Clone` - `Out` has additional constraint to be `Clone` Reviewed By: krallin Differential Revision: D16621004 fbshipit-source-id: b9f60e461d5d50e060be4f5bb6b970f16a9b99f9	2019-08-05 05:41:17 -07:00
Thomas Orozco	e192f295df	mononoke/admin: add filestore debug subcommands Summary: This adds debug subcommands metadata and verify in the Filestore. Those respectively output the metadata for a file verify that the file is reachable through all aliases. Reviewed By: ahornby Differential Revision: D16621789 fbshipit-source-id: 4a2156bfffb9d9641ce58f6d5f691364ba9dc145	2019-08-05 05:28:53 -07:00
Thomas Orozco	07f3fa1a88	mononoke/integration: un-blackhole apiserver tests Summary: Johan fixed retry logic in Mercurial, so those tests can now succeed even if the blackhole is enabled (though we haven't fully understood why the blackhole was breaking them in the first place). Differential Revision: D16646032 fbshipit-source-id: 8b7ff2d8d284e003e49681e737367e9942370fa1	2019-08-05 05:21:14 -07:00
Alex Hornby	e2e9f35211	mononoke: Ensure blobstore_healer can still make progress when queue contains unknown stores Summary: Update blobstore_healer handling of unknown stores to re-queue and delete original entries. Make sure it we still make progress in the case where there are lots of unknown blobstore entries on the queue. Previous diff in stack took the approach of not deleting, which could keep loading and logging same entries if there were more than blobstore_sync_queue_limit of them. Better to reinsert with new timestamp and delete old entries. Reviewed By: krallin Differential Revision: D16599270 fbshipit-source-id: efa3e5602f0ab3a037d0534e1fe8e3d42fbb52e6	2019-08-05 03:50:55 -07:00
Alex Hornby	d17d3475ed	mononoke: make blob store healer preserve queue entries for unknown blobstores Summary: make blob store healer preserve queue entries for unknown blobstores rather than erroring Reviewed By: ikostia Differential Revision: D16586816 fbshipit-source-id: 3d4987a95adcddd0329b9ededdf95887aa11286e	2019-08-05 03:50:54 -07:00
Alex Hornby	f864348558	mononoke: add healer logic to fetch from all source blobstores on the queue Summary: Add healer logic to fetch from all source blobstores on the queue Add tests for the healer queue state including put failures Reviewed By: krallin Differential Revision: D16549013 fbshipit-source-id: 6aa55b3cb2ed7fa9a1630edd5bc5b2ad2c6f5011	2019-08-05 03:50:54 -07:00
Alex Hornby	00d855084d	mononoke/blobstore_healer: fixes for the handling of blobstores after heal Summary: Fixes for the handling of blobstores after heal: 1. If all blobstores are successfully healed for a key, no need to requeue it 2. Where all heal puts fail, make sure we requeue with at least the original source blobstore we loaded the blob from 3. When we do write to the queue, write with all blobstore ids where we know we have good data, so that when it is read later it is not considered missing. Reviewed By: krallin Differential Revision: D15911853 fbshipit-source-id: 1c81ce4ec5f975e5230b27934662e02ec515cb8f	2019-08-05 03:50:54 -07:00
Alex Hornby	d1a8c487ae	mononoke: make blobstore_healer auto-heal missing source blobstores where possible Summary: make blobstore_healer auto-heal source blobstores found to be missing data so long as at least one other source blobstore from the queue has the data for the missing key Reviewed By: krallin Differential Revision: D16464895 fbshipit-source-id: 32549e58933f39bb20c173caf02a35c91123fe8d	2019-08-05 03:50:54 -07:00
Alex Hornby	a06468acd6	mononoke: add key filter option to blobstore healer Summary: Add blobstore key filter option to blobstore healer to allow easier reproduction of healer issues for particular keys. Reviewed By: StanislavGlebik Differential Revision: D16457530 fbshipit-source-id: 23201e45fbdf14fa7fdccbaf8e0f4b29355aa906	2019-08-05 03:50:53 -07:00
Alex Hornby	82f2a70c3b	mononoke/blobstore_healer: remove ratelimiter Summary: Since we're only running a single healer in the process for a single blobstore, its easy to bound the concurrency by limiting it to the number of entries we deal with at once. As a result, we don't need a separate mechanism to do overall control. Reviewed By: StanislavGlebik Differential Revision: D15912818 fbshipit-source-id: 3087b88cfdfed2490664cd0df10bd6f126267b83	2019-08-05 03:50:53 -07:00
Alex Hornby	3d27faba08	mononoke/blobstore_healer: add comments and type annotations Summary: Basically notes I took for myself to truely understand the code. Reviewed By: StanislavGlebik Differential Revision: D15908406 fbshipit-source-id: 3f21f7a1ddce8e15ceeeffdb5518fd7f5b1749c4	2019-08-05 03:50:53 -07:00
Alex Hornby	978242fb35	mononoke/repoconfig: update to using chain_err Reviewed By: StanislavGlebik Differential Revision: D15072797 fbshipit-source-id: 5339d78de265463ad800d7fb8db8a1444e3fdd6b	2019-08-05 03:50:52 -07:00
Alex Hornby	4322423811	mononoke: Update blobstore_healer for new storage config model Summary: Allow blobstore_healer to be directly configured to operate on a blobstore. This makes two changes: - Define which blobstore to operate on defined in storage.toml (doesn't currently support server.toml-local storage configs) - Only heal one blobstore at a time. We can run multiple separate instances of the healer to heal multiple blobstores. Reviewed By: HarveyHunt Differential Revision: D15065422 fbshipit-source-id: 5bc9f1a16fc83ca5966d804b5715b09d359a3832	2019-08-05 03:50:52 -07:00
Alex Hornby	b43dc6e5a7	mononoke: Migrate populate_healer for new storage config data model Summary: Update populate_healer to act directly on a blobstore config rather than indirectly via a repo config. Reviewed By: StanislavGlebik Differential Revision: D15065424 fbshipit-source-id: 638778a61283dc9ed991c49936a21d02b8d2e3f3	2019-08-05 03:50:52 -07:00
Alex Hornby	59b47cf4fa	mononoke: Drop repoid from healer structures Summary: The healer is a blobstore-level operation, which is orthogonal to the concept of a repo; therefore, there should be no mention of repoid in any of the healer's structures or tables. For now this leaves the schema unmodified, and fills the repoid with a dummy value (0). We can clean that up later. Reviewed By: lukaspiatkowski, HarveyHunt Differential Revision: D15051896 fbshipit-source-id: 438b4c6885f18934228f43d85cdb8bf2f0e542f1	2019-08-05 03:50:51 -07:00
Alex Hornby	f4e304eb09	mononoke/sqlblob: drop repo_id from everywhere Summary: RepositoryId shouldn't leak into the blobstore layer. This leaves repoid in the schema, but just populates it with a dummy value (0). We can clean up the schema and this code in a later diff. Reviewed By: StanislavGlebik Differential Revision: D15021285 fbshipit-source-id: 3ecb04a76ce74409ed0cced3d2a0217eacd0e2fb	2019-08-05 03:50:51 -07:00
Kostia Balytskyi	bc985480e9	mononoke: add the filenode subcommand to the admin tool Summary: This is useful to inspect the Mercurial filenodes in Mononoke, like in S183272. For example, I intend to use this subcommand to verify how well the future linknode healing works. Reviewed By: krallin Differential Revision: D16621516 fbshipit-source-id: 4266f85bce29b59072bf9c4f3e63777dae09a4f1	2019-08-02 12:45:57 -07:00
Kostia Balytskyi	1420897ff8	mononoke: separate cs_id fetching from filenode fetching in the admin tool Summary: Let's separate some concerns. Reviewed By: krallin Differential Revision: D16621518 fbshipit-source-id: 2d6ca96b72d5ffbc0fac4a4f9643ecc2acde0ca2	2019-08-02 12:45:57 -07:00
Kostia Balytskyi	432f1f6401	mononoke: in admin, move get_file_nodes to common Summary: This is needed in the following diff. Reviewed By: krallin Differential Revision: D16621517 fbshipit-source-id: 5a50cae7c8b761d7578bcbe5caf302a5ee2578a3	2019-08-02 12:45:57 -07:00
Thomas Orozco	ea059ef2c7	mononoke/benchmark_filestore: add support for testing with caches Summary: This updates benchmark_filestore to allow testing with caches (notably, Memcache & Cachelib). It also reads twice now, which is nice for caches that aren't filled by us (e.g. Manifold CDN). Reviewed By: ahornby Differential Revision: D16584952 fbshipit-source-id: 48ceaa9f2ea393626ac0e5f3988672df020fbc28	2019-08-02 05:40:29 -07:00
Thomas Orozco	bea4a85117	mononoke/types: clean up ContentMetadata out of FileContents Summary: There's a lot of stuff in file_contents.rs that's not actually about file contents per-se. This fixes that. Reviewed By: ahornby Differential Revision: D16598905 fbshipit-source-id: 9832b96261264c54809e0c32980cf449f8537517	2019-08-02 03:43:16 -07:00
Thomas Orozco	68569e5d0c	mononoke/{types,filestore}: use a separate type for File Chunks Summary: NOTE: This changes our file storage format. It's fine to do it now since we haven't started enabling chunking yet (see: D16560844). This updates the Filestore's chunking to store chunks as their own entity in Thrift, as opposed to have them be just FileContents. The upside of this approach is that this we can't have an entity that's both a File and a Chunk, which means: - We don't have to deal with recursive chunks (since, unlike Files, Chunks can't contain be pointers to other chunks). - We don't have to produce metadata (forward mappings and backmappings) for chunks (the reason we had to produce it was to make sure we wouldn't accidentally produce inconsitent data if the upload for one of our chunks happened to have been tried as a file earlier and failed). Note that this also updates the return value from the Filestore to `ContentMetadata`. We were using `Chunk` before there because it was sufficient and convenient, but now that `Chunk` no longer contains a `ContentId`, it no longer is convenient, so it's worth changing :) Reviewed By: HarveyHunt Differential Revision: D16598906 fbshipit-source-id: f6bec75d921f1dea3a9ea3441f57213f13aeb395	2019-08-02 03:43:16 -07:00
Thomas Orozco	cfa4c8332f	mononoke/integration: disable blackhole for apiserver tests Summary: The network blackhole is causing the API server to occasionally hang while serving requests, which has broken some LFS tests. This appears to be have happened in the last month or so, but unfortunately, I haven't been able to root cause why this is happening. From what I can tell, we have an hg client that tries an upload to the API Server, and uploads everything... and then the API server just hangs. If I kill the hg client, then the API server responds with a 400 (so it's not completely stuck), but otherwise it seems like the API server is waiting for something to happen on the client-side, but the client isn't sending that. As far as I can tell, the API Server isn't actualy trying to make outbound requests (strace does report that it has a Scribe client that's trying to connect, but Scuba logging isn't enabled, and this is just trying to connect but not send anything), but something with the blackhole is causing this hg - API server interaciton to fail. In the meantime, this diff disables the blackhole for those tests that definitely don't work when it's enabled ... Reviewed By: HarveyHunt Differential Revision: D16599929 fbshipit-source-id: c6d77c5428e206cd41d5466e20405264622158ab	2019-08-01 07:36:02 -07:00
Thomas Orozco	7ba44d737c	mononoke: add filestore params to configuration Summary: This updates our repo config to allow passing through Filestore params. This will be useful to conditionally enable Filestore chunking for new repos. Reviewed By: HarveyHunt Differential Revision: D16580700 fbshipit-source-id: b624bb524f0a939f9ce11f9c2983d49f91df855a	2019-07-31 11:48:18 -07:00
Thomas Orozco	9b90fd63cb	mononoke/blobimport: allow running without re-creating store Summary: This allows running blobimport multiple times over the same path locally (with a blob files storage, for example), which is how we use it in prod (but there we don't use the file blobstore so it works). This is helpful when playing around with local changes to blobimport. Reviewed By: HarveyHunt Differential Revision: D16580697 fbshipit-source-id: 4a62ff89542f67ce6396948c666244ef40ffe5e7	2019-07-31 11:42:36 -07:00
Thomas Orozco	ac19446c03	mononoke/filestore: dont boxify Summary: This updates the Filestore to avoid boxifying in its chunking code. The upshot is that this gets us to a place where passing a Send Stream into the Filestore gives you a Send Future back, and passing in a non-Send Stream in the Filestore gives you a non-Send future back (I hinted at this earlier in the diff that introduced faster writes). Reviewed By: aslpavel Differential Revision: D16560768 fbshipit-source-id: b77766380f2eaed5919f78cef6fbc02afeead0b9	2019-07-31 05:19:42 -07:00
Thomas Orozco	6c07d8da97	mononoke/filestore: make reads faster Summary: As noted earlier in this stack, the Filestore read implementation was very inefficient, since it required reading a Chunk before moving on to the next one. Since our Chunked files will typically have just one level of chunking, this is very inefficient (we could be fetching additional chunks ahead of time). This new implementation lets us take advantage of buffering, so we can load arbitrarily as many chunks in parallel as we'd like. Reviewed By: aslpavel Differential Revision: D16560767 fbshipit-source-id: c02c10c5de0fc5fdc3ee3897ae855b316ea34605	2019-07-31 05:19:42 -07:00
Thomas Orozco	1c6ca01a25	mononoke/filestore: make writes fast Summary: This updates the Filestore to make writes faster by farming out all hashing to separate Tokio tasks. This lets us increase throughput of the Filestore substantially, since we're no longer limited by the ability of a single core to hash data. On my dev server, when running on a 1MB file, this lets us improve the throughput of the Filestore for writes from 36.50 MB/s (0.29 Gb/s) to 152.61 MB/s (1.19 Gb/s) when using a chunk size of 1MB and a concurrency level of 10 (i.e. 10 concurrent chunk uploads). Note that the chunk size has a fairly limited impact on performance (e.g. making 10KB instead has a <10% impact on performance). Of course, this doesn't reflect performance when uploading to a remote blobstore, but note that we can tune that by tweaking our upload concurrency (making uploads faster at the expense of more memory). --- Note that as part of this change, I updated the implementation away from stream splitting, and into an implementation that fans out to Sinks. I actually had implementation of a higher-performance filestore for both, but went with this approach because it doesn't require the incoming Stream to be Send (and I have a forthcoming diff to make the whole Filestore not require a Send input), which will be useful when incorporating with the API Server, which unfortunately does not provide us with a Send input. Reviewed By: aslpavel Differential Revision: D16560769 fbshipit-source-id: b2e414ea3b47cc4db17f82d982618bbd837f93a9	2019-07-31 05:19:42 -07:00
Thomas Orozco	9bc7076a6a	mononoke/filestore: default to no chunking Summary: This is a very trivial patch simply intended to make it easier to safely roll out the Filestore. With this, we can roll out the Filestore with chunking fully disabled and make sure all hosts know how to read chunked data before we turn it on. Reviewed By: aslpavel, HarveyHunt Differential Revision: D16560844 fbshipit-source-id: 30c49c27e839dfb06b417050c0fcde13296ddade	2019-07-31 05:19:42 -07:00
Thomas Orozco	6a9007e9e0	mononoke: add a filestore benchmark + configurable concurrency Summary: This adds a filestore benchmark that allows for playing around with Filestore parameters and makes it easier to measure performance improvements. Reviewed By: aslpavel Differential Revision: D16559941 fbshipit-source-id: 50a4e91ad07bf6f9fc1efab14aa1ea6c81b9ca27	2019-07-31 05:19:41 -07:00
Thomas Orozco	b4a2d2f36a	mononoke/apiserver: make DownloadLargeFile a streaming response Summary: This updates the API server's DownloadLargeFile method to return a streaming response, instead of buffering all file contents in a Bytes blob. Reviewed By: aslpavel Differential Revision: D16494240 fbshipit-source-id: bdbece99215d87be6a65e67f8f2d920933109e15	2019-07-31 05:19:41 -07:00
Thomas Orozco	4e30164506	mononoke/blobimport: support LFS filenodes Summary: This adds support for LFS filenodes in blobimport. This works by passing a `--lfs-helper` argument, which should be an executable that can provide a LFS blob's contents on stdout when called with the OID and size of a LFS blob. My thinking is to `curl` directly from Dewey when running this in prod. Note that, as of this change, we can blobimport LFS files, but doing so might be somewhat inefficient, since we'll roundtrip the blobs to our filestore, then generate filenodes. For now, however, I'd like to start with this so we can get a sense of whether this is acceptable performance-wise. Reviewed By: farnz Differential Revision: D16494241 fbshipit-source-id: 2ae032feb1530c558edf2cfbe967444a9a7c0d0f	2019-07-31 05:19:41 -07:00
Thomas Orozco	7c25a6010e	mononoke: UploadHgFileContents: don't buffer contents to compute a Filenode ID Summary: This update our UploadHgFileContents::ContentUploaded implementation to not require buffering file contents in order to produce a Mercurial Filenode ID. Reviewed By: farnz Differential Revision: D16457833 fbshipit-source-id: ce2c5577ffbe91dfd0de1cac7d85b8d90ded140e	2019-07-31 05:19:40 -07:00
Thomas Orozco	f9360cab9d	mononoke/filestore: incorporate in Mononoke Summary: NOTE: This isn't 100% complete yet. I have a little more work to do around the aliasverify binary, but I think it'll make sense to rework this a little bit with the Filestore anyway. This patch incorporates the Filestore throughout Mononoke. At this time, what this means is: - Blobrepo methods return streams of `FileBytes`. - Various callsites that need access to `FileBytes` call `concat2` on those streams. This also eliminates the Sha256 aliasing code that we had written for LFS and replaces it with a Filestore-based implementation. However, note that this does _not_ change how files submitted through `unbundle` are written to blobstores right now. Indeed, those contents are passed into the Filestore through `store_bytes`, which doesn't do chunking. This is intentional since it lets us use LFS uploads as a testbed for chunked storage before turning it on for everything else (also, chunking those requires further refactoring of content uploads, since right now they don't expect the `ContentId` to come back through a Future). The goal of doing it this way is to make the transition simpler. In other words, this diff doesn't change anything functionally — it just updates the underlying API we use to access files. This is also important to get a smooth release: it we had new servers that started chunking things while old servers tried to read them, things would be bad. Doing it this way ensures that doesn't happen. This means that streaming is there, but it's not being leveraged just yet. I'm planning to do so in a separate diff, starting with the LFS read and write endpoints in Reviewed By: farnz Differential Revision: D16440671 fbshipit-source-id: 02ae23783f38da895ee3052252fa6023b4a51979	2019-07-31 05:19:40 -07:00
Thomas Orozco	16801112a2	mononoke/filestore: add store_bytes for compatibility Summary: This adds a `store_bytes` call in the Filestore that can be used to store a set of Bytes without chunking and return a `FileContents` blob. This is intended as a transitional API while we incorporate the Filestore throughout the codebase. It's useful for 2 reasons: - It lets us roll out chunked writes gradually where we need it. My goal is to use the Filestore chunking writes API for LFS, but keep using `store_bytes` initially in other places. This means content submitted through regular Mercurial bundles won't be chunked until we feel comfortable with chunking it. - It lets us use the Filestore in places where we were relying on the assumption that you can immediately turn Bytes into a ContentId (notably: in the `UploadHgFileContents::RawBytes` code path). This is intended to be removed later. Reviewed By: aslpavel Differential Revision: D16440670 fbshipit-source-id: e591f89bb876d08e6b6f805e35f0b791e61a6474	2019-07-31 05:19:40 -07:00
Thomas Orozco	4faaa6b2f7	mononoke/filestore: introduce peek() Summary: This adds a new `peek(.., N, ..)` call in the Filestore that allows reading at least N bytes from a file in the Filestore. This is helpful for generating Mercurial metadata blobs. This is implemented using the same ChunkStream we use to write to the Filestore (but that needed a little fixing to support streams of empty bytes as well). Reviewed By: aslpavel Differential Revision: D16440672 fbshipit-source-id: a584099f87ab34e2151b9c3f5c9f1289575f024b	2019-07-31 05:19:39 -07:00
Thomas Orozco	6d675846bf	mononoke/filestore: switch to functions (config is only needed in writes) Summary: This updates the Filestore API to be a set of functions, instead of a Struct. The rationale here is that there is only one Filestore call that needs anything on top of a blobstore (the write call), so making those functions makes it much easier to incorporate Filestore in various places that need it without having to thread a Filestore all the way through. Reviewed By: aslpavel Differential Revision: D16440668 fbshipit-source-id: 4b4bc8872e205a66a12ec96a478f0f1811f2e6b1	2019-07-31 05:19:39 -07:00
Thomas Orozco	5e8148a968	mononoke: UploadHgFileContents: optimistically reuse HG filenodes Summary: This adds supporting for reusing Mercurial filenodes in `UploadHgFileContents::execute`. The motivation behind this is that given file contents, copy info, and parents, the file node ID is deterministic, but computing it requires fetching and hashing the body of the file. This implementation implements a lookup through the blobstore to find a pre-computed filenode ID. Doing so is in general a little inefficient (but it's not entirely certain that the implementation I'm proposing here is faster -- more on this below), but it's particularly problematic large files. Indeed, fetching a multiple-GB file to recompute the filenode even if we received it from the client can be fairly slow (and use up quite a bit of RAM, though that's something we can mitigate by streaming file contents). Once thing worth noting here (hence the RFC flag) is that there is a bit of a potential for performance regression. Indeed, we could have a cache miss when looking up the filenode ID, and then we'll have to fetch the file. At this time, this is also somewhat inefficient, since we'll have to fetch the file anyway to peek at its contents in order to generate metadata. This is fixed later in this Filestore stack. That said, an actual regression seems a little unlikely to happen since in practice we'll write out the lookup entry when accepting a pushrebase then do a lookup on it later when converting the pushrebased Bonsai changeset to a Mercurial changeset). If we're worried, then perhaps adding hit / miss stats on the lookup might make sense. Let me know what you think. --- Finally, there's a bit I don't love here, which is trusting LFS clients with the size of their uploads. I'm thinking of fixing this when I finish the Filestore work. Reviewed By: aslpavel Differential Revision: D16345248 fbshipit-source-id: 6ce8a191efbb374ff8a1a185ce4b80dc237a536d	2019-07-31 05:19:39 -07:00
Thomas Orozco	386d2025fb	mononoke/filestore: add invariant testing (lots of calls, simulated failures) Summary: This adds invariant testing in the Filestore. Specifically, this tests for the fact that if you can read a file through any alias, you can read it through all aliases, and if you can read a file, you can also read its backmapping. Reviewed By: StanislavGlebik Differential Revision: D16440677 fbshipit-source-id: 737f736d99ec91bd6219d145380582341af755ae	2019-07-31 05:19:38 -07:00
Thomas Orozco	064b6a501c	mononoke/filestore: rebuild metadata on read Summary: NOTE: This diff updates Thrift serialization for chunked files. Normally, this wouldn't be OK, but we never stored anything in this format, so that's fine. This updates the Filestore to rebuild backmappings on read. The rationale for this is that since we store backmappings after writing FileContents, it is possible to have FileContents that logically exist but don't have backmappings. This should be a fairly exceptional case, but we can handle it by recomputing the missing backmappings on the fly. As part of this change, I'm updating our Thrift serialization for chunked files and backmappings (as mentioned earlier, this is normally not a good idea, but it should be fine here): - Chunked files now contain chunk lengths, which lets us derive offsets as well as total size for a chunked file. - Backmappings don't contain the size anymore (since FileContents contains it now). This was necessary because we need to have a file's size in order to recompute backmappings. In this change, I also updated GitSha1 to stop embedding its blob type and length. This lets us reconstruct a strongly-typed GitSha1 object from just the hash (and was therefore necessary to avoid duplicating file size into backmappings), which seems sufficient at this stage (and if we do need the size, we can obtain it through the Filestore by querying the file). Reviewed By: aslpavel Differential Revision: D16440676 fbshipit-source-id: 23b66caf40fde2a2f756fef89af9fe0bb8bdadef	2019-07-31 05:19:38 -07:00
Thomas Orozco	c995b9f481	mononoke/filestore: Make ContentMetadata contents mandatory Summary: NOTE: This makes Thrift changes, but we never stored anything in this format. This updates `ContentAliasBackmapping` to make all fields we are currently computing mandatory. We never stored anything using the old format, so this is safe to change for now. In general, this makes code a little simpler, since we can rely on those fields to exist. Later on, if we add new backmappings, then we'll have to handle cases where they might not exist, but there's no reason to force this work upon ourselves for now if we don't need to. Reviewed By: aslpavel Differential Revision: D16440673 fbshipit-source-id: 6f5d0e4a687a2641a9b5e8e518859b796997e22c	2019-07-31 05:19:38 -07:00
Thomas Orozco	455ff23a1d	mononoke/filestore: remove unused shared() Summary: This removes unused calls to `shared()`. This lets us leverage the never type a little more and avoid having to spell out some `unreachable!`'s (instead, the compiler can prove them for us). Reviewed By: StanislavGlebik Differential Revision: D16440675 fbshipit-source-id: e53c496962ba9d3920dfae9953f6dc8778295509	2019-07-31 05:19:38 -07:00
Thomas Orozco	051000a6b8	mononoke/filestore: verify hashes before committing a store Summary: This adds support for verifying the hashes that were provided by writers (if any) when committing a Store. This lets writers do conditional writes (i.e. if the writer knows the Sha 256 of their content, then they can ask the blobstore to verify said Sha 256 when uploading). Note that any uploaded chunks will not be cleaned up if a conditional write fails. Reviewed By: StanislavGlebik Differential Revision: D16440669 fbshipit-source-id: 88bc99e646616997a4e9d7e59d59315c18f47da9	2019-07-31 05:19:37 -07:00
Thomas Orozco	d7a318f6fc	mononoke/filestore: use a strongly typed ExpectedSize to avoid confusing hints and real sizes Summary: This updates the Filestore implementation to use a ExpectedSize type to differentiate sizes we know to be correct from sizes that were given to us by writers. This ensures we can't accidentally use the ExpectedSize we received from the writer when we should be using the effective observed size. Reviewed By: aslpavel Differential Revision: D16440674 fbshipit-source-id: 8bf03b9a962339ea2896f2f60d4a52417ca0327e	2019-07-31 05:19:37 -07:00
Thomas Orozco	ff10fff199	mononoke/filestore: introduce chunking support Summary: This adds support for chunking in the Filestore. To do so, this reworks writes to be a 2 stage process: - First, you prepare your write. This puts everything into place, but doesn't upload the aliases, backmappings, and the logical FileContents blob representing your file. If you're uploading a file that fits in a single chunk, preparing your writes effectively makes no blobstore changes. If you're uploading chunks, then you upload the individual chunks (those are uploaded essentially as if they were small files). - Then, you commit your write. At that point, prepared gave you `FileContents` that represent your write, and a set of aliases. To commit your write, you write the aliases, then the file contents, and then a backmapping. Note that we never create hierarchies when writing to the Filestore (i.e. chunked files always reference concrete chunks that contain file data), but that's not a guarantee we can rely on when reading. Indeed, since chunks and files are identical as far as blobstore storage is concerned, we have to handle the case where a chunked file references chunks that are themselves chunked (as I mentioned, we won't write it that way, but it could happen if we uploaded a file, then later on reduced the chunk size and wrote an identical file). Note that this diff also updates the FileContents enum and adds unimplemented methods there. These are all going away later in this stack (in the diff where I incorporate the Filestore into the rest of Mononoke). Reviewed By: StanislavGlebik Differential Revision: D16440678 fbshipit-source-id: 07187aa154f4fdcbd3b3faab7c0cbcb1f8a91217	2019-07-31 05:19:37 -07:00
Thomas Orozco	1e18736a6e	mononoke/filestore: initial implementation of the FileStore API Summary: NOTE: This was Jeremy's original API design for the Filestore. I've left this diff mostly unchanged (since I didn't feel super comfortable commandeering a diff then substantially changing it!), but note that the implementations for reads and writes were largely updated in my next diff, so it's probably not necessary to review them here except for context. This is a basic implementation of the API. While the API is expressed in terms of streaming, there's no chunking yet, either in memory or in storage. Questions: - Should `store` accept the aliases, or should it only take `ContentId` and always recompute the aliases? - Should it check the ContentId is correct? It could do so easily now since its walking the entire stream anyway, but it seems like an expensive API guarantee to make. TODO: - Interop testing with existing blobs - Integrate into the rest of the code - Spawn hashers into their own tasks for parallelism - Chunking in later diff(s) Reviewed By: StanislavGlebik Differential Revision: D15795778 fbshipit-source-id: e56aad086ae9e3bba0227cf3c6206faac8c97f5e	2019-07-31 05:19:36 -07:00
Thomas Orozco	874e110c3d	mononoke: move typed fetch to Blobstore Trait Summary: This moves typed fetches from the blobrepo to the blobstore. The upshot is that this allows consumers of a blobstore to do typed fetches, instead of forcing them to get bytes then cast bytes to a blob, then cast the blob to the thing they want. This required refactoring our crate hierarchy a little bit by moving BlobstoreBytes into Mononoke Types, since we now need the Blobstore crate to depend on Mononoke types, whereas it was the other way around before (since BlobstoreValue was already there, that seems reasonable) Reviewed By: StanislavGlebik Differential Revision: D16486774 fbshipit-source-id: 05751986ce3cb7273d68a8b4ebe9957bb892bcd6	2019-07-31 05:19:36 -07:00
Thomas Orozco	6352d98a3e	mononoke: don't require MononokeId impls to know how to recompute themselves from a value Summary: Currently, we require in the MononokeId trait that the Id know how to recompute itself from a value (and this has to return `Self, not `Result<Self>`, so it also can't fail). This isn't ideal for a few reasons: - It means we cannot support the MononokeId trait (and therefore typed blobstore fetches) for things that aren't purely a hash of their contents - One workaround (which jsgf had implemented for ContentAliasBackmappingId) is to peek at the contents and embed an ID in there. A downside of this approach is that we end up parsing the content twice when loading a value from a blobstore (once to get the ID, and once to get the contents later). - It's a little inefficient in general, since it means we recompute hashes of things we just fetched just to know what their hash should be (which we often proceed to immediately discard afterwards). This could be worth doing if we verified that the ID we got is the ID wee wanted, but we don't actually do this right now. Reviewed By: StanislavGlebik Differential Revision: D16486775 fbshipit-source-id: a75864eed3efa7e07b8bf642dbac3ada00cadc7c	2019-07-31 05:19:36 -07:00
Thomas Orozco	1871ddc473	mononoke/mononoke_types: Add a ContentAlias and ContentMetadata serialized types Summary: ContentAlias - the content of a blob mapping from an alias to the canonical id ContentMetadata - metadata (aliases, size) for a piece of content Reviewed By: StanislavGlebik Differential Revision: D15404826 fbshipit-source-id: 7c5284a73caa873e7655858aa31f4817a1cb648b	2019-07-31 05:19:35 -07:00
Thomas Orozco	68b77f8fd1	mononoke/mononoke_types: add Sha1 and GitSha1 hash types Summary: These will be needed as object aliases, so put them on the same footing as Sha256. Refactor the implementation of these secondary hash functions so they are implemented by macro. Reviewed By: StanislavGlebik Differential Revision: D15404825 fbshipit-source-id: 2ca65f96003e2b68875fad6f34e7c30d0dc6a8b1	2019-07-31 05:19:35 -07:00

1 2 3 4 5 ...

2322 Commits