sapling

mirror of https://github.com/facebook/sapling.git synced 2024-10-16 11:52:02 +03:00

Author	SHA1	Message	Date
Thomas Orozco	c55140f290	mononoke/lfs_server: download: handle redaction Summary: When a client requests a blob that is redacted, we should tell them that, instead of returning a 500. This does that, we now return a `410 Gone` when redacted content is accessed. Reviewed By: farnz Differential Revision: D20897251 fbshipit-source-id: fc6bd75c82e0cc92a5dbd86e95805d0a1c8235fb	2020-04-08 11:58:09 -07:00
Thomas Orozco	0a21ab46c4	mononoke/lfs_server: ignore redaction errors in batch Summary: If a blob is redacted, we shouldn't crash in batch. Instead, we should return that the blob exists, and let the download path return to the client the information that the blob is redacted. This diff does that. Reviewed By: HarveyHunt Differential Revision: D20897247 fbshipit-source-id: 3f305dfd9de4ac6a749a9eaedce101f594284d16	2020-04-08 11:58:09 -07:00
Thomas Orozco	77149d7ee8	mononoke/lfs_server: don't return a 502 on batch error Summary: 502 made a bit of sense since we can occasionally proxy things to upstream, but it's not very meaningful because our inability to service a batch request is never fully upstream's fault (it would not a failure if we had everything internally). So, let's just return a 500, which makes more sense. Reviewed By: farnz Differential Revision: D20897250 fbshipit-source-id: 239c776d04d2235c95e0fc0c395550f9c67e1f6a	2020-04-08 11:58:09 -07:00
Thomas Orozco	ee45e76fcf	mononoke/lfs_server: ignore failures from upstream if internal can satisfy Summary: I noticed this while doing some unrelated work on this code. Basically, if we get an error from upstream, then we shouldn't return an error the client unless upstream being down means we are unable to satisfy their request (meaning, we are unable to say whether a particular piece of content is definitely present or definitely missing). This diff fixes that. Instead of checking for a success when hearing form upstream _then_ running our routing logic, let's instead only fail if in the course of trying to route the client, we discover that we need a URL from upstream AND upstream has failed. Concretely, this means that if upstream blew up but internal has all the data we want, we ignore the fact that upstream is down. In practice, internal is usually very fast (because it's typically all locally-cached) so this is unlikely to really occur in real life, but it's still a good idea to account for this failure scenario. Reviewed By: HarveyHunt Differential Revision: D20897252 fbshipit-source-id: f5a8598e8a9da382d0d7fa6ea6a61c2eee8ae44c	2020-04-08 11:58:08 -07:00
Thomas Orozco	368d43cb71	mononoke_types: add Sha256 stubs Summary: Like it says in the title. Reviewed By: farnz Differential Revision: D20897248 fbshipit-source-id: bf17ee8bdec85153eed3c8265304af79ec9a8877	2020-04-08 11:58:08 -07:00
Thomas Orozco	6130f1290f	mononoke/blobrepo_factory: add a builder for test repos Summary: Right now we have a couple functions, but they're not easily composable. I'd like to make the redacted blobs configurable when creating a test repo, but I also don't want to have 2 new variants, so let's create a little builder for test repos. This should make it easier to extend in the future to add more customizability to test repos, which should in turn make it easier to write unit tests :) Reviewed By: HarveyHunt Differential Revision: D20897253 fbshipit-source-id: 3cb9b52ffda80ccf5b9a328accb92132261616a1	2020-04-08 11:58:08 -07:00
Steven Troxler	10bf48e871	Extract async fn tail_one_iteration Summary: This asyncifies the internals of `subcommand_tail`, which loops over a stream, by taking the operation performed in the loop and making it an async function. The resulting code saves a few heap allocations by reducing clones, and is also much less indented, which helps with readability. Reviewed By: krallin Differential Revision: D20664511 fbshipit-source-id: 8e81a1507e37ad2cc59e616c739e19574252e72c	2020-04-08 11:19:35 -07:00
Lukas Piatkowski	c7d12b648f	mononoke/mercurial: make revlog crate OSS buildable Reviewed By: krallin Differential Revision: D20869309 fbshipit-source-id: bc234b6cfcb575a5dabdf154969db7577ebdb5c5	2020-04-08 09:49:11 -07:00
Simon Farnsworth	4135c567a8	Port over all hooks whose behaviour doesn't change from Mercurial form to Bonsai form Summary: These hooks behave the same way in Mercurial and Bonsai form. Port them over to operating on Bonsai form Reviewed By: krallin Differential Revision: D20891165 fbshipit-source-id: cbcdf217398714642d2f2d6669376defe8b944d7	2020-04-08 08:59:01 -07:00
Simon Farnsworth	da7cbd7f36	Run Bonsai hooks as well as old-style hooks Summary: Running on Mercurial hooks isn't scalable long term - move the consumers of hooks to run on both forms for a transition period Reviewed By: krallin Differential Revision: D20879136 fbshipit-source-id: 4630cafaebbf6a26aa6ba92bd8d53794a1d1c058	2020-04-08 08:59:00 -07:00
Simon Farnsworth	c59ae3274b	Teach hook loader to load new (Bonsai) form hooks Summary: To use Bonsai-based hooks, we ned to be able to load them. Make it possible. Reviewed By: krallin Differential Revision: D20879135 fbshipit-source-id: 9b44d7ca83257c8fc30809b4b65ec27a8e9a8209	2020-04-08 08:59:00 -07:00
Simon Farnsworth	b66d875fa5	Move hooks over from an internal representation based on HgChangesets to BonsaiChangesets Summary: We want all hooks to run against the Bonsai form, not a Mercurial form. Create a second form of hooks (currently not used) which acts on Bonsai hooks. Later diffs in the stack will move us over to Bonsai only, and remove support for Mercurial changeset derived hooks Reviewed By: krallin Differential Revision: D20604846 fbshipit-source-id: 61eece8bc4ec5dcc262059c19a434d5966a8d550	2020-04-08 08:59:00 -07:00
Steven Troxler	afdb247802	Swap out a while loop instead of .and_then + .fold Summary: Thanks to StanislavGlebik for this idea: we can make the looping over upload changesets into straightforward imperative code instead of using `.and_then` + `.fold` by taking the next chunk in a while loop. The resulting code is probably easier to understand (depends whether you come from a functional background I guess), and it's less indented which is definitely more readable Reviewed By: StanislavGlebik Differential Revision: D20881862 fbshipit-source-id: 7ecf76a2fae3eb0e6c24a1ee14e0684b6334b087	2020-04-08 08:19:32 -07:00
Steven Troxler	aabbd3b66a	Minor cleanups of `blobimport_lib/lib.rs` Summary: A couple of minor improvements, removing some overhead: - We don't need to pass cloned structs to `erive_data_for_csids`, refs work just fine - We can strip out one of the boxing blocks by directly assigning an `async` block to `globalrevs_work` - We can't do the same for `synced_commit_mapping_work` because we have to iterate over `chunk` in synchronous code, so that `chunk` can later be consumed by the line defining `changesets`. Reviewed By: StanislavGlebik Differential Revision: D20863304 fbshipit-source-id: 14cad3324978a66bcf325b77df7803d77468d30b	2020-04-08 08:19:32 -07:00
Steven Troxler	814f428f03	Asyncify the max_rev code Summary: This wound up being a little tricky, because that `async move` blocks capture any data used, and most of the fields of the `Blobimport` struct are values rather than refs. The easiest solution that I came up with, which looks a little weird but works better than anything else I tried, is to just inject a little block of code (which I commented so it will hopefully be clear to future readers) taking refs of anything that we need to use in an async block but also have available later. In the process, we are able to strip out a layer of clones, which should improve efficiency a bit. Reviewed By: StanislavGlebik Differential Revision: D20862358 fbshipit-source-id: 186bf9939b9496c432ff0d9a01e602da47f4b5d4	2020-04-08 08:19:32 -07:00
Lukas Piatkowski	2e7baa454b	mononoke: cover more crates with OSS buildability that depend on cmdlib crate Reviewed By: krallin Differential Revision: D20735803 fbshipit-source-id: d4159d16384ff795717f6ccdd278b6b4af45d1ab	2020-04-08 03:09:07 -07:00
Lukas Piatkowski	8e9df760c5	mononoke: make cmdlib OSS buildable Summary: Some methods that were unused or barely used outside of the cmdlib crate were made non-public (parse_caching, CachelibSettings, init_cachelib_from_settings). Reviewed By: krallin Differential Revision: D20671251 fbshipit-source-id: 232e786fa5af5af543239aca939cb15ca2d6bc10	2020-04-08 03:09:06 -07:00
Stefan Filip	d1ba21803a	version: warn users when they are running an old build Summary: Old is defined by being based on a commit that is more than 30 days old. The build date is taken from the version string. One observation is that if we fail to release in more than 30 days then all users will start seeing this message without any way of turning it off. Doesn't seem worth while to add a config for silencing it though. Reviewed By: quark-zju Differential Revision: D20825399 fbshipit-source-id: f97518031bbda5e2c49226f3df634c5b80651c5b	2020-04-07 14:25:38 -07:00
Stanislau Hlebik	b2a8862a9a	mononoke: add a test backfill derived data Summary: I decided to go with integration test because backfilling derived data at the moment requires two separate calls - a first one to prefetch changesets, and a second one to actually run backfill. So integration test is better suited for this case than unit tests. While doing so I noticed that fetch_all_public_changesets actually won't fetch all changesets - it loses the last commit becauses t_bs_cs_id_in_range was returning exclusive (i.e. max_id was not included). I fixed the bug and made the name clearer. Reviewed By: krallin Differential Revision: D20891457 fbshipit-source-id: f6c115e3fcc280ada26a6a79e1997573f684f37d	2020-04-07 08:44:25 -07:00
Aida Getoeva	2df76d79c8	mononoke/scs-log: add history stream terminator Summary: `log_v2` supports time-filters and that means it needs to be able to drop the history stream if the commits got older than the given time frame. (if not it just traverses the whole history...) However, it cannot be done from the SCS commit_path API or from changeset_path, because they already receive history stream where commits are not ordered by creation time. And the naive solution "if next commit in the stream is older than `after_ts` then drop" won't work: there might be another branch (commit after the current one) which is still in the time frame. I added a terminator-function to the `list_file_history` that is called on changeset id, for which a new fastlog batch is going to be fetched. If terminator returns true, then the fastlog is not fetched and the current history branch is dropped. All ready nodes are still streamed. ``` For example, if we have a history of the file changes like this: A 03/03 ^\| \| \| B 02/03 \| \| \| - one fastlog batch C 01/03 \| \| \ \| 02/01 D E 10/02 _\| - let assume, that fastlog batches for D and E ancestors are needed to prefetch \| \| 01/01 F G 05/02 # Example 1 We query "history from A after time 01/02" The old version would fetch all the commits and then filter them in `commit_path`. We would fetch both fastlog batches for the D branch and E branch. With the terminator, `list_file_history` will call terminator on commit D and get `true` in return and then will drop the D branch, then it will call terminator on E and get `false` and proceed with fetching fastlog for the E branch. # Example 2 We query "history from A after time 01/04" The old version would fetch all the commits and then filter them in `commit_path`, despite the fact that the very first commit is already older than needed. With the terminator it will call terminator on A and get `true` and won't proceed any further. Reviewed By: StanislavGlebik Differential Revision: D20801029 fbshipit-source-id: e637dcfb6fddceb4a8cfc29d08b427413bf42e79	2020-04-07 07:08:24 -07:00
Aida Getoeva	2dcfcbac62	mononoke/fastlog: asyncify part of ops Summary: Asyncified main functions of the fastlog/ops, so it'd be easier to modify them and proceed with the new features. Reviewed By: StanislavGlebik Differential Revision: D20801028 fbshipit-source-id: 2a03eedca776c6e1048a72c7bd613a6ef38c5c17	2020-04-07 07:08:24 -07:00
Thomas Orozco	7e2ad0b529	mononoke/fastreplay: handle Gettreepack for designated nodes Summary: We need to parse `directories` here. Let's do so. Reviewed By: HarveyHunt Differential Revision: D20869830 fbshipit-source-id: 74830aa0045b801fba089812447fb61d7d09ad14	2020-04-07 04:36:07 -07:00
Thomas Orozco	edadb9307a	mononoke/repo_client: record depth Summary: As it says in the title! Reviewed By: HarveyHunt Differential Revision: D20869828 fbshipit-source-id: df7728ce548739ef2dadad1629817fb56c166b66	2020-04-07 04:36:06 -07:00
Thomas Orozco	0e7cbcf453	mononoke/repo_client: use wireproto encoding for directories Summary: We use the logged arguments directly for wireproto replay, and then we replay this directly in traffic replay, but just joining a list with `,` doesn't actually work for directories: - We need trailing commas - We need wireproto encoding This does that. It also clarifies that this encoding is for debug purposes by updating function names, and relaxes a bunch of types (since hgproto uses bytes_old). Reviewed By: StanislavGlebik Differential Revision: D20868630 fbshipit-source-id: 3b805c83505aefecd639d4d2375e0aa9e3c73ab9	2020-04-07 04:36:06 -07:00
Thomas Orozco	1c982d5258	mononoke/unbundle_replay: report size of the unbundle Summary: This is helpful to draw conclusions as to how fast it is. Reviewed By: StanislavGlebik Differential Revision: D20872108 fbshipit-source-id: d323358bbba29de310d6dfb4c605e72ce550a019	2020-04-07 01:05:32 -07:00
Aida Getoeva	514a72e835	mononoke/fastlog: yield all ready nodes Summary: `list_file_history` implements BFS on the commit graph and returns a stream of changeset ids using `bounded_traversal_stream`. The old version iterated BFS "levels" and each iteration streamed all nodes on the current level. For example, for the commit graph: ``` 1 <- start # 1 level \| 2 # 2 \| \ 3 4 # 3 \| \| ``` there would be 3 iterations and on each nodes would be yielded: [1], [2], [3, 4]. If there was need to prefetch fastlog batch or batches, it prefetched in parallel batch/batches for changesets on the same level. The implementation was a bit hacky and it is a bit unfortunate that we need to make 100 iterations to stream changesets that are ready and do not require fetching fastlog. I also needed some simplification so I could then add a terminator function (3rd diff in the stack) on the fastlog batch prefetching stage (and add Deleted Manifest intefration). So now the `bounded_traversal_stream` keeps a BFS-queue as a state and on each iteration streams all nodes in the queue until it hits the node for which it needs prefetch fastlog batch and goes to the next iteration. ``` state - [queue, prefetch_cs_id] * on each iteration: 1. If prefetch_cs_id.is_some() => - fetch fastlog batch for prefetch_cs_id - fill the commit graph - add parents of the prefetch_cs_id to the bfs queue 2. Get from the queue all nodes until we meet changeset without fetched parents. Mark this node as `prefetch_cs_id` for the next iteration. 3. Stream ready nodes and go to the next iteration. ``` Thus - we still fetch fastlog batches on demand and not before we really need them - if we have 100 ready to be yield commits in the queue, we won't do 100 iterations and stream them in one go - now if we need prefetch fastlog batches for 2 branches on the same "bfs level" we will do it one by one and not in parallel, but this situation is pretty uncommon - code is simpler and allows to integrate Deleted Manifest and add terminator function. Reviewed By: StanislavGlebik Differential Revision: D20768401 fbshipit-source-id: cdba40539a842b3628826f6c72a29514da0d539e	2020-04-06 21:13:08 -07:00
Steven Troxler	ddcb9109ba	Asyncify outermost logic in Blobimport::import Summary: In the previous diff we asyncified the signature of Blobimport::import, but the body remained an old-style future with a compat and await at the end. This diff asyncifies the outermost logic from within the function, slightly improving readability and removing one layer of clones to cut down on heap allocations. The derivation of `max_rev` still currently uses old-style streams and futures. Reviewed By: StanislavGlebik Differential Revision: D20861230 fbshipit-source-id: 1b462f17581c764e77a0a0a163c86ffa894df742	2020-04-06 15:26:01 -07:00
Steven Troxler	17717db851	Push `compat` down one layer in Blobimport::import Summary: Switch the Blobimport struct to take a reference to a ctx, and have `import` be an `async fn`. Reviewed By: StanislavGlebik Differential Revision: D20861165 fbshipit-source-id: eda9d599af2e525ec3142facc1eeb6b5b433ab06	2020-04-06 15:26:01 -07:00
Alex Hornby	d7be844de5	mononoke: add default implementations for samplingblobs put and is_present handlers Summary: Add default implementations for samplingblobs put and is_present handlers to save some boilerplate Reviewed By: farnz Differential Revision: D20868507 fbshipit-source-id: 40275cc832870019238c0635e097e53671b76783	2020-04-06 10:01:38 -07:00
Mateusz Kwapich	4163e2937f	use operation keys for selects Summary: This way we'll never select more than (no_of_stores * limit) rather than potentially unbounded output. NOTE: This diff has to be landed and rolled out after D20557702 is rolled out. I'm assuming that after some time since D20557702 rollout all the rows in the production db will have proper `operation_key` value set so we can make queries based on them. Reviewed By: krallin Differential Revision: D20557700 fbshipit-source-id: 5a1d4b69949b425915214f5227c5c0dcce374360	2020-04-06 09:57:24 -07:00
Mateusz Kwapich	549eb41059	run blobstore healer integration test with mysql Summary: So we're sure that all the quries work not only in sqlite. Reviewed By: krallin Differential Revision: D20839958 fbshipit-source-id: 9d05cc175d65396af7495b31f8c6958ac7bd8fb6	2020-04-06 09:57:24 -07:00
Mateusz Kwapich	975ddc043f	test showing the problem with repeating blob_keys Summary: When we have more entries with a single blobkey we always select all of them regardles when and how they were added. That's why we need to select basing on operation_key. Reviewed By: krallin Differential Revision: D20557699 fbshipit-source-id: 77ccf992bb24d1a46ea28a13ab0780e6c92935ae	2020-04-06 09:57:24 -07:00
Alex Hornby	ba8e6e0d1c	mononoke: walker: log errors to scuba regardless of the error_as_data setting Summary: Log errors to scuba regardless of the error_as_data setting, as as finding the logs is much easier from scuba than stderr. Reviewed By: farnz Differential Revision: D20838462 fbshipit-source-id: b78e3a3213ed4aee4e4b2feb871ad7e42e25ed00	2020-04-06 03:17:40 -07:00
Stanislau Hlebik	bd9fe4db1d	mononoke: add missing telemetry to phases Reviewed By: krallin Differential Revision: D20839750 fbshipit-source-id: ea1f329cb0cc015461146428601a22685293bfc4	2020-04-03 10:38:42 -07:00
Stanislau Hlebik	0100cd75ee	mononoke: asyncify all phases except for trait Reviewed By: krallin Differential Revision: D20839242 fbshipit-source-id: 75c5a8f9967c2c71b7e36b74abe151df142fcbab	2020-04-03 10:38:42 -07:00
Stanislau Hlebik	7b26350b74	mononoke: asyncify get_public_derive Reviewed By: krallin Differential Revision: D20838069 fbshipit-source-id: 3cda39571fd191f40663da3f1dd51bc03d86e250	2020-04-03 10:38:41 -07:00
Stanislau Hlebik	9ed388e34c	mononoke: move phases store to new futures Reviewed By: krallin Differential Revision: D20837913 fbshipit-source-id: ddcbce9518255d9dda2cf09b61fdd4939cef5258	2020-04-03 10:38:41 -07:00
Stanislau Hlebik	5bc98d60db	mononoke: asyncify test Reviewed By: farnz Differential Revision: D20837893 fbshipit-source-id: 43ca705ce2ada1532330da89d69392c6b97b5129	2020-04-03 09:12:18 -07:00
Kostia Balytskyi	717d82a828	unbundle processing: add stats reporting Summary: Combined with the unbundle resolver stats, we will be able to say which percentage of pushrebases fails, for example. Reviewed By: StanislavGlebik Differential Revision: D20818224 fbshipit-source-id: 70888b1cb90ffae8b11984bb024ec1db0e0542f7	2020-04-03 09:05:59 -07:00
Kostia Balytskyi	cfadb57637	resolver.rs: report which kind of unbundle was resolved if any Summary: We need this to be able to monitor how frequently we get pushes vs infinitepushes, etc. A furhter diff will add a similar reporting to `processing.rs`, so that we can compute a percentage of successful pushes to all pushes, for example. Reviewed By: StanislavGlebik Differential Revision: D20818225 fbshipit-source-id: 7945dc285560d1357bdc6aef8e5fe50b61622254	2020-04-03 09:05:58 -07:00
Stanislau Hlebik	a47bb8c5e1	mononoke: use caching in phases more efficiently Summary: Our phases caching wasn't great. If you tried to ask for a draft commit then we'd call mark_reachable_as_public method, bu this method was bypassing caches. The reason why we had this problem was because we had caching on a higher level than necessary - we had SqlPhases struct which was "smarter" (i.e. it has a logic of traversing ancestors of public heads and marking these ancestors and public) and SqlPhasesStore which just did sql access. Previously we had our caching layer on top of SqlPhases, meaning that when SqlPhases calls `mark_reachable_as_public` it can't use caches anymore. This diff fixes it by moving caching one layer lower - now we have a cache right on top of SqlPhasesStore. Because of this change we no longer need CachingPhases, and they were removed. Also `ephemeral_derive` logic was simplified a bit Reviewed By: krallin Differential Revision: D20834740 fbshipit-source-id: 908b7e17d6588ce85771dedf51fcddcd2fabf00e	2020-04-03 08:23:38 -07:00
Stanislau Hlebik	016f24b93e	mononoke: asyncify mark_reachable_as_public Reviewed By: krallin Differential Revision: D20836348 fbshipit-source-id: 1f30e69f9b3f47967a54ab0bf70c6f40944098b1	2020-04-03 08:23:38 -07:00
Stanislau Hlebik	74c7d0b11f	mononoke: use MemcacheHandler Summary: Very small refactoring to store MemcacheHandler (i.e. an enum which can either be a real Memcache client or a mock) instead of a memcache client. It will be used in the next diff to create mock caches Reviewed By: krallin Differential Revision: D20834916 fbshipit-source-id: cb1e3e8f0ae0e2c0f7018d3a003ada56725c65c6	2020-04-03 04:20:53 -07:00
Stanislau Hlebik	8afcb5aaa3	mononoke: remove SelectPhase method Summary: SelectPhases does the same thing - no need to keep two queries Reviewed By: krallin Differential Revision: D20817379 fbshipit-source-id: 8cc56ea4a94e81f110a286899a8f5c596566a142	2020-04-03 04:20:53 -07:00
Stanislau Hlebik	8ff854c2dc	mononoke: move SqlPhasesStore to a separate file Summary: I'm going to refactor it soon, for now just move it to another file. Reviewed By: krallin Differential Revision: D20817293 fbshipit-source-id: 6fb44b4be858ebbd0e8c9dfee160b91806f78285	2020-04-03 04:20:53 -07:00
David Tolnay	1a86366f0e	third-party/rust: Turn off async-trait/support_old_nightly Summary: This diff turns off the support_old_nightly feature of async-trait (https://github.com/dtolnay/async-trait/blob/0.1.24/Cargo.toml#L28-L32) everywhere in fbcode. I am getting ready to remove the feature upstream. It was an alternative implementation of async-trait that produces worse error messages but supports some older toolchains dating back to before stabilization of async/await that the default implementation does not support. This diff includes updating async-trait from 0.1.24 to 0.1.29 to pull in fixes for some patterns that used to work in the support_old_nightly implementation but not the default implementation. Differential Revision: D20805832 fbshipit-source-id: cd34ce55b419b5408f4f7efb4377c777209e4a6d	2020-04-02 17:01:24 -07:00
Xavier Deguillard	29727102db	memcache: don't panic if Memcache fails to initialize Summary: Simply return an error when that happens. Reviewed By: dtolnay Differential Revision: D20808660 fbshipit-source-id: 94ca1c6de5739e4e67f2db6be547ed92c5696e43	2020-04-02 10:07:23 -07:00
Alex Hornby	a156633c1f	mononoke: add sampling_fingerprint to hash types Summary: Add a fingerprint method that returns a subset of the hash. This will allow us to see compression benefit, or write out a corpus, sampling 1 in N of a group of keys Reviewed By: krallin Differential Revision: D20541312 fbshipit-source-id: 93bd44ba9c14285daf50d8cd18eeb4b6dcc38d82	2020-04-02 09:08:05 -07:00
Alex Hornby	7060cd47d6	mononoke: walker: use sampling blobstore in compression-benefit Summary: Use the new sampling blobstore and sampling key in existing compression-benefit subcommand and check the new vs old reported sizes. The overall idea for these changes is that the walker uses a CoreContext tagged with a SamplingKey to correlate walker steps for a node to the underlying blobstore reads, this allows us to track overall bytes size (used in scrub stats) or the data itself (used in compression benefit) per node type. The SamplingVisitor and NodeSamplingHandler cooperate to gather the sampled data into the maps in NodeSamplingHandler, which the output stream from the walk then operates on, e.g. to compress the blobs and report on compression benefit. The main new logic sits in sampling.rs, it is used from sizing.rs (and later in stack from scrub.rs) Reviewed By: krallin Differential Revision: D20534841 fbshipit-source-id: b20e10fcefa5c83559bdb15b86afba209c63119a	2020-04-02 09:08:05 -07:00
Mark Thomas	5fb25674ea	sql_ext: remove SqlConstructors Summary: Now that everything is using `sql_construct`, we can remove the old `SqlConstructors` trait. Reviewed By: StanislavGlebik Differential Revision: D20734881 fbshipit-source-id: af46b41d17b40f6eb0839cdb9e85b00067360fe9	2020-04-02 05:27:16 -07:00
Mark Thomas	640f272598	migrate from sql_ext::SqlConstructors to sql_construct Summary: Migrate the configuration of sql data managers from the old configuration using `sql_ext::SqlConstructors` to the new configuration using `sql_construct::SqlConstruct`. In the old configuration, sharded filenodes were included in the configuration of remote databases, even when that made no sense: ``` [storage.db.remote] db_address = "main_database" sharded_filenodes = { shard_map = "sharded_database", shard_num = 100 } [storage.blobstore.multiplexed] queue_db = { remote = { db_address = "queue_database", sharded_filenodes = { shard_map = "valid_config_but_meaningless", shard_num = 100 } } ``` This change separates out: * DatabaseConfig, which describes a single local or remote connection to a database, used in configuration like the queue database. * MetadataDatabaseConfig, which describes the multiple databases used for repo metadata. MetadataDatabaseConfig is either: * Local, which is a local sqlite database, the same as for DatabaseConfig; or * Remote, which contains: * `primary`, the database used for main metadata. * `filenodes`, the database used for filenodes, which may be sharded or unsharded. More fields can be added to RemoteMetadataDatabaseConfig when we want to add new databases. New configuration looks like: ``` [storage.metadata.remote] primary = { db_address = "main_database" } filenodes = { sharded = { shard_map = "sharded_database", shard_num = 100 } } [storage.blobstore.multiplexed] queue_db = { remote = { db_address = "queue_database" } } ``` The `sql_construct` crate facilitates this by providing the following traits: * SqlConstruct defines the basic rules for construction, and allows construction based on a local sqlite database. * SqlShardedConstruct defines the basic rules for construction based on sharded databases. * FbSqlConstruct and FbShardedSqlConstruct allow construction based on unsharded and sharded remote databases on Facebook infra. * SqlConstructFromDatabaseConfig allows construction based on the database defined in DatabaseConfig. * SqlConstructFromMetadataDatabaseConfig allows construction based on the appropriate database defined in MetadataDatabaseConfig. * SqlShardableConstructFromMetadataDatabaseConfig allows construction based on the appropriate shardable databases defined in MetadataDatabaseConfig. Sql database managers should implement: * SqlConstruct in order to define how to construct an unsharded instance from a single set of `SqlConnections`. * SqlShardedConstruct, if they are shardable, in order to define how to construct a sharded instance. * If the database is part of the repository metadata database config, either of: * SqlConstructFromMetadataDatabaseConfig if they are not shardable. By default they will use the primary metadata database, but this can be overridden by implementing `remote_database_config`. * SqlShardableConstructFromMetadataDatabaseConfig if they are shardable. They must implement `remote_database_config` to specify where to get the sharded or unsharded configuration from. Reviewed By: StanislavGlebik Differential Revision: D20734883 fbshipit-source-id: bb2f4cb3806edad2bbd54a47558a164e3190c5d1	2020-04-02 05:27:16 -07:00
Mark Thomas	662f8bfb1c	sql_construct: add crate for constructing sql data managers Summary: Refactor `sql_ext::SqlConstructors` and its related traits into a separate crate. The new `SqlConstruct` trait is joined by `SqlShardedConstruct` which allows construction of sharded databases. The new crate will support a new configuration model where the distinction between the database configuration for different repository metadata types will be made clear. Reviewed By: StanislavGlebik Differential Revision: D20734882 fbshipit-source-id: b44cf9d1efd014c29df88e2ad933025e440119dc	2020-04-02 05:27:16 -07:00
Jun Wu	2aec2dbcb6	commitcloud: migrate to tech-debt-free repo.pull for pulling Summary: The new API does nothing that cloud sync does not want: bookmarks, obsmarkers, prefetch, etc. Wrappers to disable features are removed. This solves a "lagged master" issue where selectivepull adds `-B master` to pull extra commits but cloud sync cannot hide them without narrow-heads. Now cloud sync just does not pull the extra commits. Reviewed By: sfilipco Differential Revision: D20808884 fbshipit-source-id: 0e60d96f6bbb9d4ce02c04e8851fc6bda442c764	2020-04-01 19:40:57 -07:00
Stanislau Hlebik	fb9f7fe931	mononoke: use rollout_smc_tier option Summary: For the initial rollout of lfs on fbsource we want to rollout just for our team using rollout_smc_tier option. This diff adds a support for that in Mononoke. It spawns a future that periodically updates list of enabled hosts in smc tier. I had a slight concern about listing all the available services and storing them in memory - what if smc tier have too many services? I decided to go ahead with that because 1) [Smc antipatterns](https://fburl.com/wiki/ox43ni3a) wiki page doesn't seem to list it as a concern. 2) We are unlikely to use for large tier - most likely we'll use it just for hg-dev which contains < 100 hosts. Reviewed By: krallin Differential Revision: D20789751 fbshipit-source-id: d35323e49530df6983e159e2ed5bce205cc5666d	2020-04-01 10:00:52 -07:00
Stanislau Hlebik	c65db9c551	mononoke: configerator-thrift-update Reviewed By: farnz Differential Revision: D20789686 fbshipit-source-id: 13033d5cb4239d97db70a5f4a89014ea8c9f07c4	2020-04-01 10:00:51 -07:00
Stanislau Hlebik	03c73035cb	mononoke: use more efficient copy fetching in scs server Summary: This is a followup from D20766465. In D20766465 we've avoided re-traversing fsnodes for all entries except for copy/move sources. This diff make copy/move sources fetching more efficient as well. It does by sending a find_entries() request to prefetch all fsnodes Reviewed By: mitrandir77 Differential Revision: D20770182 fbshipit-source-id: 7e4a68a2ded20b2895ee4d1c4f8fd897dbe1c850	2020-04-01 06:00:26 -07:00
Pavel Aslanov	b2c81c6d63	validate chunks sizes in the streaming clone implementation Summary: We are currently having problems with streaming clone: ``` $ hg --config 'extensions.fsmonitor=!' clone --shallow -U --config 'ui.ssh=ssh -oControlMaster=no' --configfile /etc/mercurial/repo-specific/fbsource.rc 'ssh://hg.vip.facebook.com//data/scm/fbsource?force_mononoke' "$(pwd)/fbsource-clone-test" remote: server: https://fburl.com/mononoke remote: session: vJ3qkiQIm9FT7mCp connected to twshared11499.02.cln2.facebook.com streaming all changes 2 files to transfer, 5.42 GB of data abort: unexpected response from remote server: '\x00\x01B?AB\x00\x00\x00\x00\x02U\x00\x00\x02\xc7\x00b\xf0\xd5\x00b\xf0\xd5\x00b\xf0\xd4\xff\xff\xff\xff\xa8z\xc7W\xd0&\xab\xb2\xf1{\xbfq\xac<\xaf6W\x06q\x81\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01B?C\x97\x00\x00\x00\x00\x053\x00\x00\x06\xce\x00b\xf0\xd6\x00b\xf0\xd6\x00b\xf0\xd5\xff\xff\xff\xff\xa3I\x19+\xe2\x0f\xae\xd2\x95\x14\x8a\xde\x19\x18\xf0\x8cUQu\xf1\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01B?H\xca\x00\x00\x00\x00\x02\xe4\x00\x00\x03\x9e\x00b\xf0\xd7\x00b\xf0\xd7\x00b\xf0\xd6\xff\xff\xff\xffx\xd6}\x12nt\xb9\xbc(\x83\xfb\xfa\xcc\xc1o?\xde\xcc\x06L\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01B?K\xae\x00\x00\x00\x00\x02j\x00\x00\x02\xb5\x00b\xf0\xd8\x00b\xf0\xd8\x00b\xf0\xd7\xff\xff\xff\xff\x04"\xfcw6\'M\xba\xf1f\xdb\x02\xbeE\x93:\xc8\x17\x88P\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01B?N\x18\x00\x00\x00\x00\x03\xbb\x00\x00\x04\xb8\x00b\xf0\xd9\x00b\xf0\xd9\x00b\xf0\xd8\xff\xff\xff\xff\xb9\x15p/\xa4\x00\x9dZw\x01B\x87L\x8f\x08\x11\x89\xe0\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x0000changelog.d\x005406413267\n' ``` as the result of the debugging it is turned out that we are sending more data than expected, to have better error next time if we have any corruption of the `streaming_changelog_chunks` table Reviewed By: StanislavGlebik Differential Revision: D20763738 fbshipit-source-id: 6f6fa9f9a29909e044d9ba42fe84916ddcb62e8f	2020-04-01 04:55:17 -07:00
Simon Farnsworth	39f5aacf9e	Allow filtering of blobstore benchmarks Summary: Each benchmark takes about 3 minutes to run. We've already got 16 benchmarks, and we're going to grow. Allow you to limit the number of benchmarks we run at once.. Reviewed By: ahornby Differential Revision: D20735795 fbshipit-source-id: 241184085b35da8ab85314fef1c6a08404bdb769	2020-04-01 03:13:11 -07:00
Simon Farnsworth	cd77fd6c21	Teach the blobstore benchmark to use saved baselines Summary: We're going to be doing a variety of changes to sqlblob - let's enable working against a known baseline each time, instead of incremental changes. Reviewed By: ahornby Differential Revision: D20735796 fbshipit-source-id: 86f15dac1f004b2f3c83ced829a65f3f6e111d6b	2020-04-01 03:13:11 -07:00
Simon Farnsworth	aa86f24204	Add a blobstore benchmark tool Summary: We want to be able to benchmark blobstore stacks to allow us to compare blobstore stacks, and ensure that we're not regressing as we improve SQLblob to production state. Use Criterion to benchmark a few basic workflows - starting with writes, but reads can be added later. Reviewed By: ahornby Differential Revision: D20720789 fbshipit-source-id: e8b10664a9d08a1aa7e646e1ebde251bec0db991	2020-04-01 03:13:10 -07:00
Arun Kulshreshtha	e236ef9df3	edenapi_server: use client identity middleware Summary: Use the client identity middleware from gotham_ext in the EdenAPI server. This middleware parses validated client identities from HTTP headers inserted by Proxygen; these identities can then be used to enforce repo ACLs. Reviewed By: HarveyHunt Differential Revision: D20744887 fbshipit-source-id: 651e171d1b20448b3e99bfc938d118fb6dddea91	2020-03-31 14:07:14 -07:00
Mark Thomas	b331b355ef	sync configerator thrift update Reviewed By: StanislavGlebik Differential Revision: D20769371 fbshipit-source-id: 46f476adee8abcc8248f89f768d3ee43ad29466f	2020-03-31 11:44:04 -07:00
Stanislau Hlebik	2eebab89c5	mononoke: make scs diff path-only call faster Reviewed By: markbt Differential Revision: D20766465 fbshipit-source-id: fa78fa66da32caddcd582a6500b9a8393904f687	2020-03-31 07:40:14 -07:00
Stanislau Hlebik	11f891a178	mononoke: remove allow dead_code Summary: Looks like it's not dead anymore Reviewed By: krallin Differential Revision: D20766497 fbshipit-source-id: c49ae3b6c8a660b33e61e65adda94f78addd1498	2020-03-31 07:30:32 -07:00
Lukas Piatkowski	fa9d734ad1	mononoke: remove direct usages of common/rust/configerator Summary: It is preferable to use the higher-level API of cached_config instead of ConfigeratorAPI whenever possible since the higher-level API supports OSS builds. For `ConfigStore` let `poll_interval` be None so that for one-off reading of configs the ConfigStore doesn't needlessly spawn an updating thread. Also this update is with compliance to the discussion in D19026190. Reviewed By: ahornby Differential Revision: D20670224 fbshipit-source-id: 24fc124d440fd458a9fa88a906fc3a1cfdbd827e	2020-03-31 04:02:46 -07:00
Lukas Piatkowski	7fa825d40c	rust-shed: move cached_config to the shed Reviewed By: ahornby Differential Revision: D20650304 fbshipit-source-id: 5fc704ce2964b9595722c3cd9c6f1dbd395a52ee	2020-03-31 04:02:46 -07:00
Lukas Piatkowski	bf34f084d0	mononoke: make blobrepo and its dependencies OSS buildable Reviewed By: markbt Differential Revision: D20495840 fbshipit-source-id: 3bbefae1923dc84e3daea158a24c0d2a802cc9a9	2020-03-31 04:02:45 -07:00
Lukas Piatkowski	1bee1993a3	mononoke: make newfilenodes and blobstore/factory OSS buildable Summary: In the process the blobstore/factory/lib.rs was split into submodules - this way it was easier to untangle the dependencies and refactor it, so I've left the split in this diff. Reviewed By: markbt Differential Revision: D20302068 fbshipit-source-id: caa3a2b5487c30198c62f7e4f4e9cb7c488dc8de	2020-03-31 04:02:45 -07:00
Kostia Balytskyi	5858dc309e	resolver.rs: make Bundle2Resolver contain refs to ctx and repo Summary: As suggested in D20680173, we can reduce the overall need to copy things by storing refs in the resolver. Reviewed By: krallin Differential Revision: D20696588 fbshipit-source-id: 9456e2e208cfef6faed57fc52ca59fafdccfc68c	2020-03-30 12:21:09 -07:00
Kostia Balytskyi	d7af87342b	upload_changesets: migrate the main fn to async/await Summary: See bottom diff of this stack for overview. This diff in particular asyncifies the `upload_changeset` fn. Apart from that, it also makes sure it can accept `&RevlogChangeset` instead of `RevlogChangeset`, which helps us to get rid of cloning. Reviewed By: krallin Differential Revision: D20693932 fbshipit-source-id: b0e5e1604cbfb6f6b6e269c85a79208115325734	2020-03-30 12:21:09 -07:00
Kostia Balytskyi	19fff610d7	upload_changesets: rename Future into OldFuture Summary: Same as the bottom diff of this stack, but for another file. Reviewed By: krallin Differential Revision: D20693934 fbshipit-source-id: 4c2d12bf9d9ab272898a7830ece6d9f563adb8fb	2020-03-30 12:21:08 -07:00
Kostia Balytskyi	014e19fbed	resolver.rs: simplify a few post-asyncifying things Summary: This diff focuses on the following: - replaces clones with references, both when this decreases the total sum of clones, and when it causes the only clone to be on the boundary with the compat code. This, when those boundaries are pushed further, we can only fix one place in resolver - removes a weird wrapping of a closure into an `Arc` and just calls `upload_changesets` directly instead - in cases when `BundleResolver` methods take `ctx` as an argument removes it and makes those methods use the one stored in the struct Reviewed By: StanislavGlebik Differential Revision: D20680173 fbshipit-source-id: c397c4ade57a07cbbc9206fa8a44f4225426778c	2020-03-30 12:21:08 -07:00
Kostia Balytskyi	e2bc1b7f36	resolver.rs: remove unneeded res local vars Reviewed By: StanislavGlebik Differential Revision: D20678513 fbshipit-source-id: 73ea5fbf028634fe18bba2690d65e7baf5bca512	2020-03-30 12:21:08 -07:00
Kostia Balytskyi	0a47a018f4	resolver.rs: replace stream with loops in upload_changesets Reviewed By: krallin Differential Revision: D20678301 fbshipit-source-id: 955cee628feb51639216366d09c2ffafbe31ac69	2020-03-30 12:21:07 -07:00
Thomas Orozco	11af551491	mononoke/benchmark_filestore: make it work again Summary: This bitrot with two different changes: - D19473960 put it on a v2 runtime, but the I/O is v1 so that doesn't work (it panics). - The clap update a couple months ago made duplicate arguments illegal, and a month before that we had put `debug` in the logger args (arguably where it belong), so this binary was now setting `debug` twice, which would now panic. Evidently, just static typing wasn't quite enough to keep this working through time (though that's perhaps due to the fact that both of those changes were invisible to the type system), so I also added a smoke test for this. Reviewed By: farnz Differential Revision: D20618785 fbshipit-source-id: a1bf33783885c1bb2fe99d3746d1b73853bcdf38	2020-03-30 07:32:20 -07:00
Thomas Orozco	8315336b2c	mononoke/unbundle_replay: run hooks Summary: As the name indicates, this updates unbundle_replay to run hooks. Hook failures don't block the replay, but they're logged to Scuba. Differential Revision: D20693851 fbshipit-source-id: 4357bb0d6869a658026dbc5421a694bc4b39816f	2020-03-30 06:25:08 -07:00
Thomas Orozco	fd546edbad	mononoke/unbundle_replay: don't derive filenodes Summary: Setting up a derived data tailer for this is a better approach (see D20668301 for context). Reviewed By: StanislavGlebik Differential Revision: D20693270 fbshipit-source-id: 7a06ffe059c41c4e100f8b0f8837978717293829	2020-03-30 06:25:08 -07:00
Thomas Orozco	dfcaca8077	mononoke/unbundle_replay: move unbundle & filenodes derivation to their own task Summary: Since we do those concurrently, it makes sense to do them on their own task. Besides, since those are still old futures that need ownership, there is effectively no tradeoff here. Differential Revision: D20691373 fbshipit-source-id: 1a45e43ec857d91bed1614568b4354d56a2b0848	2020-03-30 06:25:08 -07:00
Thomas Orozco	066cdcfb3d	mononoke/unbundle_replay: also report recorded duration Summary: This will make it easier to compare performance. Differential Revision: D20674164 fbshipit-source-id: eb1a037b0b060c373c1e87635f52dd228f728c89	2020-03-30 06:25:07 -07:00
Thomas Orozco	213276eff5	mononoke/unbundle_replay: add Scuba reporting Summary: This adds some Scuba reporting to unbundle_replay. Differential Revision: D20674162 fbshipit-source-id: 59e12de90f5fca8a7c341478048e68a53ff0cdc1	2020-03-30 06:25:07 -07:00
Thomas Orozco	13f24f7425	mononoke/unbundle_replay: unbundle concurrently, derive filenodes concurrently Summary: This updates unbundle_replay to do things concurrently where possible. Concretely, this means we do ingest unbundles concurrently, and filenodes derivation concurrently, and only do the actual pushrebase sequentially. This lets us get ahead on work wherever we can, and makes the process faster. Doing unbundles concurrently isn't actually guaranteed to succeed, since it's possible that an unbundle coming in immediately after a pushrebase actually depends the commits created in said pushrebase. In this case, we simply retry the unbundle when we're ready to proceed with the pushrebase (in the code, this is the `Deferred` variant). This is fine from a performance perspective As part of this, I've also moved the loading of the bundle to processing, as opposed to the hg recording client (the motivation for this is that we want to do this loading in parallel as well). This will also let us run hooks in parallel once I add this in. Reviewed By: StanislavGlebik Differential Revision: D20668301 fbshipit-source-id: fe2c62ca543f29254b4c5a3e138538e8a3647daa	2020-03-30 06:25:07 -07:00
Thomas Orozco	60d427e93c	mononoke/unbundle_replay: log when pushrebase is starting Summary: More logging is always helpful! Reviewed By: HarveyHunt Differential Revision: D20668303 fbshipit-source-id: 776f41491c4108e5f5ab9caa9351584150d7b626	2020-03-30 06:25:06 -07:00
Thomas Orozco	d18cd74f7d	mononoke/unbundle_replay: ignore entries with conflicts Summary: pushrebase_errmsg is NULL when we have conflicts, but we still shouldn't replay the entry (because it'll fail, with conflicts). Let's exclude those. Reviewed By: StanislavGlebik Differential Revision: D20668304 fbshipit-source-id: a058bb466e0a8a53ec81e41db7ba138d6aedf3f9	2020-03-30 06:25:06 -07:00
Thomas Orozco	7dd1717f7d	mononoke/unbundle_replay: log the age of the commit we just replayed Summary: It's helpful. Reviewed By: HarveyHunt Differential Revision: D20668302 fbshipit-source-id: 0f8e8cc72363aed337fd6fa4c3950c17eb1f92b7	2020-03-30 06:25:06 -07:00
Thomas Orozco	58eeb318aa	mononoke/unbundle_replay: log when we start deriving hg changesets Summary: This is helpful. Reviewed By: StanislavGlebik Differential Revision: D20645576 fbshipit-source-id: b08ec151232e46dbde1a33010c6852e9563f6a1a	2020-03-30 06:25:05 -07:00
Thomas Orozco	259e096841	mononoke/unbundle_replay: sleep when watching bookmark Summary: This updates unbundle_replay to support sleeping when watching for updates in a bookmark and said bookmark isn't moving. This will be useful so it can run as a service. Reviewed By: StanislavGlebik Differential Revision: D20645157 fbshipit-source-id: 6edeb66b65b2ef8b88c8db5e664982756acbfaf1	2020-03-30 06:25:05 -07:00
Thomas Orozco	d1cce10ea7	mononoke/unbundle_replay: fixup incomplete test Summary: I accidentally forgot to insert the entry, so that made this test a bit useless. Let's make it not useless. Reviewed By: StanislavGlebik Differential Revision: D20645158 fbshipit-source-id: 0f0eb0cf9d16e8c346897088891aa3277b4d9c07	2020-03-30 06:25:05 -07:00
Thomas Orozco	8ce3d94187	mononoke/unbundle_replay: add support for replaying a bookmark Summary: This adds support for replaying the updates to a bookmark through unbundle replay. The goal is to be able to run this as a process that keeps a bookmark continuously updated. There is still a bit of work here, since we don't yet allow the stream to pause until bookmark update becomes available (i.e. once caught up, it will exit). I'll introduce this in another diff. Note that this is only guaranteed to work if there is a single bookmark in the repo. With more, it could fail if a commit is first introduced in a bookmark that isn't the one being replayed here, and later gets introduced in said bookmark. Reviewed By: StanislavGlebik Differential Revision: D20645159 fbshipit-source-id: 0aa11195079fa6ac4553b0c1acc8aef610824747	2020-03-30 06:25:04 -07:00
Thomas Orozco	7cd5eb6774	mononoke/unbundle_replay: get a stream of requests to replay Summary: I'm going to update this to run in a loop, so to do that it would be nice to represent the things to replay as a stream. This does that change, but for now all our streams have just one element. Reviewed By: StanislavGlebik Differential Revision: D20645156 fbshipit-source-id: fce7536d0ccbc1911335704816b71c17e80f2116	2020-03-30 06:25:04 -07:00
Thomas Orozco	6b1894cec9	mononoke/unbundle_replay: derive filenodes Summary: We normally derive those lazily when accepting pushrebase, but we do derive them eagerly in blobimport. For now, let's be consistent with blobimport. This ensures that we don't lazily generate them, which would require read traffic, and gives a picture a little more consistent with what an actual push would look like. Reviewed By: ikostia Differential Revision: D20623966 fbshipit-source-id: 2209877e9f07126b7b40561abf3e6067f7a613e6	2020-03-30 06:25:04 -07:00
Thomas Orozco	8b0f92e84b	mononoke/unbundle_replay: report missing Bonsai onto_rev in hg replay Summary: This makes it easier to realize if you used the wrong entry ID when replaying (instead of telling you the bookmark isn't at `None` as expected, it tells you the Hg Changeset could not be mapped to a Bonsai). Reviewed By: ikostia Differential Revision: D20623847 fbshipit-source-id: aaa66e7825f12373742efd4f779ae20ff21f0b46	2020-03-30 06:25:03 -07:00
Thomas Orozco	90cf5df340	mononoke/unbundle_replay: add a little more logging Summary: More logging is nice! Reviewed By: ikostia Differential Revision: D20623846 fbshipit-source-id: 61eb3d17f5fb3b2bf94ef3f946b1d90d725cfece	2020-03-30 06:25:03 -07:00
Thomas Orozco	7ca14665a2	mononoke/unbundle_replay: use repo pushrebase hooks Summary: This updates unbundle_replay to account for pushrebase hooks, notably to assign globalrevs. To do so, I've extracted the creation of pushrebase hooks in repo_client and reused it in unbundle_replay. I also had to update unbundle_replay to no longer use `args::get_repo` since that doesn't give us access to the config (which we need to know what pushrebase hooks to enable). Reviewed By: ikostia Differential Revision: D20622723 fbshipit-source-id: c74068c920822ac9d25e86289a28eeb0568768fc	2020-03-30 06:25:03 -07:00
Thomas Orozco	3804f1ca16	mononoke: introduce unbundle_replay Summary: This adds a unbundle_replay Rust binary. Conceptually, this is similar to the old unbundle replay Python script we used to have, but there are a few important differences: - It runs fully in-process, as opposed to pushing to a Mononoke host. - It will validate that the pushrebase being produced is consistent with what is expected before moving the bookmark. - It can find sources to replay from the bookmarks update log (which is convenient for testing). Basically, this is to writes and to the old unbundle replay mechanism what Fastreplay is to reads and to the traffic replay script. There is still a bit of work to do here, notably: - Make it possible to run this in a loop to ingest updates iteratively. - Run hooks. - Log to Scuba! - Add the necessary hooks (notably globalrevs) - Set up pushrebase flags. I would also like to see if we can disable the presence cache here, which would let us also use this as a framework for benchmarking work on push performance, if / when we need that. Reviewed By: StanislavGlebik Differential Revision: D20603306 fbshipit-source-id: 187c228832fc81bdd30f3288021bba12f5aca69c	2020-03-30 06:25:03 -07:00
Thomas Orozco	4a62a3e629	mononoke/bookmarks: expose access to owned replay data Summary: I'd like to get the timestamps here without needing to clone them. Reviewed By: StanislavGlebik Differential Revision: D20603308 fbshipit-source-id: 2d8f72b4fb3a3eed33b58dc2f0fb1a857bb3f5b9	2020-03-30 06:25:02 -07:00
Thomas Orozco	beb18f5113	mononoke/pushrebase: make into_transaction_hook async + accept context Summary: This updates pushrebase hooks to allow into_transaction_hook to be async (the reason I hadn't made it async is because it hadn't been needed yet). Currently, this is a no-op, but I'm going to use this later in this stack. Reviewed By: StanislavGlebik Differential Revision: D20603307 fbshipit-source-id: 79651184dbe08322c4cab03d7119a31036391852	2020-03-30 06:25:02 -07:00
Stanislau Hlebik	b86b4fd627	mononoke: log if skiplist failed Summary: A few of our tasks failed on startup and most likely it was during warmup though we are not sure (see attached task). Let's add move logging Reviewed By: farnz Differential Revision: D20698273 fbshipit-source-id: 4facd21a94d2917103e417a014b820c893da4718	2020-03-27 23:49:03 -07:00
Stanislau Hlebik	2742bea611	mononoke: fix warning Reviewed By: krallin Differential Revision: D20698518 fbshipit-source-id: 53550e2d3afb49a4a3bc8c940f37175ff7ee89c0	2020-03-27 23:44:42 -07:00
Stefan Filip	ea89b541e1	segmented_changelog: add Dag struct and location_to_name functionality Summary: The IdDag provides graph algorithms using Segments. The IdMap allows converting from the SegmentedChangelogId domain to the ChangesetId domain. The Dag struct wraps IdDag and IdMap in order to provide graph algorithms using the common application level identifiers for commits (ChangesetId). The construction of the Dag is currently mocked with something that can only be used in a test environment (unit tests but also integration tests). This diff also implements a location_to_name function. This is the most important new functionality that segmented changelog clients require. It recovers the hash of a commit for which the client only has a segmented changelog Id. The current assumption is that clients have identifiers for all merge commit parents so the path to a known commit always follow a set of first parents. The IdMap queries will have to be changed to async in the future, but IdDag queries we expect to stay sync. Reviewed By: quark-zju Differential Revision: D20635577 fbshipit-source-id: 4f9bd8dd4a5bd9b0de55f51086f3434ff507963c	2020-03-27 13:48:52 -07:00
Stefan Filip	a853c7a92b	segmented_changelog: use [fbinit::compat_test] for idmap tests Summary: Modernizing the codebase. Reviewed By: krallin Differential Revision: D20655252 fbshipit-source-id: c97fd46f1a224ca74606f4b42d5fa6b1a00c8ea8	2020-03-27 13:48:52 -07:00

1 2 3 4 5 ...

505 Commits