sapling

mirror of https://github.com/facebook/sapling.git synced 2024-10-11 01:07:15 +03:00

Author	SHA1	Message	Date
Lukas Piatkowski	12c684afcd	mononoke/hooks: make deny_files public Reviewed By: aslpavel Differential Revision: D23537799 fbshipit-source-id: 58c9568e30982f682b00faae42bc3a3f3595890f	2020-09-04 12:23:35 -07:00
Thomas Orozco	3ba2c2b429	mononoke/hg_sync: make it work on Mercurial Python 3 Summary: A few things here: - The heads must be bytes. - The arguments to wireproto must be strings (we used to encode / decode them, but we shouldn't). - The bookmark must be a string (otherwise it gets serialized as `"b\"foo\""` and then it deserializes to that instead of `foo`). Reviewed By: StanislavGlebik Differential Revision: D23499846 fbshipit-source-id: c8a657f24c161080c2d829eb214d17bc1c3d13ef	2020-09-04 11:56:44 -07:00
Thomas Orozco	747b355236	mononoke: make mononoke_hg_sync_job sendunbundlereplaybatch more debuggable Summary: Right now we get very little logging out of errors in here, which is making it difficult to fix it on Py3 (where it currently is broken). This diff doesn't fix anything, but at the very least, let's make the errors better so we can make this easier to start debugging. Reviewed By: ahornby Differential Revision: D23499369 fbshipit-source-id: 7ee60b3f2a3be13f73b1f72dee062ca80cb8d8d9	2020-09-04 11:56:44 -07:00
Thomas Orozco	c8dd8ae4e3	mononoke: run tests using hg Python 3 as well Summary: The motivation for this is to surface potential regressions in hg Python 3 by testing code paths that are exercised in Mononoke. The primary driver for this were the regressions in the LFS extension that broke uploads, and for which we have test coverage here in Mononoke. To do this, I extracted the manifest generation (the manifest is the list of binaries that the tests know about, which is passed to the hg test runner), and moved it into its own function, then added a new target for the py3 tests. Unfortunately, a number of tests are broken in Python 3 currently. We should fix those. It looks like there are some errors in Mercurial when walking a manifest with non-UTF-8 files, and the other problem is that the hg sync job is in fact broken: https://fburl.com/testinfra/545af3p8. Reviewed By: ahornby Differential Revision: D23499370 fbshipit-source-id: 762764147f3b57b2493d017fb7e9d562a58d67ba	2020-09-04 11:56:44 -07:00
Stanislau Hlebik	7b323a4fd9	mononoke: add log-only mode in redaction Summary: Before redacting something it would be good to check that this file is not accessed by anything. Having log-only mode would help with that. Reviewed By: ikostia Differential Revision: D23503666 fbshipit-source-id: ae492d4e0e6f2da792d36ee42a73f591e632dfa4	2020-09-04 07:37:15 -07:00
Stanislau Hlebik	0740f99f13	mononoke: allow logging censored scuba accesses to file Summary: In the next diff I'm going to add log-only mode to redaction, and it would be good to have a way of testing it (i.e. testing that it actually logs accesses to bad keys). In this diff let's use a config option that allows logging censored scuba accesses to file, and let's update redaction integration test to use it Reviewed By: ikostia Differential Revision: D23537797 fbshipit-source-id: 69af2f05b86bdc0ff6145979f211ddd4f43142d2	2020-09-04 07:37:14 -07:00
Thomas Orozco	f1e4f62e2d	mononoke/fsnodes: expose FsnodeFile as the LeafId Summary: Fsnodes have a lot of data about files, but right now we can't access it through a Fsnode lookup or a manifest walk, because the LeafId for a Fsnode is just the content id and the file type. This is a bit sad, because it means we e.g. cannot dump a manifest with file sizes (D23471561 (`179e4eb80e`)). Just changing the LeafId is easy, but that brings a new problem with Fsnode derivation. Indeed, deriving manifests normally expects us to have the "derive leaf" function produce a LeafId (so we'd want to produce a `FsnodeFile`), but in Fsnodes, this currently happens in deriving trees instead. Unfortunately, we cannot easily just move the code that produces `FsnodeFile` from the tree derivation to the leaf derivation, that is, do: ``` fn check_fsnode_leaf( leaf_info: LeafInfo<FsnodeFile, (ContentId, FileType)>, ) -> impl Future<Item = (Option<FsnodeSummary>, FsnodeFile), Error = Error> ``` Indeed, the performance of Fsnode derivation relies on all the leaves for a given tree being derived together with the tree and its parents in context. So, we'd need the ability for deriving a new leaf to return something different from the actual leaf id. This means we want to return a `(ContentId, FileType)`, even though our `LeafId` is a `FsnodeFile`. To do this, this diff introduces a new `IntermediateLeafId` type in the derivation. This represents the type of the leaf that is passed from deriving a leaf to deriving a tree. We need to be able to turn a real `LeafId` into it, because sometimes we don't re-derive leaves. I think we could also refactor some of the code that passes a context here to just do this through the `IntermediateLeafId`, but I didn't look into this too much. So, this diff does that, and uses it in Mononoke Admin so we can print file sizes. Reviewed By: StanislavGlebik Differential Revision: D23497754 fbshipit-source-id: 2fc480be0b1e4d3d261da1d4d3dcd9c7b8501b9b	2020-09-04 06:30:18 -07:00
Mateusz Kwapich	f7be2eef14	tunable scuba sampling Summary: This allows us to sample the most popular method logs (`repo_list_hg_manifest` calls make up for 90% samples in our scuba table) while still have full logging for other queries end errors. The sampling can be eaily disabled via tunable. In case we get a lot of errors we can also start sampling the error request with a simple configerator change. Reviewed By: krallin Differential Revision: D23507333 fbshipit-source-id: c7e34467d99410ec3de08cce2db275a55394effd	2020-09-04 06:26:35 -07:00
Viet Hung Nguyen	437a0e905b	mononoke/repo_import: add deriving data types for multiple repos Summary: Previously, we only supported deriving data types for the repo we import into. This diff expands on this and now we can do that for multiple repos (e.g. small repos we backsync commits to from large repo we import to). Reviewed By: StanislavGlebik Differential Revision: D23499953 fbshipit-source-id: 223209a6a2739eae93082cae4f04e53e0cba0c58	2020-09-04 05:39:21 -07:00
Stanislau Hlebik	11a45b6b60	mononoke: do not pass tasks to find_files_with_given_content_id_blobstore_keys Summary: In the next diff I'm going to add log_only mode for redaction. And in this diff I make a small refactoring that makes next diff simpler. find_files_with_given_content_id_blobstore_keys don't accept tasks anymore, just content keys. Reviewed By: aslpavel Differential Revision: D23535829 fbshipit-source-id: 1dac37f5ea7038fc779ad51192a290fcc23e6556	2020-09-04 05:22:03 -07:00
Lukas Piatkowski	67a71d1f98	mononoke/hooks: make limit_commitsize and limit_filesize public Reviewed By: aslpavel Differential Revision: D23502908 fbshipit-source-id: 8b9070cfaa28af7b808d02548c0fb7c5d344550d	2020-09-04 04:23:05 -07:00
Lukas Piatkowski	462cb96cc2	mononoke/hooks: make no_questionable_filenames public Reviewed By: aslpavel Differential Revision: D23478259 fbshipit-source-id: 642948c2685690298a71fbe7177c4bd6a6e43f85	2020-09-04 04:23:05 -07:00
Lukas Piatkowski	eebdc0b896	mononoke/metaconfig: sync thrift changes from configerator for HookConfig Summary: Use the new fields from RawHookConfig in HookConfig Reviewed By: StanislavGlebik Differential Revision: D23499766 fbshipit-source-id: 43e9d2dfdcfb0fa0dd4de6310ea0013db1b69474	2020-09-04 02:02:06 -07:00
Stefan Filip	3f0b08e46f	segmented_changelog: add version field to IdMap Summary: The version is going to be used to seamlessly upgrade the IdMap. We can generate the IdMap in a variety of ways. Naturally, algorithms for generating the IdMap may change, so we want a mechanism for updating the shared IdMap. A generated IdDag is going to require a specific IdMap version. To be more precise, the IdDag is going to specify which version of IdMap it has to be interpreted with. Reviewed By: quark-zju Differential Revision: D23501158 fbshipit-source-id: 370e6d9f87c433645d2a6b3336b139bea456c1a0	2020-09-03 16:33:20 -07:00
Stefan Filip	58a4821fe3	segmented_changelog: add IdMap trait with SqlIdMap implementation Summary: Separate the operational bits of the IdMap from the core SegmentedChangelog requirements. I debaded whether it make sense to add repo_id to SqlIdMap. Given the current architecture I don't see a reason not to do it. On the contrary separating two objects felt convoluted. Reviewed By: quark-zju Differential Revision: D23501160 fbshipit-source-id: dab076ab65286d625d2b33476569da99c7b733d9	2020-09-03 16:33:20 -07:00
Stefan Filip	f3c353edbc	segmented_changelog: change idmap module from file to directory Summary: Planning to add a trait for core idmap functionality (that's just translating cs_id to vertex and back). The current IdMap will then be an implementation of that trait. Reviewed By: quark-zju Differential Revision: D23501159 fbshipit-source-id: 34e3b26744e4b5465cd108cca362c38070317920	2020-09-03 16:33:20 -07:00
Stanislau Hlebik	4947e07cb7	mononoke: asyncify one function in redaction admin subcommand Summary: I'm going to change this function soon, so it's nice to asyncify it to make next diffs simpler and also remove duplicated logic. Also remove unnecessary `logger` parameter - we can always get logger from CoreContext Reviewed By: krallin Differential Revision: D23501634 fbshipit-source-id: 7ad2fc17167e4107481ceb230e0b7cb3e7f2549a	2020-09-03 12:22:24 -07:00
Mateusz Kwapich	20d096f5d5	add thrift metadata support Summary: This closely replicates EscapeZero work in D23328638 and will allow us to issue requests to SCS using Thrift Fiddle (https://www.internalfb.com/thrift_fiddle). Reviewed By: EscapeZero Differential Revision: D23475864 fbshipit-source-id: fb286e3fcd6ea79704fa2e7e1ed9ab5595ff7b81	2020-09-03 12:18:18 -07:00
Arun Kulshreshtha	858a080502	gotham_ext: make StreamBody automatically delay post-request callbacks Summary: Now that post-request callbacks are available in `gotham_ext`, we can make `StreamBody` use them directly instead of using an LFS-specific wrapper (previously required to access the LFS server's `RequestContext`). This also means that the EdenAPI server will get this behavior for free. Reviewed By: krallin Differential Revision: D23402969 fbshipit-source-id: 56ab710473f13e8983b136664af364af6884bd3f	2020-09-03 11:59:32 -07:00
Arun Kulshreshtha	5556a447d1	edenapi_server: use LogMiddleware Summary: Add `LogMiddleware` to the EdenAPI server, which will print a log message whenever a request is received or has completed. Reviewed By: DurhamG Differential Revision: D23299902 fbshipit-source-id: f44ef1b01692f0e4f9b109917fcee89a84ca4208	2020-09-03 11:59:32 -07:00
Arun Kulshreshtha	96a6a3fcfb	edenapi_server: use LoadMiddleware Summary: Use `LoadMiddleware` to track the number of outstanding requests in the server. Reviewed By: DurhamG Differential Revision: D23298415 fbshipit-source-id: bdcdb0f657d8deac593d356c87ac0d8d3f39e322	2020-09-03 11:59:32 -07:00
Arun Kulshreshtha	7144363d2c	gotham_ext: move LogMiddleware to gotham_ext Summary: Now that `LogMiddleware` no longer depends on `RequestContext`, it can be moved into `gotham_ext`. Reviewed By: DurhamG Differential Revision: D23298412 fbshipit-source-id: d5288decba98c3dd4605b9a44e41eba0f47fee37	2020-09-03 11:59:31 -07:00
Arun Kulshreshtha	35d292e513	gotham_ext: move LoadMiddleware to gotham_ext Summary: Now that `LoadMiddleware` no longer depends on `RequestContext`, it can be moved into `gotham_ext`. Reviewed By: DurhamG Differential Revision: D23298416 fbshipit-source-id: 5d29da492e39beb5621daf0570d9b3e657cbfc04	2020-09-03 11:59:31 -07:00
Arun Kulshreshtha	82c451fb9f	lfs_server: use PostRequestMiddleware Summary: This diff removes the post-request callback functionality from the LFS server's `RequestContext` and replaces it with the new `PostRequestMiddleware`. The middleware is directly based on `RequestContext`, so the underlying behavior is essentially the same as before. Reviewed By: krallin Differential Revision: D23298413 fbshipit-source-id: 1e58a40f6ce6d526456dbd9ae3a8efc85768bf04	2020-09-03 11:59:31 -07:00
Arun Kulshreshtha	3ad7fa8b6f	gotham_ext: allow applications to dynamically configure PostRequestMiddleware Summary: Make `PostRequestMiddleware` generic over a user-provided config struct which can be used to dynamically configure the behavior of post-request callback dispatching. Right now this is only used to support disabling hostname logging, but could be easily extended to cover more uses in the future. Reviewed By: krallin Differential Revision: D23495005 fbshipit-source-id: 3d59a8346f449775ec76d03c260d973d04fb90a9	2020-09-03 11:59:31 -07:00
Arun Kulshreshtha	cc0f2e4c40	gotham_ext: add PostRequestMiddleware Summary: Add new middleware that allows HTTP handlers and other middleware to register callbacks that will be run once the current request completes. This is heavily based on the post-request callback functionality from the LFS server's `RequestContext`. The intention here is to expose this functionality in a manner that's independent of other, application-specific logic. Reviewed By: krallin Differential Revision: D23298419 fbshipit-source-id: e4b1534b02c35f685ce544de13e331947e187818	2020-09-03 11:59:31 -07:00
Thomas Orozco	d77cf89ead	mononoke/admin: clean up unodes subcommand a bit Summary: I pattern matched off of this for the previous diff in this stack, and spotted a bit of clean up that might make sense here: - Using `.help()` for a subcommand overrides the whole help text. We meant to use `.about()` here. I fixed this in some copy-pasted code as well. - Printing debug output alongside real output makes it harder to select the real output. I fixed this by logging debug output to stderr instead. Reviewed By: StanislavGlebik Differential Revision: D23471560 fbshipit-source-id: 7900cfe65613c48abd77faad6d6a45a7aa523b36	2020-09-03 09:32:06 -07:00
Thomas Orozco	179e4eb80e	mononoke/admin: add a subcommand for dumping paths Summary: This adds a subcommand for dumping all the paths in a repository. This is helpful when you have a Content ID, limited imagination and time on your hands, and you'd like to turn those into a file path where that Content ID lives. This uses fsnodes for the traversal because that's O(# directories) as opposed top O(# files). I had an earlier implementation that used unodes, but that was really slow. Reviewed By: StanislavGlebik Differential Revision: D23471561 fbshipit-source-id: 948bfd20939adf4de0fb1e4b2852ad4d12182f16	2020-09-03 09:32:06 -07:00
Viet Hung Nguyen	7c34b39ec8	mononoke/repo_import: add backsyncing to rewrite file paths, remove backup file Summary: add backsyncing to rewrite file paths: After setting the variables for large repo (D23294833 (`d6895d837d`)), we try to import the git commits into large repo and rewrite the file paths. Following this, repo import tool should back-sync the commits into small_repo. next step: derive all the data types for both small and large repos. Currently, we only derive it for the large repo. ============== remove backup file: The backup file was a last-minute addition when trying to import a repo for the first time. Removed it, because we shouldn't write to external files. Future plan is to include better process recoverability across the whole tool and not just rewrite file paths functionality. Reviewed By: StanislavGlebik Differential Revision: D23452571 fbshipit-source-id: bda39694fa34788218be795319dbbfd014ba85ff	2020-09-03 06:43:08 -07:00
Stanislau Hlebik	a77d9f243a	mononoke: parallelize operations in create_commit scs method Reviewed By: krallin Differential Revision: D23496535 fbshipit-source-id: 18f88abb9b85d38a93d2aa99c38edcf8190343c3	2020-09-03 04:12:35 -07:00
Lukas Piatkowski	a4af730541	monononke/hooks: make no_bad_filenames public Reviewed By: aslpavel Differential Revision: D23474524 fbshipit-source-id: 5f7826346500b1acc7450791dd1e7806c4e623d6	2020-09-03 02:40:43 -07:00
Lukas Piatkowski	81d9338100	mononoke/hooks: make few generic hooks public Summary: More hooks will come in next diffs. Reviewed By: aslpavel Differential Revision: D23449755 fbshipit-source-id: 451fdb7a759140f2d6df8f3a18493c700fa2b761	2020-09-03 02:40:43 -07:00
Stanislau Hlebik	29bbc0dc15	mononoke: check if content we are about to redact is not reachable Summary: That's one of the sev followups. Before redacting a file content let's check if it exists in "main-bookmark" (which is be default master), and refuse to redact if it actually exists. If this check passes (i.e. the content we are about to redact is not reachable from master) that doesn't mean that we are 100% safe. E.g. this comment can be in ancestor of master, or in any other repo or it can be added in the next commit. This check is a best-effort check to prevent shooting ourselves in the foot. Reviewed By: aslpavel Differential Revision: D23476278 fbshipit-source-id: 5a4cd10964a65b8503ba9a6391f17319f0ce37d8	2020-09-03 01:30:14 -07:00
Stefan Filip	da4c33c67a	tests: add commit-location-to-hash integration test Summary: Exercise location-to-hash functionality in edenapi. Reviewed By: kulshrax Differential Revision: D23456214 fbshipit-source-id: 2ab22eb045517a5927c2de502d8cfc9898daecef	2020-09-02 17:20:43 -07:00
Stefan Filip	932450fb15	handlers: update location-to-hash endpoint with count parameter Summary: To reduce the size over the wire on cases where we would be traversing the changelog on the client, we want to allow the endpoint to return a whole parent chain with their hashes. Reviewed By: kulshrax Differential Revision: D23456216 fbshipit-source-id: d048462fa8415d0466dd8e814144347df7a3452a	2020-09-02 17:20:42 -07:00
Stefan Filip	7122cdded7	types: rename Location to CommitLocation Summary: Renaming all the LocationToHash related structures to CommitLocationToHash. This is done for consistency. I realized the issue when the command for reading the request from cbor was not what I was expecting it to be. The reason was that the commit prefix was used inconsistently for LocationToHash. Reviewed By: kulshrax Differential Revision: D23456221 fbshipit-source-id: 0181dcaf81368b978902d8ca79c5405838e4b184	2020-09-02 17:20:42 -07:00
Stefan Filip	310b3616a6	blobrepo: instantiate segmented changelog as an attribute Summary: Segmented Changelog is a component that has multiple components of each own that each can be configured in different ways. It seems that it already is more complicated than other components in how it is set up and it will probably evolve to have more knobs (caching comes to mind). Right now we have 3 ways of instantiating SegmentedChangelog: - Disabled, all requests return errors - ReadOnly, requests to unprocessed commits return errors - OnDemandUpdate, requests trigger commit processing when required Reviewed By: aslpavel Differential Revision: D23456217 fbshipit-source-id: a6016f05197abbc3722764fa8e9056190a767b36	2020-09-02 17:20:42 -07:00
Stefan Filip	b818a86631	config: add segmented changelog config parsing Summary: Parsing is done in the SegmentedChangelogConfig structure which will inform how to construct the SegmentedChangelog in Mononoke. Reviewed By: aslpavel Differential Revision: D23456222 fbshipit-source-id: a7d5d81f4c166909164026e81af57f1c2ea32347	2020-09-02 17:20:42 -07:00
Stefan Filip	e57b1f9265	segmented_changelog: add on-demand updating dag implementation Summary: The Segmented Changelog must be built somewhere. One of the simplest deployments of involves the on-demand update of the graph. When a commit that wasn't yet processed is encountered, we sent it to processing along with all of it's ancestors. At this time not much attention was paid to the distinction of master commit versus non-master commit. For now the expectation is that only commits from master will exercise this code path. The current expectation is that clients will only call location-to-hash using commits from master. Let me know if there is an easy way to check if a commit is part of master. Later changes will invest more in handling non-master commits. Reviewed By: aslpavel Differential Revision: D23456218 fbshipit-source-id: 28c70f589cdd13d08b83928c1968372b758c81ad	2020-09-02 17:20:42 -07:00
Stefan Filip	d50e09a41d	segmented_changelog: add SegmentedChangelogBuilder Summary: This builders implements SqlConstruct and SqlConstuctFromMetadataDatabaseConfig to make handling the Sql connection for IdMap consistent with what happens in Mononoke in general. Reviewed By: aslpavel Differential Revision: D23456219 fbshipit-source-id: 6998afbbfaf1e0690a40be6e706aca1a3b47829f	2020-09-02 17:20:42 -07:00
Stefan Filip	66706d77c5	segmented_changelog: add SegmentedChangelog trait Summary: The trait provides two methods for location to hash translation. The first returns a single hash and is existing functionality. The second returns a list of hashes and represents new functionality. This diff also adds this functionality to the Dag structure which is currently the only real implementation for SegmentedChangelog. Reviewed By: aslpavel Differential Revision: D23456215 fbshipit-source-id: 0c2ca91672cf23129342c585f98446c0ebbdf7ef	2020-09-02 17:20:41 -07:00
Stefan Filip	10b233f180	blobrepo: move ChangesetFetcher to attributes Summary: I am planning to add Segmented Changelog to attributes. I am writing an integration test for an EdenApi endpoint that depends on Segmented Changelog and I would like to set it up to update on demand. When a request comes in for a commit that we haven't parsed for Segmented Changelog we want to update the structure on demand. This means that we probably need to fetch commits. This means that we want to pass the ChangesetFetcher to Segmented Changelog when it is built. Since Segmented Changelog fits well as an attribute we want the ChangesetFetcher as an attribute. I wonder how much thought has been given to attributes behaving as a dependency injector in the `guice` sense. Reviewed By: aslpavel Differential Revision: D23428201 fbshipit-source-id: 7003c018ba806fd657dd8f071e0e83d35058b10f	2020-09-02 17:20:41 -07:00
Kostia Balytskyi	6e8cbd31b1	megarepotool: add gradual-merge-progress subcommand Summary: This is to be able to automatically report progress: how many merges has been done already. Note: this intentionally uses the same logic as regular `gradual-merge`, so that we always report correct numbers. Reviewed By: StanislavGlebik Differential Revision: D23478448 fbshipit-source-id: 3deb081ab99ad34dbdac1057682096b8faebca41	2020-09-02 12:18:31 -07:00
Thomas Orozco	b8e197fdb4	mononoke/lfs_server: allow enabling rate limits probabilistically Summary: If we exceed a rate limit, we probably don't want to just drop 100% of traffic. This would create a sawtooth pattern where we allow a bunch of traffic, update our counters, drop a bunch of traffic, update our counters again, allow a bunch of traffic, etc. To fix this, let's make limits probabilistic. This lets us say "beyond X GB/s, drop Y% of traffic", which is closer to a sane rate limit. It might also make sense to eventually change this to use ratelim. Initially, we didn't do this because we needed our rate limiting decisions to be local to a single host (because different hosts served different traffic), but now that we spread the load for popular blobs across the whole tier, we should be able to just delegate to ratelim. For now, however, let's finish this bit of a functionality so we can turn it on. The corresponding Configerator change is here: D23472683 Reviewed By: aslpavel Differential Revision: D23472945 fbshipit-source-id: f7d985fded3cdbbcea3bc8cef405224ff5426a25	2020-09-02 11:02:18 -07:00
Stanislau Hlebik	cdf96a20dd	mononoke: asyncify redaction_add Summary: Will change it in the next diff, so let's asyncify it now. Reviewed By: aslpavel Differential Revision: D23475332 fbshipit-source-id: f25fb7dc16f99cb140df9374f435e071401c2b90	2020-09-02 09:28:48 -07:00
Alex Hornby	b22599c500	mononoke: memo the hash values of interned paths in the walker Summary: Memo the hash values of interned paths in the walker. The interner calls the hash function inside a lock that gets heavily contended, so this reduces the time the lock is held. Reviewed By: farnz Differential Revision: D23075260 fbshipit-source-id: 3ee50e3ce56106eadd17dc7d737ba95282640051	2020-09-02 05:52:33 -07:00
Alex Hornby	46cc110012	mononoke: switch walker from arc-intern to internment Summary: Switch the walker from arc-intern::ArcIntern to internment::ArcIntern as internment does not need to acquire its map's locks on every drop. Reviewed By: farnz Differential Revision: D23075265 fbshipit-source-id: 6dd241aed850ec0fd3c8a4e68dda06053ec0b424	2020-09-02 05:52:33 -07:00
Kostia Balytskyi	d49406d847	repo_client: get rid of unneeded perf counters Summary: These two perf counters proved to be not very convenient to evaluate the volume of undesired file fetches. Let's get rid of them. Specifically, they are not convenient, because they accumulate values and it's hard to aggregate over them. Note that I don't do the same for tree fetches, as there's no better way of estimating those now. Reviewed By: mitrandir77 Differential Revision: D23452913 fbshipit-source-id: 08f8dd25eece495f986dc912a302ab3109662478	2020-09-02 05:02:46 -07:00
Kostia Balytskyi	e7ddc6cc13	undesired fetches: regex-based reporting Summary: We want to be able to report more than just on one prefix. Instead, let's add a regex-based reporting. To make deployment easier, let's keep both options for now and later just remove prefix-based one. Note: this diff also changes how a situation with absent `undesired_path_prefix_to_log` is treated. Previously, if `undesired_path_prefix_to_log` is absent, but `"undesired_path_repo_name_to_log": "fbsource"`, it would report every path. Now it won't report any, which I think is a saner behavior. If we do ever want to report every path, we can just add `.*` as a regex. Reviewed By: StanislavGlebik Differential Revision: D23447800 fbshipit-source-id: 059109b44256f5703843625b7ab725a243a13056	2020-09-01 12:01:00 -07:00
Viet Hung Nguyen	2c1d4a49ad	mononoke/repo_import: change logic of file paths rewriting with multiple movers Summary: This diff modifies how we rewrite file paths when we import into a repo by allowing the tool to apply multiple movers. Motivation: When we try to import into a small repo that pushredirects to a large repo, we have decided to import into the large repo first, then backsync to the small repo. To do that, we have to set a couple of flags related to importing into the large repo (see: D23294833 (`d6895d837d`)): bookmarks and import destination path. Previously, we fixed the destination path in large repo by applying the small_to_large repo syncer's mover on the destination path in small repo. e.g: if small_to_large repo syncer mover = { default_action = prepend(large_dir) map = [...]}, then destination_path in small repo becomes large_dir/destination_path in large repo. After this, we prepended the imported files with the new prefix with another mover: prepend(large_dir/dest_path) a -> large_dir/dest_path/a Consequently, all directories and files under destination_path would get imported under large_dir/destination_path in large repo with this logic. e.g. However, it's possible that with push-redirections, some directories would get remapped to a different place in large repo. e.g small_to_large syncer mover = { default_action = prepend(large_dir) map = [ dest_path/b -> random_dir/b ]}, but with the current repo_import implementation dest_path/b would get prepended to large_dir/dest_path/b. To avoid this, we apply multiple movers on the imported files. e.g. 1. we prepend all files with dest_path: mover = { default_action: prepend(dest_path) map={}} => a -> dest_path/a b -> dest_path/b 2. we remap the files using the small_to_large repo syncer mover: mover = { default_action: prepend(large_dir) map = {dest_path/b -> random_dir/b}} => dest_path/a -> large_dir/dest_path/a dest_path/b -> random_dir/b Reviewed By: StanislavGlebik Differential Revision: D23371244 fbshipit-source-id: 0bf4193b24d73c79ed00dfb38e2b0538388d1c0f	2020-09-01 09:26:07 -07:00

1 2 3 4 5 ...

1335 Commits