sapling

mirror of https://github.com/facebook/sapling.git synced 2024-10-10 08:47:12 +03:00

Author	SHA1	Message	Date
Jun Wu	120e6a03a0	templater: move builtin templates to a separate file Summary: We'd like to remove `{rev}` from templates. This makes it easier to copy, edit, review, switch the built templates. Reviewed By: DurhamG Differential Revision: D22468670 fbshipit-source-id: 59580c413f97dfbee898b97ee64d6812f82d2014	2020-07-10 17:52:28 -07:00
Katie Mancini	a701242bf7	fetch aux data on corp Summary: This extends the metadata importer to work on corp. Prefetching metadata for the entries in a tree when we fetch it saves us an extra round trip to the server to fetch a blob when only the metadata for that blob is fetched. This is particularly important on corp where the latencies to the server are larger. Requesting metadata with out the blob itself happens often in the parsing phase of buck commands, thus this metadata prefetch should improve the build times, and other work flows that rely heavily on buck queries. Reviewed By: chadaustin Differential Revision: D21698728 fbshipit-source-id: 4072be23f2fa7df33cf46879e8c1d8ddd6c316ba	2020-07-10 16:03:32 -07:00
Katie Mancini	614729cb49	Fetch aux data for blobs from scs Summary: Buck uses the content SHA-1 to identify each of the source files for a target. During the parsing phase it needs these SHAs, though the content of the files is not yet needed, and may never be needed if the file has already been built and is in the buck cache. Currently, if we do not already have metadata cached for a file when requested we fetch the contents of the file, and compute the hash. We want to avoid this. Eventually this data will be available from the Mononoke EdenAPI server, but for now we want a temporary solution to unblock the Buck team, and ship benefits early. Reviewed By: chadaustin Differential Revision: D21820913 fbshipit-source-id: 56a7e32519f0fb04881518306d94aaed33527fd9	2020-07-10 16:03:32 -07:00
Katie Mancini	4962a7face	tree metdata storage Summary: Prefetching metadata for the entries in a tree when we fetch it saves us an extra round trip to the server to fetch a blob when only the metadata for that blob is fetched. (This can happen often while parsing targets in builds) We will to prefetch the metadata for each of the entries in a tree when we fetch the tree and store the metadata for each entry under that entries id (to make looking up the entry metadata by its id quick) However, we also don't want to unnecessarily fetch data from the server if we already done so. To accomplish this we will also store the metadata for each entry under the tree id in the local store. This will: 1) allow us to check if we have already fetched the metadata from the server when we are fetching a tree (we only have the tree id easily available here to storing the metadata under the tree id makes it much easier/less expensive to do this check). 2) allow us to refil the metadata for each entry stored under that entries blob id if it has been cleared from the local store (this may happen is the local store is gets large and gets partially cleaned to reclaim space). This implements the method to store tree metadata for all entries under the tree id and under the blob id for each entry. Reviewed By: chadaustin Differential Revision: D22239173 fbshipit-source-id: d4e0ffd642ce0b4034188cfc4eeaf2ea05f54e77	2020-07-10 16:03:32 -07:00
Katie Mancini	d2b4855372	introduce scs metadata importer Summary: Prefetching metadata for the entries in a tree when we fetch it saves us an extra round trip to the server to fetch a blob when only the metadata for that blob is fetched. (This can happen often while parsing targets in builds) This implements a custom metadata fetcher to fetch this data when we fetch a tree from the server. Reviewed By: chadaustin Differential Revision: D22086639 fbshipit-source-id: 5fe31d375bf6f7376eb67496d553d6b4540fc0c9	2020-07-10 16:03:32 -07:00
Katie Mancini	af70a36a41	introduce metadata importer Summary: This will allow adding custom MetadataImporters in different eden builds. DefaultMetadataImporter provides a no-op version of the interface to be used by default. Reviewed By: chadaustin Differential Revision: D21960834 fbshipit-source-id: aec8a3627ab1223f74466b92a0ebe3290b67b7ed	2020-07-10 16:03:32 -07:00
Katie Mancini	26a7f3ad25	Wrap backing stores copy of local store in shared ptr Summary: Previously the BackingStore kept a raw pointer to the LocalStore. To do this we relied on EdenServer ensuring the lifetime of the LocalStore exceeds that of the BackingStore. This makes the LocalStore pointer a shared pointer to explictly make sure that the LocalStores lifetime matches the BackingStores lifetime. Reviewed By: chadaustin Differential Revision: D22394597 fbshipit-source-id: c81cb26c6fc8f834bc46d8576ced06ba6a96ac2c	2020-07-10 16:03:32 -07:00
Katie Mancini	550400364d	introduce tree metadata storage in local store Summary: This introduces a class to manipulate the metadata for all the entries in a tree. This adds serialization and deserialization to this class so that it can be written to the local store. Why do we need this? We need some way to easily check when we have already fetched metadata for a tree and do not need to refetch this from the server to avoid expensive network requests. Later diffs add functionally to store the metadata for tree entries in the local store under the tree hash using this class. Reviewed By: chadaustin Differential Revision: D21959015 fbshipit-source-id: 0c0e8750737f3076c1f9604d0319cab7f2658656	2020-07-10 16:03:32 -07:00
Katie Mancini	264d749d67	record ScsProxyHash in LocalStore during import Summary: In following diffs we will use scs to prefetch meta-data for files, so that this data will be available with out fetching the file content (which will improve build times on eden). This builds up the proxy hash index that serves as a conversion between eden specific identifiers and commit and path which we will use to index into scs. Reviewed By: chadaustin Differential Revision: D21820909 fbshipit-source-id: 17891f6772f49c7c183061d7a4df2fe0a3be9d25	2020-07-10 16:03:32 -07:00
Katie Mancini	dc94bc8916	create scs proxy hash Summary: In following diffs we will use scs to prefetch meta-data for files, so that this data will be available with out fetching the file content (which will improve build times on eden). SCS indexes trees by an scs specific hash (blake2 content hash) or by the commit hash and path. Since this is different from the eden hashes and mercurial hashes, we need another index to go between the current ids we have in eden and identifiers for scs. This introduces a proxy hash that serves as this conversion. Because we have commit hashes around in eden right now, this is an easier route to indexing into scs currently. Reviewed By: chadaustin Differential Revision: D21237648 fbshipit-source-id: 79115ac034a5f062ae879713cd2c1a17f348c725	2020-07-10 16:03:31 -07:00
Lukasz Piatkowski	75ae342768	edenscm/hg: add GitHub Actions with CI for HG plus add fixes for getdeps (#25 ) Summary: Fixes include: 1. Passing "GETDEPS_BUILD_DIR" and "GETDEPS_INSTALL_DIR" env variable and using them in eden/scm/Makefile rather than assuming the source code is always in the same place regardless getdeps arguments (it isn't). 2. Added "fbthrift-source" and "fb303-source" to avoid unnecessary compilation (at least of fb303) and to put fbthrift and fb303 source code in an easy to locate place inside getdeps' "installed" folder. Pull Request resolved: https://github.com/facebookexperimental/eden/pull/25 Test Plan: sandcastle, check oss-eden_scm-darwin-getdeps Reviewed By: farnz Differential Revision: D22431872 Pulled By: lukaspiatkowski fbshipit-source-id: 8ccbb090713ec085a5dd56df509eb58ab6fb9e34	2020-07-10 12:07:45 -07:00
Simon Farnsworth	78847ff88c	Make BlobstoreSyncQueue use new futures Summary: Stage 1 of a migration - next step is to make all users of this trait use new futures, and then I can come back, add lifetimes and references, and leave it modernised Reviewed By: StanislavGlebik Differential Revision: D22460164 fbshipit-source-id: 94591183912c0b006b7bcd7388a3d7c296e60577	2020-07-10 06:43:13 -07:00
Mark Thomas	a51d164892	admin: increase type_length_limit Reviewed By: ikostia Differential Revision: D22476055 fbshipit-source-id: 1df7556a5cf774744b26f09e3ed681cceb30c617	2020-07-10 05:55:06 -07:00
Mark Thomas	2180ac866d	dbbookmarks: share SelectBookmark query Summary: Use `pub(crate)` visibility to share the `SelectBookmark` query between modules. Reviewed By: StanislavGlebik Differential Revision: D22464059 fbshipit-source-id: 269561f5ab936b730ce2052e50173134ce241ff8	2020-07-10 04:50:25 -07:00
Mark Thomas	fb5fdb9c15	bookmarks: remove repo_id from Bookmarks methods Summary: Remove the `repo_id` parameter from the `Bookmarks` trait methods. The `repo_id` parameters was intended to allow a single `Bookmarks` implementation to serve multiple repos. In practise, however, each repo has its own config, which results in a separate `Bookmarks` instance for each repo. The `repo_id` parameter complicates the API and provides no benefit. To make this work, we switch to the `Builder` pattern for `SqlBookmarks`, which allows us to inject the `repo_id` at construction time. In fact nothing here prevents us from adding back-end sharing later on, as these `SqlBookmarks` objects are free to share data in their implementation. Reviewed By: StanislavGlebik Differential Revision: D22437089 fbshipit-source-id: d20e08ce6313108b74912683c620d25d6bf7ca01	2020-07-10 04:50:25 -07:00
Mark Thomas	aed95ea96d	dbbookmarks: split up into modules Summary: The dbbookmarks crate is getting too large for a single file. Split it up into a `store` module, which implements the bookmarks traits, and a `transaction` module, which handles bookmark transactions. Reviewed By: krallin Differential Revision: D22437088 fbshipit-source-id: 629b62de151400cdbf56d502aef061df46c3da81	2020-07-10 04:50:25 -07:00
Mark Thomas	3afceb0e2c	bookmarks: extract BundleReplayData from BookmarkUpdateReason Summary: Separate out the `BundleReplayData` from the `BookmarkUpdateReason` enum. There's no real need for this to be part of the reason, and removing it means we can abstract away the remaining dependency on Mercurial changeset IDs from the main bookmarks traits. Reviewed By: mitrandir77, ikostia Differential Revision: D22417659 fbshipit-source-id: c8e5af7ba57d10a90c86437b59c0d48e587e730e	2020-07-10 04:50:24 -07:00
Mark Thomas	fa4dce16f7	dbbookmarks: improve async transaction functions Summary: The dbbookmarks implementation still contains some remnants of old-style futures combinators. Remove these. Reviewed By: krallin Differential Revision: D22432175 fbshipit-source-id: 8d4419def4129112c2386b45e750970790020049	2020-07-10 04:50:24 -07:00
Simon Farnsworth	65e7404eba	Command to manually scrub keys supplied on stdin Summary: For populating the XDB blobstore, we'd like to copy data from Manifold - the easiest way to do that is to exploit MultiplexedBlobstore's scrub mode to copy data directly. Reviewed By: krallin Differential Revision: D22373838 fbshipit-source-id: 550a9c73e79059380337fa35ac94fe1134378196	2020-07-10 01:01:05 -07:00
Durham Goode	48a4372ae3	py3: fix morestatus printing Summary: It now works in py3. Reviewed By: quark-zju Differential Revision: D22468693 fbshipit-source-id: 7ebe03560c54369f98477a64ce8074b55de2073e	2020-07-09 20:46:16 -07:00
Durham Goode	7c0dea5f3d	py3: fix nullid file content from remotefilelog Summary: All file content should be bytes. Reviewed By: quark-zju Differential Revision: D22468694 fbshipit-source-id: 9febc60d0fd1c49c2f5812f3ba9f10b4782ffd11	2020-07-09 20:46:16 -07:00
Arun Kulshreshtha	6b67d820bd	auth: remove use of unwrap Reviewed By: quark-zju Differential Revision: D22467292 fbshipit-source-id: d645d437a3dc80b1a7f29841067aa05b0e48df17	2020-07-09 19:05:55 -07:00
Arun Kulshreshtha	14a7fe636f	cpython-ext: Add ExtractInnerRef trait Summary: Per comments on D22429347, add a new `ExtractInnerRef` trait that is similar to `ExtractInner`, but returns a reference to the underlying value. A default implementation is provided for types whose inner value is `Clone + 'static`, so in practice most types will only need to implement `ExtractInnerRef`, whereas the callsite may choose whether it needs a reference or an owned value. Reviewed By: quark-zju Differential Revision: D22464158 fbshipit-source-id: 7b97329aedcddb0e51fd242b519e79eba2eed350	2020-07-09 19:05:55 -07:00
Arun Kulshreshtha	a5ae136439	pyrevisionstore: add EdenAPI store bindings Summary: Add add a `edenapistore` class to that wraps a `EdenApiHgIdRemoteStore`. This class is purely used as a means to set up the stores from Python code, and is only used as a way to get an `Arc<EdenApiHgIdRemoteStore>` to the Rust content store. It has no functionality of its own. Reviewed By: quark-zju Differential Revision: D22449702 fbshipit-source-id: ad2094c79da523071b6ed8344c8dde706e448c95	2020-07-09 19:05:55 -07:00
Arun Kulshreshtha	2f97de536f	pyedenapi: rewrite bindings Summary: This is effectively a complete rewrite of the EdenAPI Python bindings to use the new client. Reviewed By: quark-zju Differential Revision: D22442903 fbshipit-source-id: c3cf2b2b8291e24d6d4d3a3546ccc69472510567	2020-07-09 19:05:55 -07:00
Arun Kulshreshtha	5cb7bdd3c0	edenapi: use EdenApiError as error type for StatsFuture Summary: Ensure that all of the components of an EdenAPI response use the same error type. Reviewed By: quark-zju Differential Revision: D22443029 fbshipit-source-id: 3e00a8b83677beb5ef2d90630fe9b85760874186	2020-07-09 19:05:55 -07:00
Arun Kulshreshtha	cb16831e6d	revisionstore: add add_entry method to HgIdMutableDeltaStore Summary: Add an `add_entry` convenience method to `HgIdMutableDeltaStore`, similar to the one present in `HgIdMutableHistoryStore`. Reviewed By: quark-zju Differential Revision: D22443031 fbshipit-source-id: 84fdaae9fbd51e6f2df466b0441ec5f7ce6715f7	2020-07-09 19:05:55 -07:00
Arun Kulshreshtha	7ae097e8da	cpython-ext: add ExtractInner trait Summary: A common pattern in Mercurial's data storage layer Python bindings is to have a Python object that wraps a Rust object. These Python objects are often passed across the FFI boundary to Rust code, which then may need to access the underlying Rust value. Previously, the objects that used this pattern did so in an ad-hoc manner, typically by providing an `into_inner` or `to_inner` inherent method. This diff introduces a new `ExtractInner` trait that standardizes this pattern into a single interface, which in turn allows this pattern to be used with generics. Reviewed By: quark-zju Differential Revision: D22429347 fbshipit-source-id: cab4c24b8b98c6ef8307f72a9b4726aabdc829cc	2020-07-09 19:05:55 -07:00
Durham Goode	fe102de31a	bundle2: chunk large data blobs so they fit under 2GB Summary: Bundle2 chunks have to fit under 2GB, but we have code that simply returns entire buffers as a chunk, which may be over 2GB. Let's split that up into smaller chunks. Reviewed By: quark-zju Differential Revision: D21235286 fbshipit-source-id: f52366fb5ecebf4f9f00914e044c46e147873bec	2020-07-09 16:22:40 -07:00
Ailin Zhang	763f855e5a	show fsck progress animation bars Summary: This diff updated `eden start` to show animated fsck progress bars Reviewed By: genevievehelsel Differential Revision: D22357699 fbshipit-source-id: f399cc78fc25c85100e5931505ce787899c3fff4	2020-07-09 14:43:42 -07:00
Arun Kulshreshtha	a4140f2eca	remotefilelog: remove old EdenAPI Python code Summary: Remove all of the old EdenAPI Python code from Mercurial. For the new EdenAPI client, we intend to expose HTTP fetching through the Rust storage interfaces rather than putting conditionals throughout the Python code. Reviewed By: quark-zju Differential Revision: D22405579 fbshipit-source-id: d3c9ed02d9f624b9490e9280b8b0b4f8a127a9b5	2020-07-09 13:08:28 -07:00
Arun Kulshreshtha	9c9d27d95f	edenapi: reuse HttpClient Summary: D22396026 made it so that `HttpClient::send_async` no longer consumes `self`. This means that instead of creating a new HTTP client for each request, we can reuse the same one. This has the benefit of allowing for connection reuse (which was the point of D22396026), resulting in lower latency for serial fetches. Reviewed By: quark-zju Differential Revision: D22397768 fbshipit-source-id: 9d066c1ec64a6aa1b36ec674ef294030c1f90b41	2020-07-09 13:08:28 -07:00
Arun Kulshreshtha	ce69772ec7	edenapi: support sending serial requests from CLI Summary: Allow passing multiple JSON requests to the EdenAPI CLI. The requests will be performed serially, which allows for testing the performance of serial EdenAPI calls. Reviewed By: quark-zju Differential Revision: D22397769 fbshipit-source-id: c59e5abf53eee9c2014010672183e202b6f180fc	2020-07-09 13:08:27 -07:00
Arun Kulshreshtha	0b05d4aefe	http_client: add pool for Multi handles Summary: Add a pool of `Multi` handles that the client can reuse across requests. Previously, `HttpClient`'s async functions had to consume the client in order to have a `'static` lifetime (since `Future`s generally cannot hold references to things outside of themselves). This meant that the each async operation would use its own `Multi` handle, preventing connection reuse across operations since the `Multi` handle maintains a connection cache internally. With this change, the client can reuse the `Multi` session after an async operation, thereby benefitting from libcurl's caching. Note that the same `Multi` handle still cannot be used by concurrently running `Future`s (as this [would not be thread safe](https://curl.haxx.se/libcurl/c/threadsafe.html)), but once a `Future` has completed its `Multi` handle will return to the pool for use by subsequent requests. --- (Somewhat tangential) As is noted in the code comments, `libcurl`'s C API provides a way to share caches across multiple multi sessions: [the "share" interface](https://curl.haxx.se/libcurl/c/libcurl-share.html). While using this would seems preferable to an ad-hoc solution like this diff, it turns out that the `curl` crate does not provide safe bindings to the share interface. This means that in order to use the share interface, we'd need to directly use the unsafe bindings from `curl-sys`. In addition to the difficulty of working with unsafe FFI code, the API expects the application to handle synchronization by passing it function pointers to handle locking/unlocking shared resources. Ultimately, I came to the conclusion that managing lifetimes and synchronization in unsafe code across an FFI boundary would be nontrivial, and ensuring correctness would require a lot of effort that could be avoided by implementing an ad-hoc solution on top of the safe API instead. However, it might make sense to change this to use the share interface in the future. Reviewed By: quark-zju Differential Revision: D22396026 fbshipit-source-id: 06eea2ffacdc791527eac9ce4becc457af5c0480	2020-07-09 13:08:27 -07:00
Arun Kulshreshtha	1323b17436	edenapi: delete old client Summary: Delete the old EdenAPI client. Reviewed By: quark-zju Differential Revision: D22379475 fbshipit-source-id: 99f76ea170ec9db8d79727fbdfd441afd2de3899	2020-07-09 13:08:27 -07:00
Arun Kulshreshtha	14392eb035	pyedenapi: use new EdenAPI crate Summary: Update the EdenAPI Python bindings to use the new client. This is mostly just a stopgap measure to allow us to delete the old client code; nothing in production actually uses these bindings anymore, and the new client will primarily be used from Rust. Reviewed By: quark-zju Differential Revision: D22379476 fbshipit-source-id: 953e0ffc2ce682869ee234d672a154046b373c1e	2020-07-09 13:08:27 -07:00
Arun Kulshreshtha	d1d3224ba1	revisionstore: use new EdenAPI crate Summary: Update the `revisionstore` and `backingstore` crates to use the new EdenAPI crate. Reviewed By: quark-zju Differential Revision: D22378330 fbshipit-source-id: 989f34827b744ff4b4ac0aa10d004f03dbe9058f	2020-07-09 13:08:27 -07:00
Arun Kulshreshtha	41e68f46d3	edenapi: add blocking API Summary: Add a new `EdenApiBlocking` trait that exposes blocking versions of the `EdenApi` trait's methods, for use in non-async code. Reviewed By: quark-zju Differential Revision: D22305396 fbshipit-source-id: d0f3a73cad1a23a4f0892a17f18267374e63108e	2020-07-09 13:08:27 -07:00
Arun Kulshreshtha	30a6cad591	edenapi: add EdenAPI testing CLI Summary: This diff adds an EdenAPI CLI program that allows manually sending requests to the server. Requests are read from stdin in a JSON format (the same format used by the `make_req` tool and the EdenAPI server integration tests). This makes it easy to create and edit requests during debugging. Responses are re-serialized as CBOR and written to stdout. (The program will refuse to write output if stdout is a TTY.) These responses can then be analyzed using the `read_res` tool (also used by the EdenAPI server integration tests). The program prints real-time download statistics during data fetching, allow the user to debug performance in addition to correctness. The program uses standard `hgrc` files to configure the EdenAPI client, which means that one can simulate production settings by specifying a production `hgrc`. By default, it will read from `~/.hgrc.edenapi` rather than `~/.hgrc` since the user will most likely want to configure this program independently of Mercurial. Reviewed By: quark-zju Differential Revision: D22370163 fbshipit-source-id: 5d9974bc05fa960d26cd2c87810f4646e2bc55b4	2020-07-09 13:08:27 -07:00
Zeyi (Rice) Fan	30ba5cf783	fix missed use of Path.resolve Summary: There was some missed usage of `Path.resolve`. This diff should cover it all. ``` cli $ rg -F ".resolve" main.py 967: uid = self.resolve_uid(args.uid) 968: gid = self.resolve_gid(args.gid) util.py 622: `Path.resolve`. This is a helper method to work around that by using 628: return path.resolve(strict=strict) ``` Reviewed By: chadaustin Differential Revision: D22459188 fbshipit-source-id: c2a1b132f752cc399ebf34723f26123559939f2a	2020-07-09 13:03:58 -07:00
Durham Goode	7411a8e6fa	py3: fix pushrebase and infinitepush tests using old mysql connector Summary: Apparently some of these tests still run in py2. Let's let it fallback to the old mysql-connector connector. Reviewed By: xavierd Differential Revision: D22458822 fbshipit-source-id: add3da42cbd18e6cb5b34b3038d96cf52c7c6387	2020-07-09 12:17:17 -07:00
Chad Austin	1031c6a211	stop shipping hg_import_helper.py Summary: proxy_import_helper.py exists for compatibility with older EdenFS builds. None of those builds are running anymore, so remove it. Reviewed By: genevievehelsel Differential Revision: D22451196 fbshipit-source-id: 4d258b3fafe13bb67bd11259f5d1193a7e5575e6	2020-07-09 11:28:07 -07:00
Ailin Zhang	a1163c34f2	using ProgressCallback type for simplification to print fsck messages Summary: This diff defines `Overlaychecker::ProgressCallback` to replace repetitive function type declaration. Reviewed By: genevievehelsel Differential Revision: D22243160 fbshipit-source-id: ea05e451817a760b5266879b956eaea48dc8d85e	2020-07-09 11:13:18 -07:00
Stanislau Hlebik	361f4e98a7	mononoke: use batch_derive method in derived data utils Summary: Previously backfill_batch_dangerous method was calling internal derive_impl() method directly. That wasn't great (after all, we are calling a function whose name suggests it should only be called from inside derive data crate) and this diff changes it so that we call batch_derive() method instead. This gives a few benefits: 1) We no longer call internal derive_impl function 2) It allows different types of derived data to override batching behaviour. For example, we've already overriden it for fsnodes and next diff will override it for blame as well. To make it compatible with derive_impl() batch_derive() now accepts derive data mode and mapping Reviewed By: krallin Differential Revision: D22435044 fbshipit-source-id: a4d911606284676566583a94199195860ffe2ecf	2020-07-09 10:45:19 -07:00
Katie Mancini	df2b9b9009	open all RocksDB column families for backwards compatabiliy Summary: From the Rocks DB documentation: > When opening a DB in a read-write mode, you need to specify all Column Families that currently exist in a DB. If that's not the case, DB::Open call will return Status::InvalidArgument() This can cause problems for us in a couple of situations: - When we need to rollback from an eden version where we added a column to our configuration for RocksDB - When we delete a column from our configuration for RocksDB To make sure we do not encounter this error we need to make sure that we still open all the columns existing in the database, even if they are not in our configured list of family columns. Reviewed By: wez Differential Revision: D22425310 fbshipit-source-id: 9822b22cfedf4633f65bbed96f95a546dd3614f4	2020-07-09 10:28:14 -07:00
Mark Thomas	5c95bf6be2	mutation_store: make separate requests by primordial and successor Summary: D22206317 (`9a6ed4b6ca`) added requesting of predecessor information for suspected primordials by the successor ID. This allows recovery of earlier predecessors when partial data upload resulted in the history of a commit being extended backwards. Unfortunately, while the individual requests are fast, the combined request using `OR` in SQL ended up being very slow for some requests. Separate out the requests at the application level, and aggregate the results by concatenating them. `collect_entries` already handles duplicates should any arise. Most of the time the successor query will very quickly return no rows, as it only matters when history is extended backwards, which is expected to be rare. Reviewed By: ikostia Differential Revision: D22456062 fbshipit-source-id: 1e6094b4ac1590a5824e9ae6ef48468766560188	2020-07-09 09:21:01 -07:00
Viet Hung Nguyen	b7817ffbd8	xdiff: renamed third-party xdiff functions Summary: Renamed xdiff functions to avoid linking issues when using both libgit2-sys and xdiff. When using repo_import tool (https://fburl.com/diffusion/8p6fhjt2) we have libgit2-sys dependency for importing git repos. However, when we derive blame data types, we need to use xdiff functionalities (from_no_parents: https://fburl.com/diffusion/pitukmyo -> diff_hunks: https://fburl.com/diffusion/9f8caan9 -> xdl_diff: https://fburl.com/diffusion/260x66hf). Both libgit2 and eden/scm have vendored versions of xdiff library. Therefore, libgit2-sys and eden/scm share functions with the same signatures, but have different behaviours and when we tried to derive blame, it used libgit2-sys's xdl_diff instead of eden's. This resulted in getting segfaults (https://fburl.com/paste/04gwalpo). Note: repo_import is the first tool that has tried to import both and the first to run into this issue. Reviewed By: StanislavGlebik Differential Revision: D22432330 fbshipit-source-id: f2b965f3926a2dc45de1bf20e41dad70ca09cdfd	2020-07-09 01:20:32 -07:00
Katie Mancini	81a21ac4fa	Increase process cmdline buffer Summary: Currently when we are resolving the full command line for a client pid, we only read the first 256 bytes of the command. This means that some commands will be truncated, this has come up in some of our recently added logs. This ups the buffer size so that we can hopefully get the full command line. The longer term solution would be to implement the something fancier mentioned in the comment in the code copied below, but also has drawbacks as mentioned. > // Could do something fancy if the entire buffer is filled, but it's better // if this code does as few syscalls as possible, so just truncate the // result Reviewed By: wez Differential Revision: D22436219 fbshipit-source-id: 80a9aecfe148aa3e333ca480c6a8cb8b9c5c86f2	2020-07-08 15:48:15 -07:00
Jun Wu	b80966f93c	revlog: turn on head-based-commit-transaction for tests Summary: Bypass truncation-based transaction if narrow-heads is on. The transaction abort still works logically because commit references stay unchanged on abort. Related EdenFS and Mononoke tests are updated. Mononoke tests probably shouldn't rely on revlog / fncache implementation details in hg. Reviewed By: DurhamG Differential Revision: D22240186 fbshipit-source-id: f97efd60855467b52c9fb83e7c794ded269e9617	2020-07-08 14:33:58 -07:00
Jun Wu	3244065fb4	revlog: bypass some transaction logic when adding commits with modern setup Summary: With narrow-heads, visible heads are explicitly controlled by commit references. Adding commits can be just writing them out directly. This mainly removes the "buffered" writes of `00changelog.i`. Instead of writing pending changes to `00changelog.i.a`, they are directly written to `00changelog.i` (or buffered in memory with future changes). This does not bypass all transaction logic. Truncation can still happen. Strip is also unaffected. The change is incomplete. In the future, pending changes will be written in-memory to the Rust HgCommits struct and we no longer write directly to revlog. Reviewed By: DurhamG Differential Revision: D22240176 fbshipit-source-id: ac9d20ab95ff304fb285a503d2d3db815942d5b3	2020-07-08 14:33:58 -07:00
Jun Wu	b61236537c	util: define istest Summary: This makes pyre aware that `istest` exist on `util`. Reviewed By: DurhamG Differential Revision: D22421141 fbshipit-source-id: 50dd264988ffe0e93597df2d540f3de03e8aea4d	2020-07-08 14:29:31 -07:00
Jun Wu	4b09e88ceb	amend: do not auto restack if there are no visible children Summary: With modern configs, repo is unfiltered and `ctx.children()` returns unfiltered commits. Use the revset function `children` instead so invisible children won't trigger auto restack. Reviewed By: DurhamG Differential Revision: D22421689 fbshipit-source-id: 3ec8f616c17254ee9ccfcad96673d209b9163da6	2020-07-08 14:10:37 -07:00
Jun Wu	3d82fd408a	amend: add a new auto restack test case Summary: The test demostrates an issue with the current auto restack logic. Reviewed By: DurhamG Differential Revision: D22421690 fbshipit-source-id: e035cd3212357f24322f8eb9ec5941767ad780d9	2020-07-08 14:10:37 -07:00
Arun Kulshreshtha	fdb8859422	edenapi: add new EdenAPI client Summary: This diff is a complete, ground-up rewrite of the EdenAPI client. Rather than attempting to use `libcurl` directly, it relies on the new `http_client` crate, which makes the code considerably simpler and allows for a proper async interface. The most notable change is that `EdenApi` is now an async trait. A blocking API is added later in the stack for use in non-async contexts. Reviewed By: quark-zju Differential Revision: D22305397 fbshipit-source-id: 4c1e5d3091d6dd04cf13291e7b7a4217dfdd249f	2020-07-08 12:51:04 -07:00
Arun Kulshreshtha	014f1a5289	http_client: add buffering to CborStream Summary: As was pointed out in the review for D22280745 (`d73c63d862`), `CborStream` is inefficient in situations where the underlying stream produces chunks that are much smaller than the size of the serialized items. To avoid pathological behavior, make `CborStream` buffer the incoming data, and only attempt deserialization if enough data has accumulated. For now, the buffer size is fixed (with a default of 1MB, chosen arbitrarily). In the future, it might make sense to have the stream adjust the buffer size based on the average size of observed deserialized values. Reviewed By: quark-zju Differential Revision: D22370164 fbshipit-source-id: ed940c56ca2cbbfc07f01d47becf6f1d71872872	2020-07-08 12:51:04 -07:00
Jun Wu	e440d3ce2b	revlogindex: update nodemap even if it's non-symlink and mmaped on Windows Summary: On Windows a mmap file cannot be replaced. Detect that and delete manually. Reviewed By: farnz Differential Revision: D22428731 fbshipit-source-id: 4d308a07aae02dcaf2aedb7b0267a535c2e09c92	2020-07-08 11:31:21 -07:00
Durham Goode	8902138221	mysql: set ssl_disabled for infinitepush and pushrebase Summary: Diff D22140187 (`74da65a38f`) upgraded mysql-connector-python, which enabled ssl by default. Our db doesn't support this, so we disabled it for hgsql but forgot to for infinitepush and pushrebase. Let's do it for them too. Reviewed By: krallin Differential Revision: D22416533 fbshipit-source-id: bc91ccd2ab4d9bc8ba423c8e60fc0191c7ff78c6	2020-07-08 09:15:29 -07:00
Thomas Orozco	ae917ba227	mononoke/virtually_sharded_blobstore: make sampling rate tunable Summary: As it says in the title. Reviewed By: farnz Differential Revision: D22432526 fbshipit-source-id: 42726584689cbc2f5c9138b42b7bf77939921bdd	2020-07-08 09:07:19 -07:00
Kostia Balytskyi	75db021d70	live_commit_sync_config: make it into a trait Summary: The goal is to make it easier to implement unit tests, which depend on `LiveCommitSyncConfig`. Specifically, `scs` has a piece of code, which instantiates `mononoke_api::Repo` with a test version of `CommitSyncConfig`. To migrate it to `LiveCommitSyncConfig`, I need to be able to create a test version of that. It is possible now, but would require me to turn a supplied instance of `CommitSyncConfig` back into `json`, which is cumbersome. Using a `dyn LiveCommitSyncConfig` there, instead of a concrete struct seems like a good idea. Note also that we are using this technique in many places: most (all?) of our DB tables are traits, which we then implement for SQL-specific structs. Finally, this diff does not actually migrate all of the current users of `LiveCommitSyncConfig` (the struct) to be users of `LiveCommitSyncConfig` (the trait), and instead makes them use `CfgrLiveCommitSyncConfig` (the trait impl). The idea is that we can migrate bits to use traits when needed (for example, in an upcoming `scs` diff). When not needed, it's fine to use concrete structs. Again, this is already the case in a a few places: we sometimes use `SqlSyncedCommitMapping` struct directly, instead of `T: SyncedCommitMapping` or `dyn SyncedCommitMapping`. Reviewed By: StanislavGlebik Differential Revision: D22383859 fbshipit-source-id: 8657fa39b11101684c1baae9f26becad6f890302	2020-07-08 08:34:06 -07:00
Thomas Orozco	dd1aaf90fe	mononoke/{hgproto,mercurial_bundles}: eliminate O(N^2) behavior in decoding Summary: This updates the AsyncRead implementations we use in hgproto and mercurial_bundles to use a LimitedAsyncRead. The upshot of this change is that we eliminate O(N^2) behavior when parsing the data we receive from clients. See the earlier diff on this stack for more detail on where this happens, but the bottom line is that Framed presents a full-size buffer that we zero out every time we try to read data. With this change, the buffer we zero out is comparable to the amount of data we are reading. This matters in commit cloud because bundles might be really big, and a single big bundle is enough to take an entire core for a spin or 20 minutes (and they achieve nothing but time out in the end). That being said, it's also useful for non-commit cloud bundles: we do occasionally receive big bundles (especially for WWW codemods), and those will benefit from the exact same speedup. One final thing I should mention: this is all in a busy CPU poll loop, and as I noted in my earlier diff, the effect persists across our bundle receiving code. This means it will sometimes result in not polling other futures we might have going. Reviewed By: farnz Differential Revision: D22432350 fbshipit-source-id: 33f1a035afb8cdae94c2ecb8e03204c394c67a55	2020-07-08 08:07:13 -07:00
Lukas Piatkowski	0eb9d79a46	rust/reindeer: update eventsource to 0.5 Summary: The 0.3 version (currently being used only in one crate eden/scm/lib/commitcloudsubscriber) is using an old openssl crate which doesn't work with openssl library installed on most machines (Both in FB and on GitHub Actions). Reviewed By: mitrandir77 Differential Revision: D22430649 fbshipit-source-id: b8fa930841dbcdd4c085d8c9488d768b3526e1c4	2020-07-08 04:15:21 -07:00
Durham Goode	f8b098febb	dirstate: prevent absolute paths in the dirstate Summary: The dirstate code did not prevent absolute paths from being added to the structure, but they would cause problems later when those paths were passed to Rust. We should move the dirstate to use the Rust path type, but for now let's just block absolute paths. Reviewed By: quark-zju, xavierd Differential Revision: D22426592 fbshipit-source-id: 4ae9f004237e4c54336beb03aab29517254ae441	2020-07-07 20:32:05 -07:00
Xavier Deguillard	71a9ae11d9	pyrevisionstore: do not abort on partial fetches Summary: We've seen a handful of users complaining about clone failing and not being able to recover from it. From looking at the various reports and the stacktraces, I believe this is caused by a flaky connection on the user end that causes the Python code to retry the getpack calls. Before retrying, the code will figure out what still needs fetching and this is done via the getmissing API. When LFS pointers were fetched, the LFS blobs aren't yet present on disk, and thus the underlying ContentStore::get_missing will a set of keys that contain some StoreKey::Content keys. The code would previously fail at this point, but since the key also contains the original key, we can simply return this, the pointers might be refetched but these are fairly small. Taking a step back from this bug, the issue really is that the retry logic is done in code that cannot understand content-keys, and moving it to a part of the code that understands this would also resolve the issue. I went with the simple approach for now, but since other remote stores (EdenAPI, the LFS one, etc) would also benefit from the retry logic, we may want to move the logic into Rust and remove the getmissing API from the Python exposed ContentStore. Reviewed By: DurhamG Differential Revision: D22425600 fbshipit-source-id: 69c2898cc302d2170cd0f206c89189c341db5278	2020-07-07 19:44:01 -07:00
Jun Wu	8e199e9a44	debugcomplete: show aliases with -v Summary: Make zsh_completion complete standard aliases like `checkout`. This restores the behavior before D18463299 (`54451585ce`) stack. Reviewed By: farnz Differential Revision: D22396737 fbshipit-source-id: 745761041d6d1dec6adba2efb102e2021a01b36b	2020-07-07 16:47:45 -07:00
Chad Austin	4c0996afa5	make systemd management functions explicitly async Summary: Rather than dynamically allocating an event loop in the systemd async code, make all the corresponding functions async, so the caller is responsible for threading an event loop down. Reviewed By: genevievehelsel Differential Revision: D21894106 fbshipit-source-id: 398c769c30c85a3bb210dbc209f34f9f7336996c	2020-07-07 11:31:33 -07:00
Genevieve Helsel	bda44c5201	move is_system_idle to procutils Summary: I'd eventually like to use this in the edenfs_monitor, so moving it to `proc_utils` for sharibility. Reviewed By: chadaustin Differential Revision: D21998763 fbshipit-source-id: 052e78fb8e58515f98eb465b8041fd0e621fc9da	2020-07-07 11:22:33 -07:00
Genevieve Helsel	cd1196bb28	move eden idleness detection to EdenFSProcess Summary: I'd eventually like to use this in the edenfs_monitor, so I'm adding this to `proc_utils` for future ease of use. Reviewed By: chadaustin Differential Revision: D21987390 fbshipit-source-id: 076672b44311c2a1e0cac934c0674a18a87649af	2020-07-07 11:22:33 -07:00
Genevieve Helsel	0f9cff3b88	Add tests for ignored file with carriage return in name Summary: On macOS, "Icon?" (aka "Icon\r") is a sometimes added. This file is weird to ignore, and should be ignored using "Icon\r\r" or "Icon[/r]". This won't be hidden with "Icon\r" or "Icon\r" Reviewed By: chadaustin Differential Revision: D22050682 fbshipit-source-id: 51d7d4c2414a07b959120455ae991d2425c1ea4d	2020-07-07 11:13:12 -07:00
Thomas Orozco	8c994e7682	mononoke/fastreplay: log replay success & failure counts to ODS Summary: I want to update the health check to stop averaging averages (like in D22394014). To do this, I need those counters. Reviewed By: ahornby Differential Revision: D22410196 fbshipit-source-id: aa5cbfe6607be3b953887f1639e1de54baac7389	2020-07-07 06:41:23 -07:00
Stanislau Hlebik	886e34d17b	mononoke: log size of fetched undesired files Summary: Just knowing the number of fetched undesired files doesn't give the full picture. e.g. fetching lots of small files is better than fetching single multi-Gb file. So knowing the size of files is helpful Reviewed By: krallin Differential Revision: D22408400 fbshipit-source-id: 7653c1cdceccf50aeda9ce8a4880ee5178d4b107	2020-07-07 06:23:01 -07:00
Thomas Orozco	78b9dbb470	changelog: fix undefined variable Summary: This is causing the Mononoke and hg tests to break on an undefined variable. This looks like it might have been a refactoring accident when the code defining this got extracted into D22240176 (but the code using it landed in D22240177 (`ec58e72903`)). This diff unbreaks the tests by defaulting the parameter to False (which seems coherent with the idea of that latter diff), and puts it in a place that'll trigger a merge conflict for quark-zju when he rebases D22240176. Reviewed By: farnz Differential Revision: D22408588 fbshipit-source-id: 496808742a13dfeb17989123742a0aa8bae17b38	2020-07-07 02:37:42 -07:00
Ratnadeep Joshi	ce4b8122ae	Some minor modifications to verbiage in documentation Summary: [edenfs] Some minor modifications to verbiage in documentation Reviewed By: chadaustin Differential Revision: D22394129 fbshipit-source-id: 2d662e56d621fd5e5d5ba6de284ca3d08f8bd4e5	2020-07-06 22:20:17 -07:00
Durham Goode	f09839fdf9	merge: don't do checkunknown for files that we know are not untracked Summary: checkunknown is quite expensive since it has to read the contents of every untracked file, which can be 10's of thousands of non-parallel stats and reads. For files that don't exist in the working copy, it's just wasted work to stat for the files at all. Status can efficiently tell us what files are unknown, so let's use that to triage most "unknown" files to normal writes before we even get to checkunknown. The downside of this approach is that it makes an additional call to status, which is not cached (only non-unknown+non-ignore+non-clean status calls are cached). We could add more caching if this is a problem. This doesn't help the case where a user might have 10k+ untracked files due to a ctrl+c'd checkout, but we'll improve that in a future diff. Reviewed By: quark-zju Differential Revision: D22366758 fbshipit-source-id: b54fec113dc162f97a35e705ed083ddd14babe55	2020-07-06 22:10:56 -07:00
Arun Kulshreshtha	5f0181f48c	Regenerate all Cargo.tomls after upgrade to futures 0.3.5 Summary: D22381744 updated the version of `futures` in third-party/rust to 0.3.5, but did not regenerate the autocargo-managed Cargo.toml files in the repo. Although this is a semver-compatible change (and therefore should not break anything), it means that affected projects would see changes to all of their Cargo.toml files the next time they ran `cargo autocargo`. Reviewed By: dtolnay Differential Revision: D22403809 fbshipit-source-id: eb1fdbaf69c99549309da0f67c9bebcb69c1131b	2020-07-06 20:49:43 -07:00
generatedunixname89002005307016	09cd8a3058	suppress errors in `eden` Summary: Automatic run to suppress type errors. #pyreupgrade Differential Revision: D22404242 fbshipit-source-id: a7eabe6c6eb0dc9f29b3cf01f780c34fff1c6810	2020-07-06 20:31:25 -07:00
Xavier Deguillard	48d97384d0	revisionstore: fix typo in memcache trace Summary: The key is "hit_count" not "hits". This typo caused the trace to always claim that no data was fetched from memcache, which is obviously not true as the getpack trace that follows listed significantly less requested keys. Reviewed By: kulshrax Differential Revision: D22401592 fbshipit-source-id: ab2ea3e7f8ff3a9c7322678afc8a174e09d6dc09	2020-07-06 17:57:16 -07:00
Jun Wu	ae2f6a939c	pydag: ignore nullrev or nullid from Python Summary: The Mercurial's concept of `null` revision (hardcoded as 20 zeros) is a headache to special case. See https://www.mercurial-scm.org/wiki/RevsetVirtualRevisionPlan. The Rust DAG layer cannot handle it. Make pydag drop the nullid or nullrev when crossing the Python -> Rust boundary. A cleaner way to handle `null` might be: - Create a new vertex in the DAG in memory that has empty content. Calculate its commit hash normally. The commit is isolated from other parts of the commit graph. It has no parents and no children. The vertex has an assigned Id, which is not zero if the repo is not empty. - Assign the `null` special name (like how we do for `tip`) to the commit. - Remove all hard-code special cases of the 20-zero `nullid`. That would allow things like `hg up null`, `hg diff -r null -r X` to continue work without special casing it in the commit graph layer. Reviewed By: sfilipco Differential Revision: D22240188 fbshipit-source-id: 707af47cbf36a7df60097a17d69094aae89d3250	2020-07-06 15:51:00 -07:00
Jun Wu	e89ef0df78	changelog: replace index2 with the new Rust structure Summary: The new Rust structure has enough features to replace index2. This will eventually allow us to delete index2 related logic (namely, pyrevlogindex). Reviewed By: sfilipco Differential Revision: D22240178 fbshipit-source-id: 1af9e6045c8d8d1a220a6abad6d33b129a3afa70	2020-07-06 15:51:00 -07:00
Jun Wu	b1163b2553	repair: invalidate changelog after strip Summary: The old revlog C code works with strip. The new Rust code is never designed to work with strip (especially the segmented changelog does not support strip). In the strip code path, just reload the changelog after strip to avoid issues. Reviewed By: sfilipco Differential Revision: D22323188 fbshipit-source-id: c4f790c66372c28a71173cf16910ad1d7cb89223	2020-07-06 15:51:00 -07:00
Jun Wu	2b67699ca1	treemanifest: do not prefetch anything for 'debugstrip' Summary: Strip is a special case. `tr.changes["revs"]` can contain invalidated revision numbers. Do not use it. Reviewed By: sfilipco Differential Revision: D22240179 fbshipit-source-id: 6b9d29e099f821a7fc7aa6055dc8eccf4597ffd0	2020-07-06 15:51:00 -07:00
Jun Wu	ec58e72903	changelog: load Rust HgCommits structure Summary: This exposes the Rust HgCommits object. It will replace changelog operations gradually. This diff only makes changelog load the new Rust HgCommits structure, and maintain it side-by-side with the original data structures when there are changes. It does not replace real changelog operations yet. Reviewed By: sfilipco Differential Revision: D22240177 fbshipit-source-id: b585c1585defdc133d2b9ef2fda4aea8702152bf	2020-07-06 15:51:00 -07:00
Jun Wu	b1ae5b2874	revlogindex: skip flushing duplicated entries Summary: If the revlog on disk was changed to include new commits, read them and avoid writing duplicated commits (which breaks nodemap building). Reviewed By: sfilipco Differential Revision: D22323187 fbshipit-source-id: cdd65f31e65865d9f3868e43416633297896c0f9	2020-07-06 15:51:00 -07:00
Jun Wu	0c91746cc7	pydag: use trait object abstractions Summary: Change pydag from using concreate `namedag` and `memnamedag` to trait objects: - `commits`: High-level read-write commits storage, supports Rust `HgCommits` (segmented changelog), `MemHgCommits`, and `RevlogCommits`. - `dagalgo`: maps to the `DagAlgorithm` Rust trait. - `idmap`: maps to the `IdConvert + PrefixLookup` Rust traits. The idea is that we move the revlog / segmented changelog difference from Python to behind Rust trait objects so the Python code looks overall cleaner, the Rust revset alternative gets exercised early, and switching from revlog to segmented changelog becomes easier. Reviewed By: sfilipco Differential Revision: D21796242 fbshipit-source-id: 3a4a3ff3d9e7e46059d1ed3461a55003c352e82d	2020-07-06 15:51:00 -07:00
Jun Wu	e53be6d0fc	hgcommits: make HgCommit serializable Summary: This is used by the next diff. Reviewed By: sfilipco Differential Revision: D21944139 fbshipit-source-id: 184c4e97aaeca36c3608665defd1473c9300fb5b	2020-07-06 15:51:00 -07:00
Jun Wu	132f046f11	hgcommits: add revlog-based commits Summary: Use `RevlogIndex` to implement the HgCommits interface. Reviewed By: sfilipco Differential Revision: D21854226 fbshipit-source-id: 05ff242858ac879d3b40b35ba8db5044135604be	2020-07-06 15:50:59 -07:00
Jun Wu	fa62891e30	hgcommits: add in-memory version of HgCommits Summary: This will satisfy some use-cases. Reviewed By: sfilipco Differential Revision: D21854225 fbshipit-source-id: 76758716b35cfd31dc3843c118917c0fb7609027	2020-07-06 15:50:59 -07:00
Jun Wu	ea67a2168e	hgcommits: new crate for hybrid commit data + dag structure Summary: This will help move more Python logic to Rust. Reviewed By: sfilipco Differential Revision: D21854224 fbshipit-source-id: b03cbacedc11d77e8c56262437a8d10bd9a89e59	2020-07-06 15:50:59 -07:00
Jun Wu	a0c5b1b3a5	revlogindex: is_ancestor(x, x) should return true Summary: This is discovered by using it in Python world. Reviewed By: sfilipco Differential Revision: D22323186 fbshipit-source-id: 295811e0950b94ad2ad73ad242228b6a3f9765d0	2020-07-06 15:50:59 -07:00
Jun Wu	52668752d8	revlogindex: de-duplicate insertions Summary: Adding a same commit multiple times is a no-op. Reviewed By: sfilipco Differential Revision: D22323190 fbshipit-source-id: 61a06335581a9cad32dc7e929b841ec69b551a9c	2020-07-06 15:50:59 -07:00
Jun Wu	b86f3bd6e2	revlogindex: use tests from the dag crate Summary: This adds some test coverage for the revlog DagAlgorithm implementation. Reviewed By: sfilipco Differential Revision: D22249157 fbshipit-source-id: a1d347b4d90d0e7f8fb229c317cc75c2b8e16242	2020-07-06 15:50:59 -07:00
Jun Wu	fcbe821dd1	revlogindex: impl DagAddHeads for RevlogIndex Summary: This makes RevlogIndex compatible with the generic DAG testing API from the dag crate. Reviewed By: sfilipco Differential Revision: D22249156 fbshipit-source-id: 54a3c458e85804968964174eab674e494a6fa8a2	2020-07-06 15:50:59 -07:00
Jun Wu	cf1bc37007	dag: avoid using > 2 parents in generic DAG tests Summary: Some DAG implementations does not support it. Reviewed By: sfilipco Differential Revision: D22249158 fbshipit-source-id: ebcdf164677ee647ef44aa1ee3cfd318bac658b0	2020-07-06 15:50:59 -07:00
Jun Wu	9a17be7ce0	dag: do not test the order of vertexes in generic tests Summary: Different implementation might return different orders. They should be considered correct. Reviewed By: sfilipco Differential Revision: D22249159 fbshipit-source-id: 36e4cadf814366f7ee2ed8a778948ff810760550	2020-07-06 15:50:58 -07:00
Jun Wu	f24dc621cb	dag: make part of the tests generic Summary: This makes it possible to run tests for other DAGs, like the revlog. Reviewed By: sfilipco Differential Revision: D22249155 fbshipit-source-id: 205579eeaccd42a21297d965973957168bb8726e	2020-07-06 15:50:58 -07:00
Jun Wu	5d9baa2f07	revlogindex: implement fast path for only Summary: For revlog, calculating `only` can have some fast paths that do not scan the entire changelog. Reviewed By: sfilipco Differential Revision: D21944136 fbshipit-source-id: 58391636350f8f19643d59c46d663f55861d6de3	2020-07-06 15:50:58 -07:00
Jun Wu	7d75f6046f	revlogindex: implement fast path for only_both Summary: This will be used to maintain narrow-heads phase calculation and sunsetting the revlog-specific changelog.index2. Reviewed By: sfilipco Differential Revision: D21944131 fbshipit-source-id: a8bbd1fd24546f4891ffa677476bff750c3faf5f	2020-07-06 15:50:58 -07:00
Jun Wu	40cd0f8f06	revlogindex: fix an offset-by-one error Summary: The values of `pending_nodes_index` should start from 0 instead of 1. Reviewed By: sfilipco Differential Revision: D21944133 fbshipit-source-id: a2a332868f16b398037289c81bf8076d1400c0a7	2020-07-06 15:50:58 -07:00
Jun Wu	bd0a35f2a0	revlogindex: do not raise errors on ambiguous prefix Summary: This matches the interface of segmented changelog. Reviewed By: sfilipco Differential Revision: D21944134 fbshipit-source-id: 75f68b2838de4abe95f13cb3c62dc68af132fff7	2020-07-06 15:50:58 -07:00
Jun Wu	4670200c21	revlogindex: maintain revlog.d file handler transparently Summary: This drops the `file` parameter from the `raw_data` API, making RevlogIndex easier to use. Reviewed By: sfilipco Differential Revision: D21854228 fbshipit-source-id: 259726524d1cc6a1f9d00783e22f9502c7decdeb	2020-07-06 15:50:58 -07:00
Jun Wu	137fa3cd34	revlogindex: implement writing to revlog data Summary: Extend RevlogIndex to support writing to revlog data. Reviewed By: sfilipco Differential Revision: D21854227 fbshipit-source-id: 11b6bf3b706b316f23c33ab07144530c9db92d58	2020-07-06 15:50:58 -07:00
Jun Wu	f005d92f07	revlogindex: implement reading from revlog data Summary: Extend RevlogIndex to support reading from revlog data. Reviewed By: sfilipco Differential Revision: D21854229 fbshipit-source-id: 4cbc08762fd236a97370d5d55c59a222f935b262	2020-07-06 15:50:58 -07:00
Jun Wu	2bc4dd01ca	dag: add a trait to convert IdSet to Set Summary: The reverse `to_id_set` exists. It turns out that the Python land wants this in many places. Reviewed By: sfilipco Differential Revision: D22240175 fbshipit-source-id: b6a3a3a3869dc0c521a21b1d86394421b816632b	2020-07-06 15:50:58 -07:00
Jun Wu	07b3d60f80	dag: add "only(x, y)" to DagAlgorithm Summary: This provides a way for implementations to optimize the operation. For segmented changelog, the default implementation is good enough. For revlog, `only` can have a fast path that does not iterate through the entire changelog. A related API `only_both` is added. For revlog it has multiple use-cases, including narrow-heads phase calculation and revlog.findcommonmissing used by discovery. Reviewed By: markbt Differential Revision: D21944132 fbshipit-source-id: d11660dae85ea6158977eb00d1ceaceddf1d8234	2020-07-06 15:50:57 -07:00
Thomas Orozco	7e8c9174be	mononoke/admin: add a filestore fetch subcommand Summary: Sometimes you want to fetch a file. Using curl and the LFS server works, but this really should be part of Mononoke admin. Reviewed By: ikostia Differential Revision: D22397472 fbshipit-source-id: 17decf4aa2017a2c1be52605a254692f293d1bcd	2020-07-06 14:56:08 -07:00
Thomas Orozco	46def15c4f	mononoke/admin: fix `filestore store` subcommand Summary: This got broken when we moved to Tokio 0.2. Let's fix it and add a test to make sure it does not regress. Reviewed By: ikostia Differential Revision: D22396261 fbshipit-source-id: a8359aee33b4d6d840581f57f91af6c03125fd6a	2020-07-06 14:56:08 -07:00
Arun Kulshreshtha	7ae5344bb2	edenapi_types: improve DataEntry hash check API Summary: Use `thiserror` to provide a more ergonomic API for `DataEntry` hash checking. The `.data()` method now simply returns a `Result` rather than a tuple with an ad-hoc enum. Reviewed By: quark-zju Differential Revision: D22376164 fbshipit-source-id: fc39cb212ec1ee5830292db4aa5eca18f2c16a2b	2020-07-06 14:47:48 -07:00
Durham Goode	f02c7e85f0	py3: fix streampager progress bar Summary: The streaming pager expects bytes as inputs, but we were sending strings to the progress buffer. This fixes it. Reviewed By: quark-zju Differential Revision: D22394395 fbshipit-source-id: 4acbfc08ca624ca3c794e6e369df669e370e5b42	2020-07-06 14:07:59 -07:00
Durham Goode	dc09a79a6b	py3: fix checkout with conflicts in eden Summary: It now works Reviewed By: quark-zju Differential Revision: D22394103 fbshipit-source-id: e525192079bef66d36f731c754c4b73f5fc7cb11	2020-07-06 14:07:59 -07:00
Jun Wu	253584bf2e	test-generaldelta: disable modern features Summary: The test relies on Python revlog implementation details which do not exist in the Rust revlog implementation. Reviewed By: DurhamG Differential Revision: D22240183 fbshipit-source-id: b245b35e561c3364618a0e199244df030cc47942	2020-07-06 14:04:28 -07:00
Jun Wu	1a5de54ab0	test-bookmarks-strip: do not use debugstrip Reviewed By: DurhamG Differential Revision: D22240182 fbshipit-source-id: b0bf34e84f46a0593b6390c6c97a47110f8d94d2	2020-07-06 14:04:28 -07:00
Jun Wu	ddfb3c5dba	test-backout: rewrite the test in a basic form Summary: The original test is unmaintainable. Rewrite it to test key features. I dropped detailed tests about merge conflict / content handling. In the future we probalby will have a clean Rust implementation of "applying diff between X and Y to Z" which can replace various unmaintainable patch application logic in Python. We can test that Rust library extensively and commands will just use the clean library (ex. revert, backout) Reviewed By: sfilipco Differential Revision: D22240184 fbshipit-source-id: 4d6c65fe02ccc92e64c62a48f702187678973086	2020-07-06 14:04:28 -07:00
Jun Wu	24ab8f5b4b	test-amend-split: do not use `hg debugstrip` Summary: `debugstrip` is an operation that depends on multiple legacy components (revlog strip, truncate-based transaction). They are incompatible with modern configs (no truncation, heads-based visibility, metalog-based transaction). Avoid using it in the test. Reviewed By: DurhamG Differential Revision: D22240187 fbshipit-source-id: ec215d75fb766957a3d6f58e491ef815f5bedbdc	2020-07-06 14:04:28 -07:00
Jun Wu	4da3addbee	tests: migrate off rev numbers for more tests Summary: Change by `fix-revnum.py`. Part of the tests using `hg debugstrip`, which I'm trying to avoid. Reviewed By: DurhamG Differential Revision: D22240181 fbshipit-source-id: a569b712fe4b985378e5c61c000deecccefbc488	2020-07-06 14:04:28 -07:00
Jun Wu	ac60550de6	commands: disable rollback unconditionally Summary: Since tests do not use it, and we have long disabled it in production. Let's just disable the command unconditionally. Reviewed By: DurhamG Differential Revision: D22368834 fbshipit-source-id: 7ebc5b07c4044b6809defc06437cda7256cb2ebf	2020-07-06 14:04:27 -07:00
Jun Wu	24363ad52f	tests: remove usage of `hg rollback` Summary: `hg rollback` was long disabled in production setup. It has weird behavior and is likely incompatible with modern transaction frameworks. Remove its usage in tests. Reviewed By: DurhamG Differential Revision: D22240180 fbshipit-source-id: 453684ebbc77132e09b1b717b6ad1e106dcad214	2020-07-06 14:04:27 -07:00
Jun Wu	dabce28285	repoview: further remove repoview references Summary: Since repoview is removed, those concetps are useless. Therefore remove them. This includes: - repo.unfiltered(), repo.filtered(), repo.filtername - changelog.filteredrevs - error.FilteredIndexError, error.FilteredLookupError, error.FilteredRepoLookupError - repo.unfilteredpropertycache, repo.filteredpropertycache, repo.unfilteredmethod - index.headsrevsfiltered Reviewed By: DurhamG Differential Revision: D22367600 fbshipit-source-id: d133b8aaa136176b4c9f7f4b0c52ee60ac888531	2020-07-06 14:04:27 -07:00
Jun Wu	021fa7eba5	repoview: remove repoview layer Summary: End-users have been using visibleheads + narrowheads for a while, and hgsql does not require any filtering, and most tests are migrated to modern configs (visibility + narrow heads). Now it's time to consider removing the repoview layer. This removes complexities around `changelog.filteredrevs` and various different `repoview` objects with caching problems (ex. I have seen that `repo` and `unfi` have inconsistent phasecache causing they calculate phases differently and it's quite hard to reason about confidently). This will also make it easier to migrate to segmented changelog. Reviewed By: DurhamG Differential Revision: D22201084 fbshipit-source-id: 3661c26dd72a64b5005d86e164af4da5a6895649	2020-07-06 14:04:27 -07:00
Kostia Balytskyi	6d5b3ac1f2	live_commit_sync_config: add versions accessors Summary: This diff adds two new bits of functionality to `LiveCommitSyncConfig`: - getting all possible versions of `CommitSyncConfig` for a given repo - getting `CommitSyncConfig` for a repo by version name These bits are meant to be used in: - `commit_validator` and `bookmarks_validator`, which would need to run validation against a specific config version - `mononoke_admin`, which would need to be able to query all versions, display the version used to sync two commits and so on Reviewed By: StanislavGlebik Differential Revision: D22235381 fbshipit-source-id: 42326fe853b588849bce0185b456a5365f3d8dff	2020-07-06 14:00:36 -07:00
Jun Wu	ef913a9914	debugshell: ignore error setting up commitcloud service Summary: For various reasons (ex. wrong configs like investigating test repos) the initialization can fail. Ignore them. Reviewed By: DurhamG Differential Revision: D22368942 fbshipit-source-id: ae01dcc499f63f373b0f7bec00554ea8074ae7cf	2020-07-06 12:54:52 -07:00
Thomas Orozco	ce0af2d591	mononoke/virtually_sharded_blobstore: deduplicate puts based on data being put Summary: This updates the virtually_sharded_blobstore to deduplicate puts only if the data being put is actually the data we have put in the past. This is done by keeping track of the hash of things we've put in the presence cache. This has 2 benefits: - This is safer. We only dedupe puts we 100% know succeeded (because this particular instance was the one to attempt the put). - This is creates less surprises, notably it lets us overwrite data in the backing store (if we are writing something different). Reviewed By: StanislavGlebik Differential Revision: D22392809 fbshipit-source-id: d76a49baa9a5749b0fb4865ee1fc1aa5016791bc	2020-07-06 12:10:46 -07:00
Thomas Orozco	19b31ead9d	mononoke/virtually_sharded_blobstore: make race tests a little more forgiving Summary: Running those on my devserver, I noticed they can be a bit flaky. They're are racy on the purpose, but let's relax them a bit. We have a lot of margin here — our blobstore is rate limited at once request every 10ms, and we need to do 100 requests (the goal is to show that they don't all wait), so 100ms is fine to prove that they're not rate limited when sharing the same data. Reviewed By: StanislavGlebik Differential Revision: D22392810 fbshipit-source-id: 2e3c9cdf19b0e4ab979dfc000fbfa8da864c4fd6	2020-07-06 12:10:46 -07:00
Kostia Balytskyi	f223ca6e6e	synced commit mapping: expose version in get query Summary: When we look up how a commit was synced, we frequently need to know which version of `CommitSyncConfig` was used to sync it. Specifically, this is useful for admin tooling and commit validator, which I am planning to migrate to use versioned `CommitSyncConfig` in the near future. Later I will also include this information into `RewrittenAs` variant of `CommitSyncOutcome`, so that we expose it to real users. I did not do it in this diff to keep is small and easy to review. And because the other part is not ready :P Reviewed By: StanislavGlebik Differential Revision: D22255785 fbshipit-source-id: 4312e9b75e2c5f92ba018ff9ed9149efd3e7b7bc	2020-07-06 11:23:31 -07:00
Mateusz Kwapich	7b3aa42459	fix the problem with ordering in into_response Summary: When I've implemented this method I didn't test it for preserving the order of the input changesets and I've noticed my mistake when I was testing the scmquery part. Reviewed By: StanislavGlebik Differential Revision: D22374981 fbshipit-source-id: 4529f01370798377b27e4b6a706fc192a1ea928e	2020-07-06 08:32:03 -07:00
Mark Thomas	bcaaba1e9c	add list-bookmarks command Summary: Add the `scsc list-bookmarks` command, which lists bookmarks in a repository. If a commit id is also provided, `list-bookmark` will be limited to bookmarks that point to that commit of one of its descendants. Reviewed By: mitrandir77 Differential Revision: D22361240 fbshipit-source-id: 17067ba47f9285b8137a567a70a87fadcaabec80	2020-07-06 07:01:24 -07:00
Thomas Orozco	07907b2b26	mononoke/virtually_sharded_blobstore: merge in the context_concurrency_blobstore Summary: There is inevitably interaction between caching, deduplication and rate limiting: - You don't want the rate limiting to be above caching (in the blobstore stack, that is), because you shouldn't rate limits cache hits (this is where we are today). - You don't want the rate limiting to below deduplication, because then you get priority inversion where a low-priority rate-limited request might hold the semaphore while a higher-priority, non rate limited request wants to do the same fetch (we could have moved rate limiting here prior to introducing deduplication, but I didn't do it earlier because I wanted to eventually introduce deduplication). So, now that we have caching and deduplication in the same blobstore, let's also incorporate rate limiting there!. Note that this also brings a potential motivation for moving Memcache into this blobstore, in case we don't want rate limiting to apply to requests before they go to the _actual_ blobstore (I did not do this in this diff). The design here when accessing the blobstore is as follows: - Get the semaphore - Check if the data is in cache, if so release the semaphore and return the data. - Otherwise, check if we are rater limited. Then, if we are rate limited: - Release the semaphore - Wait for our turn - Acquire the semaphore again - Check the cache again (someone might have put the data we want while we were waiting). - If the data is there, then return our rate limit token. - If the data isn't there, then proceed to query the blobstore. If we aren't rate limited, then we just proceed to query the blobstore. There are a couple subtle aspects of this: - If we have a "late" cache hit (i.e. after we waited for rate limiting), then we'll have waited but we won't need to query the blobstore. - This is important when a large number of requests from the same key arrive at the same time and get rate limited. If we don't do this second cache check or if we don't return the token, then we'll consume a rate limiting token for each request (instead of 1 for the first request). - If a piece of data isn't cacheable, we should treat it like a cache hit with regard to semaphores (i.e. release early), but like a miss with regard to rate limits (i.e. wait). Both of those are addressed captured in the code by returning the `Ticket` on a cache hit. We can then choose to either return the ticket on a cache hit, or wait for it on a cache miss. (all of this logic is captured in unit tests, we can remove any of the blocks there in `Shards::acquire` and a test will fail) Reviewed By: farnz Differential Revision: D22374606 fbshipit-source-id: c3a48805d3cdfed2a885bec8c47c173ee7ebfe2d	2020-07-06 04:38:31 -07:00
Thomas Orozco	6153dab328	mononoke/async_limiter: add support for cancelling access Summary: Sometimes we take a token then realize we don't want it. In this case, giving it back is convenient. This adds this! Reviewed By: farnz Differential Revision: D22374607 fbshipit-source-id: ccf47e6c75c37d154704645c9e826f514d6f49f6	2020-07-06 04:38:31 -07:00
Kostia Balytskyi	b7cf1dcbdb	x-repo sync job: use LiveCommitSyncConfig Summary: This is a mirror image of a diff, which made backsyncer use `LiveCommitSyncConfig`: we want to use configerator-based live configs, when we run in the continuous tailing mode. As no-op iteration time used to be 10s and that's a bit wasteful for tests, this diff changes it to be configurable. Finally, because of instantiating various additional `CommitSyncerArgs` structs, this diff globs out some of the `using repo` logs (which aren't very useful as test signals anyway, IMO). Reviewed By: StanislavGlebik Differential Revision: D22209205 fbshipit-source-id: fa46802418a431781593c41ee36f468dee9eefba	2020-07-03 13:36:18 -07:00
Arun Kulshreshtha	1b5283aa5a	edenapi_types: improve comments Summary: Tidy up some comments in this file. Reviewed By: ikostia Differential Revision: D22376165 fbshipit-source-id: ce4760776048aa8e72809b4f828d0ea426fcf878	2020-07-03 12:29:19 -07:00
Arun Kulshreshtha	dc98c085ad	edenapi_types: split redaction tombstone string across multiple lines Summary: Make this line less long. Reviewed By: ikostia Differential Revision: D22372492 fbshipit-source-id: cfc1ab6a296aa2056a908bf786e4f498f3a688b4	2020-07-03 12:15:01 -07:00
Stanislau Hlebik	2cfc23770c	mononoke: use override_blame_filesize_limit option Summary: This diff actually start to use the option Reviewed By: krallin Differential Revision: D22373943 fbshipit-source-id: fe23da9c3daa1f9f91a5ee5e368b33e0091aa9c1	2020-07-03 09:58:46 -07:00
Stanislau Hlebik	06f2a420d1	mononoke: correctly return BlameError::Rejected Summary: Previously if a blame request was rejected (e.g. because a file was too large) then we returned BlameError::Error. This doesn't look correct, because there's BlameError::Rejected. This diff makes it so that fetch_blame function returns BlameError::Rejected Reviewed By: aslpavel Differential Revision: D22373948 fbshipit-source-id: 4859809dc315b8fd66f94016c6bd5156cffd7cc2	2020-07-03 09:58:46 -07:00
Stanislau Hlebik	2a732f2626	mononoke: pass BlobRepo in fetch_full_file_content Summary: In the next diffs we'll need to read override_blame_filesize_limit from derived data config, and this config is stored in BlobRepo. this diff makes a small refactoring to pass BlobRepo to fetch_full_file_content Reviewed By: krallin Differential Revision: D22373946 fbshipit-source-id: b209abce82c0279d41173b5b25f6761659a92f3d	2020-07-03 09:58:46 -07:00
Stanislau Hlebik	b703f11685	mononoke: asyncify fetch_full_file_content Summary: This will make adding blame file size limit override the next diffs easier Reviewed By: krallin Differential Revision: D22373945 fbshipit-source-id: 4857e43c5d80596340878753ea90bf31d7bb3367	2020-07-03 09:58:46 -07:00
Mateusz Kwapich	f21b459c99	remove dependency on bounded_traversal Summary: We're always yielding zero or one child during traversal, bounded traversal is unnecessary here Differential Revision: D22242148 fbshipit-source-id: b4c8a1279ef7bd15e9d0b3b2063683f45e30a97a	2020-07-03 08:02:25 -07:00
Mateusz Kwapich	7ff7c931a8	add option for limiting the log to descendants of single node Summary: Let's use new option in CLI. Unfortunately we can't easily accept commit ids in named params so it has to be a postional one. Differential Revision: D22234412 fbshipit-source-id: a9c27422fa65ae1c42cb1c243c7694507a957437	2020-07-03 08:02:25 -07:00
Thomas Orozco	de731a89fc	mononoke/virtually_sharded_blobstore: log deduplicated puts Summary: If anything were to go wrong, we'd be happy to know which puts we ignored. So, let's log them. Reviewed By: farnz Differential Revision: D22356714 fbshipit-source-id: 5687bf0fc426421c5f28b99a9004d87c97106695	2020-07-03 05:53:11 -07:00
Thomas Orozco	be1bac6c06	mononoke/virtually_sharded_blobstore: expose this in cmdlib Summary: Eventually, I plan to make this the default, but for now I'd like to make it something we can choose to turn on or off as a cmd argument (so we can start with the experimental tier and Fastreplay). Note that this mixes volatile vs. non-volatile pools when accessing the pools for cacheblob. In practice, those pools are actually volatile, it's just that things don't break if you access them as non-volatile. Reviewed By: farnz Differential Revision: D22356537 fbshipit-source-id: 53071b6b21ca5727d422e10f685061c709114ae7	2020-07-03 05:53:11 -07:00
Thomas Orozco	c68100f46e	mononoke/virtually_sharded_blobstore: spawn before taking semaphores Summary: I canaried this on Fastreplay, but unfortunately that showed that sometimes we just deadlock, or get so slow we might as well be deadlocked (and it happens pretty quickly, after ~20 minutes). I tried spawning all the `get()` futures, and that fixes the problem (but it makes gettreepack noticeably slower), so that suggests something somewhere is creating futures, polling them a little bit, then never driving them to completion. For better or worse, I'd experienced the exact same problem with the ContextConcurrencyBlobstore (my initial attempt at QOS, which also used a semaphore), so I was kinda expecting this to happen. In a sense, this nice because I we've suspected there were things like that in the codebase for a while (e.g. with the occasional SQL timeout we see where it looks like MySQL responds fast but we don't actually poll it until past the timeout), and it gives us a somewhat convenient repro. In another sense, it's annoying because it blocks this work :) So, to work around the problem, for now, let's spawn futures to force the work to complete when a semaphore is held. I originally had an unconditional spawn here, but that is too expensive for the cache-hit code path and slows things down (by about ~2x). However, having it only if we'll query the blobstore isn't not as expensive, and that seems to be fine (in fact it is a ~20% p99 perf improvement, though the exact number depends on the number of shard we use for this, which I've had to tweak a bit). https://pxl.cl/1c18H I did find what I think is one potential instance of this problem in `bounded_traversal_stream`, which is that we never try to poll `scheduled` to completion. Instead, we just poll for the next ready future in our FuturesUnordered, and if that turns out to be synchronous work then we'll just re-enqueue more stuff (and sort of starve async work in this FuturesUnordered). I tried updating bounded traversal to try a fairer implementation (which polls everything), but that wasn't sufficient to make the problem go away, so I think this is something we have to just accept for now (note that this actually has some interesting perf impact in isolation: it's a free ~20% perf improvement on p95+: https://pxl.cl/1c192 see 976b6b92293a0912147c09aa222b2957873ef0df if you're curious Reviewed By: farnz Differential Revision: D22332478 fbshipit-source-id: 885b84cda1abc15c51fbc5dd34473e49338e13f4	2020-07-03 05:53:11 -07:00
Thomas Orozco	2082621d51	mononoke/virtually_sharded_blobstore: add ODS metrics Summary: Those are useful to track. Reviewed By: farnz Differential Revision: D22332480 fbshipit-source-id: 43f5cd7121c4aa497d961015e7c16973615798d1	2020-07-03 05:53:10 -07:00
Thomas Orozco	1db62473f2	mononoke/virtually_sharded_blobstore: track perf counters Summary: Like it says in the title. Those are useful! Reviewed By: farnz Differential Revision: D22332479 fbshipit-source-id: f9bddad75fcbed2593c675f9ba45965bd87f1575	2020-07-03 05:53:10 -07:00
Thomas Orozco	c297024a52	mononoke/virtually_sharded_blobstore: do not delay reads for uncacheable data Summary: The goal of this blobstore is to dedupe reads by waiting for them to finish and hit cache instead (and also to dedupe writes, but that's not relevant here). However, this is not a desirable feature if a blob cannot be stored in cache, because then we're serializing accesses for no good reason. So, when that happens, we store "this cannot be stored in cache", and we release reads immediately. Reviewed By: farnz Differential Revision: D22285269 fbshipit-source-id: be7f1c73dc36b6d58c5075172e5e3c5764eed894	2020-07-03 05:53:10 -07:00
Thomas Orozco	b9319a4d32	mononoke/virtually_sharded_blobstore: add a newtype for cache keys + a prefix Summary: I'm going to store things that aren't quite the exact blobs in here, so on the off chance that we somehow have two caching blobstores (the old one and this one) that use the same pools, we should avoid collisions by using a prefix. And, since I'm going to use a prefix, I'm adding a newtype wrapper to not use the prefixed key as the blobstore key by accident. Differential Revision: D22285271 fbshipit-source-id: e352ba107f205958fa33af829c8a46896c24027e	2020-07-03 05:53:10 -07:00
Thomas Orozco	bf3c2e19f0	mononoke/virtually_sharded_blobstore: a caching blobstore that deduplicates Summary: This introduces a caching blobstore that deduplicates reads and writes. The underlying motivation is to improve performance for processes that might find themsleves inadvertently reading the same data concurrently from a bunch of independent callsites (most of Mononoke), or writing the same bit of data over and over again. The latter is particularly useful for things like commit cloud backfilling in WWW, where some logger commits include the same blob being written hundreds or thousands of times, and cause us to overload the underlying Zippy shard in Manifold. This is however a problem we've also encountered in the past in e.g. the deleted files manifest and had to solve there. This blobstore is a little different in the sense that it solves that problem for all writers. This comes at the cost of writes being dropped if they're known to be redundant, which prevents updates through this blobstore. This is desirable for most of Mononoke, but not all (notably, for skiplist updates it's not great). For now, I'm going to add this behind an opt-in flag, and later on I'm planning to make it opt-out and turn it off there (I'm thinking to use the CoreContext for this). Reviewed By: farnz Differential Revision: D22285270 fbshipit-source-id: 4e3502ab2da52a3a0e0e471cd9bc4c10b84a3cc5	2020-07-03 05:53:10 -07:00
Kostia Balytskyi	f210326656	blobstore_healer: log the speed with which queue rows are deleted Summary: This allowed me to compare two alternative approaches to queue draining, and generally seems like a useful thing to do. Reviewed By: krallin Differential Revision: D22364733 fbshipit-source-id: b6c76295c85b4dec6f0bfd7107c30bb4e4a28942	2020-07-03 05:09:56 -07:00
Johan Schuijt-Li	2b69716461	push compat() down one level from main Summary: Migrate to new-style futures Reviewed By: ikostia Differential Revision: D22365232 fbshipit-source-id: 08ddd50be1c34fe90a453f369cea2e45323b63db	2020-07-03 02:36:09 -07:00
Stanislau Hlebik	2d24ddf2e1	mononoke: add --all-types to backfill_derive_data single Summary: It's useful to derive all enabled derived data at once Reviewed By: krallin Differential Revision: D22336338 fbshipit-source-id: 54bc27ab2c23c175913fc02e6bf05d18a54c249c	2020-07-03 00:20:58 -07:00
Stanislau Hlebik	2a54f281f2	mononoke: add an option to perform a stack move in megarepotool Summary: We've recently added an option to perform a stack move in megarepolib. A "stack move" it's a stack of commits that move a files according to a mover. Now let's expose it in the megarepotool Reviewed By: ikostia Differential Revision: D22312486 fbshipit-source-id: 878d4b2575ed2930bbbf0b9b35e51bb41393e622	2020-07-03 00:18:41 -07:00
Jun Wu	fe1aadc238	util: do not print large dict for smarttraceback Summary: Printing it can take too much time. Use a larger threshold than `list` or `set`, since `hg` commonly uses a dict for command line options, and that can exceeds 8 items. As we're here, also fix the traceback order so it's "most recent call last". Reviewed By: kulshrax Differential Revision: D22362003 fbshipit-source-id: 3d2f4664bec6b4cfaf42b8e5d2fc47b0f3d96411	2020-07-02 19:19:10 -07:00
Lukas Piatkowski	c763ab4b40	eden/scm: provide getdeps.py way of building eden/scm on GitHub Summary: In order to do what the title says, this diff does: 1. Add the `eden/oss/.../third-party/rust/.../Cargo.toml` files. As mentioned in the previous diff, those are required by GitHub so that the third party dependencies that are local in fbsource are properly defined with a "git" dependency in order for Cargo to "link" crates properly. 2. Changes to `eden/scm/Makefile` to add build/install commands for getdeps to invoke. Those command knowing that they are called from withing getdeps context they link the dependencies brought by getdeps into their proper places that match their folder layout in fbsource. Those Makefile commands also pass a GETDEPS_BUILD env to the setup.py invocations so that it knows it is being called withing a getdeps build. 3. Changes to `eden/scm/setup.py` that add "thriftasset" that makes use of the getdeps.py provided "thrift" binary to build .py files out of thrift files. 4. Changes to `distutils_rust` to use the vendored crates dir provided by getdeps. 5. Changes to `getdeps/builder.py` and `getdeps/manifest.py` that enable more fine-grained configuratior of how Makefile builds are invoked. 6. Changes to `getdeps/buildopts.py` and `getdeps/manifest.py` to disable overriding PATH and pkgconfig env, so that "eden/scm" builds in getdeps using system libraries rather than getdeps-provided ones (NOTE: I've tried to use getdeps provided libraries, but the trickiest bit was that Rust links with Python, which is currently not providable by getdeps, so if you try to build everything the system provided Python libraries will collide with getdeps provided ones) 7. Added `opensource/fbcode_builder/manifests/eden_scm` for the getdeps build. Reviewed By: quark-zju Differential Revision: D22336485 fbshipit-source-id: 244d10c9e06ee83de61e97e62a1f2a2184d2312f	2020-07-02 17:53:37 -07:00
Stefan Filip	232caa0cd6	tests: update test-*ruststores-repack for Mononoke Summary: The test case assumed that clone would return data in order from the server. That is not a valid assumption and Mononoke doesn't return data in order. Reviewed By: xavierd Differential Revision: D22364636 fbshipit-source-id: abfcbe0074a08c9a76c42d351ce5c792eb65e24f	2020-07-02 14:37:41 -07:00

1 2 3 4 5 ...

6268 Commits