sapling

mirror of https://github.com/facebook/sapling.git synced 2024-10-11 01:07:15 +03:00

Author	SHA1	Message	Date
Xavier Deguillard	e7b5eb81d9	remotefilelog: add lfs fetching retry logic Summary: The LFS server might be temporarily having issues, let's retry a bit before giving up. Reviewed By: DurhamG Differential Revision: D20686659 fbshipit-source-id: 90dabd19e45a681d6eae5cd50c72b635d44c0517	2020-03-30 14:45:48 -07:00
Xavier Deguillard	f06132def2	configparser: add FromConfigValue for floats Summary: Since we have all the integer types, let's also allow float types in the config. Reviewed By: kulshrax Differential Revision: D20697007 fbshipit-source-id: 21fa264d24c0f63c233f47c3bcfb2448b4c05c70	2020-03-30 14:45:48 -07:00
Xavier Deguillard	15c81cd54f	revisionstore: repack even when having one packfile Summary: When repacking for the purpose of file format changes, a single packfile may contain data that needs to be moved out of it, and thus, we need to do a repack then. Reviewed By: DurhamG Differential Revision: D20677442 fbshipit-source-id: c621dd2e657f5a4565b37d4b029731415b899117	2020-03-30 14:45:47 -07:00
Xavier Deguillard	e0cfd08e7f	revisionstore: properly implement get_missing for remote stores Summary: Remotestores can implement get_missing properly by simply querying the underlying store that they will be writing to. This may prevent double fetching some blobs in `hg prefetch` that we already have. Reviewed By: DurhamG Differential Revision: D20662668 fbshipit-source-id: 22140b5b7200c687e0ec723dd8879dc8fbea6fb9	2020-03-30 14:45:47 -07:00
Xavier Deguillard	e1ccc29eec	revisionstore: add an is_local method to the IndexedLog abstraction Summary: There are cases where the user of the abstraction needs to know if this is a local store, this will simplify the caller code. Reviewed By: DurhamG Differential Revision: D20662666 fbshipit-source-id: e0bde7eb0dc3484979732a7c4cdf888fedc70e13	2020-03-30 14:45:46 -07:00
Xavier Deguillard	8ed454845a	revisionstore: auto-sync the LFS blobs store Summary: By regularly flushing the blob store, we avoid keeping too many LFS blobs in memory, which could cause OOM issues. The default size is chosen to be 1GB, but is configurable for more control. Reviewed By: DurhamG Differential Revision: D20646213 fbshipit-source-id: 12c06fd0212ef3974bea10c82026b6e74fb5bf21	2020-03-30 14:45:46 -07:00
Xavier Deguillard	5ea2e581c6	revisionstore: store LFS blobs in an IndexedLog Summary: In the legacy lfs extension, LFS blobs were stored as loosefiles on disk, and as we saw with loosefiles for remotefilelog, they can incur a significant overhead to maintain. Due to LFS blobs being large by definition, the number of loose LFS blobs should be reasonable for repack to walk over all of them to chose which one to throw away. A different approach would be to simply store the blobs in an on-disk format that allows automatic size management, and simple indexing. That format is an IndexedLog. This of course doesn't come without drawbacks, the main one being that the IndexedLog API mandate that the full blob is present on insertion, preventing streaming writes to it, the solution is to simply chunk the blobs before writing them to it. While proper streaming is not done just yet, the storage format no longer prevent it from being implemented. Reviewed By: DurhamG Differential Revision: D20633783 fbshipit-source-id: 37a88331e747cf22511aa348da2d30edfa481a60	2020-03-30 14:45:46 -07:00
Arun Kulshreshtha	91e76774fe	edenapi: fix typo in logging Reviewed By: sfilipco Differential Revision: D20744831 fbshipit-source-id: 194e550d88c33ed9601d2f24fea281996696087b	2020-03-30 14:39:42 -07:00
Jun Wu	6bddc1df08	rotatelog: avoid loading broken Logs multiple times Summary: RotateLog loads older logs lazily. If an older log is broken, remember that and avoid loading the broken log again. Reviewed By: DurhamG Differential Revision: D20663899 fbshipit-source-id: 7a4b5279cc6387c19329a51048bfe1be2e0bc1f8	2020-03-30 11:34:49 -07:00
Xavier Deguillard	fd72344578	revisionstore: feature gate the Mononoke LFS tests Summary: Due to the Mononoke LFS server only being available on FB's network, the tests using them cannot run outside of FB, including in the github workflows. Reviewed By: quark-zju Differential Revision: D20698062 fbshipit-source-id: f780c35665cf8dc314d1f20a637ed615705fd6cf	2020-03-30 08:40:43 -07:00
Stefan Filip	ea89b541e1	segmented_changelog: add Dag struct and location_to_name functionality Summary: The IdDag provides graph algorithms using Segments. The IdMap allows converting from the SegmentedChangelogId domain to the ChangesetId domain. The Dag struct wraps IdDag and IdMap in order to provide graph algorithms using the common application level identifiers for commits (ChangesetId). The construction of the Dag is currently mocked with something that can only be used in a test environment (unit tests but also integration tests). This diff also implements a location_to_name function. This is the most important new functionality that segmented changelog clients require. It recovers the hash of a commit for which the client only has a segmented changelog Id. The current assumption is that clients have identifiers for all merge commit parents so the path to a known commit always follow a set of first parents. The IdMap queries will have to be changed to async in the future, but IdDag queries we expect to stay sync. Reviewed By: quark-zju Differential Revision: D20635577 fbshipit-source-id: 4f9bd8dd4a5bd9b0de55f51086f3434ff507963c	2020-03-27 13:48:52 -07:00
Stefan Filip	7502ce31ca	dag: add in process stored IdMap constructor Summary: The interesting observation is that InProcessStore is not public. Reviewed By: quark-zju Differential Revision: D20635578 fbshipit-source-id: a0149929c8059ff77f047fd385bf3b26dc738dfd	2020-03-27 13:48:51 -07:00
Lukas Piatkowski	bace9bc0ae	mononoke: make filenode and mercurial_types OSS buildable Reviewed By: farnz Differential Revision: D20281345 fbshipit-source-id: bc4910d3040d74c7ceb4cb825bea6960952f6310	2020-03-27 11:40:13 -07:00
Xavier Deguillard	179caf60dd	revisionstore: pass the config when building the LFS stores Summary: Instead of hardcoding some values, use configured ones. Reviewed By: DurhamG Differential Revision: D20633784 fbshipit-source-id: d17657b1fca29280f2d07e8cf824e553ad3704f7	2020-03-26 19:02:49 -07:00
Xavier Deguillard	d404b0a228	revisionstore: revamp repack to ease file format migration Summary: One of the main drawback of the current version of repack is that it writes back the data to a packfile, making it hard to change file format. Currently, 2 file format changes are ongoing: moving away from packfiles entirely, and moving from having LFS pointers stored in the packfiles, to a separate storage. While an ad-hoc solution could be designed for this purpose, repack can fullfill this goal easily by simply writing to the ContentStore, the configuration of the ContentStore will then decide where this data will be written into. The main drawback of this code is the unfortunate added duplication of code. I'm sure there is a way to avoid it by having new traits, I decided against it for now from a code readability point of view. Reviewed By: DurhamG Differential Revision: D20567118 fbshipit-source-id: d67282dae31db93739e50f8cc64f9ecce92d2d30	2020-03-26 19:02:48 -07:00
Xavier Deguillard	d4780ee322	revisionstore: store a HashMap of ContentHash in LFS pointer Summary: While the primary (for now) way of addressing an LFS blob is via its sha256, being able to address them via different hash schemes (sha1 for Eden/Buck, blake2, etc) will be helpful down the line. Thus, let's store a HashMap of ContentHash in the pointer store. Reviewed By: DurhamG Differential Revision: D20560197 fbshipit-source-id: 8bdc4fc4cd7fc19c7eed6a27d11953c4eedf9195	2020-03-26 19:02:47 -07:00
Xavier Deguillard	2f314fc43c	revisionstore: remove unnecessary lock around the LfsBlobsStore Summary: No locking is required for this one due to being loose files on disk. Reviewed By: DurhamG Differential Revision: D20522890 fbshipit-source-id: 72b7ebc063060a89f54976a1128977a3b7501053	2020-03-26 19:02:47 -07:00
Xavier Deguillard	6372a4a4fc	revisionstore: add an is_lfs method to Metadata Summary: Instead of having the magic number 0x2000 all over the place, let's move the logic to this method. Reviewed By: DurhamG Differential Revision: D20637749 fbshipit-source-id: bf666f8787e37e6d6c58ad8982a5679b7e3e717b	2020-03-25 12:29:25 -07:00
Stefan Filip	c400809eba	dag: rename child index iteration to iter_master_flat_segments_with_parent Summary: `iter_segments_with_parent` has a few more conditions attached to it than the name would imply. We are renaming it to give a better sense of its true behavior. Reviewed By: quark-zju Differential Revision: D20547631 fbshipit-source-id: 406f46b9de5efc9e8e6a8c4bc22ab18fa5bc54bb	2020-03-24 13:58:07 -07:00
Stefan Filip	59ff2a8571	dag: remove_non_master implementation for Summary: Also adding better tests for non master entries. Reviewed By: quark-zju Differential Revision: D20504483 fbshipit-source-id: 60d4a20aecb00f7750db2fff5d3832aac99d00e2	2020-03-24 13:58:06 -07:00
Stefan Filip	03c1e1cac5	dag: iterator implementations for InProcessStore Summary: The main question I had while writing the tests was whether we expect a specific order for Segments for `iter_segments_with_parent`. `InProcessStore` will return the segments in the order that they were inserted. Reviewed By: quark-zju Differential Revision: D20501401 fbshipit-source-id: 48ceb78f3191c7425c1488a3392cf3167f7e7268	2020-03-24 13:58:06 -07:00
Stefan Filip	5f4e706f81	dag: Add InProcessStore as iddagstore Summary: First 6 methods implemented from the IdDagStore trait for the InProcessStore. Any suggestions welcome. Reviewed By: quark-zju Differential Revision: D20499228 fbshipit-source-id: cb536a3a0136077ada78934d82a25d079a5bc809	2020-03-24 13:58:06 -07:00
Stefan Filip	3dcb56535e	dag: add descriptions to IdDagStore methods Summary: Documentation. Reviewed By: quark-zju Differential Revision: D20499926 fbshipit-source-id: ebbb7a1249109bd56ff459a659e0c628c2974179	2020-03-24 13:58:05 -07:00
Steven Troxler	d90435ced9	Deprecate `rust-crypto` in `eden/scm/lib/revisionstore` Summary: Replace `rust-crypto` with `hex`, `sha-1`, `sha2`. - `crypto::sha1::Sha1` with `sha1::Sha1` - `crypto::sha2::Sha2` with `sha2::Sha2` - `crypto::digest::Digest` with `sha1::Digest` and `sha2::Digest` - `.result_str()` with `hex::encode` and `.result()` Reviewed By: jsgf Differential Revision: D20588313 fbshipit-source-id: 75c4342e8b6285f0f960f864c21457a1a0808f64	2020-03-23 16:38:07 -07:00
Xavier Deguillard	674e9c7900	fsinfo: return an enum instead of a String Summary: In a strongly typed langage, using strings should be avoided whenever possible as they do not provide the safety guarantees that types provide. I took the liberty of removing all the filesystems that are not relevant for Mercurial for simplification reasons. If needs arise, we can always add a new FsType to the enum. Reviewed By: DurhamG Differential Revision: D20517138 fbshipit-source-id: 0a38b53c6a87f05f4b2d664038e10c4293de96ae	2020-03-23 14:29:10 -07:00
Steven Troxler	b6de099da8	Deprecate `rust-crypto` in `zstore` Summary: Replace `rust-crypto` with `sha-1`: - `crypto::digest::Digest` with `sha1::Digest` - `crypto::Sha1` with `sha1::Sha1` The interface changes slightly - no need to pass a mutable byte array when getting the result. Reviewed By: jsgf Differential Revision: D20587638 fbshipit-source-id: c6c737f3f8eba94b98c728e198eb4fac12c5c80b	2020-03-23 11:29:09 -07:00
Steven Troxler	71279b3916	Deprecate rust-crypto in `manifest-tree` Summary: Replace `rust-crypto` with `sha-1`: - `crypto::digest::Digest` with `sha1::Digest` - `crypto::sha1::Sha1` with `sha1::Sha1` Reviewed By: jsgf Differential Revision: D20587716 fbshipit-source-id: de801c20bffd356eb5b2205a63ec0218b3aca6c0	2020-03-23 11:15:58 -07:00
Steven Troxler	154c458285	Deprecate `rust-crypto` in `eden/scm/lib/types` Summary: Swap out `rust-crypto` for `sha-1` - `crypto::sha1::Sha1` is replaced by `sha1::Sha1` - `crypto::digest::Digest` is replaced by `digest::Digest` Reviewed By: jsgf Differential Revision: D20587685 fbshipit-source-id: 971fdaa8ce5b3e9e60db219131f6c36dcbc213d9	2020-03-23 11:15:57 -07:00
Steven Troxler	3534ca3bb8	Deprecate rust-crypto in edenfs-client Summary: Switched out the `sha` package for the `rust-crypto` package. The apis aren't an exact match, so I had to insert a clone in place of a modification to a mutable reference. Reviewed By: jsgf Differential Revision: D20585336 fbshipit-source-id: 22245157aea1115ae6f225b17b0346f0696653f7	2020-03-23 11:04:36 -07:00
Xavier Deguillard	767134797c	pyerror: stringify Rust errors with "{:?}" Summary: According to the anyhow documentation[0], the behavior of `.to_string()` is to only stringify the top-level errors, hiding all the context of the error. Instead, the debug format allows all the context to be displayed, and, if available the backtrace. This should significantly help debug Rust errors when context is available, which we should strive to have everywhere! [0]: https://docs.rs/anyhow/1.0.27/anyhow/struct.Error.html#display-representations Reviewed By: sfilipco Differential Revision: D20575944 fbshipit-source-id: 2968d7fb755edec7f7e5151138e8049ded181c1b	2020-03-20 20:22:14 -07:00
Lukas Piatkowski	12f639159e	cargo_from_buck: get rid of signatures in generated Cargo.toml files Summary: The signatures were used by the linter to warn if the files require regenerating, since the linter now regenerates the files regardless of the signature it is no longer needed to sign the files. Reviewed By: krallin Differential Revision: D20467745 fbshipit-source-id: aff2643f80939d5693e7a30abf07484c9060796f	2020-03-20 08:56:11 -07:00
Xavier Deguillard	68edce4365	lfs: allow the LFS remote to be a local directory Summary: This is only intended for Mercurial .t tests and not in any production environment. Reviewed By: DurhamG Differential Revision: D20504236 fbshipit-source-id: 618e17631b73afa650875cb7217ba7c55fb9f737	2020-03-19 14:36:19 -07:00
Xavier Deguillard	092cfcec7d	revisionstore: add a ContentDataStore trait Summary: For now, this is only used for LFS, as this is the only store that can correctly answer both. This API will be exposed to Python to be able to have cheap filectx comparison, and other use cases. Reviewed By: DurhamG Differential Revision: D20504234 fbshipit-source-id: 0edb912ce479eb469d679b7df39ba80fceef05f2	2020-03-19 14:36:18 -07:00
Xavier Deguillard	632bd53a02	revisionstore: add a LFS remote store Summary: This enables fetching blobs from the LFS server. For now, this is limited to fetching them, but the protocol specify ways to also upload. That second part will matter for commit cloud and when pushing code to the server. One caveat to this code is that the LFS server is not mocked in tests, and thus requests are done directly to the server. I chose very small blobs to limit the disruption to the server, by setting a test specific user-agent, we should be able to monitor traffic due to tests and potentially rate limit it. Reviewed By: DurhamG Differential Revision: D20445628 fbshipit-source-id: beb3acb3f69dd27b54f8df7ccb95b04192deca30	2020-03-19 14:36:18 -07:00
Jun Wu	6ffdcebadf	tracing: write some blackbox events as tracing events Summary: This is the start of migrating blackbox events to tracing events. The motivation is to have a single data source for log processing (for simplicity) and the tracing data seems a better fit, since it can represent a tree of spans, instead of just a flat list. Eventually blackbox might be mostly a wrapper for tracing data, with some minimal support for logging some indexed events. Reviewed By: DurhamG Differential Revision: D19797710 fbshipit-source-id: 034f17fb5552242b60e759559a202fd26061f1f1	2020-03-19 10:23:24 -07:00
Jun Wu	8cc30ac302	dag: add Segment::new API Summary: Now Segment has no lifetime we can create it directly and return the ownership. Performance of "building segments" does not seem to change: # before building segments 750.129 ms # after building segments 712.177 ms Reviewed By: sfilipco Differential Revision: D20505200 fbshipit-source-id: 2448814751ad1a754b90267e43262da072bf4a16	2020-03-18 15:05:58 -07:00
Jun Wu	1bd54a5971	dag: drop lifetime on Segment<'a> Summary: This allows structures like BTreeMap to own and store Segment. It was not possible until D19818714, which adds minibytes::Bytes interface for indexedlog. In theory this hurts performance a little bit. But the perf difference does not seem visible by `cargo bench --bench dag_ops`: # before building segments 714.420 ms ancestors 54.045 ms children 490.386 ms common_ancestors (spans) 2.579 s descendants (small subset) 406.374 ms gca_one (2 ids) 161.260 ms gca_one (spans) 2.731 s gca_all (2 ids) 287.857 ms gca_all (spans) 2.799 s heads 234.130 ms heads_ancestors 39.383 ms is_ancestor 113.847 ms parents 251.604 ms parent_ids 11.412 ms range (2 ids) 117.037 ms range (spans) 241.156 ms roots 507.328 ms # after building segments 750.129 ms ancestors 53.341 ms children 515.607 ms common_ancestors (spans) 2.664 s descendants (small subset) 411.556 ms gca_one (2 ids) 164.466 ms gca_one (spans) 2.701 s gca_all (2 ids) 290.516 ms gca_all (spans) 2.801 s heads 240.548 ms heads_ancestors 39.625 ms is_ancestor 115.735 ms parents 239.353 ms parent_ids 11.172 ms range (2 ids) 115.483 ms range (spans) 235.694 ms roots 506.861 ms Reviewed By: sfilipco Differential Revision: D20505201 fbshipit-source-id: c34d48f0216fc5b20a1d348a75ace89ace7c080b	2020-03-18 15:05:57 -07:00
Xavier Deguillard	db310fc87f	revisionstore: replace lazy_init with once_cell Summary: The later is what is now recommended, and no longer requires a macro to initialize a lazy value, leading to nicer code. Reviewed By: DurhamG Differential Revision: D20491488 fbshipit-source-id: 2e0126c9c61d0885e5deee9dbf112a3cd64376d6	2020-03-18 12:20:12 -07:00
Xavier Deguillard	9c8633bb0a	revisionstore: address clippy warnings Summary: Lots of different warnings on this one. Main ones were: - One bug where .write was used instead of .write_all - Using .next instead of .nth(0) for iterators, - Using .cloned() instead of .map(\|x\| x.clone()) - Using conditions as expressions instead of mut variables - Using .to_vec() on slices instead of .iter().cloned().collect(). - Using .is_empty instead of comparing .len() against 0. Reviewed By: DurhamG Differential Revision: D20469894 fbshipit-source-id: 3666a44ad05e0fbfa68d490595703c022073af63	2020-03-18 10:16:39 -07:00
Xavier Deguillard	a760c0e672	edenapi: address clippy warnings Reviewed By: DurhamG Differential Revision: D20469646 fbshipit-source-id: 222f75196ef140c2e9bdfc0a0500f3fbcffb2309	2020-03-18 10:16:39 -07:00
Xavier Deguillard	121e524df9	blackbox: address clippy warnings Reviewed By: DurhamG Differential Revision: D20469649 fbshipit-source-id: 99b0e68259b5e2ed5b1c969d0a5fa8473e899f17	2020-03-18 10:16:39 -07:00
Xavier Deguillard	aae9075762	lz4-pyframe: address clippy warnings. Reviewed By: DurhamG Differential Revision: D20469648 fbshipit-source-id: 346c8a23ff2b4a895a066843ebe5341103956e76	2020-03-18 10:16:38 -07:00
Xavier Deguillard	8c1f033f50	indexedlog: address clippy warnings Summary: These were from a wide variety of warnings. The only one I haven't addressed is that clippy complains that Pin<Box<Vec<u8>>> can be replaced by Pin<Vec<u8>>. I haven't investigated too much into it, someone more familiar with this code can probably figure out if this is buggy or not :) Reviewed By: DurhamG Differential Revision: D20469647 fbshipit-source-id: d42891d95c1d21b625230234994ab49bbc45b961	2020-03-18 10:16:38 -07:00
Xavier Deguillard	42f1213efa	util: address clippy warnings Summary: The lifetime is unecessary. Reviewed By: DurhamG Differential Revision: D20452750 fbshipit-source-id: 184f5e109a0ff59931bdddaf611a7581d2255e78	2020-03-18 09:35:36 -07:00
Jun Wu	e48079180f	indexedlog: fix a typo in benchmarks Summary: This belongs to D20149376. However buck test does not include benchmarks so it was not noticed. Reviewed By: DurhamG Differential Revision: D20505097 fbshipit-source-id: 24daeb17b68808f8e69e18452ab2cf26c7aa10a7	2020-03-18 09:30:31 -07:00
Mark Thomas	5666399fcf	mutationstore: switch mutation entry timestamp from f64 to i64 Summary: The mutation store stores entries with a floating-point timestamp. This pattern was copied from obsmarkers. However, Mercurial uses integer timestamps in the commit metadata (the parser supports floats for historical reasons, but only stores integer timestamps). Mononoke also uses integer timestamps in its `DateTime` type. To keep things simple, switch to using integer timestamps for mutation entries. Existing entries with floating point timestamps are truncated. Add a new entry format version that encodes the timestamp as an integer. For now, continue to generate the old version so that old clients can read entries created by new clients. Reviewed By: quark-zju Differential Revision: D20444366 fbshipit-source-id: 4d6d9851aacb314abea19b87c9d0130c47fdf512	2020-03-17 04:18:44 -07:00
Mark Thomas	ac80212e8f	mutationstore: remove mutation entry origins Summary: Tracking the origin of mutation entries did not prove useful, and just creates an un-necessary overhead. Remove the tracking and repurpose the field as a version field. Reviewed By: quark-zju Differential Revision: D20444365 fbshipit-source-id: 65ff11ee8cfe77d5e67a83d03a510541d58ef69b	2020-03-17 04:18:44 -07:00
Xavier Deguillard	deffd9a477	minibytes: address clippy warnings Summary: Using ptr.add is shorter and preferred to ptr.offset. Reviewed By: quark-zju Differential Revision: D20452752 fbshipit-source-id: 1dc2fdbc392267d2d690673c10dcc161ecd00dfa	2020-03-16 14:58:22 -07:00
Xavier Deguillard	67c8cf22a3	hgtime: address clippy warnings Summary: These warnings are fairly trivial, as it recommends using single quote (char) for single characters search instead of a double quote (str). Reviewed By: quark-zju Differential Revision: D20452408 fbshipit-source-id: b2951e133e57633a8e766536e22969fa9ac0ecee	2020-03-16 14:58:22 -07:00
Xavier Deguillard	bb30c40375	types: address clippy warnings Summary: Clippy had 3 sources of warnings in this crate: - from_str method not in impl FromStr. We still have 2 of them in path.rs, but this is documented as not supported by the FromStr trait due to returning a reference. Maybe we can find a different name? - Use of mem::transmute while casts are sufficient. I find the cast to be ugly, but they are simply safer as the compiler can do some type checking on them. - Unecessary lifetime parameters Reviewed By: quark-zju Differential Revision: D20452257 fbshipit-source-id: 94abd8d8cd76ff7af5e0bbfc97c1e106cdd142b0	2020-03-16 14:58:21 -07:00
Xavier Deguillard	82d3c7f544	configparser: address clippy warnings Summary: Clippy complains about 3 things: - Using raw pointers in a public function that is not declared as unsafe. This happens for C exported ones, this feels like a warning, so I haven't changed it. - Using .map(...).unwrap_or(<default value constructed>). The recommendation is to use .unwrap_or_default(). - Single match instead of if let, the latter makes code much shorter. Reviewed By: quark-zju Differential Revision: D20452751 fbshipit-source-id: 8eeff7581c119c651ca41d8117f1f70f15774833	2020-03-16 14:53:45 -07:00
Stefan Filip	1fb5acf242	dag: use IdDagStore in IdDag with type parameter Summary: Make IdDag storage generic by depending on IdDagStore. Reviewed By: quark-zju Differential Revision: D20471712 fbshipit-source-id: 3a2668f301758a3c880db35c9f0db6887ef1dd38	2020-03-16 14:41:41 -07:00
Stefan Filip	236292c0fd	dag: add the GetLock trait Summary: Used to generalize `get_lock` functionality. Reviewed By: quark-zju Differential Revision: D20471710 fbshipit-source-id: e44d5b22ecacdb653170ef83914354f521f82dfc	2020-03-16 14:41:40 -07:00
Stefan Filip	66436b4a3c	dag: add the IdDagStore trait Summary: Abstract the storage functionality required by IdDag. Reviewed By: quark-zju Differential Revision: D20449122 fbshipit-source-id: fc3c7d7b88d74f7a93670d310be2e680f35e8ce7	2020-03-16 14:41:40 -07:00
Stefan Filip	1239628ef8	dag: move IdDag storage details to the iddagstore module Summary: Right now the module has one implementation IndexedLogStore. The name could be more specific in the context of the crate. The goal will be to add a trait for storage requirements of IdDag and make IndexedLogStorage one implementation of that trait. Reviewed By: quark-zju Differential Revision: D20446042 fbshipit-source-id: 7576e1cc4ad757c1a2c00322936cc884838ff710	2020-03-16 14:41:40 -07:00
Jun Wu	1f64b4ec50	nameset: fix LazySet iteration Summary: The `next` method forgot to increase the iteration index, causing infinite iteration. Reviewed By: ikostia Differential Revision: D20473206 fbshipit-source-id: 82a95de1b1c12ac4e9e4d328a0adba7145d7b24c	2020-03-16 13:00:35 -07:00
Jun Wu	8115053c00	indexedlog: implement xxd-like fmt::Debug for Log Summary: This makes `hg debugindexedlog dump` more useful. Reviewed By: sfilipco Differential Revision: D20448863 fbshipit-source-id: c5cc24449ae00ee329ce02bf0adf947ff57e72ed	2020-03-16 10:21:46 -07:00
Durham Goode	a13fcd4910	workingcopy: support returning directories from the walker Summary: Purge needs to be able to see what directories the walker traversed, so it can delete them if they are empty. Instead of having the walker call match.traversedir (which it seems like a bizarre pattern to use the matcher as a holder for a non-matching related function), let's have the walker return an enum and have an option to return directories. At the python layer we then translate this into match.traversedir calls, but we can clean that up later. Reviewed By: quark-zju Differential Revision: D19543795 fbshipit-source-id: cc51c86c91799d3df2c65d25a7b6cfe810206d0a	2020-03-16 10:15:26 -07:00
Durham Goode	fc7739fa26	workingcopy: rename walker results Summary: In preparation for supporting returning directories from the walker (to support purge), let's rename the result structure to be more generic. Reviewed By: kulshrax Differential Revision: D19543791 fbshipit-source-id: 9b71452c879cf397ae92533a4ef4727140ac7369	2020-03-16 10:15:26 -07:00
Durham Goode	05e09b2b89	workingcopy: report invalid file types from rust walker Summary: The mercurial tests print errors when they encounter 'fifo' files. Let's handle that case. Differential Revision: D19543796 fbshipit-source-id: f87d4b9c3f0ad8b8d8ebe2e6d18e325fc93d0ae9	2020-03-16 10:15:25 -07:00
Xavier Deguillard	fd8d92f1f5	revisionstore: allow indexing LFS pointers via sha256 Summary: While the sha256 of a blob gives access to its content, it doesn't allow accessing its metadata, by adding a sha256 index, we can easily get the metadata of a blob via its content hash. Reviewed By: quark-zju Differential Revision: D20445624 fbshipit-source-id: 42c04bd69d3c7380706c6237c5b4f4061c016cca	2020-03-13 19:03:29 -07:00
Xavier Deguillard	d9cca63444	types: add a into_inner method to Sha256 Reviewed By: quark-zju Differential Revision: D20445623 fbshipit-source-id: d9cba7ddd16a8e89c76cd5e988ab0fb79383d0c2	2020-03-13 19:03:29 -07:00
Xavier Deguillard	60be0ac94d	types: fix typo when displaying Sha256 Reviewed By: quark-zju Differential Revision: D20445622 fbshipit-source-id: dc9a8a165ca55fdece90a5eb3a87cd3c28f444cb	2020-03-13 19:03:29 -07:00
Xavier Deguillard	6ee3a8f42f	revisionstore: add metadata to FakeHgIdRemoteStore Summary: This is necessary to properly test LFS stores. Reviewed By: quark-zju Differential Revision: D20445625 fbshipit-source-id: 530ddf87249e8d721957806f2d8edef3262f303c	2020-03-13 19:03:28 -07:00
Xavier Deguillard	5002d01e0a	revisionstore: allow indexedlogutil users to lookup in different indices Summary: The OpenOptions allow for multiple indices to be added, but lookup had no way to querying these multiple indices. Reviewed By: quark-zju Differential Revision: D20445627 fbshipit-source-id: 0cb754ba17b452d892b7bcb56d502d5753ef963a	2020-03-13 19:03:28 -07:00
Xavier Deguillard	01fb3c0a77	revisionstore: add a new StoreKey type Summary: This type can either be a Mercurial type key, or a content hash based key. Both the prefetch and get_missing now can handle these properly. This is essential for stores where data can either be fetched in both ways or when the data is split in 2. For LFS for instance, it is possible to have the LFS pointer (via getpackv2), but not the actual blob. In which case get_missing will simply return the content hash version of the StoreKey, to signify what it actually has missing. Reviewed By: quark-zju Differential Revision: D20445631 fbshipit-source-id: 06282f70214966cc96e805e9891f220b438c91a7	2020-03-13 19:03:28 -07:00
Xavier Deguillard	d900874401	revisionstore: rename HistoryStore to HgIdHistoryStore Summary: Similarly to the DataStore trait, this makes it easier to understand that they deal with a Mercurial type Key. Reviewed By: quark-zju Differential Revision: D20445621 fbshipit-source-id: a1143d5f5d6a2c8686d517a6ea3c25b07c0df072	2020-03-13 19:03:27 -07:00
Xavier Deguillard	2e4742cefc	revisionstore: rename DataStore traits to HgIdDataStore Summary: This makes it clear that these traits are dealing with Mercurial Key. Reviewed By: quark-zju Differential Revision: D20445626 fbshipit-source-id: d5acbf442e9407b973e95e40af69b5a61bff0a4d	2020-03-13 19:03:27 -07:00
Jun Wu	cf04fe3e1f	thrift-types: recompile Thrift sources Summary: The thrift compiler and sources are changed. Reviewed By: xavierd Differential Revision: D20445164 fbshipit-source-id: f20f16ae02a922042f366a9a80a3642577f60e57	2020-03-13 14:25:23 -07:00
Jun Wu	7a7f98f1b2	configparser: migrate from Bytes to Text Summary: Since configparser enforces utf-8 config files (because pest wants Rust strings), let's migrate from Bytes to Text to remove extra encoding conversions. Previously this was blocked by the lack of ref-counted text (since the "source" of each config location is the entire config file). Now minibytes provides Text so we can use it. This unfortunately requires dependent code to be updated. The pyconfigparser interface is in theory wrong - it shouldn't return utf-8 bytes but local-encoded bytes. I think it's cleaner to make pyconfigparser unaware of HGENCODING, so I changed pyconfigparser to use unicode, and add compatibility layer in uiconfig.py. This also fixes non-ascii encoding issues on user name (especially on Windows). The hgrc config file should be in utf-8 and the config parser returns explicit unicode types, and Python code round-trip them with local encodings. Reviewed By: markbt Differential Revision: D20432938 fbshipit-source-id: b1359429b8f1c133ab2d6b2deea6048377dfeca1	2020-03-13 10:51:41 -07:00
Jun Wu	715bc5d451	configparser: migrate from bytes to minibytes Summary: This makes it easier to further migrate to `Text` interface. Dependent crate (`auth`) is updated. Reviewed By: markbt Differential Revision: D20432941 fbshipit-source-id: 1dc29d52c9b17ce14676ef0555470c6d36a09c2b	2020-03-13 10:51:41 -07:00
Jun Wu	c4ec99ded4	minibytes: implement Text Summary: Text is a reference-counted shared String. It's similar to Bytes but works for utf-8 strings. The motivation is to replace configparser's use of Bytes to Text. Reviewed By: markbt Differential Revision: D20432940 fbshipit-source-id: ef990255d269e60d433c6520819f60ccdcbe488f	2020-03-13 10:51:41 -07:00
Jun Wu	7895e70dcf	minibytes: make Bytes abstract Summary: This makes it possible to implement "Text". See the next diff. Reviewed By: markbt Differential Revision: D20432943 fbshipit-source-id: 94b3810ab205c260d33f57bd637e4accc3ee871d	2020-03-13 10:51:40 -07:00
Jun Wu	e9b14b3608	minibytes: implement From<&'static {str,[u8]}> Summary: This makes the API easier to use. Practically this makes it easier for configparser to migrate to minibytes. Reviewed By: markbt Differential Revision: D20432942 fbshipit-source-id: ad08eb118d2216054dc24c86b0b129ae82b9d17c	2020-03-13 10:51:40 -07:00
Jun Wu	ad8190713b	cpython-ext: serialize Rust str into Python str type Summary: Previously Rust str was serialized into bytes. To be Python 3 friendly, let's serialize it into `str`. Reviewed By: markbt Differential Revision: D19797706 fbshipit-source-id: 388eb044dc7e25cdc438f0c3d6fa5a5740f22e3d	2020-03-12 12:19:38 -07:00
Jun Wu	3376363721	tracing-collector: add is_event to TreeSpan Summary: Expose the is_event property via public APIs. Reviewed By: DurhamG Differential Revision: D19797705 fbshipit-source-id: f441825e98208964f7b3d6815a177b464430cbb7	2020-03-12 12:19:38 -07:00
Stanislau Hlebik	ba871d3bdc	xdiff: allow rendering diff for large files Summary: The goal of the stack is to support "rendering" diffs for large files in scs server. Note that rendering is in quotes - we are fine with just showing a placeholder like "Binary file ... differs". This is still better than the current behaviour which just return an error. In order to do that I suggest to tweak xdiff library to accept FileContentType which can be either Normal(...) meaning that we have file content available, or Omitted, which usually means the file is large and we don't even want to fetch it, and we just want xdiff to generate a placeholder. Reviewed By: markbt, krallin Differential Revision: D20389226 fbshipit-source-id: 0b776d4f143e2ac657d664aa9911f6de8ccfea37	2020-03-12 04:27:23 -07:00
Jun Wu	194b38385a	nameset: add a way to convert between NameSet and SpanSet Summary: This will be used in the Python world for legacy reasons. It shouldn't be used in new Rust node. To use it, the name `LegacyCodeNeedIdAccess` has to be used so we can do a code search to find all users of it. Reviewed By: sfilipco Differential Revision: D20367834 fbshipit-source-id: 9b93a29f1461ce24bba6f31a2bbb1f327e216c6d	2020-03-11 20:37:30 -07:00
Jun Wu	eef56d9c5b	namedag: add a sort API Summary: This will be useful to actually sort commits. Reviewed By: sfilipco Differential Revision: D20367835 fbshipit-source-id: 43bc7835277af3a14ef323ce34247e0c03878dc8	2020-03-11 20:37:29 -07:00
Jun Wu	2ecc0bb757	namedag: move "all" concept to DagSet Summary: The old "AllSet" implementation is not very practical - it does not support iteration. Practically, the "all()" set comes from the DAG. Change the "all" concept to a hint similar to "is_topo_sorted", and update the fast path (intersection) accordingly. Reviewed By: sfilipco Differential Revision: D20367837 fbshipit-source-id: fdbf370897c93058bfcab0571c1f6fa4b99b0f6b	2020-03-11 20:37:29 -07:00
Jun Wu	ef1696b4db	namedag: rename arc_map to snapshot_map Summary: The word "snapshot" more accurately describes its purpose. Reviewed By: sfilipco Differential Revision: D20367836 fbshipit-source-id: c91a0bd402fa1718b5d805beedc0e062824c53d3	2020-03-11 20:37:29 -07:00
Jun Wu	c5c75c9f59	fsinfo: autocorrect "" to "." Summary: Without this: In [3]: util.getfstype('') IOError: [Errno 2] No such file or directory (os error 2) And there is a code path hitting this: File "edenscm/mercurial/util.py", line 1483, in checknlink fstype = getfstype(os.path.dirname(testfile)) # testfile = '.' # os.path.dirname(".") = "" The old implementation works fine for an empty path: In [2]: m.util.getfstype('') Out[2]: 'eden' So let's make the new Rust implementation consistent. Reviewed By: xavierd Differential Revision: D20313387 fbshipit-source-id: 258c424a3e8a796d983e20b0d4656e8e3f413706	2020-03-11 17:35:40 -07:00
Jun Wu	61bebcaacc	fsinfo: try harder to get fuse fs type Summary: Similar to D13982877. Try to get names like "fuse.ntfs". Reviewed By: farnz Differential Revision: D20313392 fbshipit-source-id: 8363d3d92843e6afb53a0003950be083034bd841	2020-03-11 17:35:39 -07:00
Jun Wu	13374f9d74	fsinfo: drop most type parameters Summary: Only keep type parameters at the top-level function. This reduces the binary size and speeds up rustc. Reviewed By: xavierd Differential Revision: D20313388 fbshipit-source-id: 29d77731ff462fee1f1bb9f234601e3430198ae7	2020-03-11 17:35:39 -07:00
Jun Wu	c83006002c	fsinfo: return unknown on unsupported platforms Summary: This makes the code a bit more portable. Reviewed By: xavierd Differential Revision: D20313389 fbshipit-source-id: 080538939fa4d2d72e5905f23ad9be987d952748	2020-03-11 17:35:38 -07:00
Jun Wu	9cdc818915	fsinfo: drop "repo" from method names Summary: Rename the main method to "fstype". The API has no relation with repo. So let's rename it. Reviewed By: xavierd Differential Revision: D20313386 fbshipit-source-id: 80dd1231ccccfe945150b117b151bce773f0dfeb	2020-03-11 17:35:38 -07:00
Jun Wu	951c8ab082	fsinfo: backport from telemetry Summary: The fsinfo crate provides the "filesystem type" information. Reviewed By: xavierd Differential Revision: D20313391 fbshipit-source-id: f717f5edb32957d59d03090117cfdb8123f03933	2020-03-11 17:35:37 -07:00
Xavier Deguillard	f466037b4b	revisionstore: fix memcache test flakiness Summary: Since the mocked memcache is shared between the tests, we need to make sure the keys used by the tests are different, otherwise they are just caching each others data. Reviewed By: ikostia Differential Revision: D20388783 fbshipit-source-id: 0f2f926e0ffe0e52e55291e46142808ce0921288	2020-03-11 15:58:03 -07:00
Jun Wu	97e9b81ba5	indexedlog: remove compiler warnings on Windows Summary: Some `use`s are not used on Windows. The code was also formatted using the latest rustfmt. Reviewed By: xavierd Differential Revision: D20379704 fbshipit-source-id: ffadcd68e4e0440dcbd2a4e1ad8532b47a9d83e2	2020-03-11 15:54:19 -07:00
Xavier Deguillard	c98b9cfff9	revisionstore: remove Arc from MetadataStore Summary: Similarly to the ContentStore, remove the Arc from MetadataStore. Reviewed By: quark-zju Differential Revision: D20376838 fbshipit-source-id: 4321600b752c919b6d9fa7bdee6f6cb7ae083b10	2020-03-11 13:39:06 -07:00
Xavier Deguillard	7e704ec7fb	revisionstore: remove the Arc from ContentStore Summary: The clients should use an Rc/Arc if they need the ability to clone it. This makes it more obvious and reduces the number of pointer indirection. Reviewed By: quark-zju Differential Revision: D20376839 fbshipit-source-id: c56e7e8f89ab17727be621894c329e344a7f3adb	2020-03-11 13:39:05 -07:00
Jun Wu	4960709aa3	dag: do not depend on types Summary: The dag crate is designed to work with any kind of binary commit hashes (ex. bonsai, git or hg). The only use of `types` is to convert from binary to hex. Since dag already has its own `to_hex` logic in `VertexName`. Let's use that instead. Reviewed By: sfilipco Differential Revision: D20378447 fbshipit-source-id: 00ecb551ea927fdb60dd91e5e645064f23139bcd	2020-03-11 10:49:31 -07:00
Jun Wu	009ea22175	indexedlog: retry rename in atomic_write on Windows Summary: Recently there are some Windows-related test flakiness in . All of them are caused by `file.persist(path)` in `atomic_write_plain` failing with "Access Denied". Since that can be caused by Windows Anti-Virus scans or other weird stuff, let's workaround around it using automatically retires. Process Explorer does not provide extra information: indexedlog-d0c6135fd7ed9ece.exe 5868 SetRenameInformationFile C:\Users\quark\AppData\Local\Temp\.tmpKERc5G\.tmpcfDsQQ ACCESS DENIED ReplaceIfExists: True, FileName: C:\Users\quark\AppData\Local\Temp\.tmpKERc5G\meta A successful rename looks like: indexedlog-d0c6135fd7ed9ece.exe 5868 SetRenameInformationFile C:\Users\quark\AppData\Local\Temp\.tmpKERc5G\.tmpbXEVw0 SUCCESS ReplaceIfExists: True, FileName: C:\Users\quark\AppData\Local\Temp\.tmpKERc5G\meta Reviewed By: ikostia Differential Revision: D20379618 fbshipit-source-id: db3e6be3d785875486f7a517df11cbf58bf65ddd	2020-03-11 10:06:47 -07:00
Xavier Deguillard	5d230aef68	backingstore: use get_file_content to strip metadata Summary: Now that the ContentStore can automatically strip the metadata header, no need for duplicated code in the backingstore. Reviewed By: fanzeyi Differential Revision: D20376812 fbshipit-source-id: e863e1cc2fcdc8b9e612a464b305fa25ceb66e13	2020-03-11 09:40:26 -07:00
Xavier Deguillard	40bbe7b4da	merge: add a Rust threaded file updater Summary: During `hg update`, Mercurial forks multiple processes to write files on disk concurrently, this is done as fetching blobs from the content store, and writing them to disk is CPU bound. Usually, threads would be the preferred way of speeding up such process, but unfortunately, Python has GIL that severely limit the available concurrency. So, multiple processes were chosen. Unfortunately, the multi-process solution also brings a lot of other issues, more recently, we've had cases where the connections to the server and memcache had to be dropped after the fork. In some other cases, this caused deadlocks. And the solution is not effective on Windows. Now that Mercurial is getting more and more Rust, we could instead go back to the threads solution by using them in Rust, and have Python just push work to them, this is exactly what this change does. Things that are left to be done, but I wanted to get a diff out first: - no file path audit - no file backup - no symlink creation - probably other things I'm missing Reviewed By: quark-zju Differential Revision: D20102888 fbshipit-source-id: d47829fd7818b97710586b9851880f178048e27b	2020-03-11 01:13:54 -07:00
Xavier Deguillard	185bc0f437	revisionstore: add an LfsMultiplexer store Summary: With this new store, blobs will be transparently written to either an LFS store, or a non-LFS one, depending on their size. Initially, and as long as getpackv2 is supported, we also need to support parsing lfs pointer data that the server is sending and write these to the lfs pointer store. This code is very adhoc and does manual parsing of the pointer data, definitively not great, suggestion for a simple and better solution is welcome :). From a migration standpoint, the read-only LFS stores are added to the ContentStore, this allows blobs written in it to be readable at all time even when `remotefilelog.lfs` isn't set. The code will effecitvely be dormant for a while until the option is turned on, if we need to disable it, the dormant code will still be able to read all the blobs written to disk. This forces us to deploy a release that contains this code to stable first, before setting `remotefilelog.lfs`. Reviewed By: quark-zju Differential Revision: D19986878 fbshipit-source-id: 260f5a542d52e748c0c703bfa7bb8ffac0e7b388	2020-03-10 18:14:54 -07:00
Jun Wu	5f84fc8222	indexedlog: use dev-logger Summary: This makes `RUST_LOG` work for indexedlog tests. Reviewed By: xavierd Differential Revision: D20286515 fbshipit-source-id: ff4a1476eb01a9067dabe3622fd598f65fe86a18	2020-03-10 14:16:39 -07:00
Jun Wu	7a12c33163	dev-logger: a simple library to enable env_logger for testing Summary: The tracing / env_logger integration works for hg as a binary. However I'd also like to use it in library tests. This crate makes it easier to do so. Reviewed By: xavierd Differential Revision: D20286507 fbshipit-source-id: f5bf3288ce950591ddfe64b524ad51ce21ee4099	2020-03-10 14:16:38 -07:00
Jun Wu	cf72dc45f5	indexedlog: add some tracing information Summary: Those has helped me debugging some issues. Reviewed By: xavierd Differential Revision: D20286513 fbshipit-source-id: 012ddb16c2d0efd8f8697a5ecd4564ea31d65630	2020-03-10 14:16:38 -07:00
Jun Wu	e7ed737a64	hgcommands: make env_logger show exit code Summary: Move the scope of spans so the exit code is shown. Reviewed By: xavierd Differential Revision: D20286516 fbshipit-source-id: f39cbf60c86ea19a1bb0a09958748f04ff6a42e8	2020-03-10 14:16:37 -07:00
Jun Wu	bb9023c2cb	hgcommands: move env_logger initialization to hgcommands Summary: Previously env_logger is only initialized if Python is initialized. This diff makes env_logger initialized for Rust native commands. Reviewed By: xavierd Differential Revision: D20286517 fbshipit-source-id: 18fee96c2b41db1da9648d615d1e18809de90a63	2020-03-10 14:16:37 -07:00
Jun Wu	97d0a976fd	tracing: make it write to the log eco-system Summary: This means crates like env_logger (which reads $RUST_LOG, and writes to stderr) can be used for convenient debugging. Reviewed By: xavierd Differential Revision: D20286514 fbshipit-source-id: e3b80cc4830ba5cc6dbf7aa1cbb92a4f4f046a54	2020-03-10 14:16:37 -07:00
Jun Wu	796f199130	tracing: save static metadata from tracing to Spans Summary: Those metadata include module_path, target, line number, etc, in Rust native format. They will be used for the upcoming `log` integration. Reviewed By: xavierd Differential Revision: D20286510 fbshipit-source-id: 27019b941bef08c0bb3e505bbdae642282dcb141	2020-03-10 14:16:36 -07:00
Stefan Filip	d8b4ddcecf	dag: split lock file acquisition to own function Summary: Spliting lock file acquisition from `IdDag::prepare_filesystem_sync` to its own function. Useful when looking ahead to split IdDag from IndexedLog. Reviewed By: quark-zju Differential Revision: D20316443 fbshipit-source-id: a0fd43439730376920706bb4349ce497f6624335	2020-03-09 10:18:07 -07:00
Stefan Filip	620cdd96f2	dag: add IdDag::iter_segments_with_parent Summary: This removes an inline use of the indexedlog indexes. This is going to be useful when we try to separate IndexedLog specifics from IdDag functionality. Reviewed By: quark-zju Differential Revision: D20316058 fbshipit-source-id: 942a0a71660bb327376c81fd3ac435d002ecca6e	2020-03-09 10:18:07 -07:00
Kuba Zika	6a25dbee81	Simplify error pattern matching Summary: Instead of returning `anyhow::Error` wrapping an `ErrorKind` enum from each Thrift client method, just return an error type specific to that method. This will make error handling simpler and less error-prone by removing the need to downcast the returned error. This diff also removes the `ErrorKind` enums so that we can be sure that there are no leftover places trying to downcast to them. (Note: this ignores all push blocking failures!) Reviewed By: dtolnay Differential Revision: D20260398 fbshipit-source-id: f0dd96a7b83dd49f6b30948660456539012f82e6	2020-03-06 12:09:38 -08:00
Jun Wu	3103fcf62b	indexedlog: reload content after obtaining a lock at open time Summary: The old code does "read, lock, write", which is unsound because after "lock" the data just read can be outdated and needs a reload. Reviewed By: xavierd Differential Revision: D20306137 fbshipit-source-id: a1c29d5078b2d47ee95cf00db8c1fcbe3447cccf	2020-03-06 08:12:02 -08:00
Jun Wu	75e4ffc17f	indexedlog: change IndexDef.lag_threshold back from entries to bytes Summary: I thought the index function could be the bottleneck. However, the Log reading (xxhash, decoding vlqs) can be much slower for very long entries. Therefore using bytes as the lag threshold is better. It does leaked the Log implementation details (how it encodes an entry) to some extend, though. Reverts D20042045 and D20043116 logically. The lagging calculation is using the new Index::get_original_meta API, which is easier to verify correctness (In fact, it seems the old code is wrong - it might skip Index flushes if sync() is called multiple times without flushing). This should mitigate an issue where a huge entry (generated by `hg trace`) in blackbox does not get indexed in time and cause performance regressions. Reviewed By: DurhamG Differential Revision: D20286508 fbshipit-source-id: 7cd694b58b95537490047fb1834c16b30d102f18	2020-03-05 13:29:48 -08:00
Jun Wu	efff6f3592	indexedlog: add an API to get the Index meta that is not dirty Summary: This will be used to more reliably detect index lags. Reviewed By: DurhamG Differential Revision: D20286518 fbshipit-source-id: c553b6587363a55603b75df12580588e3100e35f	2020-03-05 13:29:47 -08:00
Jun Wu	66e60bacb9	rotatelog: build indexes for older logs on access Summary: This ensures indexes are complete even if index format or definition has been changed. Reviewed By: DurhamG Differential Revision: D20286509 fbshipit-source-id: fcc4ebc616a4501e4b6fd2f1a9826f54f40b99b8	2020-03-05 13:29:47 -08:00
Jun Wu	669c58bd56	blackbox: use RotateLog::iter_dirty() Summary: This avoids loading all blackbox logs when `init()` gets called multiple times (for example, once in Rust and once in Python). Reviewed By: DurhamG Differential Revision: D20286511 fbshipit-source-id: ef985e454782b787feac90a6249651a882b6552e	2020-03-05 13:29:47 -08:00
Jun Wu	1c6310b9d6	rotatelog: add iter_dirty() API Summary: This API has the benefit that it does not trigger loading older logs. Reviewed By: DurhamG Differential Revision: D20286512 fbshipit-source-id: 426421691ad1130cdbb2305612d76f18c9f8798c	2020-03-05 13:29:46 -08:00
Jun Wu	64ba669a51	nameset: add some tests for DagSet Summary: With the new crate-public interfaces and Debug implementations it's possible to write tests for DagSet. So let's do it. Reviewed By: sfilipco Differential Revision: D20242561 fbshipit-source-id: 180e04d9535f79471c79c4307f6ab6e8e8815067	2020-03-05 11:46:18 -08:00
Xavier Deguillard	34bce8690f	revisionstore: silence compiler warning Summary: Don't restrict constructing a c_api datapack store to only Unix, we can construct it on Windows too by assuming that their path will be valid UTF-8. Reviewed By: quark-zju Differential Revision: D20250718 fbshipit-source-id: 07234b6a71b50c803cfe3b962fa727f57037c919	2020-03-05 09:35:57 -08:00
Xavier Deguillard	751fc53638	types: add an ancestors method to RepoPath Summary: This returns the ancestors in the reverser order as the parents method. Reviewed By: sfilipco Differential Revision: D20265277 fbshipit-source-id: 83277cee3d8e9070fc56d20d4c1877e6782c22f7	2020-03-05 09:31:32 -08:00
Jun Wu	bb1562604a	dag: make some test APIs public in crate Summary: Those will be reused by nameset::DagSet. Reviewed By: sfilipco Differential Revision: D20242563 fbshipit-source-id: 944e9a04aeb15439256ecea64355b67e326e5c89	2020-03-04 17:33:25 -08:00
Jun Wu	b8e1477401	nameset: impl Debug for other sets Summary: This is useful for `assert_eq!(format!("{:?}", set), "...")` tests. It will be eventually exposed to Python as `__repr__`, similar to Python's smartsets. Reviewed By: sfilipco Differential Revision: D20242562 fbshipit-source-id: 5373bb180db7cafebf273ace7cf2cb80fbfb8038	2020-03-04 17:33:25 -08:00
Jun Wu	fa069204e3	nameset: impl Debug for StaticSet Summary: In the Python world all smartsets have some kind of "debug" information. Let's do something similar in Rust. Related code is updated so the test is more readable. Reviewed By: sfilipco Differential Revision: D20242564 fbshipit-source-id: 7439c93d82d5d037c7167818f4e1125c5a1e513e	2020-03-04 17:33:24 -08:00
Jun Wu	0ae5a59e9e	indexedlog: fix metadata-only updates for Indexes Summary: Previously, `flush()` will skip writing the file if there are only metadata changes. Fix it by detecting metadata changes. This can potentially fix an issue that certain blackbox indexes are empty, lagging and require scanning the whole log again and again. In that case, the index itself is not changed (the root radix entry is not changed), but only the metadata tracking how many bytes in Log the index covered changed. Reviewed By: sfilipco Differential Revision: D20264627 fbshipit-source-id: 7ee48454a92b5786b847d8b1d738cc38183f7a32	2020-03-04 15:59:12 -08:00
Xavier Deguillard	314d1978ef	clidispatch: silence warning on windows Summary: Using `if cfg!` instead of `#[cfg]` allows for the compiler to understand that the arguments aren't unused, and silence the warnings. Reviewed By: quark-zju Differential Revision: D20242280 fbshipit-source-id: 332dfe17b3a80a1096d15c91c9fb6644bd10e0cd	2020-03-04 09:49:15 -08:00
Xavier Deguillard	7d9d38017c	configparser: silence compiler warning Summary: Compiling it on Windows produced a bunch of warning due to `hgrc_configset_load_path` not being compiled on it. Fixed it so it no longer depends on Unix specific imports. Reviewed By: quark-zju Differential Revision: D20241102 fbshipit-source-id: 3002f961191fbb9bc51aa9ac1154d6d50bd7fe23	2020-03-04 09:49:14 -08:00
Xavier Deguillard	db76b7d52b	procinfo: address compiler warning Summary: The `.into_iter()` for this object is being deprecated and won't compile in the future, fix it now. Reviewed By: quark-zju Differential Revision: D20241103 fbshipit-source-id: fdee463ed81cd07a65f3cc4c70a96c88928b3b87	2020-03-04 09:49:14 -08:00
Xavier Deguillard	be7ae642ea	commitcloudsubscriber: silence compiler warning Summary: While compiling on Windows, this file issues a bunch of warnings, use `if cfg!` instead of `#[cfg]` to silence them. The behavior is the same, but the later allows the compiler to recognize that some is not unused. Reviewed By: quark-zju Differential Revision: D20241104 fbshipit-source-id: 2cd7f171c7a2f7220cc73bea9be3359260de19b2	2020-03-04 09:49:14 -08:00
Jun Wu	49464342fd	indexedlog: try to use symlink for atomic_write on unix Summary: The change is in theory not necessary. However it improves the reliability on OS crashes a bit, and can potentially workaround some bugs in filesystems (as we saw in production where the atomic-written files are empty and the system didn't crash). The idea is, the `symlink` syscall does the file creation and "content" writing together, while there is no way to create a file and write specific content in one syscall. Note that the C symlink call uses 0-terminated string, and the Rust stdlib exports it as accepting `Path`. To be safe, we encode binary or non-utf8 content using `hex`. For downgrade safety, the write path does not use symlink by default unless format.use-symlink-atomic-write is set to true. This makes downgrade possible: the read path is rolled out first, then we can turn on and off the write path. The indexedlog Rust unit tests and test-doctor.t are migrated to use the new symlink code paths. Reviewed By: DurhamG Differential Revision: D20153864 fbshipit-source-id: c31bd4287a8d29575180fbcf7227d2b04c4c1252	2020-03-04 07:23:48 -08:00
Jun Wu	def12896db	indexedlog: add a utility function to read files crated by atomic_write Summary: This makes it possible to implement atomic_write differently (ex. use a symlink). Reviewed By: DurhamG Differential Revision: D20153865 fbshipit-source-id: 07fa78c2f2dac696668f477c75f65cf70950b73f	2020-03-04 07:23:47 -08:00
Jun Wu	7c5e47bab1	indexedlog: rename chunk_size_log to chunk_size_logarithm Summary: This makes it clear that `log` is a math concept, not an append-only file like `Log`. Reviewed By: DurhamG Differential Revision: D20149376 fbshipit-source-id: 67d2e9584b15f48759ca9b6dfce4279a5b1365a0	2020-03-03 13:41:28 -08:00
David Tolnay	e988a88be9	rust: Rename futures_preview:: to futures:: Summary: Context: https://fb.workplace.com/groups/rust.language/permalink/3338940432821215/ This codemod replaces all dependencies on `//common/rust/renamed:futures-preview` with `fbsource//third-party/rust:futures-preview` and their uses in Rust code from `futures_preview::` to `futures::`. This does not introduce any collisions with `futures::` meaning 0.1 futures because D20168958 previously renamed all of those to `futures_old::` in crates that depend on both 0.1 and 0.3 futures. Codemod performed by: ``` rg \ --files-with-matches \ --type-add buck:TARGETS \ --type buck \ --glob '!/experimental' \ --regexp '(_\|\b)rust(_\|\b)' \ \| sed 's,TARGETS$,:,' \ \| xargs \ -x \ buck query "labels(srcs, rdeps(%Ss, //common/rust/renamed:futures-preview, 1))" \ \| xargs sed -i 's,\bfutures_preview::,futures::,' rg \ --files-with-matches \ --type-add buck:TARGETS \ --type buck \ --glob '!/experimental' \ --regexp '(_\|\b)rust(_\|\b)' \ \| xargs sed -i 's,//common/rust/renamed:futures-preview,fbsource//third-party/rust:futures-preview,' ``` Reviewed By: k21 Differential Revision: D20213432 fbshipit-source-id: 07ee643d350c5817cda1f43684d55084f8ac68a6	2020-03-03 11:01:20 -08:00
Genevieve Helsel	528015f9fe	allow more hg fastpath cases Reviewed By: simpkins Differential Revision: D20143888 fbshipit-source-id: 4b1a73159bde6835626ad1766b2cf9dcd2faf6c4	2020-03-02 07:43:39 -08:00
Jun Wu	c718a5dc19	pathmatcher: add a test about a bug in globset/aho-corasick Summary: Also patch aho-corasick to fix the issue. The issue was introduced by [an optimization path](`063ca0d253`) added in aho-corasick 0.7 series (used by globset 0.4.3). aho-corasick 0.6.x (globset 0.4.2) are not affected. The next aho-corasick release (0.7.9) contains the fix. See https://github.com/BurntSushi/aho-corasick/issues/53 for more context. Reported by: yns88 Reviewed By: DurhamG Differential Revision: D20125697 fbshipit-source-id: 592375b43d7ee494bb3e916a1cb11c18f9ebe425	2020-02-28 22:09:28 -08:00
Jun Wu	10bb5a144e	revset: replace some repo.revs with repo.nodes Summary: Migrate away from some uses of revision numbers. Some dead code in discovery.py is removed. I also fixed some test issues when I run tests locally. Reviewed By: sfilipco Differential Revision: D20155399 fbshipit-source-id: bfdcb57f06374f9f27be51b0980652ef50a2c8e0	2020-02-28 17:45:26 -08:00
Jun Wu	0220b4a0c3	nameset: make NameIter Send Summary: This makes it possible to use NameIter in py_class. Reviewed By: sfilipco Differential Revision: D20020529 fbshipit-source-id: b9147b7dccb38d18d8361b420507fcbe97e01351	2020-02-28 16:35:25 -08:00
Jun Wu	782f2017aa	dag: add hex prefix lookup Summary: This will be used by commit hash prefix lookup. Reviewed By: sfilipco Differential Revision: D20020523 fbshipit-source-id: f2905ddf63098704b08dad8eb48272c3ffba7e25	2020-02-28 16:35:24 -08:00
Jun Wu	12441f48bf	dag: re-export common types at top-level Summary: Export common types at the top-level of the crate so it's easier to use. Reviewed By: sfilipco Differential Revision: D20020526 fbshipit-source-id: e9a0a8bc3cc91f81d0bc74e7530cd4613fc1dd61	2020-02-28 16:35:23 -08:00
Jun Wu	bc9f72ccf3	dag: implement DAG algorithms on NameDag Summary: Those just delegate to IdDag for the actual calculation. Reviewed By: sfilipco Differential Revision: D20020522 fbshipit-source-id: 272828c520097c993ab50dac6ecc94dc370c8e8b	2020-02-28 16:35:23 -08:00
Jun Wu	b88da34fb0	dag: expose NameDag in tests Summary: This allows tests to check NameDag APIs. Reviewed By: sfilipco Differential Revision: D20020525 fbshipit-source-id: 4ee8e4bcbd0731512ba17068e827b8045fc5d522	2020-02-28 16:35:23 -08:00
Jun Wu	194cd25f4f	dag: add Arc<IdMap> to NameDag Summary: This will be used to produce NameSet. Reviewed By: sfilipco Differential Revision: D20020519 fbshipit-source-id: abf6d73f2b985b74560d6b5db2800ff25450f02e	2020-02-28 16:35:22 -08:00
Jun Wu	7a343271b9	dag: rename NameDag::parents to NameDag::parent_names Summary: This matches IdDag::parents (taking a set) and IdDag::parent_ids. Reviewed By: sfilipco Differential Revision: D20020524 fbshipit-source-id: 6e90727c355a7400f9a23e0b25e3392bdc032f49	2020-02-28 16:35:22 -08:00
Jun Wu	e3b28a683c	nameset: add fast paths for DagSet Summary: DagSet's SpanSet has fast paths for set operations. Use them. Reviewed By: sfilipco Differential Revision: D19912104 fbshipit-source-id: 24b55aa14d03be2f1be59c923e0b8e79d6bcbe6d	2020-02-28 16:35:22 -08:00
Jun Wu	587b06efee	nameset: AllSet Summary: This is similar to hg's fullreposet. It'll be useful as a dummy "subset". Reviewed By: sfilipco Differential Revision: D19912108 fbshipit-source-id: 33a95bcb3cf5931a431a1201d1a1f3c627cec7a1	2020-02-28 16:35:21 -08:00
Jun Wu	d41c55a13b	nameset: SortedSet Summary: SortedSet is a wrapper to other sets that marks it as topologically sorted. Reviewed By: sfilipco Differential Revision: D19912111 fbshipit-source-id: 2637e8fd29b97f6db0c5bae3f0decd7ac382eeb1	2020-02-28 16:35:21 -08:00
Jun Wu	51bea7aff7	nameset: LazySet Summary: Similar to Mercurial's smartset.generatorset. Reviewed By: sfilipco Differential Revision: D19912110 fbshipit-source-id: 7d940b8578ec7090282e2addb1fde871cddb2b25	2020-02-28 16:35:20 -08:00
Jun Wu	5e451d07b1	nameset: DagSet Summary: Wraps SpanSet + IdMap so it only exposes commit names without ids. There is no equivalent smartset in Mercurial. Reviewed By: sfilipco Differential Revision: D19912112 fbshipit-source-id: 0d257de11527dfa8836065ac94f652730a97a468	2020-02-28 16:35:20 -08:00
Jun Wu	e7e7a5b356	nameset: StaticSet Summary: Similar to Mercurial's smartset.baseset. All names are statically known. Reviewed By: sfilipco Differential Revision: D19912105 fbshipit-source-id: e4fcf2d59291adb3ca01b3b90f1ac32c65ad7eaa	2020-02-28 16:35:20 -08:00
Jun Wu	349d1bc33e	nameset: IntersectionSet Summary: Similar to Mercurial's smartset.filterset. Reviewed By: sfilipco Differential Revision: D19912113 fbshipit-source-id: 7cf2101b2eb7ba34b542199293cdbfd3973ef72f	2020-02-28 16:35:19 -08:00
Jun Wu	c0a1a3ab22	nameset: DifferenceSet Summary: Similar to Mercurial's smartset.filterset. Reviewed By: sfilipco Differential Revision: D19912107 fbshipit-source-id: a3187c94f8e0c64f6d92e924ba46e83ce74c3e19	2020-02-28 16:35:19 -08:00
Jun Wu	b6cea95ea5	dag: use Bytes to avoid some VertexName copies Summary: This is an example about how to use the new Bytes type. The performance change is not obviously visible in benchmarks since the bottleneck is not at the bytes copying. Reviewed By: DurhamG Differential Revision: D19818720 fbshipit-source-id: a431ae206cfa4fa08b2e162a48b3d7cbcd900f7f	2020-02-28 09:23:59 -08:00
Jun Wu	76ab726056	dag: switch from bytes to minibytes Summary: The APIs are compatible so the switch is straightforward. Reviewed By: DurhamG Differential Revision: D19818713 fbshipit-source-id: 504e9149567c90eb661804e0dad20580a401aa76	2020-02-28 09:23:59 -08:00
Jun Wu	9e3920ca1c	dag: fix benchmarks Summary: D19559127 forgot those files. Reviewed By: DurhamG Differential Revision: D19818715 fbshipit-source-id: 92321492eae89ed9f748800b3bfcc306a54aab20	2020-02-28 09:23:59 -08:00
Jun Wu	c417232b1b	mutationstore: update lag_threshold Summary: D20042045 changes the meaning of "lag_threshold". Update the value in mutation store accordingly. Reviewed By: DurhamG Differential Revision: D20043116 fbshipit-source-id: 154e6dc2aa88ab0a9a9b21929ae5fa6163dcd403	2020-02-28 09:23:59 -08:00
Jun Wu	1962fd5f5b	indexedlog: update lagging indexes at open time Summary: Previously indexes are only updated at `sync()` time. This diff makes it so `open()` can also update lagging indexes. This should make index migration (ex. D19851355) smoother - indexes are built in time and users suffer less from the absent of indexes. Reviewed By: DurhamG Differential Revision: D20042046 fbshipit-source-id: 20412661a0ca4f5f67b671137c47b6373a42981d	2020-02-28 09:23:58 -08:00
Jun Wu	6da3bdadd2	indexedlog: extract logic writing indexes to disk to a method Summary: The logic is currently only used by `sync()`. I'd like to reuse it at `open()`. Reviewed By: DurhamG Differential Revision: D20042044 fbshipit-source-id: 5c9734ff68bdcf8f8c8710c6a821b18d3afeaca0	2020-02-28 09:23:58 -08:00
Jun Wu	afb24f8a8a	indexedlog: change IndexDef.lag_threshold from bytes to entries Summary: This is more friendly for indexedlog users - deciding lag_threshold by number of entries is easier than by bytes. Initially, I thought checking `bytes` is cheaper and checking `entries` is more expensive. However, practically we will have to build indexes for `entires` anyway. So we do know the number of entries lagging behind. Reviewed By: DurhamG Differential Revision: D20042045 fbshipit-source-id: 73042e406bd8b262d5ef9875e45a3fd5f29f78cf	2020-02-28 09:23:58 -08:00
Jun Wu	55363a78a7	indexedlog: add API to convert `&[u8]` to zero-copy `Bytes` Summary: This can be useful for users of indexedlog when they want `Bytes` (to get rid of the lifetime parameter). This might be useful for storage layer that wants to take the ownership of the returned bytes. Reviewed By: xavierd Differential Revision: D19818714 fbshipit-source-id: cb2d4e7deff921915e07454fee15cb94a3d5c00d	2020-02-28 09:23:57 -08:00
Jun Wu	556850e715	indexedlog: remove unused mmap utility functions Summary: Those utilities are no longer necessary since the new code uses Bytes. Reviewed By: xavierd Differential Revision: D19818717 fbshipit-source-id: 0b43af0f1eae1a4288e84d4170db058b27f80334	2020-02-28 09:23:57 -08:00
Jun Wu	aaf59c569d	indexedlog: replace Mmap with Bytes in Log Summary: This simplifies the code a bit and makes it cheaper to clone the Log. Reviewed By: xavierd Differential Revision: D19818716 fbshipit-source-id: bbf07b8b36009d53b63d8066ec422fc3c3796840	2020-02-28 09:23:57 -08:00
Jun Wu	90ee3cb05a	indexedlog: remove ChecksumTable Summary: It's no longer used since Index now has inlined its checksum logic. Reviewed By: ikostia Differential Revision: D19850744 fbshipit-source-id: eb134e4c1613573a2d238710b44ad8119c80a5ee	2020-02-28 09:23:56 -08:00
Jun Wu	a1601bfdd9	indexedlog: bump index filename Summary: Change index filename and metadata name. This makes sure the new format and old format are separate so upgrading or downgrading won't have issues. Reviewed By: DurhamG Differential Revision: D19851355 fbshipit-source-id: 25dee018073a90040f5818b32b753a3f589c10e0	2020-02-28 09:23:56 -08:00
Jun Wu	6f4bf325d5	indexedlog: write Checksum inline with Log Summary: Enhance the index format: The Root entry can be followed by an optional Checksum entry which replaces the need of ChecksumTable. The format is backwards compatible since the old format will be just treated as "there is no ChecksumTable", and the ChecksumTable will be built on the next "flush". This change is non-trivial. But the tests are pretty strong - the bitflip test alone covered a lot of issues, and the dump of Index content helps a lot too. For the index itself without ".sum", checksum, this change is bi-directional compatible: 1. New code reading old file will just think the old file does not have the checksum entry, similar to new code having checksum disabled. 2. Old code will think the root+checksum slice is the "root" entry. Parsing the root entry is fine since it does not complain about unknown data at the end. However, this change dropped the logic updating ".sum" files. That part is an issue blocking old clients from reading new data. Reviewed By: DurhamG Differential Revision: D19850741 fbshipit-source-id: 551a45cd5422f1fb4c5b08e3b207a2ffe3d93dea	2020-02-28 09:23:55 -08:00
Jun Wu	b9e3046a8d	indexedlog: add Checksum entry to Index Summary: To solve the soundness issue of ChecksumTable raised by the last diff. I plan to move Checksum logic to Index. This has multiple benefits: - Solve the soundness issue of ChecksumTable. - Indexedlog no longer writes the ".sum" files. `atomic_write` can be quite slow (tens of milliseconds) on Windows. So this should help perf - with many indexes, it can save hundreds of milliseconds on Windows per indexedlog sync. This diff adds the definition and serialization of the new Checksum entry. The index format is not updated yet. Reviewed By: markbt Differential Revision: D19850742 fbshipit-source-id: df6e6ed12a12ef0d2a782dc9d6b4dc5dec3f4b46	2020-02-28 09:23:55 -08:00
Jun Wu	0f09413ed4	indexedlog: add a broken test showing checksum_table is racy Summary: With the last change, mmap cost is reduced, but ChecksumTable is unsound in a corner case: the buffer to check is shorter than what ChecksumTable covers: checksum: \|----chunk----\|----chunk----\|----chunk--\| buf: \|-------------------------------\| \| ^ ^ logic len physical len The checksum table will be unable to verify the last chunk, since it does not have enough data in buf. The issues is exposed by stress testing the multithread sync tests. It's not always easy to reproduce, though. Reviewed By: markbt Differential Revision: D19850745 fbshipit-source-id: a1a96080163b7b9b56dcd6c1673d5d8d10e18a2b	2020-02-28 09:23:55 -08:00
Jun Wu	1e10527482	indexedlog: share Bytes between Index and ChecksumTable Summary: This avoids some extra mmap syscalls by ChecksumTable. Reviewed By: xavierd Differential Revision: D19818721 fbshipit-source-id: dace55193f2b4b0f35e3868781faa2d2998d3b58	2020-02-28 09:23:54 -08:00
Jun Wu	1ece621c4d	indexedlog: replace Mmap with Bytes in Index Summary: This simplifies the code a bit (no special cases about 0-sized mmap buffers) and makes it cheaper to clone the index buffer (just an Arc::clone, without another mmap syscall). Reviewed By: xavierd Differential Revision: D19818718 fbshipit-source-id: e96d42af74c7f0bb11703c5da31cdfbd5d76c372	2020-02-28 09:23:54 -08:00
Jun Wu	918672b106	tracing-collector: support owned strings in TreeSpans Summary: TreeSpans used to use `&str`, which adds a lifetime to the struct, making it harder to be used in the Python land. Use a type parameter so TreeSpans<String> can be used. Reviewed By: DurhamG Differential Revision: D19797708 fbshipit-source-id: c66429abfaf16d876151ca6f29da976bed91485d	2020-02-28 09:16:14 -08:00
Jun Wu	4cd7df6a01	tracing-collector: rename structs Summary: TreeSpan -> RawTreeSpan; TreeSpanWithMeta -> TreeSpanRef. I'm going to add a non-reference version of TreeSpanRef. Differential Revision: D19797701 fbshipit-source-id: 42b04c23d4d0ddbe821b94fa2ccb133ce9eafa05	2020-02-28 09:16:14 -08:00
Jun Wu	957617c8b8	tracing-collector: support filtering in TreeSpans Summary: The filtering interface allows callsite to select what they want. It's similar to manifest walk with files or directory matchers in source control. Reviewed By: DurhamG Differential Revision: D19784467 fbshipit-source-id: 5cf6e4016d6fa1c90f8aeccc50809baccd4af5ab	2020-02-28 09:16:13 -08:00
Jun Wu	366e701239	tracing-collector: support Events in TreeSpans Summary: The idea is that instants (events) can be a drop-in replacement for `ui.log`. Reviewed By: DurhamG Differential Revision: D19782897 fbshipit-source-id: 795bbba23d921e460f723f19ef529b203aea366a	2020-02-28 09:16:13 -08:00
Jun Wu	d205592d42	tracing-collector: extract logic finding parent span to a function Summary: This function will be reused by the next diff. Reviewed By: DurhamG Differential Revision: D19782895 fbshipit-source-id: 1e636eabee9b0dffd287a1e6784a24ab2259f51f	2020-02-28 09:16:13 -08:00
Jun Wu	8b5fdc01fc	tracing-collector: put treespans into a struct Summary: This allows us to define methods on the treespans, such as filtering APIs. Reviewed By: DurhamG Differential Revision: D19782896 fbshipit-source-id: 2e7bd8344c0196e382728c26a8233abf944bbf29	2020-02-28 09:16:12 -08:00
David Tolnay	37a8401761	rust/thrift: Un-rename futures-preview dependency Summary: The Thrift generated code depends only on futures 0.3, not 0.1. Thus it isn't necessary to depend on renamed:futures-preview and we can depend on futures-preview directly, which is exposed to Rust code as `futures::`. Reviewed By: jsgf Differential Revision: D20145921 fbshipit-source-id: 5cae94ec6747a374c2bf05f124ab237c798de005	2020-02-27 22:27:58 -08:00
Xavier Deguillard	6fac9ebad0	revisionstore: add a get_stripped method to ContentStore Summary: This new method returns the content of a blob without the copy-from metadata header. Reviewed By: DurhamG Differential Revision: D20102889 fbshipit-source-id: e96f636b7d30460b59707a2cb700d667e616116a	2020-02-27 12:29:42 -08:00
Jun Wu	bce29c9562	nameset: UnionSet Summary: Similar to Mercurial's smartset.addset. Reviewed By: sfilipco Differential Revision: D19912106 fbshipit-source-id: 0d0c8d0b71d2757259d26295eb4a564fea807dea	2020-02-27 07:34:57 -08:00
Jun Wu	c906a21ce1	nameset: initial NameSet abstraction Summary: The NameSet is something similar to SpanSet and Mercurial's smartset but speaks VertexNames instead of Ids. The idea is, NameSet will be part of NameDag APIs, and potentially replace Mercurial's smartset layer (just smartset the container types, not the revset language), in a way that revision numbers are completely hidden behind the scenes. This diff adds some basic abstraction around iteration-related operations. Other operations will be added later. Reviewed By: sfilipco Differential Revision: D19912109 fbshipit-source-id: 504a26c074282ec51f260535ca63e943124f688e	2020-02-27 07:34:57 -08:00
Adam Simpkins	0ffcf3e450	update the Rust `print_status()` function to take an `IO` parameter Summary: Update the `print_status()` function to take a `clidispatch::io::IO` object as a parameter, instead of a simple output object. This will allow us to also print error messages from this function in a future diff. Reviewed By: quark-zju Differential Revision: D19958504 fbshipit-source-id: bf482fdc4420e1350363a730c6a539cd760aef25	2020-02-26 14:54:40 -08:00
Adam Simpkins	b22fc79e4b	clean up PathRelativizer API usage of Path vs PathBuf Summary: Fix the PathRelativizer APIs to accept `Path` and even `str` arguments instead of just `PathBuf`. The old code required a `PathBuf`, which often forced callers to make a copy of the path data. Reviewed By: quark-zju Differential Revision: D19958505 fbshipit-source-id: 6fa40dd4b75df4e3faf9ad2ae4f0e4e6595669f6	2020-02-24 15:38:36 -08:00
Xavier Deguillard	934b64397b	convert to bytes 0.5 Summary: The bytes 0.5 is a depencency of newer tokio, it's also newer, and thus better. Staying on 0.4 means that copies between Bytes 0.4 and 0.5 need to be done, this will be especially bad in the LFS code since 10+MB buffer will have to be copied... One main API change is for the configparser. The code used to take Into<Bytes> for the keys, I switched it to AsRef<[u8]>. For hg_memcache_client, an extra copy is performed to build a Delta, since this code uses an old tokio, and is being replaced right now, the effort of switching to a new tokio and new bytes was not deemed worth it, the copy will do for now. Reviewed By: dtolnay Differential Revision: D20043137 fbshipit-source-id: 395bfc3749a3b1bdfea652262019ac6a086e61e0	2020-02-24 10:28:46 -08:00
Jun Wu	142937c2f8	cargo: bump serde_cbor to 0.11 Summary: Follow up of D20024491. Reviewed By: sfilipco Differential Revision: D20043585 fbshipit-source-id: f66896c8f41c3918fb37611d87fa26c39cdecef1	2020-02-21 14:08:43 -08:00
Xavier Deguillard	44c4f2f5d9	revisionstore: add copyfrom information to the LFS pointer Summary: Mercurial filenode hash is computed by including the copy information in the blob header. Before computing the blob content hash, or returning it to the upper layers, we need to either strip or reconstruct this header appropriately. Reviewed By: DurhamG Differential Revision: D19975887 fbshipit-source-id: 7555e7219e50f4d18ec677fdecc216ee705d7af4	2020-02-20 14:28:52 -08:00
Xavier Deguillard	7fb75ce4f0	lfs: move contenthash computation to the enum impl Summary: This will make it easier to support more hash schemes in the future. Reviewed By: DurhamG Differential Revision: D19975888 fbshipit-source-id: 8b8ce3b20d72199bac3cd20a48475b5ab56bfc52	2020-02-20 14:28:52 -08:00
Xavier Deguillard	cd56a8b39a	revisionstore: move Arc outside of the stores Summary: With the Arc embedded into the store themselves, this forces a second allocation in order to use them as trait objects. Since in most cases, we do not want the stores themselves to be cloneable, we can move the Arc outside and thus reduce the number of pointer indirection. Reviewed By: DurhamG Differential Revision: D19867568 fbshipit-source-id: 9cd126831fe2b9ee715472ac3299b7a09df95fce	2020-02-20 14:28:52 -08:00
Xavier Deguillard	7c1a623d8a	revisionstore: add the LfsStore to the ContentStore Summary: The ContentStore now can read LFS blobs from both the shared cache, and the local store. Reviewed By: DurhamG Differential Revision: D19866249 fbshipit-source-id: a6fb3523495e9d3832613b56438f631cfa552b91	2020-02-20 14:28:51 -08:00
Xavier Deguillard	58d9d92e88	revisionstore: simplify ContentStore/MetadataStore initialization a bit Summary: With the LFS store being added, and the indexedlog being soon used for trees, this simplification should help in formalizing the hierarchy of files/folders. It will look like the following: <root dir>/lfs: for the lfs store <root dir>/indexedlog*: for the indexedlog <root dir>/foobar: for a hypothetical foobar store For manifests, <root dir> will therefore be: <store dir>/manifests. The unfortunate part is that the current tree data lives under <store dir>/packs/manifests. As packfiles will be replaced, this small discrepency is acceptable. Reviewed By: DurhamG Differential Revision: D19866248 fbshipit-source-id: 7ef59ef7df19149b19a529b4f4a45a479cc9d23b	2020-02-20 14:28:51 -08:00
Xavier Deguillard	f512b5658d	revisionstore: add an LfsStore Summary: This is the first step in having a stronger integration between LFS blobs and the ContentStore abstraction. The 2 main difference between the Python based LFS implementation and this one are: - pointers are not stored alongside plain data, - blobs are split between local and shared blobs As of now, no reclamation is being performed for shared blobs, blobs aren't fetched or uploaded. This will come in future diffs. Reviewed By: DurhamG Differential Revision: D19859291 fbshipit-source-id: 45000fc574e6fbd6d3487f4966cad4f49dab731c	2020-02-20 14:28:51 -08:00
Adam Simpkins	5ffa268af2	use absolute includes for the native cext modules Summary: Update the C files under edenscm/mercurial/cext to use absolute includes from the repository root. Also update a few of the libraries in edenscm/mercurial that the cext code depends on. This makes these files easier to build with Buck in fbsource, and reduces the number of places where we have to use deprecated Buck functionality to help find these headers. This also allows autodeps to work with the build targets for these rules. Reviewed By: xavierd Differential Revision: D19958221 fbshipit-source-id: e6e471583a795ba5773bae5f16ed582c9c5fd57e	2020-02-19 13:05:06 -08:00
Lukas Piatkowski	c4f0887fc2	eden/scm: cover xdiff with autocargo Summary: Generate the Cargo.toml files inside xdiff with autocargo. This will enable Mononoke to depend on this code easily without sacrificing anything on eden/scm side. Reviewed By: aslpavel Differential Revision: D19948741 fbshipit-source-id: 905ff3d64b90830e5f075e4c6ed2b3de959e3f00	2020-02-19 05:15:17 -08:00
Xavier Deguillard	d8064b5e2a	types: add a Sha256 type Summary: This will be used in the LFS store. Reviewed By: DurhamG Differential Revision: D19895803 fbshipit-source-id: 4cf447987c10fed0b5c98904f20c841428965d89	2020-02-18 08:32:33 -08:00
Xavier Deguillard	17cc9ab5ab	revisionstore: add a wrapper around IndexedLog/RotateLog Summary: In some cases, higher level stores may want to store data in either a plain IndexedLog, or in a RotateLog, for local and shared data. Due to slight difference between the 2, they can't easily be adapted into a common trait. Instead let's just wrap both into an enum and implement the main functions that the higher level stores need. The first use of this will be the LfsStore, future use will include the IndexedLogDataStore and the IndexedLogHistoryStores. Reviewed By: DurhamG Differential Revision: D19859292 fbshipit-source-id: 920572e0cf5f69bda4901a727a6b0dc0f08fc8d0	2020-02-18 08:32:32 -08:00
Durham Goode	f530333e06	edenfs: update eden thrift types Summary: When I run make local it's creating changes in our checked in thrift types. I guess I need to check these in? Reviewed By: quark-zju Differential Revision: D19848706 fbshipit-source-id: 8a2e9a2617734eda41eade1f2645689362b1d75d	2020-02-17 14:52:29 -08:00
Xavier Deguillard	7bb3e384d8	remotefilelog: append the repo name to memcache key Summary: Up to now, this has been done in chef, and thus for repos that we do not list, they may share the memcache keys, with potential unintended consequences. Let's always add the repo name to the key, so we can simplify the code in chef. One small negative effect of this change is that while it is being rolled out, the cache hit rate will be impacted. This should resolve itself quickly. Reviewed By: DurhamG Differential Revision: D19885775 fbshipit-source-id: 0b59ce9e378b0ab70f696a39d19d27cd89921098	2020-02-14 14:10:48 -08:00
Xavier Deguillard	28564b228d	backingstore: do not fail if memcache can't be initialized Summary: Failing means that we fallback to the Python importer. Let's simply warn about it. Reviewed By: fanzeyi Differential Revision: D19897274 fbshipit-source-id: f9c63f5aa76015c28b31f00bba98244f5c86e923	2020-02-14 09:00:27 -08:00
Jun Wu	03baa31789	indexedlog: switch from bytes to minibytes Summary: This makes it possible to use `Bytes` for mmap buffers. The changes are because `minibytes::Bytes` does not implement `From<&[u8]>` with the intention to make slice copy explicit. Reviewed By: xavierd Differential Revision: D19818719 fbshipit-source-id: c34ee451bfd2dc7bcbbcebd52a76444b6c236849	2020-02-12 13:57:37 -08:00
Xavier Deguillard	49c953bc7e	backingstore: plumb the MemcacheStore Summary: EdenFS will now be able to fetch blobs directly from memcache. This won't have any big benefits as no blobs are in memcache right now, but over time, this will significantly reduce the cost of fetching blobs. Reviewed By: fanzeyi Differential Revision: D19861643 fbshipit-source-id: c2e9d317bd30d4656bf0b3f8897794161697761a	2020-02-12 13:43:00 -08:00
Xavier Deguillard	a4b83e384a	revisionstore: add tracing point for memcache Summary: These tracing points will help us understand the memcache hit rate as well as the fetching speed. Reviewed By: quark-zju Differential Revision: D19836499 fbshipit-source-id: 1936c44efc3e7715069e6a959f5331139d591d5c	2020-02-12 10:38:59 -08:00
Xavier Deguillard	2c4a10bf4b	revisionstore: move memcache set to a background thread Summary: Everytime a cache miss is seen, the data fetched from the server will be sent directly to memcache for future use. Unfortunately, doing so in a blocking manner severely impact the overall fetching speed from the server. Since memcache is purely an optimization, we can afford to send data to it asynchronously. Let's move as much as possible of the code to a background thread to reduce the overhead of memcache. Reviewed By: DurhamG Differential Revision: D19836011 fbshipit-source-id: 68e506ef7464d6e99d98457d0d37178f514be1a9	2020-02-12 10:38:59 -08:00
Xavier Deguillard	dc7f7908ef	revisionstore: prefetch data with get_iter Summary: Instead of fetching data one-by-one, let's prefetch data concurrently by using the new get_iter function. Reviewed By: DurhamG Differential Revision: D19836009 fbshipit-source-id: 4a50328c0cbbba677c2de3777ebe4c34cb10c1e2	2020-02-12 10:38:58 -08:00
Xavier Deguillard	8b082a18f7	revisionstore: don't prefetch with an empty key set Summary: Even when memcache would be able to prefetch everything, this would always call into the underlying remote store with an empt key set. For things like `hg prefetch` and a large number of keys, the effect of doing that is minimum, but for EdenFS or `hg log -p`, the roundtrip to the server for every file/revision would add a significant amount of overhead. Let's simply stop iterating when we no longer need to fetch anything. Reviewed By: DurhamG Differential Revision: D19835797 fbshipit-source-id: 54ad704428c3b20d973cfa87f7171899ec44b3f9	2020-02-11 18:05:16 -08:00
Jun Wu	abb8ccb346	minibytes: add serde support Summary: See also https://github.com/serde-rs/bytes/. This will be used in the `dag` crate. Reviewed By: DurhamG Differential Revision: D19770858 fbshipit-source-id: 2a870a564e0ceecdc7a4667853b2b2a5ea4ce6e3	2020-02-07 14:21:39 -08:00
Jun Wu	8029cd3878	minibytes: port benchmark from tokio/bytes Summary: Performance looks okay comparing with tokio/bytes v0.5.4: minibytes: test clone_arc_vec ... bench: 16,542 ns/iter (+/- 1,524) test clone_shared ... bench: 16,211 ns/iter (+/- 596) test clone_static ... bench: 1,437 ns/iter (+/- 502) test deref_shared ... bench: 367 ns/iter (+/- 101) test deref_static ... bench: 366 ns/iter (+/- 1) test deref_unique ... bench: 367 ns/iter (+/- 4) test from_long_slicd ... bench: 91 ns/iter (+/- 18) = 1406 MB/s test slice_empty ... bench: 10,382 ns/iter (+/- 104) test slice_short_from_arc ... bench: 23,823 ns/iter (+/- 1,411) tokio/bytes: test clone_arc_vec ... bench: 16,213 ns/iter (+/- 1,864) test clone_shared ... bench: 18,685 ns/iter (+/- 634) test clone_static ... bench: 3,983 ns/iter (+/- 163) test deref_shared ... bench: 366 ns/iter (+/- 26) test deref_static ... bench: 373 ns/iter (+/- 36) test deref_unique ... bench: 391 ns/iter (+/- 33) test from_long_slice ... bench: 67 ns/iter (+/- 7) = 1910 MB/s test slice_empty ... bench: 15,149 ns/iter (+/- 1,708) test slice_short_from_arc ... bench: 36,541 ns/iter (+/- 3,485) clone_static is faster because minibytes don't call into vtable's clone. from_long_slice is slower because minibytes uses Arc unconditionally while bytes can avoid Arc overhead if refcount is 1. Reviewed By: DurhamG Differential Revision: D19770857 fbshipit-source-id: 5bafcc57a38c68baccfcafd3906f1a47b2bf4530	2020-02-07 14:21:39 -08:00
Jun Wu	108f1c947a	minibytes: minimalist zero-copy Bytes with mmap support Summary: This crate provides the core features of the commonly known `Bytes` crate: zero-copy slicing and cloning, while also supports mmap-backed buffers. The main motivation is to replace `Mmap` in `indexedlog`. That has multiple benefits: - Handles 0-sized mmap more cleanly. - Handles clones more cleanly. - Gain the flexibility to zero-copy data without lifetime / reference. - Gain the flexibility to switch to non-mmap data. The `bytes::Bytes` crate does not yet support mmap buffers as of its latest release (0.5.4). Implementation wise, `minibytes::Bytes` uses `Option<Arc<dyn Trait>>` for the "trait object". This makes implementing the mmap storage just one line. `bytes 0.5.4` re-invents the "trait object" manually using unsafe code. It requires about 50 lines to implement the mmap storage (in D19756122). Reviewed By: xavierd Differential Revision: D19770856 fbshipit-source-id: 8cfa7052a18ac2e0cd6348b77d5e2a4acc61195c	2020-02-07 14:21:38 -08:00
Jun Wu	69aa37f23b	tracing: limit column width on ASCII output Summary: This makes the output more readable even if the "name" of a span is very long. Reviewed By: DurhamG Differential Revision: D19780536 fbshipit-source-id: dce0d3777409c32b0752db51341a572addb823ea	2020-02-06 15:46:53 -08:00
Xavier Deguillard	6ea4bb998e	revisionstore: move memcache initialization to a background thread Summary: As initializing the memcache client takes ~0.7s, let's move it to a background thread as to not impact Mercurial startup time. This diff uses ArcSwap in order to reduce the overhead of the very common read paths as much as possible. Using Mutex or RwLock instead would have caused unecessary contention. Reviewed By: DurhamG Differential Revision: D19518693 fbshipit-source-id: 886e9b86813fda6ff005ccce99659890026f643a	2020-02-05 14:01:54 -08:00

... 2 3 4 5 6 ...

562 Commits