sapling

mirror of https://github.com/facebook/sapling.git synced 2024-10-10 16:57:49 +03:00

Author	SHA1	Message	Date
Jun Wu	556850e715	indexedlog: remove unused mmap utility functions Summary: Those utilities are no longer necessary since the new code uses Bytes. Reviewed By: xavierd Differential Revision: D19818717 fbshipit-source-id: 0b43af0f1eae1a4288e84d4170db058b27f80334	2020-02-28 09:23:57 -08:00
Jun Wu	aaf59c569d	indexedlog: replace Mmap with Bytes in Log Summary: This simplifies the code a bit and makes it cheaper to clone the Log. Reviewed By: xavierd Differential Revision: D19818716 fbshipit-source-id: bbf07b8b36009d53b63d8066ec422fc3c3796840	2020-02-28 09:23:57 -08:00
Jun Wu	90ee3cb05a	indexedlog: remove ChecksumTable Summary: It's no longer used since Index now has inlined its checksum logic. Reviewed By: ikostia Differential Revision: D19850744 fbshipit-source-id: eb134e4c1613573a2d238710b44ad8119c80a5ee	2020-02-28 09:23:56 -08:00
Jun Wu	a1601bfdd9	indexedlog: bump index filename Summary: Change index filename and metadata name. This makes sure the new format and old format are separate so upgrading or downgrading won't have issues. Reviewed By: DurhamG Differential Revision: D19851355 fbshipit-source-id: 25dee018073a90040f5818b32b753a3f589c10e0	2020-02-28 09:23:56 -08:00
Jun Wu	6f4bf325d5	indexedlog: write Checksum inline with Log Summary: Enhance the index format: The Root entry can be followed by an optional Checksum entry which replaces the need of ChecksumTable. The format is backwards compatible since the old format will be just treated as "there is no ChecksumTable", and the ChecksumTable will be built on the next "flush". This change is non-trivial. But the tests are pretty strong - the bitflip test alone covered a lot of issues, and the dump of Index content helps a lot too. For the index itself without ".sum", checksum, this change is bi-directional compatible: 1. New code reading old file will just think the old file does not have the checksum entry, similar to new code having checksum disabled. 2. Old code will think the root+checksum slice is the "root" entry. Parsing the root entry is fine since it does not complain about unknown data at the end. However, this change dropped the logic updating ".sum" files. That part is an issue blocking old clients from reading new data. Reviewed By: DurhamG Differential Revision: D19850741 fbshipit-source-id: 551a45cd5422f1fb4c5b08e3b207a2ffe3d93dea	2020-02-28 09:23:55 -08:00
Jun Wu	b9e3046a8d	indexedlog: add Checksum entry to Index Summary: To solve the soundness issue of ChecksumTable raised by the last diff. I plan to move Checksum logic to Index. This has multiple benefits: - Solve the soundness issue of ChecksumTable. - Indexedlog no longer writes the ".sum" files. `atomic_write` can be quite slow (tens of milliseconds) on Windows. So this should help perf - with many indexes, it can save hundreds of milliseconds on Windows per indexedlog sync. This diff adds the definition and serialization of the new Checksum entry. The index format is not updated yet. Reviewed By: markbt Differential Revision: D19850742 fbshipit-source-id: df6e6ed12a12ef0d2a782dc9d6b4dc5dec3f4b46	2020-02-28 09:23:55 -08:00
Jun Wu	0f09413ed4	indexedlog: add a broken test showing checksum_table is racy Summary: With the last change, mmap cost is reduced, but ChecksumTable is unsound in a corner case: the buffer to check is shorter than what ChecksumTable covers: checksum: \|----chunk----\|----chunk----\|----chunk--\| buf: \|-------------------------------\| \| ^ ^ logic len physical len The checksum table will be unable to verify the last chunk, since it does not have enough data in buf. The issues is exposed by stress testing the multithread sync tests. It's not always easy to reproduce, though. Reviewed By: markbt Differential Revision: D19850745 fbshipit-source-id: a1a96080163b7b9b56dcd6c1673d5d8d10e18a2b	2020-02-28 09:23:55 -08:00
Jun Wu	1e10527482	indexedlog: share Bytes between Index and ChecksumTable Summary: This avoids some extra mmap syscalls by ChecksumTable. Reviewed By: xavierd Differential Revision: D19818721 fbshipit-source-id: dace55193f2b4b0f35e3868781faa2d2998d3b58	2020-02-28 09:23:54 -08:00
Jun Wu	1ece621c4d	indexedlog: replace Mmap with Bytes in Index Summary: This simplifies the code a bit (no special cases about 0-sized mmap buffers) and makes it cheaper to clone the index buffer (just an Arc::clone, without another mmap syscall). Reviewed By: xavierd Differential Revision: D19818718 fbshipit-source-id: e96d42af74c7f0bb11703c5da31cdfbd5d76c372	2020-02-28 09:23:54 -08:00
Jun Wu	918672b106	tracing-collector: support owned strings in TreeSpans Summary: TreeSpans used to use `&str`, which adds a lifetime to the struct, making it harder to be used in the Python land. Use a type parameter so TreeSpans<String> can be used. Reviewed By: DurhamG Differential Revision: D19797708 fbshipit-source-id: c66429abfaf16d876151ca6f29da976bed91485d	2020-02-28 09:16:14 -08:00
Jun Wu	4cd7df6a01	tracing-collector: rename structs Summary: TreeSpan -> RawTreeSpan; TreeSpanWithMeta -> TreeSpanRef. I'm going to add a non-reference version of TreeSpanRef. Differential Revision: D19797701 fbshipit-source-id: 42b04c23d4d0ddbe821b94fa2ccb133ce9eafa05	2020-02-28 09:16:14 -08:00
Jun Wu	957617c8b8	tracing-collector: support filtering in TreeSpans Summary: The filtering interface allows callsite to select what they want. It's similar to manifest walk with files or directory matchers in source control. Reviewed By: DurhamG Differential Revision: D19784467 fbshipit-source-id: 5cf6e4016d6fa1c90f8aeccc50809baccd4af5ab	2020-02-28 09:16:13 -08:00
Jun Wu	366e701239	tracing-collector: support Events in TreeSpans Summary: The idea is that instants (events) can be a drop-in replacement for `ui.log`. Reviewed By: DurhamG Differential Revision: D19782897 fbshipit-source-id: 795bbba23d921e460f723f19ef529b203aea366a	2020-02-28 09:16:13 -08:00
Jun Wu	d205592d42	tracing-collector: extract logic finding parent span to a function Summary: This function will be reused by the next diff. Reviewed By: DurhamG Differential Revision: D19782895 fbshipit-source-id: 1e636eabee9b0dffd287a1e6784a24ab2259f51f	2020-02-28 09:16:13 -08:00
Jun Wu	8b5fdc01fc	tracing-collector: put treespans into a struct Summary: This allows us to define methods on the treespans, such as filtering APIs. Reviewed By: DurhamG Differential Revision: D19782896 fbshipit-source-id: 2e7bd8344c0196e382728c26a8233abf944bbf29	2020-02-28 09:16:12 -08:00
David Tolnay	37a8401761	rust/thrift: Un-rename futures-preview dependency Summary: The Thrift generated code depends only on futures 0.3, not 0.1. Thus it isn't necessary to depend on renamed:futures-preview and we can depend on futures-preview directly, which is exposed to Rust code as `futures::`. Reviewed By: jsgf Differential Revision: D20145921 fbshipit-source-id: 5cae94ec6747a374c2bf05f124ab237c798de005	2020-02-27 22:27:58 -08:00
Xavier Deguillard	6fac9ebad0	revisionstore: add a get_stripped method to ContentStore Summary: This new method returns the content of a blob without the copy-from metadata header. Reviewed By: DurhamG Differential Revision: D20102889 fbshipit-source-id: e96f636b7d30460b59707a2cb700d667e616116a	2020-02-27 12:29:42 -08:00
Jun Wu	bce29c9562	nameset: UnionSet Summary: Similar to Mercurial's smartset.addset. Reviewed By: sfilipco Differential Revision: D19912106 fbshipit-source-id: 0d0c8d0b71d2757259d26295eb4a564fea807dea	2020-02-27 07:34:57 -08:00
Jun Wu	c906a21ce1	nameset: initial NameSet abstraction Summary: The NameSet is something similar to SpanSet and Mercurial's smartset but speaks VertexNames instead of Ids. The idea is, NameSet will be part of NameDag APIs, and potentially replace Mercurial's smartset layer (just smartset the container types, not the revset language), in a way that revision numbers are completely hidden behind the scenes. This diff adds some basic abstraction around iteration-related operations. Other operations will be added later. Reviewed By: sfilipco Differential Revision: D19912109 fbshipit-source-id: 504a26c074282ec51f260535ca63e943124f688e	2020-02-27 07:34:57 -08:00
Adam Simpkins	0ffcf3e450	update the Rust `print_status()` function to take an `IO` parameter Summary: Update the `print_status()` function to take a `clidispatch::io::IO` object as a parameter, instead of a simple output object. This will allow us to also print error messages from this function in a future diff. Reviewed By: quark-zju Differential Revision: D19958504 fbshipit-source-id: bf482fdc4420e1350363a730c6a539cd760aef25	2020-02-26 14:54:40 -08:00
Adam Simpkins	b22fc79e4b	clean up PathRelativizer API usage of Path vs PathBuf Summary: Fix the PathRelativizer APIs to accept `Path` and even `str` arguments instead of just `PathBuf`. The old code required a `PathBuf`, which often forced callers to make a copy of the path data. Reviewed By: quark-zju Differential Revision: D19958505 fbshipit-source-id: 6fa40dd4b75df4e3faf9ad2ae4f0e4e6595669f6	2020-02-24 15:38:36 -08:00
Xavier Deguillard	934b64397b	convert to bytes 0.5 Summary: The bytes 0.5 is a depencency of newer tokio, it's also newer, and thus better. Staying on 0.4 means that copies between Bytes 0.4 and 0.5 need to be done, this will be especially bad in the LFS code since 10+MB buffer will have to be copied... One main API change is for the configparser. The code used to take Into<Bytes> for the keys, I switched it to AsRef<[u8]>. For hg_memcache_client, an extra copy is performed to build a Delta, since this code uses an old tokio, and is being replaced right now, the effort of switching to a new tokio and new bytes was not deemed worth it, the copy will do for now. Reviewed By: dtolnay Differential Revision: D20043137 fbshipit-source-id: 395bfc3749a3b1bdfea652262019ac6a086e61e0	2020-02-24 10:28:46 -08:00
Jun Wu	142937c2f8	cargo: bump serde_cbor to 0.11 Summary: Follow up of D20024491. Reviewed By: sfilipco Differential Revision: D20043585 fbshipit-source-id: f66896c8f41c3918fb37611d87fa26c39cdecef1	2020-02-21 14:08:43 -08:00
Xavier Deguillard	44c4f2f5d9	revisionstore: add copyfrom information to the LFS pointer Summary: Mercurial filenode hash is computed by including the copy information in the blob header. Before computing the blob content hash, or returning it to the upper layers, we need to either strip or reconstruct this header appropriately. Reviewed By: DurhamG Differential Revision: D19975887 fbshipit-source-id: 7555e7219e50f4d18ec677fdecc216ee705d7af4	2020-02-20 14:28:52 -08:00
Xavier Deguillard	7fb75ce4f0	lfs: move contenthash computation to the enum impl Summary: This will make it easier to support more hash schemes in the future. Reviewed By: DurhamG Differential Revision: D19975888 fbshipit-source-id: 8b8ce3b20d72199bac3cd20a48475b5ab56bfc52	2020-02-20 14:28:52 -08:00
Xavier Deguillard	cd56a8b39a	revisionstore: move Arc outside of the stores Summary: With the Arc embedded into the store themselves, this forces a second allocation in order to use them as trait objects. Since in most cases, we do not want the stores themselves to be cloneable, we can move the Arc outside and thus reduce the number of pointer indirection. Reviewed By: DurhamG Differential Revision: D19867568 fbshipit-source-id: 9cd126831fe2b9ee715472ac3299b7a09df95fce	2020-02-20 14:28:52 -08:00
Xavier Deguillard	7c1a623d8a	revisionstore: add the LfsStore to the ContentStore Summary: The ContentStore now can read LFS blobs from both the shared cache, and the local store. Reviewed By: DurhamG Differential Revision: D19866249 fbshipit-source-id: a6fb3523495e9d3832613b56438f631cfa552b91	2020-02-20 14:28:51 -08:00
Xavier Deguillard	58d9d92e88	revisionstore: simplify ContentStore/MetadataStore initialization a bit Summary: With the LFS store being added, and the indexedlog being soon used for trees, this simplification should help in formalizing the hierarchy of files/folders. It will look like the following: <root dir>/lfs: for the lfs store <root dir>/indexedlog*: for the indexedlog <root dir>/foobar: for a hypothetical foobar store For manifests, <root dir> will therefore be: <store dir>/manifests. The unfortunate part is that the current tree data lives under <store dir>/packs/manifests. As packfiles will be replaced, this small discrepency is acceptable. Reviewed By: DurhamG Differential Revision: D19866248 fbshipit-source-id: 7ef59ef7df19149b19a529b4f4a45a479cc9d23b	2020-02-20 14:28:51 -08:00
Xavier Deguillard	f512b5658d	revisionstore: add an LfsStore Summary: This is the first step in having a stronger integration between LFS blobs and the ContentStore abstraction. The 2 main difference between the Python based LFS implementation and this one are: - pointers are not stored alongside plain data, - blobs are split between local and shared blobs As of now, no reclamation is being performed for shared blobs, blobs aren't fetched or uploaded. This will come in future diffs. Reviewed By: DurhamG Differential Revision: D19859291 fbshipit-source-id: 45000fc574e6fbd6d3487f4966cad4f49dab731c	2020-02-20 14:28:51 -08:00
Adam Simpkins	5ffa268af2	use absolute includes for the native cext modules Summary: Update the C files under edenscm/mercurial/cext to use absolute includes from the repository root. Also update a few of the libraries in edenscm/mercurial that the cext code depends on. This makes these files easier to build with Buck in fbsource, and reduces the number of places where we have to use deprecated Buck functionality to help find these headers. This also allows autodeps to work with the build targets for these rules. Reviewed By: xavierd Differential Revision: D19958221 fbshipit-source-id: e6e471583a795ba5773bae5f16ed582c9c5fd57e	2020-02-19 13:05:06 -08:00
Lukas Piatkowski	c4f0887fc2	eden/scm: cover xdiff with autocargo Summary: Generate the Cargo.toml files inside xdiff with autocargo. This will enable Mononoke to depend on this code easily without sacrificing anything on eden/scm side. Reviewed By: aslpavel Differential Revision: D19948741 fbshipit-source-id: 905ff3d64b90830e5f075e4c6ed2b3de959e3f00	2020-02-19 05:15:17 -08:00
Xavier Deguillard	d8064b5e2a	types: add a Sha256 type Summary: This will be used in the LFS store. Reviewed By: DurhamG Differential Revision: D19895803 fbshipit-source-id: 4cf447987c10fed0b5c98904f20c841428965d89	2020-02-18 08:32:33 -08:00
Xavier Deguillard	17cc9ab5ab	revisionstore: add a wrapper around IndexedLog/RotateLog Summary: In some cases, higher level stores may want to store data in either a plain IndexedLog, or in a RotateLog, for local and shared data. Due to slight difference between the 2, they can't easily be adapted into a common trait. Instead let's just wrap both into an enum and implement the main functions that the higher level stores need. The first use of this will be the LfsStore, future use will include the IndexedLogDataStore and the IndexedLogHistoryStores. Reviewed By: DurhamG Differential Revision: D19859292 fbshipit-source-id: 920572e0cf5f69bda4901a727a6b0dc0f08fc8d0	2020-02-18 08:32:32 -08:00
Durham Goode	f530333e06	edenfs: update eden thrift types Summary: When I run make local it's creating changes in our checked in thrift types. I guess I need to check these in? Reviewed By: quark-zju Differential Revision: D19848706 fbshipit-source-id: 8a2e9a2617734eda41eade1f2645689362b1d75d	2020-02-17 14:52:29 -08:00
Xavier Deguillard	7bb3e384d8	remotefilelog: append the repo name to memcache key Summary: Up to now, this has been done in chef, and thus for repos that we do not list, they may share the memcache keys, with potential unintended consequences. Let's always add the repo name to the key, so we can simplify the code in chef. One small negative effect of this change is that while it is being rolled out, the cache hit rate will be impacted. This should resolve itself quickly. Reviewed By: DurhamG Differential Revision: D19885775 fbshipit-source-id: 0b59ce9e378b0ab70f696a39d19d27cd89921098	2020-02-14 14:10:48 -08:00
Xavier Deguillard	28564b228d	backingstore: do not fail if memcache can't be initialized Summary: Failing means that we fallback to the Python importer. Let's simply warn about it. Reviewed By: fanzeyi Differential Revision: D19897274 fbshipit-source-id: f9c63f5aa76015c28b31f00bba98244f5c86e923	2020-02-14 09:00:27 -08:00
Jun Wu	03baa31789	indexedlog: switch from bytes to minibytes Summary: This makes it possible to use `Bytes` for mmap buffers. The changes are because `minibytes::Bytes` does not implement `From<&[u8]>` with the intention to make slice copy explicit. Reviewed By: xavierd Differential Revision: D19818719 fbshipit-source-id: c34ee451bfd2dc7bcbbcebd52a76444b6c236849	2020-02-12 13:57:37 -08:00
Xavier Deguillard	49c953bc7e	backingstore: plumb the MemcacheStore Summary: EdenFS will now be able to fetch blobs directly from memcache. This won't have any big benefits as no blobs are in memcache right now, but over time, this will significantly reduce the cost of fetching blobs. Reviewed By: fanzeyi Differential Revision: D19861643 fbshipit-source-id: c2e9d317bd30d4656bf0b3f8897794161697761a	2020-02-12 13:43:00 -08:00
Xavier Deguillard	a4b83e384a	revisionstore: add tracing point for memcache Summary: These tracing points will help us understand the memcache hit rate as well as the fetching speed. Reviewed By: quark-zju Differential Revision: D19836499 fbshipit-source-id: 1936c44efc3e7715069e6a959f5331139d591d5c	2020-02-12 10:38:59 -08:00
Xavier Deguillard	2c4a10bf4b	revisionstore: move memcache set to a background thread Summary: Everytime a cache miss is seen, the data fetched from the server will be sent directly to memcache for future use. Unfortunately, doing so in a blocking manner severely impact the overall fetching speed from the server. Since memcache is purely an optimization, we can afford to send data to it asynchronously. Let's move as much as possible of the code to a background thread to reduce the overhead of memcache. Reviewed By: DurhamG Differential Revision: D19836011 fbshipit-source-id: 68e506ef7464d6e99d98457d0d37178f514be1a9	2020-02-12 10:38:59 -08:00
Xavier Deguillard	dc7f7908ef	revisionstore: prefetch data with get_iter Summary: Instead of fetching data one-by-one, let's prefetch data concurrently by using the new get_iter function. Reviewed By: DurhamG Differential Revision: D19836009 fbshipit-source-id: 4a50328c0cbbba677c2de3777ebe4c34cb10c1e2	2020-02-12 10:38:58 -08:00
Xavier Deguillard	8b082a18f7	revisionstore: don't prefetch with an empty key set Summary: Even when memcache would be able to prefetch everything, this would always call into the underlying remote store with an empt key set. For things like `hg prefetch` and a large number of keys, the effect of doing that is minimum, but for EdenFS or `hg log -p`, the roundtrip to the server for every file/revision would add a significant amount of overhead. Let's simply stop iterating when we no longer need to fetch anything. Reviewed By: DurhamG Differential Revision: D19835797 fbshipit-source-id: 54ad704428c3b20d973cfa87f7171899ec44b3f9	2020-02-11 18:05:16 -08:00
Jun Wu	abb8ccb346	minibytes: add serde support Summary: See also https://github.com/serde-rs/bytes/. This will be used in the `dag` crate. Reviewed By: DurhamG Differential Revision: D19770858 fbshipit-source-id: 2a870a564e0ceecdc7a4667853b2b2a5ea4ce6e3	2020-02-07 14:21:39 -08:00
Jun Wu	8029cd3878	minibytes: port benchmark from tokio/bytes Summary: Performance looks okay comparing with tokio/bytes v0.5.4: minibytes: test clone_arc_vec ... bench: 16,542 ns/iter (+/- 1,524) test clone_shared ... bench: 16,211 ns/iter (+/- 596) test clone_static ... bench: 1,437 ns/iter (+/- 502) test deref_shared ... bench: 367 ns/iter (+/- 101) test deref_static ... bench: 366 ns/iter (+/- 1) test deref_unique ... bench: 367 ns/iter (+/- 4) test from_long_slicd ... bench: 91 ns/iter (+/- 18) = 1406 MB/s test slice_empty ... bench: 10,382 ns/iter (+/- 104) test slice_short_from_arc ... bench: 23,823 ns/iter (+/- 1,411) tokio/bytes: test clone_arc_vec ... bench: 16,213 ns/iter (+/- 1,864) test clone_shared ... bench: 18,685 ns/iter (+/- 634) test clone_static ... bench: 3,983 ns/iter (+/- 163) test deref_shared ... bench: 366 ns/iter (+/- 26) test deref_static ... bench: 373 ns/iter (+/- 36) test deref_unique ... bench: 391 ns/iter (+/- 33) test from_long_slice ... bench: 67 ns/iter (+/- 7) = 1910 MB/s test slice_empty ... bench: 15,149 ns/iter (+/- 1,708) test slice_short_from_arc ... bench: 36,541 ns/iter (+/- 3,485) clone_static is faster because minibytes don't call into vtable's clone. from_long_slice is slower because minibytes uses Arc unconditionally while bytes can avoid Arc overhead if refcount is 1. Reviewed By: DurhamG Differential Revision: D19770857 fbshipit-source-id: 5bafcc57a38c68baccfcafd3906f1a47b2bf4530	2020-02-07 14:21:39 -08:00
Jun Wu	108f1c947a	minibytes: minimalist zero-copy Bytes with mmap support Summary: This crate provides the core features of the commonly known `Bytes` crate: zero-copy slicing and cloning, while also supports mmap-backed buffers. The main motivation is to replace `Mmap` in `indexedlog`. That has multiple benefits: - Handles 0-sized mmap more cleanly. - Handles clones more cleanly. - Gain the flexibility to zero-copy data without lifetime / reference. - Gain the flexibility to switch to non-mmap data. The `bytes::Bytes` crate does not yet support mmap buffers as of its latest release (0.5.4). Implementation wise, `minibytes::Bytes` uses `Option<Arc<dyn Trait>>` for the "trait object". This makes implementing the mmap storage just one line. `bytes 0.5.4` re-invents the "trait object" manually using unsafe code. It requires about 50 lines to implement the mmap storage (in D19756122). Reviewed By: xavierd Differential Revision: D19770856 fbshipit-source-id: 8cfa7052a18ac2e0cd6348b77d5e2a4acc61195c	2020-02-07 14:21:38 -08:00
Jun Wu	69aa37f23b	tracing: limit column width on ASCII output Summary: This makes the output more readable even if the "name" of a span is very long. Reviewed By: DurhamG Differential Revision: D19780536 fbshipit-source-id: dce0d3777409c32b0752db51341a572addb823ea	2020-02-06 15:46:53 -08:00
Xavier Deguillard	6ea4bb998e	revisionstore: move memcache initialization to a background thread Summary: As initializing the memcache client takes ~0.7s, let's move it to a background thread as to not impact Mercurial startup time. This diff uses ArcSwap in order to reduce the overhead of the very common read paths as much as possible. Using Mutex or RwLock instead would have caused unecessary contention. Reviewed By: DurhamG Differential Revision: D19518693 fbshipit-source-id: 886e9b86813fda6ff005ccce99659890026f643a	2020-02-05 14:01:54 -08:00
Xavier Deguillard	b8947748b5	pyrevisionstore: expose the memcache client to python Summary: This allows the Python code to build a memcache client and build ContentStore and MetadataStore with it. Reviewed By: DurhamG Differential Revision: D19518694 fbshipit-source-id: d932fd5223ccfdf37db69cbb54a11a6571312709	2020-02-05 14:01:54 -08:00
Xavier Deguillard	920ea27a17	revisionstore: add memcache client Summary: This enables an in-process memcache client for the Rust ContentStore/MetadataStore. For now, this implementation is lacking several necessary optimization: - Start-up time is always slowed down by ~0.7s, the initialization will be moved to a background thread - Writing data to memcache is blocking and will be moved to a background thread too. - Prefetching data does a roundtrip to memcache for every key, batching memcache APIs will be added. Compared to the existing hg_memcache_client, this implementation is both significantly shorter and do not exhibit some of the pathological behavior of having to flush the indexedlog for every fetched blob when used in Eden. Reviewed By: DurhamG Differential Revision: D19518696 fbshipit-source-id: 4725447d13e7eddd9586135c2511e13ddb921771	2020-02-05 14:01:53 -08:00
Jun Wu	7316c4cc22	cpython-ext: add a way to wrap Rust Write object into a Python object Summary: The library already has a way to wrap a Python object into a Rust object that exposes the Rust Read/Write interface. This is the reverse direction for the Write interface. The initial intention is to expose Rust stdout as described in D19702533. However, I found Python's `sys.stdout.buffer` also enforces utf-8 encoding on Windows (unless PYTHONLEGACYWINDOWSSTDIO is set). So Python's stdout actually behaves similarly with Rust's stdout on Windows and is okay to use. That said, it's still useful to have this abstraction, for streampager [1] integration. [1]: https://github.com/markbt/streampager/ Reviewed By: sfilipco Differential Revision: D19716127 fbshipit-source-id: ba39898122561d9a49b7080ee95d7c940540eb40	2020-02-04 18:41:13 -08:00
David Tolnay	34a520536a	Update rustfmt and reformat fbsource Summary: ``` $ tools/third-party/rustfmt/rustfmt --version rustfmt 1.4.11-nightly (1838235 2019-12-03) ``` Reviewed By: zertosh Differential Revision: D19704678 fbshipit-source-id: fe8707e964495e76746edcb8b68e34fc1411f52a	2020-02-04 17:14:27 -08:00
Jun Wu	3e0b781197	py3: only use binary stdin/stdout/stderr Summary: Drop stdoutbytes/stdinbytes. They make things unnecessarily complicated (especially for chg / Rust dispatch entry point). The new idea is IO are using bytes. Text are written in utf-8 (Python 3) or local encoding (Python 2). To make stdout behave reasonably on systems not using utf-8 locale (ex. Windows), we might add a Rust binding to Rust's stdout, which does the right thing: - When writing to stdout console, expect text to be utf-8 encoded and do proper decoding. - Wehn writing to stdout file, write the raw bytes without translation. Note Python's `sys.stdout.buffer` does not do translation when writing to stdout console like Rust's stdout. For now, my main motivation of this change is to fix chg on Python 3. Reviewed By: xavierd Differential Revision: D19702533 fbshipit-source-id: 74704c83e1b200ff66fb3a2d23d97ff21c7239c8	2020-02-03 18:26:57 -08:00
Mateusz Kwapich	e2dc4e8014	diff: update for py3 Summary: All diff functions are (bytes, bytes) -> bytes to preserver the original file encoding. Because of that I had to add ui.writebytes output function that accepts bytes for terminal output. Reviewed By: farnz Differential Revision: D19656673 fbshipit-source-id: b9a1e4361e825fc8c2313e8402c2bbe00f490dd4	2020-01-31 13:00:23 -08:00
Mark Thomas	18ecb01b8a	mutationstore: update tests so that user is now a string Summary: D19649887 changed mutation entry users to be strings. Update the tests accordingly. Reviewed By: simpkins Differential Revision: D19656792 fbshipit-source-id: fcff677099dc0200130bf30eadaaf66822c6139c	2020-01-30 19:54:45 -08:00
Mark Thomas	914607cac7	cpython-ext: add PyPath for references to paths Summary: `PyPath` is to `PyPathBuf` as `Path` is to `PathBuf` and `str` is to `String`. Reviewed By: quark-zju Differential Revision: D19647995 fbshipit-source-id: 841a5f6fea295bc72b00da028ae256ca38578504	2020-01-30 17:33:35 -08:00
Durham Goode	b567c16b60	py3: make mutation markers 'user' utf8 Summary: Username as utf8, so let's make mutationmarker treat them as such. Reviewed By: xavierd Differential Revision: D19649887 fbshipit-source-id: 3f8b2db434a57ee8ee3017de8d925c19a2002b20	2020-01-30 15:22:24 -08:00
Mark Thomas	13b7a759a2	cpython-ext: add PyNone, a marker struct for functions that can only return None Summary: Add `PyNone`. This is a marker struct that indicates that a python function can only return `PyNone`. Reviewed By: xavierd Differential Revision: D19644338 fbshipit-source-id: f846b146237ebf7de996177494934fec662cde0f	2020-01-30 12:28:38 -08:00
Mark Thomas	6b8042662a	cpython_ext: rename PyPath to PyPathBuf Summary: `PyPath` is the type that owns the data. Rename it to `PyPathBuf` for analogy with `PathBuf` and `RepoPathBuf`, and to allow us to introduce a reference type named `PyPath`. Reviewed By: xavierd Differential Revision: D19643797 fbshipit-source-id: 56d80fea5677f7223e967b0723039d1763b26f68	2020-01-30 11:06:24 -08:00
Jun Wu	c5dd6829c7	cpython-ext: add more utilities for PyPath Summary: Make the type easier to use. Namely, the treestate bindings want PyPath <-> bytes since treestate internally uses bytes. Reviewed By: xavierd Differential Revision: D19635357 fbshipit-source-id: 37d1889b5da1d7f3869bb7820de0219b87b71a8b	2020-01-30 08:27:33 -08:00
Mark Thomas	1e63f205f4	rust-cpython: allow compilation for both py2 and py3 Summary: Set up the `cpython-ext` and `hgcommands` libraries so that they can compile against py2 and py3 versions of rust-cpython. Make py2 the default so that cargo test still works. Reviewed By: singhsrb Differential Revision: D19615656 fbshipit-source-id: 3403e7077deb3c0a9dfe0e3b7d4f4ad1da73bba3	2020-01-28 20:17:20 -08:00
Adam Simpkins	ad957e7803	py3: update Rust hgcommands code to pass argv to python as Str Summary: Update the Rust hgcommands code to pass the command line arguments into the Python logic as `Str` types, so that this will be Unicode `str` objects when using Python 3. Reviewed By: xavierd Differential Revision: D19596739 fbshipit-source-id: 7cdfd44a1c4ce8b0f86d20b634d9b27eab822b2d	2020-01-28 15:58:37 -08:00
Jun Wu	ed3a2b2247	cpython-ext: add missed types dep Summary: This is incorrectly removed due to a bad rebase / merge. Reviewed By: DurhamG Differential Revision: D19607801 fbshipit-source-id: a6ee7a3f184ff1882eb1f1513f7fed74a7108727	2020-01-28 13:50:14 -08:00
Jun Wu	8703970cea	py3: update Cargo.toml to make py3 buildable Summary: This makes `make hg3` work. It requires cleaning up the `build` directory when switching between py2 and py3 build, which will be fixed later. Reviewed By: DurhamG Differential Revision: D19604824 fbshipit-source-id: 060ff313420126a5dba935c4451b45dc9af45f13	2020-01-28 13:39:38 -08:00
Xavier Deguillard	d087e39a34	pypathmatcher: use PyPath instead of PyByte Reviewed By: DurhamG Differential Revision: D19592136 fbshipit-source-id: 5db6ca629cd920d52ffbf7f10963c44c8f7b203d	2020-01-28 12:40:48 -08:00
Adam Simpkins	beff6fdea7	py3: add additional from() conversion methods for Str Summary: Add methods to convert to a `Str` object from `String` and from `Vec[u8]` Reviewed By: xavierd Differential Revision: D19596743 fbshipit-source-id: 6499f7f1b8329f4d14ce8179a41ed46982a85c8e	2020-01-28 12:25:39 -08:00
Mark Thomas	4fe02f3607	bindings: update to rust-cpython 0.4 Summary: Update to the new version of rust-cpython. This supports `list.append`, so make use of it. Reviewed By: xavierd Differential Revision: D19590905 fbshipit-source-id: 03609d4f698ae8e4380e82b8144caaa205b4c2d4	2020-01-28 10:46:33 -08:00
Stefan Filip	5720b9a2a1	py3/pymanifest: convert path types from PyBytes to PyPath Reviewed By: xavierd Differential Revision: D19594134 fbshipit-source-id: e8532a125aa2ed4b7740e669ad572fcbb327692f	2020-01-28 10:29:11 -08:00
Xavier Deguillard	283b120bb6	pyconfigparser: use PyPath instead of PyByte Summary: Also, add a util::path::strip_unc function that is more clear than the normalize_for_display Reviewed By: DurhamG Differential Revision: D19595961 fbshipit-source-id: 330bcb708bf64320a3562d79db685d6cb1e14f16	2020-01-28 10:14:14 -08:00
Xavier Deguillard	61aaf894c3	pyrevisionstore: use PyPath instead of PyBytes Summary: For Python3 compatibility, let's use PyPath, it hides the logic of encoding for Python2 Reviewed By: DurhamG Differential Revision: D19590024 fbshipit-source-id: 7bed134a500b266837f3cab9b10604e1f34cc4a0	2020-01-28 10:01:50 -08:00
Jun Wu	373073df47	py3/rust: cpython-ext: optionally show Python error traceback Summary: This is optional, but it helps investigating Python errors chained with other Rust errors. For example: error.RustError: failed fetching from store (, cc38739855a7f356b4a2aaac0a0a858fd646e6bf) Caused by: TypeError() Traceback (most recent call last): File "scm3/edenscm/hgext/remotefilelog/contentstore.py", line 53, in get chain = self.getdeltachain(name, node) File "scm3/edenscm/hgext/remotefilelog/contentstore.py", line 91, in getdeltachain chain = self._getpartialchain(name, node) File "scm3/edenscm/hgext/remotefilelog/contentstore.py", line 125, in _getpartialchain return store.getdeltachain(name, node) TypeError Without this diff there is only "TypeError()" without the traceback. This can be turned off by unsetting RUST_BACKTRACE. Reviewed By: markbt Differential Revision: D19581173 fbshipit-source-id: 74605b78146b6b1c9ddd5ad720dcd19ff73908a8	2020-01-27 18:56:10 -08:00
Xavier Deguillard	24ae9f9592	cpython-ext: fix python3 compile error Summary: The format_err is used in shared code too, we need to import it. Reviewed By: quark-zju Differential Revision: D19592591 fbshipit-source-id: bd344bf3c295473f4647235a98432d11c9678bf9	2020-01-27 16:58:42 -08:00
Xavier Deguillard	33ea1763ce	cpython-ext: add a PyPath type Summary: This will be used as an argument to the Rust bindings when using paths. This type is either a PyBytes in Python2 and uses the various encoding function to convert into a String, or a PyUnicode in Python3 with no encoding change. Reviewed By: farnz Differential Revision: D19587890 fbshipit-source-id: 58903426585693193754691fe3c756b9097b35f6	2020-01-27 16:50:14 -08:00
Xavier Deguillard	e512c370fd	py3/rust: cpython-ext: set ob_size on raw PyObject Summary: Without this, Rust code using the feature (ex. lz4, used by lz4revlog) will panic. Reviewed By: sfilipco Differential Revision: D19581188 fbshipit-source-id: b499449df4fede27fe66cf8e5af57e8347a0dd48	2020-01-27 16:50:14 -08:00
Xavier Deguillard	f16bb04977	py3/rust: cpython-ext: support memoryview as PySimpleBuf Summary: Otherwise we got RustPanic when clindex or dagindex reads mmapped changelog.i. Reviewed By: sfilipco Differential Revision: D19581189 fbshipit-source-id: 3ee74a1bd000d58272551ae404dcfe7f957bb2c0	2020-01-27 16:50:13 -08:00
Xavier Deguillard	ad58839ca1	py3/rust: use Str type in cliparser and hgcommands Reviewed By: sfilipco Differential Revision: D19581176 fbshipit-source-id: e92e5c2538537ec16da25a9819c9a097a24a4d6e	2020-01-27 16:50:13 -08:00
Xavier Deguillard	1c697dbc49	py3/rust: cpython-ext: add Str type Summary: This converts to bytes on Python 2, but unicode on Python 3. Reviewed By: markbt Differential Revision: D19581180 fbshipit-source-id: 0de9056a01ae30810a72352387de5a940b37d7ab	2020-01-27 16:50:13 -08:00
Xavier Deguillard	789d2b5fbb	py3/rust: types: add AsRef<str> for RepoPath Summary: In a future diff, I have RepoPath in Rust and want to send unicode path to Python. Reviewed By: sfilipco Differential Revision: D19581184 fbshipit-source-id: 73a03707a6bdae4a497a8ee2c14314aa4ffefb6d	2020-01-27 16:50:12 -08:00
Kostia Balytskyi	6bf47a9f5a	hgtime: fix corner case of date range parsing Summary: The docs promise that both `<` and `>` bounds are inclusive, so let's fix that. Reviewed By: markbt Differential Revision: D19580840 fbshipit-source-id: 13770a8e9351fe62f58e9a701b526a167752543a	2020-01-27 09:37:00 -08:00
Stefan Filip	d78982a6e8	dag: move iddag to own file Summary: Separate Segment and IdDag in two individual files. This is preparation for refactoring IdDag to be more flexible in terms of storage. That will probably involve moving stuff out of IdDag into a new file that deals with the storage abstractions. Reviewed By: quark-zju Differential Revision: D19559127 fbshipit-source-id: b3b9b18e2653157e69148b1f29292a57b30016ec	2020-01-24 15:49:54 -08:00
Jun Wu	52af332c28	renderdag: add tests showing how orders affect rendering Summary: I wrote it to understand how renderdag draws the same graph with different orders. It seems useful for future optimization that tries to reduce the number of columns. So let's check it in. Reviewed By: xavierd Differential Revision: D19440713 fbshipit-source-id: 8bc580799f6b24c87886d5ac306020f50bb694e5	2020-01-23 20:50:56 -08:00
Jun Wu	29c749ef7d	dag: add fuzz tests on the octopus DAG Summary: This gives us some confidence about octopus merge handling. Reviewed By: DurhamG Differential Revision: D19540726 fbshipit-source-id: e84de74aecae54429483edd185d39fd1bd858f87	2020-01-23 17:58:51 -08:00
Jun Wu	8ac97da54e	bindag: make TestContext more flexible Summary: TestContext uses ParentRevs. That limits parents to at most 2. Use a type parameter so we can opt-in Vec<usize> for octopus merge support, at the cost of worse cache efficiency. Reviewed By: DurhamG Differential Revision: D19540727 fbshipit-source-id: f9e8de151b7b296fd6f0fd89be9de2b8996634c7	2020-01-23 17:58:51 -08:00
Jun Wu	df23791d08	bindag: add some octopus examples Summary: Our new algorithms support octopus merges. However there were no tests using octopus merges. This diff adds a simple one. Reviewed By: DurhamG Differential Revision: D19540728 fbshipit-source-id: 8411024f0b7e27c2ebfabbe1935496124c25df7b	2020-01-23 17:58:51 -08:00
Jun Wu	494bdae7cc	dag: add a fuzz test about range algorithm Summary: The test runs the old and new algorithm and compares their result. This is more interesting than using random numbers, since the fuzzing framework will try to explore new code paths. Reviewed By: sfilipco Differential Revision: D19511576 fbshipit-source-id: e9a2066769b54a60bb92643e5715f91a6fccbcb5	2020-01-23 17:58:50 -08:00
Jun Wu	78ea96cb9d	bindag: port range algorithm from hg Summary: The ported algorithm will work as a comparison to verify dag's range implementation. Reviewed By: sfilipco Differential Revision: D19511574 fbshipit-source-id: 589353d6e6c91b8d6707c977eeb8558ac733b525	2020-01-23 17:58:50 -08:00
Stefan Filip	f5280b75e9	thrift: update thrift generated files Summary: Commit updates after having ran `make local` Reviewed By: xavierd Differential Revision: D19543278 fbshipit-source-id: 00fdc3ebec32e8a3d706b89402dc91f771984c3c	2020-01-23 16:06:51 -08:00
Xavier Deguillard	b6589bde84	revisionstore: prefetch takes &[Key] instead of Vec<Key> Summary: This can prevent potential moves and clones on the caller of prefetch. Reviewed By: quark-zju Differential Revision: D19518697 fbshipit-source-id: 63839fc3f4bb9ca420e290eabaffb481a3584f7b	2020-01-23 08:57:22 -08:00
Jun Wu	fff2cb833f	dag: add a fuzz test about gca algorithm Summary: The test runs the old and new algorithm and compares their result. This is more interesting than using random numbers, since the fuzzing framework will try to explore new code paths. This cannot run on stable Rust yet. I added a README for how to run it. Reviewed By: sfilipco Differential Revision: D19504096 fbshipit-source-id: 621da02c50a771dee9932f9d7a407cb1f412a543	2020-01-22 19:30:50 -08:00
Jun Wu	af85f4ff3b	bindag: add a way to get a subdag of a parsed bindag Summary: Sometimes the graph is too large. Provide a way to slice it. Reviewed By: sfilipco Differential Revision: D19511575 fbshipit-source-id: 504317d6894764043b23ea49dcf09c8cdea96961	2020-01-22 19:30:49 -08:00
Jun Wu	b5482f8976	bindag: add utilities for easier testing Summary: As we plan to test the dag crate with some other DAG implementation, add a convenient structure that setups both DAG implementations. Reviewed By: sfilipco Differential Revision: D19503371 fbshipit-source-id: 3e9933ad37301bfac36eb1af6d82b4298af778b6	2020-01-22 19:30:49 -08:00
Jun Wu	7d11508dfa	bindag: port GCA algorithm from hg Summary: The ported algorithm will work as a comparison to verify dag's gca implementation. Reviewed By: sfilipco Differential Revision: D19503373 fbshipit-source-id: f5253db89fbcdc2fd02f3fdaa0796e24338b1fba	2020-01-22 19:30:49 -08:00
Jun Wu	a98d288938	bindag: apply smallvec optimization Summary: This is similar to D17581248. It will make the old linear-scan algorithm (which will be added later) about 5x faster. Reviewed By: sfilipco Differential Revision: D19503372 fbshipit-source-id: c65d7217e7b144603dadd57f54a5e70f513c8e51	2020-01-22 19:30:48 -08:00
Jun Wu	64271f24ba	dag: move bindag from benches to a separate crate Summary: This allows bindag to be used outside benches. Reviewed By: sfilipco Differential Revision: D19503374 fbshipit-source-id: 131061f7d1d28125875a86afc330dbb9634249cf	2020-01-22 19:30:48 -08:00
Xavier Deguillard	524c85d711	revisionstore: limit delta chain to 1000 entries Summary: We've seen a case where a datapack contains a circular delta chain, causing Mercurial to fall into a infinite loop when trying to read it. Let's fail when the chain is over 1000 entries. Reviewed By: quark-zju Differential Revision: D19458453 fbshipit-source-id: bfa503f7807122eca72cf94418abda161dafa41c	2020-01-21 08:50:59 -08:00
Jun Wu	124e275377	dag: make NameDag use MultiLog for data consistency Summary: This ensures IdMap and IdDag are guaranteed consistent in the storage layer. Reviewed By: DurhamG Differential Revision: D19432658 fbshipit-source-id: 00f1a9b4c747baa1f14d78c31d925682317463b4	2020-01-17 21:49:57 -08:00
Jun Wu	907aadcdd7	indexedlog: add MultiLog Summary: The MultiLog holds multiple Logs and can atomically sync them. Reviewed By: DurhamG Differential Revision: D19432659 fbshipit-source-id: 6ac7dc6f74468f985c6a6b0c419e888722a80037	2020-01-17 21:49:57 -08:00
Jun Wu	5aa872599c	indexedlog: make ScopedDirLock remember which directory gets locked Summary: This makes it possible to do extra sanity checks. Reviewed By: DurhamG Differential Revision: D19443783 fbshipit-source-id: 254c2537a6aadd25a67c5e48a768187ce65aa686	2020-01-17 21:49:56 -08:00
Jun Wu	2c9a3b9c61	indexedlog: use a method to create LogMetadata Summary: This makes the code overall shorter. Reviewed By: DurhamG Differential Revision: D19443552 fbshipit-source-id: abd1db227830a88549c7eca1cfd08b67c4914518	2020-01-17 21:49:56 -08:00
Zeyi (Rice) Fan	a431e64e4e	eden: periodically refresh content store to give it chances to release mapped files Summary: As reported by JT, EdenFS may hold the file descriptor of mapped pack files too long even when it is deleted by external processes, thus taking more disk spaces. This diff fixes this problem by making EdenFS periodically rescan the pack files. Reviewed By: chadaustin Differential Revision: D19395439 fbshipit-source-id: 4bfd6a7ac13dceb3099d2704d62b3825433aff4b	2020-01-17 15:00:01 -08:00
Jun Wu	733961456f	indexedlog: fix try_clone external key buffer handling Summary: In both cases (clone with or without dirty content), the external key buffers used by indexes should be re-created, since mem_buf location has changed. Reviewed By: DurhamG Differential Revision: D19432657 fbshipit-source-id: fe6f76e7ccfd16ccd2f5c1d89866687a3503603e	2020-01-17 03:58:00 -08:00

1 2 3 4 5 ...

309 Commits