sapling

mirror of https://github.com/facebook/sapling.git synced 2024-10-12 01:39:21 +03:00

Author	SHA1	Message	Date
Maksim Solovjov	a42978939e	Add exclude functionality to dirsync Reviewed By: DurhamG Differential Revision: D13607512 fbshipit-source-id: 80d48eab8fb49d209f856dd82ba084bf2c059150	2019-01-15 07:14:23 -08:00
Mark Thomas	9d6b58b5cd	localrepo: add automigrate mechanism Summary: Generalise the `migrateonpull` mechanism of `treestate` into a generic `automigrate` step that is invoked at the start of pulling. This will be used for other migrations in the future. Reviewed By: liubov-dmitrieva Differential Revision: D13608718 fbshipit-source-id: d558dc21176a6b8d786836d06414e3fc88a20d47	2019-01-15 07:05:46 -08:00
Mark Thomas	3b9eb801e1	types: use Fallible Summary: Use the `Fallible` type alias provided by `failure` rather than defining our own. Differential Revision: D13657313 fbshipit-source-id: ae249bc15037cc2be019ce7ce8a440c153aa31cc	2019-01-15 03:50:47 -08:00
Mark Thomas	3570402d79	watchman_client: use Fallible Summary: Use the `Fallible` type alias provided by `failure` rather than defining our own. Differential Revision: D13657312 fbshipit-source-id: 55134ee93f1f3aaaeefe5644a4a1f2285603bc1c	2019-01-15 03:50:47 -08:00
Mark Thomas	7f1258f091	commitcloudsubscriber: use Fallible Summary: Use the `Fallible` type alias provided by `failure` rather than defining our own. Differential Revision: D13657314 fbshipit-source-id: f1a379089972f7f0066c49ddedf606d36b7ac260	2019-01-15 03:50:47 -08:00
Mark Thomas	d3709fde5b	mononokeapi: use Fallible Summary: Use the `Fallible` type alias provided by `failure` rather than defining our own. Differential Revision: D13657310 fbshipit-source-id: cae73fc239a6ad30bb6ef56a664d1ef5a2a19b5f	2019-01-15 03:50:47 -08:00
Xavier Deguillard	f170cceea2	revisionstore: Repackable::delete now takes the ownership of self. Summary: On some platforms, removing a file can fail if it's still mapped or opened. In mercurial, this can happen during repack as the datapacks are removed while still being mapped. Reviewed By: DurhamG Differential Revision: D13615938 fbshipit-source-id: fdc1ff9370e2767e52ee1828552f4598105f784f	2019-01-14 21:14:13 -08:00
Xavier Deguillard	736d0ba7d4	pyrevisionstore: Use a RefCell<Option<_>> instead of the {Data,History}Pack Summary: The cpython crate forces all the method to take a &self, which forbids modification of the embedded pack datastructure. Its documentation recommends using internal mutability for this purpose. Most of the code is wrapped to avoid lots of boiler plate code. Reviewed By: DurhamG Differential Revision: D13640638 fbshipit-source-id: 3b7513b6117d429322efe32868e683239c68806e	2019-01-14 21:14:13 -08:00
Durham Goode	dbca4b04d8	hgsubversion: dont circumvent revmap abstraction in updatemeta Summary: In D13433558 we moved lastpulled into the sqlite database for sqlite backed revmaps. It turns out the update/rebuildmeta code circumvents the revmap abstraction and attempts to read the lastpulled file directly from disk. In a sql backed world, this attempt fails and it then rebuilds the entire revmap which is very slow. There was a comment stating the code was intentionally not reading from the revmap, but I don't believe it applies anymore, and the tests pass fine with this new change. Reviewed By: singhsrb Differential Revision: D13662697 fbshipit-source-id: 2db8f346d89053604d34fbda8f531f688cf71210	2019-01-14 20:31:34 -08:00
Xavier Deguillard	678fd5c0fe	remotefilelog: Remove old temporary files. Summary: We've observed some users with very large hgcache directories that were filled with temporary pack files that for some reasons were not removed/renamed on a previous repack. These files can appear due to a variety of reasons, such as forcibly killing hg, or a host power-off, or simply due to a bug in mercurial. It is likely that the later case is what causes some of the hgcache directories to grow and this patch doesn't attempt on finding the underlying mercurial issue. Rather, let's alleviate the issue by simply removing the temporary files older than 24h. Reviewed By: ikostia Differential Revision: D13646642 fbshipit-source-id: faa0605e322d440a75187e2517cbbcb13031dae0	2019-01-14 14:56:54 -08:00
Aida Getoeva	7046f81a00	bisect -c: fix empty changeset after sparse skip Summary: If the there is no changeset left after sparse skip in `--command` mode, show the result and return, as it's done in manual testing mode. Reviewed By: markbt Differential Revision: D13650568 fbshipit-source-id: 8e867a38858d84d9a10078b74e2087318c81b01e	2019-01-14 11:55:11 -08:00
Aida Getoeva	7775a9c220	bisect: test -c in case when all nodes skipped Summary: Adding new test to show that sparse skip with `--command` doesn't show the correct answer Reviewed By: markbt Differential Revision: D13650567 fbshipit-source-id: f4b6670fe67d6ef2543efedd91d9760e1e6bc74c	2019-01-14 11:55:11 -08:00
Xavier Deguillard	da3dd2319f	revisionstore: remove repacked pack files Summary: After repacking the data/history packs, we need to cleanup the repacked files. This was an omission from D13363853. Reviewed By: markbt Differential Revision: D13577592 fbshipit-source-id: 36e7d5b8e86affe47cdd10d33a769969f02b8a62	2019-01-11 16:54:15 -08:00
Xavier Deguillard	ce16778656	remotefilelog: set proper file permissions on closed mutable packs. Summary: The python version of the mutable packs set the permission to read-only after writing them, while the rust version keeps them writeable. Let's make the rust one more consistent. Reviewed By: markbt Differential Revision: D13573572 fbshipit-source-id: 61256994562aa09058a88a7935c16dfd7ddf9d18	2019-01-11 16:54:15 -08:00
Liubov Dmitrieva	603daf671b	Back out "[mercurial] enable modern way to transport phases on client side" Summary: Original commit changeset: 63c72654a345 Backout as there are issue in implementation of this transport on Mononoke side. It requires better testing. Reviewed By: StanislavGlebik Differential Revision: D13635800 fbshipit-source-id: 1cdd43658889f68cd13df757ca6e21de01140dc9	2019-01-11 08:01:44 -08:00
Harvey Hunt	cc3b3d869a	archival: Use manifest.matches to improve performance extdiff Summary: Previously, archival.py ran a matchfn on all files reported by the manifest. This is slow for treemanifest repos. Update this to allow the manifest to run the matchfn itself, which is considerably quicker. Reviewed By: DurhamG, markbt Differential Revision: D13624052 fbshipit-source-id: eca022170e6bb0c0cf3c845bcafcb4e35207e825	2019-01-11 03:12:59 -08:00
Chad Austin	a948959e2e	hg: improve connection closed early error message Summary: Several people observed this connection closed early message last week and were unable to go further, so perhaps it would be helpful to include a bit of additional data in the error message. Reviewed By: quark-zju Differential Revision: D13165697 fbshipit-source-id: 8f7d9d29d52697393a8474d6b8697099b0e33442	2019-01-10 21:21:59 -08:00
Liubov Dmitrieva	2086ce2ed6	enable modern way to transport phases on client side Summary: upstream hg started to use separate bundle2 part to push phases but we are stuck with the old way. The old way was enabled in facebook.rc but nowhere in the tests except of the one (where I disabled it) proving that the new way is working. The original commit introduced the config option as temporary is talking about pushrebase changeset: 877f6928075428a4470bd399c82c9b1e9eaba9ad D6156039 user: Durham Goode <durham@fb.com> date: Wed, 25 Oct 2017 16:39:02 -0800 configs: set devel.legacy.exchange=phases Summary: Upstream has enable sending phase bundle parts by default now, but our server doesn't have the pushrebase fix to make this work. Let's temporarily disable this until we've updated the servers. I see in pushrebase support was implemented but the temporary disabling ended up being permanent. changeset: f6cf77230612bdf8be0c7fdd6ab21737fdaf35bf D1204 user: Stanislau Hlebik <stash@fb.com> date: Mon, 23 Oct 2017 09:36:16 -0800 pushrebase: handle pushing phases through separate bundle2 part Differential Revision: https://phab.mercurial-scm.org/D1204 I would like to switch to this transport because this is the one we are supporting on Mononoke side and I don't like to have different configs on client side depends on whether Mononoke is used or not. Reviewed By: DurhamG Differential Revision: D13622994 fbshipit-source-id: 63c72654a34584ad31d17b174660166b46f087fb	2019-01-10 14:58:28 -08:00
Mark Thomas	30eb09c931	commitcloud: don't autojoin users who have manually disconnected Summary: If a user manually disconnects from Commit Cloud Sync, their next background backup will automatically reconnect them if `commitcloud.autocloudjoin` is set. Make the `autocloudjoin` setting only work if the user has never connected to a workspace before. Detect the difference between the two by leaving a `commitcloudrc` file in place after disconnecting. Reviewed By: liubov-dmitrieva Differential Revision: D13621476 fbshipit-source-id: ffccd473cb3da592e5b991dd863b8afed45dc83a	2019-01-10 06:37:20 -08:00
Mark Thomas	3298b1c7ab	commitcloud: don't updateonmove if backups are disabled Summary: If `commitcloud.updateonmove` is set, but backups are disabled, we shouldn't still follow any new obsmarkers that have appeared (e.g. because of pullcreatemarkers) when cloud sync runs. Reviewed By: liubov-dmitrieva Differential Revision: D13599478 fbshipit-source-id: 10c5c190a08fe5cf72cdfd0165ca61928c0d1800	2019-01-09 07:56:54 -08:00
Mark Thomas	093bb82cca	commitcloud: don't updateonmove if the new commit is public Summary: The pullcreatemarkers extension interacts badly with commit cloud sync's updateonmove. If the current commit has been landed, the next cloud sync will follow the marker to the landed commit. This is not usually what the user wants, and slows down usage as they have to wait for the update to finish before they can continue working. Reviewed By: liubov-dmitrieva Differential Revision: D13599477 fbshipit-source-id: f25d50d9dcb023894f2459e632fbd5ff4d172dd0	2019-01-09 06:12:53 -08:00
Mark Thomas	98417b1ffb	configparser: fix warning about unused Result Summary: Use of `write!` requires checking for errors, however in this case, there is no need to use `write!`, as we just want the error as a string. Reviewed By: ikostia Differential Revision: D13596497 fbshipit-source-id: 5892025344936936188cf3a8ca227e71eff57d55	2019-01-08 06:19:55 -08:00
Adam Simpkins	47b14e9c2e	have fsmonitor explicitly avoid wrapping Eden repository objects Summary: Teach the fsmonitor extension about Eden, and have it explicitly avoid wrapping repository objects for Eden-backed repositories. Reviewed By: quark-zju Differential Revision: D13523302 fbshipit-source-id: d1114b24311a933fe46baef74d3e514778bd400b	2019-01-07 13:01:35 -08:00
Mark Thomas	4ce2783f74	remotefilelog: don't prune commits with null linknodes Summary: When building changegroups, remotefilelog omits filenodes for which the linknode is known to be available at the server. Since linknodes may now be null, we need to include these filenodes, as we don't know whether the linknode is available or not. Reviewed By: quark-zju Differential Revision: D13504563 fbshipit-source-id: 8d7106d32f4ec3f2e006b253d68b32c031638b4d	2019-01-02 04:43:58 -08:00
Mark Thomas	bde717a925	treemanifest: fixup linknodes when sending to the server Summary: When sending a bundle to the server, ensure that the treemanifest linknodes correctly refer to a commit that is in the bundle. Treemanifest ignores the linknodes of the subtrees - instead they inherit the linknode of the root manifest in which they were introduced, so we only need to fix up the linknode for the root manifest and propagate that to the subtrees. Reviewed By: quark-zju Differential Revision: D13504564 fbshipit-source-id: 2f481a4939239784d84d5db12c70d473b3045610	2019-01-02 04:43:58 -08:00
Mark Thomas	00951e42c2	treemanifest: add test demonstrating problem with linknodes Summary: When we send a bundle to the server, the treemanifest nodes are sent with whatever linknode they have. This linknode might not refer to a commit that the server knows about, which makes the bundle invalid. Add a test that demonstrates that the server aborts in this case. Pushrebase works fine as it rewrites the commits, generating new linknodes for all of the trees. Reviewed By: quark-zju Differential Revision: D13504565 fbshipit-source-id: 39894d367c111aea5cef7de2d7da122e39d9debe	2019-01-02 04:43:58 -08:00
Mark Thomas	2915dd2086	remotefilelog: don't store draft remotefilelog blobs in memcache Summary: If we receive a remotefilelog blob from the server where the history information contains a null linknode, don't store that blob in memcache. Reviewed By: quark-zju Differential Revision: D13517715 fbshipit-source-id: 6ea68391ab2488db223ca261e8303ea24e091915	2019-01-02 04:43:58 -08:00
Mark Thomas	6ae8397b26	pymononokeapi: use cpython-failure for PyErr generation Summary: Use the cpython-failure crate for generating `PyResult` from `Result` by mapping to a `PyErr`. Reviewed By: DurhamG Differential Revision: D13464988 fbshipit-source-id: d927f89c111dce737b59905ceeab1d30381a8510	2019-01-02 04:13:20 -08:00
Jun Wu	f6158659f8	configparser: use hardcoded system config path on Windows Summary: When I was debugging an eden importer issue with Puneet, we saw errors caused by important extensions (ex. remotefilelog, lz4revlog) not being loaded. It turned out that configpaser was checking the "exe dir" to decide where to load "system configs". For example, If we run: C:\open\fbsource\fbcode\scm\hg\build\pythonMSVC2015\python.exe eden_import_helper.py The "exe dir" is "C:\open\fbsource\fbcode\scm\hg\build", and system config is not there. Instead of copying "mercurial.ini" to every possible "exe dir", this diff just switches to a hard-coded system config path. It's now consistent with what we do on POSIX systems. The logic to copy "mercurial.ini" to "C:\open\fbsource\fbcode\scm\hg" or "C:\tools\hg" become unnecessary and are removed. Reviewed By: singhsrb Differential Revision: D13542939 fbshipit-source-id: 5fb50d8e42d36ec6da28af29de89966628fe5549	2018-12-22 01:53:03 -08:00
Saurabh Singh	b193e23dd2	test-check-fix-code: unbreak test by fixing copyrights Summary: `test-check-fix-code.t` was failing due to copyright header missing from certain files. This commit fixes the files by running ``` contrib/fix-code.py FILE ``` as suggested in the failure message. Reviewed By: DurhamG Differential Revision: D13538506 fbshipit-source-id: d8063c9a0e665377a9976abeccb68fbef6781950	2018-12-21 10:03:26 -08:00
Jun Wu	c2b973b47a	absorb: fix message when there are only deleted commits Summary: Even if no new commits are created, absorb might still have "applied" some changes by deleting commits. Let's fix the end-user message. Reviewed By: DurhamG Differential Revision: D13531959 fbshipit-source-id: 4d942f3ccd8201e8b62c8bc1c86227d41021b5f9	2018-12-20 17:54:23 -08:00
Jun Wu	5e0fc4c563	absorb: use scmutil.cleanupnodes Summary: `scmutil.cleanupnodes` was initially ported from absorb to simplify other commands. Now use it to simplify absorb itself. This solves a crash when `self.finalnode` is empty (ex. no new commits are created, only with commits deletion). Reviewed By: DurhamG Differential Revision: D13531961 fbshipit-source-id: 7006b5ac5dfc4db897413d18ccd26eedde3c98d9	2018-12-20 17:54:22 -08:00
Jun Wu	a74541c40c	scmutil: make cleanupnodes return calculated moves Summary: It's used in the upcoming patches. Reviewed By: DurhamG Differential Revision: D13531962 fbshipit-source-id: 9caf6c3d5ef079082c9fc677ff2c2ef0e492a1db	2018-12-20 17:54:22 -08:00
Jun Wu	de6a5ca10d	scmutil: revise cleanupnodes behavior when moving bookmarks backwards Summary: Previously when a commit does not have a replacment, it will be moved fair back to a commit that is not being replaced. For example, when A::C is being rebased to A'::C' and B disappers due to being empty, the old code would move BOOK-B to Z, while the new code would move BOOK-B to A': C C' \| \| B BOOK-B -> \| \| \| A ------ A' <- new BOOK-B \| / Z ---- <- old BOOK-B Note, the current `rebase` implementation overrides the "moves" calcuation used in cleanupnodes. It already has the new behavior. So there are no changes in rebase tests. Right now, the real intended user is absorb, so I'm not adding new tests here. Without this change, when absorb migrates to cleanupnodes, its test will break. So I'm not adding new tests here. Reviewed By: DurhamG Differential Revision: D13531964 fbshipit-source-id: 03b6afa116e1a7b08b33a2c8856f2e52d6f8043a	2018-12-20 17:54:22 -08:00
Jun Wu	e07d80c6af	absorb: stop writing absorb_source metadata Summary: It was to workaround the upstream obsmarker design which cannot support cycles. Now that our internal obsmarkers can have cycles just fine, and the upcoming mutation metadata makes it impossible to have cycles. Drop the workaround. Reviewed By: DurhamG Differential Revision: D13531960 fbshipit-source-id: d569172f0d2d5a3b4e1f6589be44ac21a09604f3	2018-12-20 17:54:22 -08:00
Jun Wu	22ee659eec	absorb: default to "yes" for prompting changes Summary: Otherwise absorb never works with HGPLAIN=1. Reviewed By: DurhamG Differential Revision: D13531963 fbshipit-source-id: af598b985501db425405f0c851e196e9eddc2350	2018-12-20 17:54:22 -08:00
Jun Wu	1251bb0736	tests: add a test logging files with filenode collision Summary: I have been wondering how hg behaves with filenode collision. So I added some tricky-looking tests about it. It actually shows the existing logic is problematic :( Reviewed By: DurhamG Differential Revision: D13011554 fbshipit-source-id: fffb026e05adc8d8de4a1e5692bbee57293cce4e	2018-12-20 17:54:22 -08:00
Jun Wu	03a4b9d606	setup: embed Cython Summary: Use an `asset` to download Cython on demand. So we don't need to install Cython as build dependency on all supported platforms, and maintain the "Cython" package for those platforms. Upgrade to the latest Cython by the way. Reviewed By: singhsrb Differential Revision: D13513514 fbshipit-source-id: 5ebe9a3e5b785a8f85cd51624663f9cc1e5c66fd	2018-12-20 17:54:22 -08:00
Jun Wu	94565d0386	lz4: use Rust lz4 binding Summary: Drop dependency of `python-lz4`. Add some convertions from bytearray to bytes to make code compatible. Reviewed By: DurhamG Differential Revision: D13516212 fbshipit-source-id: 89beb0aa92be4c5442a8e837f509e1eb17bb1512	2018-12-20 17:54:22 -08:00
Jun Wu	fafc7c6b1c	tests: re-open store after datapack truncation Summary: When I replace lz4 to rust.lz4, the test failed. The change fixes it. That also means datapack corruption detection is not that reliable. However, usually those files are not changed when they are loaded into an in-memory store, so it's probaby fine. Also note the python-lz4 used in production has strange behavior when compressing an empty string: In [6]: lz4.compressHC('') Out[6]: '\x00\x00\x00\x00\xa0#\xd9\x040\x00\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00' In [7]: lz4.compress('') Out[7]: '\x00\x00\x00\x00\xa0#\xd9\x040\x00\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00' In [9]: rustlz4.compress('') Out[9]: bytearray(b'\x00\x00\x00\x00') In [10]: rustlz4.compresshc('') Out[10]: bytearray(b'\x00\x00\x00\x00') In [13]: lz4.compress('1') Out[13]: '\x01\x00\x00\x00\x101' In [14]: rustlz4.compress('1') Out[14]: bytearray(b'\x01\x00\x00\x00\x101') Reviewed By: DurhamG Differential Revision: D13528199 fbshipit-source-id: 9b3e8674f989062928900766156a97d28262c8cb	2018-12-20 17:54:22 -08:00
Jun Wu	22e9000fc9	lz4-pyframe: add compresshc Summary: Unfortunately required symbols are not exposed by lz4-sys. So we just declare them ourselves. Make sure it compresses better: In [1]: c=open('/bin/bash').read(); In [2]: from mercurial.rust import lz4 In [3]: len(lz4.compress(c)) Out[3]: 762906 In [4]: len(lz4.compresshc(c)) Out[4]: 626970 While it's much slower for larger data (and compresshc is slower than pylz4): Benchmarking (easy to compress data, 20MB)... pylz4.compress: 10328.03 MB/s rustlz4.compress_py: 9373.84 MB/s pylz4.compressHC: 1666.80 MB/s rustlz4.compresshc_py: 8298.57 MB/s pylz4.decompress: 3953.03 MB/s rustlz4.decompress_py: 3935.57 MB/s Benchmarking (hard to compress data, 0.2MB)... pylz4.compress: 4357.88 MB/s rustlz4.compress_py: 4193.34 MB/s pylz4.compressHC: 3740.40 MB/s rustlz4.compresshc_py: 2730.71 MB/s pylz4.decompress: 5600.94 MB/s rustlz4.decompress_py: 5362.96 MB/s Benchmarking (hard to compress data, 20MB)... pylz4.compress: 5156.72 MB/s rustlz4.compress_py: 5447.00 MB/s pylz4.compressHC: 33.70 MB/s rustlz4.compresshc_py: 22.25 MB/s pylz4.decompress: 2375.42 MB/s rustlz4.decompress_py: 5755.46 MB/s Note python-lz4 was using an ancient version of lz4. So there could be differences. Reviewed By: DurhamG Differential Revision: D13528200 fbshipit-source-id: 6be1c1dd71f57d40dcffcc8d212d40a853583254	2018-12-20 17:54:22 -08:00
Jun Wu	4f24bffdde	cpython-ext: move pybuf to cpython-ext Summary: The `pybuf` provides a way to read `bytes`, `bytearray`, some `buffer` types in a zero-copy way. The main benefit is to use same code to support different input types. It's copied to a couple of places. Let's move it to `cpython-ext`. Reviewed By: DurhamG Differential Revision: D13516206 fbshipit-source-id: f58881c4bfe651a6fdb84cf317a74c3c8d7a4961	2018-12-20 17:54:22 -08:00
Jun Wu	08981fee2e	rustlz4: use zero-copy return type Summary: Use the newly added zero-copy method to improve Rust lz4 performance. It's now roughly as fast as python-lz4 when tested by stresstest-compress.py: Benchmarking (easy to compress data)... pylz4.compress: 10461.62 MB/s rustlz4.compress_py: 9379.41 MB/s pylz4.decompress: 3802.85 MB/s rustlz4.decompress_py: 3975.61 MB/s Benchmarking (hard to compress data)... pylz4.compress: 5341.69 MB/s rustlz4.compress_py: 5012.30 MB/s pylz4.decompress: 6768.17 MB/s rustlz4.decompress_py: 6651.08 MB/s (Note: decompress can be visibly faster if we return `bytearray` instead of `bytes`. However a lot of places expect `bytes`) Previously, the result looks like: Benchmarking (easy to compress data)... pylz4.compress: 10810.05 MB/s rustlz4.compress_py: 11175.36 MB/s pylz4.decompress: 3868.92 MB/s rustlz4.decompress_py: 634.56 MB/s Benchmarking (hard to compress data)... pylz4.compress: 4565.91 MB/s rustlz4.compress_py: 622.94 MB/s pylz4.decompress: 6887.76 MB/s rustlz4.decompress_py: 2854.79 MB/s Note this changes the return type from `bytes` to `bytearray` for the `compress` function. `decompress` still returns `bytes`, which is important for compatibility. Note that zero-copy `bytes` can not be implemented `compress` - the size of `PyBytes` is unknown and cannot be pre-allocated. Reviewed By: DurhamG Differential Revision: D13516211 fbshipit-source-id: b21f852c390722c086aa2f37a758bf3f58af31b4	2018-12-20 17:54:22 -08:00
Jun Wu	f23c6bc7e3	cpython-ext: add a way to pre-allocate PyBytes Summary: Make it possible to write content directly into a PyBytes buffer. Reviewed By: DurhamG Differential Revision: D13528202 fbshipit-source-id: 8c0a4ed030439a8dc40cdfbd72b1f6734a8b2036	2018-12-20 17:54:22 -08:00
Jun Wu	6e88ac4794	lz4-pyframe: provide decompress_into API Summary: This allows decompressing into a pre-allocated buffer. After some experiments, it seems `bytearray` will just break too many things, ex: - bytearray is not hashable - bytearray[index] returns an int - a = bytearray('x'); b = a; b += '3' # will mutate 'a' - ''.join([bytearray('')]) will raise TypeError Therefore we have to use zero-copy `bytes` instead, which is less elegent. But this API change is a step forward. Reviewed By: DurhamG Differential Revision: D13528201 fbshipit-source-id: 1cfaf5d55efdc0d6c0df85df9960fe9682028b08	2018-12-20 17:54:22 -08:00
Jun Wu	7831e2a4ce	cpython-ext: add ways to zero-copy `Vec<u8>` into a Python object Summary: I need to convert `Vec<u8>` to a Python object in a zero-copy way for rustlz4 performacne. Assuming Python and Rust use the same memory allocator, it's possible to transfer the control of a malloc-ed pointer from Rust to Python. Use this to implement zero-copy. PyByteArrayObject is chosen because its struct contains such a pointer. PyBytes cannot be used as it embeds the bytes, without using a pointer. Sadly there are no CPython APIs to do this job. So we have to write to the raw structures. That means the code will crash if python is replaced by python-debug (due to Python object header change). However, that seems less an issue given the performance wins. If python-debug does become a problem, we can try vendoring libpython directly. I didn't implement a feature-rich `PyByteArray` Rust object. It's not easy to do so outside the cpython crate. Most helper macros to declare types cannot be reused, because they refer to `::python`, which is not available in the current crate. Reviewed By: DurhamG Differential Revision: D13516209 fbshipit-source-id: 9aa089b309beb71d4d21f6c63fcb97dbc798b5f8	2018-12-20 17:54:22 -08:00
Jun Wu	3b35a77fe8	rustlz4: expose lz4-pyframe to Python Summary: This is intended to replace the python-lz4 library so we have a unified code path. However, added benchmark indicates the Rust version is significantly slower than python-lz4: Benchmarking (easy to compress data)... pylz4.compress: 10964.14 MB/s rustlz4.compress_py: 12126.00 MB/s pylz4.decompress: 3908.29 MB/s rustlz4.decompress_py: 798.68 MB/s Benchmarking (hard to compress data)... pylz4.compress: 5615.86 MB/s rustlz4.compress_py: 740.32 MB/s pylz4.decompress: 6145.68 MB/s rustlz4.decompress_py: 2423.99 MB/s The only case where the Rust version is fine is when the returned data is small. That suggests rust-cpython was likely doing some memcpy unnecessarily. Reviewed By: DurhamG Differential Revision: D13516207 fbshipit-source-id: 72150b15c38bc8d8c7e7717a56a41f48d114db19	2018-12-20 17:54:21 -08:00
Jun Wu	35c85018cd	lz4-pyframe: add a benchmark Summary: This gives some sense about how fast it is. Background: I was trying to get rid of python-lz4, by exposing this to Python. However, I noticed it's 10x slower than python-lz4. Therefore I added some benchmark here to test if it's the wrapper or the Rust lz4 code. It does not seem to be this crate: ``` # Pure Rust compress (100M) 77.170 ms decompress (~100M) 67.043 ms # python-lz4 In [1]: import lz4, os In [2]: b=os.urandom(100000000); In [3]: %timeit lz4.compress(b) 10 loops, best of 3: 87.4 ms per loop ``` Reviewed By: DurhamG Differential Revision: D13516205 fbshipit-source-id: f55f94bbecc3b49667ed12174f7000b1aa29e7c4	2018-12-20 17:54:21 -08:00
Jun Wu	b3893b3d3c	indexedlog: add methods on Log to do prefix lookups Summary: This exposes the underlying lookup functions from `Index`. Alternatively we can allow access to `Index` and provide an `iter_started_from` method on `Log` which takes a raw offset. I have been trying to avoid exposing raw offsets in public interfaces, as they would change after `flush()` and cause problems. Reviewed By: markbt Differential Revision: D13498303 fbshipit-source-id: 8b00a2a36a9383e3edb6fd7495a005bc985fd461	2018-12-20 15:50:55 -08:00
Jun Wu	3237b77e4c	indexedlog: add APIs to lookup by prefix Summary: This is the missing API before `indexedlog::Index` can fit in the `changelog.partialmatch` case. It's actually more flexible as it can provide some example commit hashes while the existing revlog.c or radixbuf implementation just error out saying "ambiguous prefix". It can be also "abused" for the semantics of sorted "sub-keys". By replace "key" with "key + subkey" when inserting to the index. Looking up using "key" would return a lazy result list (`PrefixIter`) sorted by "subkey". Note: the radix tree is NOT efficient (both in time and space) when there are common prefixes. So this use-case needs to be careful. Reviewed By: markbt Differential Revision: D13498301 fbshipit-source-id: 637856ebd761734d68b20c15866424b1d4518ad6	2018-12-20 15:50:55 -08:00

1 2 3 4 5 ...

44837 Commits