Summary:
I need to convert `Vec<u8>` to a Python object in a zero-copy way for rustlz4
performacne.
Assuming Python and Rust use the same memory allocator, it's possible to transfer
the control of a malloc-ed pointer from Rust to Python. Use this to implement
zero-copy. PyByteArrayObject is chosen because its struct contains such a pointer.
PyBytes cannot be used as it embeds the bytes, without using a pointer.
Sadly there are no CPython APIs to do this job. So we have to write to the raw
structures. That means the code will crash if python is replaced by
python-debug (due to Python object header change). However, that seems less an
issue given the performance wins. If python-debug does become a problem, we can
try vendoring libpython directly.
I didn't implement a feature-rich `PyByteArray` Rust object. It's not easy to
do so outside the cpython crate. Most helper macros to declare types cannot be
reused, because they refer to `::python`, which is not available in the current
crate.
Reviewed By: DurhamG
Differential Revision: D13516209
fbshipit-source-id: 9aa089b309beb71d4d21f6c63fcb97dbc798b5f8
Summary:
This is intended to replace the python-lz4 library so we have a unified code
path.
However, added benchmark indicates the Rust version is significantly slower
than python-lz4:
Benchmarking (easy to compress data)...
pylz4.compress: 10964.14 MB/s
rustlz4.compress_py: 12126.00 MB/s
pylz4.decompress: 3908.29 MB/s
rustlz4.decompress_py: 798.68 MB/s
Benchmarking (hard to compress data)...
pylz4.compress: 5615.86 MB/s
rustlz4.compress_py: 740.32 MB/s
pylz4.decompress: 6145.68 MB/s
rustlz4.decompress_py: 2423.99 MB/s
The only case where the Rust version is fine is when the returned data is
small. That suggests rust-cpython was likely doing some memcpy unnecessarily.
Reviewed By: DurhamG
Differential Revision: D13516207
fbshipit-source-id: 72150b15c38bc8d8c7e7717a56a41f48d114db19
Summary:
This gives some sense about how fast it is.
Background: I was trying to get rid of python-lz4, by exposing this to Python.
However, I noticed it's 10x slower than python-lz4. Therefore I added some
benchmark here to test if it's the wrapper or the Rust lz4 code.
It does not seem to be this crate:
```
# Pure Rust
compress (100M) 77.170 ms
decompress (~100M) 67.043 ms
# python-lz4
In [1]: import lz4, os
In [2]: b=os.urandom(100000000);
In [3]: %timeit lz4.compress(b)
10 loops, best of 3: 87.4 ms per loop
```
Reviewed By: DurhamG
Differential Revision: D13516205
fbshipit-source-id: f55f94bbecc3b49667ed12174f7000b1aa29e7c4
Summary:
This exposes the underlying lookup functions from `Index`.
Alternatively we can allow access to `Index` and provide an `iter_started_from`
method on `Log` which takes a raw offset. I have been trying to avoid exposing
raw offsets in public interfaces, as they would change after `flush()` and cause
problems.
Reviewed By: markbt
Differential Revision: D13498303
fbshipit-source-id: 8b00a2a36a9383e3edb6fd7495a005bc985fd461
Summary:
This is the missing API before `indexedlog::Index` can fit in the
`changelog.partialmatch` case. It's actually more flexible as it can provide
some example commit hashes while the existing revlog.c or radixbuf
implementation just error out saying "ambiguous prefix".
It can be also "abused" for the semantics of sorted "sub-keys". By replace
"key" with "key + subkey" when inserting to the index. Looking up using "key"
would return a lazy result list (`PrefixIter`) sorted by "subkey". Note:
the radix tree is NOT efficient (both in time and space) when there are common
prefixes. So this use-case needs to be careful.
Reviewed By: markbt
Differential Revision: D13498301
fbshipit-source-id: 637856ebd761734d68b20c15866424b1d4518ad6
Summary: This will be used in prefix lookups.
Reviewed By: markbt
Differential Revision: D13498300
fbshipit-source-id: 3db7a21d6f35a18699d9dc3a0eca71a5410e0e61
Summary:
As we tell users that hg cloud join -w <name> is lgtm, this should be fixed.
hg cloud leave
hg cloud join -w <name>
worked correctly but users could skip cloud leave
currently, it leads to excessive unused connection to IceBreaker.
Differential Revision: D13528559
fbshipit-source-id: 91f36089ccac6718f7a29413c8a3dae80b6b25c6
Summary:
When pulling an infinitepush bundle from the server to a client, it
uses remotefilelog's excludepattern functionality to force the server to serve
filelogs to the client. Unfortunately, it didn't specify what type of pattern,
and therefore it was treated as a regular expression. This meant that any path
that happened to be a regular expression would generally not match itself,
resulting in data not being sent to the client.
The fix is to just prefix these with 'path:' since we know they are all exact
paths.
Reviewed By: phillco, quark-zju
Differential Revision: D13521814
fbshipit-source-id: 8afc35b37d5858913b80ed0babfa0b5e401f0ab4
Summary:
If a filename is a valid regex and it doesn't match itself, that file
data will not be delivered to the client when doing a infinitepush pull that
requires a rebundle. This adds a test. The next diff fixes it.
Reviewed By: phillco
Differential Revision: D13522579
fbshipit-source-id: 6f8c5df20c31b834a33dbb9b73dc9031e26e969b
Summary: Adds the current flavor of the eden rust+python deps to LFS and have setup.py bake them into the IPython.zip.
Reviewed By: quark-zju
Differential Revision: D13516179
fbshipit-source-id: 64a0a86bc97b187f35475a3df4b580e0f0bc5deb
Summary:
Rather than only making it available to `hg dbsh`,
add it for all commands/extensions to use.
Reviewed By: quark-zju
Differential Revision: D13505762
fbshipit-source-id: c79fee888d9394c5ad70d3d8b7f59addef1381a1
Summary:
This is the library shared between the eden cli and the
eden hg extension.
Reviewed By: quark-zju
Differential Revision: D13503730
fbshipit-source-id: 45ab550da3126042cb3baacaf8469b8acd6b1c4a
Summary:
makes it possible to `import thrift` from python2.
This is implemented as a subclass of `asset`, but rather than downloading
files we copy them from their location in fbsource. Extraction
copies them from their fbsource path, optionally excluding specific files
(eg: python3 only files) to the build path.
Reviewed By: quark-zju
Differential Revision: D13503711
fbshipit-source-id: fbe69400de31376ff6135c2c5173a984ff97f282
Summary:
In addition to `six` (already listed), these packages are
required to use the eden extension thrift deps.
I downloaded these using `pip2 download URL` and then uploaded using:
```
$ ../../tools/lfs/lfs.py upload /tmp/futures-3.2.0-py2-none-any.whl -l fb/tools/.lfs-pointers
```
Reviewed By: quark-zju
Differential Revision: D13503336
fbshipit-source-id: 87a21984dbe544882bfc4a818f7b5bff46907693
Summary:
dsp implemented the remotefilelog prefetch logic in https://phab.mercurial-scm.org/D732. With that change, LFS download should only be triggered once per update for a remotefilelog repo since remotefilelog/__init__.py wrapped around checkunknownfiles() (called by calculateupdates() in hg update) to first run prefetch(). However, lfs prefetch is only called when there's missing data in either datastore or historystore.
This diff reuse the logic for lfs prefetch in remotefilelog/fileserverclient.py, but enable lfs prefetch regardless of whether there's missing data in hg stores. The logic to prefetch lfs files is very similar to that of lfs update, except that files to download are batch-processed.
Reviewed By: quark-zju, ikostia
Differential Revision: D13515102
fbshipit-source-id: 5e14c615b192d775db238bacd5e16ceec8141efa
Summary:
Sandcastle jobs are using mercurial in conjunction with Phabricator CATs as a part of Phabricato security effort.
Here is an example of a mercurial command that failed because of this:
https://our.intern.facebook.com/intern/sandcastle/job/1498468091/
This adds logic to support parsing CATs from .arcrc for mercurial commands
Reviewed By: quark-zju
Differential Revision: D13468442
fbshipit-source-id: 033806d0e0779f9e7ade054d21e4cdbbdef08ed0
Summary:
Tests don't close the child process stdout. On newer versions of Python, this
can lead to ResourceWarnings when the test runner thread terminates.
Reviewed By: HarveyHunt
Differential Revision: D13517425
fbshipit-source-id: 6cedf4f39efe1299c41dbde784daf8c159309640
Summary:
Some of the escape sequences in run-tests.py are invalid. These cause
DeprecationWarnings on newer versions of Python.
In both cases, there are `\` characters that need to be escaped as `\\`.
Reviewed By: HarveyHunt
Differential Revision: D13517137
fbshipit-source-id: a899c3c28d55210f5972a515474a2fa69d051671
Summary:
The duringundologlock config and hook were used to test the undolog using
timing. We've replaced that, so remove the hook.
Reviewed By: quark-zju
Differential Revision: D13504644
fbshipit-source-id: a6b5fb308bc8938eec72788d93c9be6c237b72d7
Summary:
The undo tests use timing to detect when the lock is being taken. This is
flaky. Instead add extra logging to detect when the lock is taken.
Reviewed By: quark-zju
Differential Revision: D13504643
fbshipit-source-id: 07b80e416047d11b4ba3e1631c2385e5f12fa36f
Summary:
The setup function for remotefilelogserver wraps `cg1packer.generatefiles`,
however it does this each time a repo is set up. After many repo objects
have been instantiated, we can end up with hundreds of wrappers around the
method, which overflows the stack when it is called.
Furthermore, in `wrappackers` we have replaced `cg1packer` with our own
`shallowbundle.shallowcg1packer`, so here remotefilelog is actually wrapping
itself! We can just move the code for the server case into the remotefilelog
version of `generatefiles`.
Reviewed By: DurhamG, quark-zju
Differential Revision: D13505002
fbshipit-source-id: a236f7e62e6d2e5186f135bfe79477ce3e09e374
Summary:
It makes testing duplicated - now `cargo test` would try running tests on 2 entry points:
lib.rs and indexedlog_dump.rs. Move it to a separate crate to solve the issue.
Reviewed By: markbt
Differential Revision: D13498266
fbshipit-source-id: 8abf07c1272dfa825ec7701fd8ea9e0d1310ec5f
Summary:
When pushing in treeonly mode, we process the manifests to decide what
files to send. In large stacks this can be slow, so let's add a progress bar.
Reviewed By: quark-zju
Differential Revision: D13460745
fbshipit-source-id: b037419e4a5e17c831492768e97064bf635678f6
Summary:
In a number of places, the remotefilelog code needs the information for
a single node but instead fetches the entire ancestor set. Now that we have
getnodeinfo, let's switch to using that. This was a primary hot spot in hg push
of large stacks of commits.
Reviewed By: quark-zju
Differential Revision: D13460746
fbshipit-source-id: 3aa288c70c87dcb32c0404311f27bbc87ddc5267
Summary:
Be compatible with `hgext.` or `hgext/` prefix. But print a warning saying it's
deprecated.
Reviewed By: DurhamG
Differential Revision: D13490362
fbshipit-source-id: ef13bd57a74be810df409af18a6259bc7b2b6dad
Summary: `write!` result needs to be used.
Reviewed By: markbt
Differential Revision: D13471967
fbshipit-source-id: d48752bcac05dd33b112679d7faf990eb8ddd651
Summary:
Many code paths assume 'build' exists. So let's create it on demand.
Use the "scratch" tool to make it more friendly on an Eden checkout.
Reviewed By: markbt
Differential Revision: D13471293
fbshipit-source-id: cce461ab67b984c53a00a98d481a821ad1f11c35
Summary: The former is deprecated and thus compiling revisionstore shows many warnings.
Reviewed By: markbt
Differential Revision: D13379278
fbshipit-source-id: d4b4662a1ad00997de4c46274deaf22f48487328
Summary:
This fixes LFS compatibility with packlocaldata.
Also drop the config as the default version is 1 now.
Reviewed By: DurhamG
Differential Revision: D13469486
fbshipit-source-id: 1dc4a1051667419d7aab97bf95f93cacd166468a
Summary:
Reraising a caught exception loses the context of the original exception, which
makes debugging harder. Instead, just call `raise`, which keeps the context
intact.
Reviewed By: quark-zju
Differential Revision: D13468943
fbshipit-source-id: 091f060327ca732a1534a7730bc6536d9a101865
Summary:
Previously, it would raise `KeyError` if the node requested cannot be found in
the first pack file.
This allows `*` to be used in `.t` tests. So file name changes are still
`run-tests.py -i` friendly.
Reviewed By: DurhamG
Differential Revision: D13469489
fbshipit-source-id: 84349ab1cf963b6d40216cec4de0e8ab10838b97
Summary:
It's meaningless to test the fetch cost in .t tests. So let's just set it to 0.
This also makes the output stable. So `run-tests.py -i` can be used directly.
Reviewed By: DurhamG
Differential Revision: D13469488
fbshipit-source-id: a2cf900531f3ff0f58957bd79d7283928b5da700
Summary: Previously, the test wasn't using a remotefilelog repo.
Reviewed By: DurhamG
Differential Revision: D13469487
fbshipit-source-id: 3c85633b8bfc209fa0d0317ca05247e14faf689d
Summary:
Add a third argument to the `get(dict, key)` template function that serves as a
default.
Reviewed By: DurhamG, quark-zju
Differential Revision: D12980774
fbshipit-source-id: d619275dd3ae880f6aba4c2b3d91aea4b45ea6d6
Summary:
Adds a new crate `cpython-result`, which provides a `ResultExt` trait, which
extends the failure `Result` type to allow coversion to `PyResult` by
converting the error to an appropriate Python Exception.
Reviewed By: quark-zju
Differential Revision: D12980782
fbshipit-source-id: 44a63d31f9ecf2f77efa3b37c68f9a99eaf6d6fa
Summary:
The mutationstore is a new store for recording records of commit mutations for
commits that are not in the local repository.
It uses an indexedlog to store the data. Each mutation entry corresponds to
the information the mutation that led to the creation of a particular commit,
which is recorded as the successor in the entry.
Entries can come from three possible places:
* `Commit` metadata for a commit not available locally
* `Obsmarkers` for repos that have been migrated from evolution tracking
* `Synthetic` for entries created synthetically, e.g. by a pullcreatemarkers
implementation.
The other commits referred to in an entry must predate the successor commit.
For entries that originated from commits, this is ensured, as the successor
commit hash includes the other commit hashes. For other entry types, it is
an error to refer to later commits, and any entry that causes a cycle will
be ignored.
Reviewed By: quark-zju
Differential Revision: D12980773
fbshipit-source-id: 040d3f7369a113e710ed8c9f61fabec6c5ec9258
Summary:
The derived debug for Node prints out each byte as a decimal number. Instead,
make the Debug output for nodes look like `Node("hexstring")`.
Reviewed By: DurhamG
Differential Revision: D12980775
fbshipit-source-id: 042cbf6eade8403759684969e1f69f7f4e335582
Summary:
Add a utility function for tests to generate a vector of random nodes. This
will be used in future tests.
Reviewed By: DurhamG
Differential Revision: D12980784
fbshipit-source-id: 73fc8643503e11a46a845671df94c912a5e49d23
Summary:
Add traits that extend `std::io::Read` and `std::io::Write` to implement new
`read_node` and `write_node` methods, allowing simple reading and writing of
binary nodes from and to streams.
Reviewed By: DurhamG
Differential Revision: D12980778
fbshipit-source-id: fc6751cd43a1693a5a5a3ac93aea74aec5fda4fe
Summary:
Add user's current working directory, which can be relevant when
debugging user issues.
Reviewed By: phillco
Differential Revision: D13426163
fbshipit-source-id: f4c58d15237c188235aad74e82333f58262cbc51
Summary:
The test failed because "hgext" is no longer a namespace package. The check
becomes unnecessary since we almost always use the in-repo modules.
Reviewed By: DurhamG
Differential Revision: D13453160
fbshipit-source-id: 0c5a72b3abeb0a12426d8ed15c045a4ce478a4c6
Summary:
Add a new `--print` option to drawdag which prints the commit hashes and
descriptions of all the nodes that were generated by drawdag.
Reviewed By: quark-zju
Differential Revision: D10149266
fbshipit-source-id: 55b4c133b3c98c0258419811d7c00a3ec73a02cc
Summary:
Update drawdag to generate mutation metadata in commits when mutation recording
is enabled.
In order for this to work, we need to add the mutation relationships as edges
between the commits, so that when walking the DAG the predecessors are already
committed and we can find their hashes.
Reviewed By: ikostia
Differential Revision: D9989024
fbshipit-source-id: 671c1eb6c4ae6e87760efb4d3aa47e5e0585c94d
Summary:
Add support for mutation information being added when commits are histedited.
For the most part, the histedit predecessor and successor have a 1:1
relationship. The special case is for when commits are folded or rolled up.
In this case we act in the same way as a fold.
Reviewed By: quark-zju
Differential Revision: D9975466
fbshipit-source-id: 748040232f49aa87af4e25a97d948995d956f04a
Summary:
Add a convenience function that generates the nodes that match a revset, rather
than the contexts or rev numbers.
Reviewed By: DurhamG, quark-zju
Differential Revision: D10149262
fbshipit-source-id: 5c889b1c7f03e3fcda7dccd297674c40729ccd90