Commit Graph

45292 Commits

Author SHA1 Message Date
Jun Wu
7831e2a4ce cpython-ext: add ways to zero-copy Vec<u8> into a Python object
Summary:
I need to convert `Vec<u8>` to a Python object in a zero-copy way for rustlz4
performacne.

Assuming Python and Rust use the same memory allocator, it's possible to transfer
the control of a malloc-ed pointer from Rust to Python. Use this to implement
zero-copy. PyByteArrayObject is chosen because its struct contains such a pointer.
PyBytes cannot be used as it embeds the bytes, without using a pointer.

Sadly there are no CPython APIs to do this job. So we have to write to the raw
structures. That means the code will crash if python is replaced by
python-debug (due to Python object header change). However, that seems less an
issue given the performance wins. If python-debug does become a problem, we can
try vendoring libpython directly.

I didn't implement a feature-rich `PyByteArray` Rust object. It's not easy to
do so outside the cpython crate. Most helper macros to declare types cannot be
reused, because they refer to `::python`, which is not available in the current
crate.

Reviewed By: DurhamG

Differential Revision: D13516209

fbshipit-source-id: 9aa089b309beb71d4d21f6c63fcb97dbc798b5f8
2018-12-20 17:54:22 -08:00
Jun Wu
3b35a77fe8 rustlz4: expose lz4-pyframe to Python
Summary:
This is intended to replace the python-lz4 library so we have a unified code
path.

However, added benchmark indicates the Rust version is significantly slower
than python-lz4:

  Benchmarking (easy to compress data)...
            pylz4.compress: 10964.14 MB/s
       rustlz4.compress_py: 12126.00 MB/s
          pylz4.decompress:  3908.29 MB/s
     rustlz4.decompress_py:   798.68 MB/s
  Benchmarking (hard to compress data)...
            pylz4.compress:  5615.86 MB/s
       rustlz4.compress_py:   740.32 MB/s
          pylz4.decompress:  6145.68 MB/s
     rustlz4.decompress_py:  2423.99 MB/s

The only case where the Rust version is fine is when the returned data is
small. That suggests rust-cpython was likely doing some memcpy unnecessarily.

Reviewed By: DurhamG

Differential Revision: D13516207

fbshipit-source-id: 72150b15c38bc8d8c7e7717a56a41f48d114db19
2018-12-20 17:54:21 -08:00
Jun Wu
35c85018cd lz4-pyframe: add a benchmark
Summary:
This gives some sense about how fast it is.

Background: I was trying to get rid of python-lz4, by exposing this to Python.
However, I noticed it's 10x slower than python-lz4. Therefore I added some
benchmark here to test if it's the wrapper or the Rust lz4 code.

It does not seem to be this crate:

```
  # Pure Rust
  compress (100M)                77.170 ms
  decompress (~100M)             67.043 ms

  # python-lz4
  In [1]: import lz4, os
  In [2]: b=os.urandom(100000000);
  In [3]: %timeit lz4.compress(b)
  10 loops, best of 3: 87.4 ms per loop
```

Reviewed By: DurhamG

Differential Revision: D13516205

fbshipit-source-id: f55f94bbecc3b49667ed12174f7000b1aa29e7c4
2018-12-20 17:54:21 -08:00
Jun Wu
b3893b3d3c indexedlog: add methods on Log to do prefix lookups
Summary:
This exposes the underlying lookup functions from `Index`.

Alternatively we can allow access to `Index` and provide an `iter_started_from`
method on `Log` which takes a raw offset. I have been trying to avoid exposing
raw offsets in public interfaces, as they would change after `flush()` and cause
problems.

Reviewed By: markbt

Differential Revision: D13498303

fbshipit-source-id: 8b00a2a36a9383e3edb6fd7495a005bc985fd461
2018-12-20 15:50:55 -08:00
Jun Wu
3237b77e4c indexedlog: add APIs to lookup by prefix
Summary:
This is the missing API before `indexedlog::Index` can fit in the
`changelog.partialmatch` case. It's actually more flexible as it can provide
some example commit hashes while the existing revlog.c or radixbuf
implementation just error out saying "ambiguous prefix".

It can be also "abused" for the semantics of sorted "sub-keys". By replace
"key" with "key + subkey" when inserting to the index. Looking up using "key"
would return a lazy result list (`PrefixIter`) sorted by "subkey". Note:
the radix tree is NOT efficient (both in time and space) when there are common
prefixes. So this use-case needs to be careful.

Reviewed By: markbt

Differential Revision: D13498301

fbshipit-source-id: 637856ebd761734d68b20c15866424b1d4518ad6
2018-12-20 15:50:55 -08:00
Jun Wu
562b7a1704 indexedlog: add a function to convert base16 to base256
Summary: This will be used in prefix lookups.

Reviewed By: markbt

Differential Revision: D13498300

fbshipit-source-id: 3db7a21d6f35a18699d9dc3a0eca71a5410e0e61
2018-12-20 15:50:55 -08:00
Xinyue Zhang
56a249c467 Added config to guard logic in D13515102
Summary: As titled

Reviewed By: quark-zju

Differential Revision: D13525086

fbshipit-source-id: 168bd7109396869326a35d2d87f8e1e97a3feeb2
2018-12-20 10:20:50 -08:00
Liubov Dmitrieva
cc62f6eb93 hg cloud join -w shut down previous subscription
Summary:
As we tell users that hg cloud join -w <name> is lgtm, this should be fixed.

hg cloud leave
hg cloud join -w <name>

worked correctly but users could skip cloud leave

currently, it leads to excessive unused connection to IceBreaker.

Differential Revision: D13528559

fbshipit-source-id: 91f36089ccac6718f7a29413c8a3dae80b6b25c6
2018-12-20 10:15:34 -08:00
Durham Goode
0d2d0365db infinitepush: prefix exclude patterns with path:
Summary:
When pulling an infinitepush bundle from the server to a client, it
uses remotefilelog's excludepattern functionality to force the server to serve
filelogs to the client. Unfortunately, it didn't specify what type of pattern,
and therefore it was treated as a regular expression. This meant that any path
that happened to be a regular expression would generally not match itself,
resulting in data not being sent to the client.

The fix is to just prefix these with 'path:' since we know they are all exact
paths.

Reviewed By: phillco, quark-zju

Differential Revision: D13521814

fbshipit-source-id: 8afc35b37d5858913b80ed0babfa0b5e401f0ab4
2018-12-19 17:04:35 -08:00
Durham Goode
2e933d02b7 infinitepush: add test showing issues with regex file names
Summary:
If a filename is a valid regex and it doesn't match itself, that file
data will not be delivered to the client when doing a infinitepush pull that
requires a rebundle. This adds a test. The next diff fixes it.

Reviewed By: phillco

Differential Revision: D13522579

fbshipit-source-id: 6f8c5df20c31b834a33dbb9b73dc9031e26e969b
2018-12-19 17:04:35 -08:00
Wez Furlong
d26efc592e hg: add eden-rust-deps.zip to LFS and setup.py
Summary: Adds the current flavor of the eden rust+python deps to LFS and have setup.py bake them into the IPython.zip.

Reviewed By: quark-zju

Differential Revision: D13516179

fbshipit-source-id: 64a0a86bc97b187f35475a3df4b580e0f0bc5deb
2018-12-19 15:58:57 -08:00
Wez Furlong
617ce8db95 hg: unconditionally add IPython.zip to sys.path on startup
Summary:
Rather than only making it available to `hg dbsh`,
add it for all commands/extensions to use.

Reviewed By: quark-zju

Differential Revision: D13505762

fbshipit-source-id: c79fee888d9394c5ad70d3d8b7f59addef1381a1
2018-12-19 15:58:57 -08:00
Wez Furlong
a96271834d hg: install eden.dirstate module
Summary:
This is the library shared between the eden cli and the
eden hg extension.

Reviewed By: quark-zju

Differential Revision: D13503730

fbshipit-source-id: 45ab550da3126042cb3baacaf8469b8acd6b1c4a
2018-12-19 15:58:57 -08:00
Wez Furlong
c3565636de hg: install python thrift runtime
Summary:
makes it possible to `import thrift` from python2.

This is implemented as a subclass of `asset`, but rather than downloading
files we copy them from their location in fbsource.  Extraction
copies them from their fbsource path, optionally excluding specific files
(eg: python3 only files) to the build path.

Reviewed By: quark-zju

Differential Revision: D13503711

fbshipit-source-id: fbe69400de31376ff6135c2c5173a984ff97f282
2018-12-19 15:58:56 -08:00
Wez Furlong
aeb57e8954 hg: add deps for eden + python thrift runtime
Summary:
In addition to `six` (already listed), these packages are
required to use the eden extension thrift deps.

I downloaded these using `pip2 download URL` and then uploaded using:

```
$ ../../tools/lfs/lfs.py upload /tmp/futures-3.2.0-py2-none-any.whl -l fb/tools/.lfs-pointers
```

Reviewed By: quark-zju

Differential Revision: D13503336

fbshipit-source-id: 87a21984dbe544882bfc4a818f7b5bff46907693
2018-12-19 15:58:56 -08:00
Xinyue Zhang
48b033c535 Enable lfs prefetch in remotefilelog regardless of whether there's missing data in hg stores
Summary:
dsp implemented the remotefilelog prefetch logic in https://phab.mercurial-scm.org/D732. With that change, LFS download should only be triggered once per update for a remotefilelog repo since remotefilelog/__init__.py wrapped around checkunknownfiles() (called by calculateupdates() in hg update) to first run prefetch(). However, lfs prefetch is only called when there's missing data in either datastore or historystore.

This diff reuse the logic for lfs prefetch in remotefilelog/fileserverclient.py, but enable lfs prefetch regardless of whether there's missing data in hg stores. The logic to prefetch lfs files is very similar to that of lfs update, except that files to download are batch-processed.

Reviewed By: quark-zju, ikostia

Differential Revision: D13515102

fbshipit-source-id: 5e14c615b192d775db238bacd5e16ceec8141efa
2018-12-19 15:53:04 -08:00
Bennett Magy
95fb0431d8 Added support for cats in arcrc
Summary:
Sandcastle jobs are using mercurial in conjunction with Phabricator CATs as a part of Phabricato security effort.

Here is an example of a mercurial command that failed because of this:
https://our.intern.facebook.com/intern/sandcastle/job/1498468091/

This adds logic to support parsing CATs from .arcrc for mercurial commands

Reviewed By: quark-zju

Differential Revision: D13468442

fbshipit-source-id: 033806d0e0779f9e7ade054d21e4cdbbdef08ed0
2018-12-19 10:33:11 -08:00
Mark Thomas
45ac931774 run-tests: close child stdout after running test
Summary:
Tests don't close the child process stdout.  On newer versions of Python, this
can lead to ResourceWarnings when the test runner thread terminates.

Reviewed By: HarveyHunt

Differential Revision: D13517425

fbshipit-source-id: 6cedf4f39efe1299c41dbde784daf8c159309640
2018-12-19 07:53:02 -08:00
Mark Thomas
49ebfc610d run-tests: fix escape sequences
Summary:
Some of the escape sequences in run-tests.py are invalid.  These cause
DeprecationWarnings on newer versions of Python.

In both cases, there are `\` characters that need to be escaped as `\\`.

Reviewed By: HarveyHunt

Differential Revision: D13517137

fbshipit-source-id: a899c3c28d55210f5972a515474a2fa69d051671
2018-12-19 07:53:02 -08:00
Mark Thomas
933be4cf4d undo: remove duringundologlock
Summary:
The duringundologlock config and hook were used to test the undolog using
timing.  We've replaced that, so remove the hook.

Reviewed By: quark-zju

Differential Revision: D13504644

fbshipit-source-id: a6b5fb308bc8938eec72788d93c9be6c237b72d7
2018-12-19 04:02:42 -08:00
Mark Thomas
de804dc6c0 undo: use extralog in tests rather than timing
Summary:
The undo tests use timing to detect when the lock is being taken.  This is
flaky.  Instead add extra logging to detect when the lock is taken.

Reviewed By: quark-zju

Differential Revision: D13504643

fbshipit-source-id: 07b80e416047d11b4ba3e1631c2385e5f12fa36f
2018-12-19 04:02:42 -08:00
Mark Thomas
f79c393d86 remotefilelog: don't continously wrap generatefiles
Summary:
The setup function for remotefilelogserver wraps `cg1packer.generatefiles`,
however it does this each time a repo is set up.  After many repo objects
have been instantiated, we can end up with hundreds of wrappers around the
method, which overflows the stack when it is called.

Furthermore, in `wrappackers` we have replaced `cg1packer` with our own
`shallowbundle.shallowcg1packer`, so here remotefilelog is actually wrapping
itself!  We can just move the code for the server case into the remotefilelog
version of `generatefiles`.

Reviewed By: DurhamG, quark-zju

Differential Revision: D13505002

fbshipit-source-id: a236f7e62e6d2e5186f135bfe79477ce3e09e374
2018-12-18 11:44:18 -08:00
Jun Wu
443a8f33b3 indexedlog: move binary indexedlog_dump out
Summary:
It makes testing duplicated - now `cargo test` would try running tests on 2 entry points:
lib.rs and indexedlog_dump.rs.  Move it to a separate crate to solve the issue.

Reviewed By: markbt

Differential Revision: D13498266

fbshipit-source-id: 8abf07c1272dfa825ec7701fd8ea9e0d1310ec5f
2018-12-18 08:17:21 -08:00
Durham Goode
ce243170da remotefilelog: add progress bar when computing what files to send
Summary:
When pushing in treeonly mode, we process the manifests to decide what
files to send. In large stacks this can be slow, so let's add a progress bar.

Reviewed By: quark-zju

Differential Revision: D13460745

fbshipit-source-id: b037419e4a5e17c831492768e97064bf635678f6
2018-12-17 16:28:12 -08:00
Durham Goode
431980c0c0 remotefilelog: use getnodeinfo instead of getancestors
Summary:
In a number of places, the remotefilelog code needs the information for
a single node but instead fetches the entire ancestor set. Now that we have
getnodeinfo, let's switch to using that. This was a primary hot spot in hg push
of large stacks of commits.

Reviewed By: quark-zju

Differential Revision: D13460746

fbshipit-source-id: 3aa288c70c87dcb32c0404311f27bbc87ddc5267
2018-12-17 16:28:12 -08:00
Marla Azriel
02d41c83e8 commands: help text for update / checkout
Summary: Updated help text for hg update / hg checkout and removed --date option

Reviewed By: ikostia

Differential Revision: D13071724

fbshipit-source-id: 31b51b26f5d199e356f2148353d6714cecc1f632
2018-12-17 14:33:04 -08:00
Jun Wu
fd1b928138 extensions: work with 'hgext.' prefix
Summary:
Be compatible with `hgext.` or `hgext/` prefix. But print a warning saying it's
deprecated.

Reviewed By: DurhamG

Differential Revision: D13490362

fbshipit-source-id: ef13bd57a74be810df409af18a6259bc7b2b6dad
2018-12-17 12:53:12 -08:00
Jun Wu
61b1a5f475 indexedlog: fix rustc warnings
Summary: `write!` result needs to be used.

Reviewed By: markbt

Differential Revision: D13471967

fbshipit-source-id: d48752bcac05dd33b112679d7faf990eb8ddd651
2018-12-17 12:10:52 -08:00
Jun Wu
cc9d529053 setup: create 'build' directory automatically
Summary:
Many code paths assume 'build' exists. So let's create it on demand.
Use the "scratch" tool to make it more friendly on an Eden checkout.

Reviewed By: markbt

Differential Revision: D13471293

fbshipit-source-id: cce461ab67b984c53a00a98d481a821ad1f11c35
2018-12-17 12:10:52 -08:00
Xavier Deguillard
79164e920c revisionstore: replace rand::chacha with rand_chacha
Summary: The former is deprecated and thus compiling revisionstore shows many warnings.

Reviewed By: markbt

Differential Revision: D13379278

fbshipit-source-id: d4b4662a1ad00997de4c46274deaf22f48487328
2018-12-17 12:07:22 -08:00
Jun Wu
4b5df986f1 remotefilelog: use latest supported version for mutabledatapack
Summary:
This fixes LFS compatibility with packlocaldata.

Also drop the config as the default version is 1 now.

Reviewed By: DurhamG

Differential Revision: D13469486

fbshipit-source-id: 1dc4a1051667419d7aab97bf95f93cacd166468a
2018-12-15 16:27:04 -08:00
Mark Thomas
30d3800c0f crdump: don't reraise exception by name
Summary:
Reraising a caught exception loses the context of the original exception, which
makes debugging harder.  Instead, just call `raise`, which keeps the context
intact.

Reviewed By: quark-zju

Differential Revision: D13468943

fbshipit-source-id: 091f060327ca732a1534a7730bc6536d9a101865
2018-12-15 02:29:30 -08:00
Jun Wu
3a1b17a8dd remotefilelog: make debugdatapack search through all files for a given node
Summary:
Previously, it would raise `KeyError` if the node requested cannot be found in
the first pack file.

This allows `*` to be used in `.t` tests. So file name changes are still
`run-tests.py -i` friendly.

Reviewed By: DurhamG

Differential Revision: D13469489

fbshipit-source-id: 84349ab1cf963b6d40216cec4de0e8ab10838b97
2018-12-14 17:13:32 -08:00
Jun Wu
fba19bd752 remotefilelog: make fetchcost 0 for .t tests
Summary:
It's meaningless to test the fetch cost in .t tests. So let's just set it to 0.
This also makes the output stable. So `run-tests.py -i` can be used directly.

Reviewed By: DurhamG

Differential Revision: D13469488

fbshipit-source-id: a2cf900531f3ff0f58957bd79d7283928b5da700
2018-12-14 17:13:32 -08:00
Jun Wu
d0bdd29337 tests: really test localpacks with LFS
Summary: Previously, the test wasn't using a remotefilelog repo.

Reviewed By: DurhamG

Differential Revision: D13469487

fbshipit-source-id: 3c85633b8bfc209fa0d0317ca05247e14faf689d
2018-12-14 17:13:32 -08:00
Norbert Csongrádi
de72d417f1 Added pushbackup --delete-bookmarks option to skip pushing & delete auxiliary bookmarks
Reviewed By: StanislavGlebik

Differential Revision: D13255687

fbshipit-source-id: 05c0dcb72d88f133fb00947e587723611f51ffd1
2018-12-14 11:56:44 -08:00
Mark Thomas
75853bb1a6 templater: add default argument to dict get
Summary:
Add a third argument to the `get(dict, key)` template function that serves as a
default.

Reviewed By: DurhamG, quark-zju

Differential Revision: D12980774

fbshipit-source-id: d619275dd3ae880f6aba4c2b3d91aea4b45ea6d6
2018-12-14 06:43:41 -08:00
Mark Thomas
d47eff8070 mutationstore: add Python bindings
Reviewed By: DurhamG, quark-zju

Differential Revision: D12980786

fbshipit-source-id: b1bec8618b335b2790ad9c913b2a4f46573e3c03
2018-12-14 06:43:40 -08:00
Mark Thomas
ca135cd33f cpython-failure: Integrate cpython PyResult with the failure crate
Summary:
Adds a new crate `cpython-result`, which provides a `ResultExt` trait, which
extends the failure `Result` type to allow coversion to `PyResult` by
converting the error to an appropriate Python Exception.

Reviewed By: quark-zju

Differential Revision: D12980782

fbshipit-source-id: 44a63d31f9ecf2f77efa3b37c68f9a99eaf6d6fa
2018-12-14 06:43:40 -08:00
Mark Thomas
cf4b52c19c mutationstore: add mutationstore
Summary:
The mutationstore is a new store for recording records of commit mutations for
commits that are not in the local repository.

It uses an indexedlog to store the data.  Each mutation entry corresponds to
the information the mutation that led to the creation of a particular commit,
which is recorded as the successor in the entry.

Entries can come from three possible places:

* `Commit` metadata for a commit not available locally
* `Obsmarkers` for repos that have been migrated from evolution tracking
* `Synthetic` for entries created synthetically, e.g. by a pullcreatemarkers
  implementation.

The other commits referred to in an entry must predate the successor commit.
For entries that originated from commits, this is ensured, as the successor
commit hash includes the other commit hashes.  For other entry types, it is
an error to refer to later commits, and any entry that causes a cycle will
be ignored.

Reviewed By: quark-zju

Differential Revision: D12980773

fbshipit-source-id: 040d3f7369a113e710ed8c9f61fabec6c5ec9258
2018-12-14 06:43:40 -08:00
Mark Thomas
1346ff92c4 types: implement Debug for Node
Summary:
The derived debug for Node prints out each byte as a decimal number.  Instead,
make the Debug output for nodes look like `Node("hexstring")`.

Reviewed By: DurhamG

Differential Revision: D12980775

fbshipit-source-id: 042cbf6eade8403759684969e1f69f7f4e335582
2018-12-14 06:43:40 -08:00
Mark Thomas
88ab626e9a types: Add Nodes::random_distinct to randomly generate sets of nodes
Summary:
Add a utility function for tests to generate a vector of random nodes.  This
will be used in future tests.

Reviewed By: DurhamG

Differential Revision: D12980784

fbshipit-source-id: 73fc8643503e11a46a845671df94c912a5e49d23
2018-12-14 06:43:40 -08:00
Mark Thomas
d0c03f6aaf types: Add WriteNodeExt and ReadNodeExt
Summary:
Add traits that extend `std::io::Read` and `std::io::Write` to implement new
`read_node` and `write_node` methods, allowing simple reading and writing of
binary nodes from and to streams.

Reviewed By: DurhamG

Differential Revision: D12980778

fbshipit-source-id: fc6751cd43a1693a5a5a3ac93aea74aec5fda4fe
2018-12-14 06:43:40 -08:00
Arun Kulshreshtha
9943b17cb3 rage: add cwd to rage output
Summary:
Add user's current working directory, which can be relevant when
debugging user issues.

Reviewed By: phillco

Differential Revision: D13426163

fbshipit-source-id: f4c58d15237c188235aad74e82333f58262cbc51
2018-12-13 19:58:31 -08:00
Jun Wu
810130a5b9 extensions: remove foreign extension check
Summary:
The test failed because "hgext" is no longer a namespace package.  The check
becomes unnecessary since we almost always use the in-repo modules.

Reviewed By: DurhamG

Differential Revision: D13453160

fbshipit-source-id: 0c5a72b3abeb0a12426d8ed15c045a4ce478a4c6
2018-12-13 12:01:13 -08:00
Mark Thomas
1fa24ad41e drawdag: add --print option
Summary:
Add a new `--print` option to drawdag which prints the commit hashes and
descriptions of all the nodes that were generated by drawdag.

Reviewed By: quark-zju

Differential Revision: D10149266

fbshipit-source-id: 55b4c133b3c98c0258419811d7c00a3ec73a02cc
2018-12-13 10:47:28 -08:00
Mark Thomas
641d1f8d75 drawdag: add fold and revive support
Reviewed By: quark-zju, ikostia

Differential Revision: D10149264

fbshipit-source-id: 42291bbfe41980764b5fa0a62a9cfc60523a2c50
2018-12-13 10:47:28 -08:00
Mark Thomas
1d3d4047e1 mutation: add drawdag support
Summary:
Update drawdag to generate mutation metadata in commits when mutation recording
is enabled.

In order for this to work, we need to add the mutation relationships as edges
between the commits, so that when walking the DAG the predecessors are already
committed and we can find their hashes.

Reviewed By: ikostia

Differential Revision: D9989024

fbshipit-source-id: 671c1eb6c4ae6e87760efb4d3aa47e5e0585c94d
2018-12-13 10:47:28 -08:00
Mark Thomas
26ee99ef66 mutation: add histedit support
Summary:
Add support for mutation information being added when commits are histedited.

For the most part, the histedit predecessor and successor have a 1:1
relationship.  The special case is for when commits are folded or rolled up.
In this case we act in the same way as a fold.

Reviewed By: quark-zju

Differential Revision: D9975466

fbshipit-source-id: 748040232f49aa87af4e25a97d948995d956f04a
2018-12-13 10:47:28 -08:00
Mark Thomas
079200e14f localrepo: add localrepo.nodes
Summary:
Add a convenience function that generates the nodes that match a revset, rather
than the contexts or rev numbers.

Reviewed By: DurhamG, quark-zju

Differential Revision: D10149262

fbshipit-source-id: 5c889b1c7f03e3fcda7dccd297674c40729ccd90
2018-12-13 10:47:28 -08:00