Summary:
In order to move the types in `edenapi-types` (containing types shared between Mercurial and Mononoke) to the `types` crate, we need to move a few types from the `revisionstore` crate into this crate first, because `revisionstore` depends on `types`, which would create a circular dependency since `edenapi-types` uses types from `revisionstore`.
In particular, this diff moves the `Key` and `NodeInfo` types into their own modules in the `types` crate.
Reviewed By: quark-zju
Differential Revision: D14114166
fbshipit-source-id: 8f9e78d610425faec9dc89ecc9e450651d24177a
Summary:
For HTTP data fetching, it will be necessary to have the same Rust types in Mononoke and Mercurial, so that Mononoke can send down the serialized types and Mercurial can deserialize them. These types must live in the Mercurial codebase since Mercurial can't link to code outside of fbcode/scm/hg. As such, this diff adds a new crate to Mercurial that Mononoke can link to, containing these shared types.
Right now the only shared type is a `HistoryEntry`, designed to match the interface of `MutableDatapack::add`. This type will be used as part of the HTTP history fetching API.
In the longer term, it would probably make sense to use something like Thrift for defining the on-the-wire formats used between Mercurial and Mononoke (and eventually for RPC as well). However, given that using Thrift from Mercurial is currently nontrivial (since Mercurial is typically built with Cargo and needs to be compatible with open source tooling), defining the schema in this crate and using `serde` for serialization and HTTP/2 for transport should be sufficient for now.
Reviewed By: quark-zju
Differential Revision: D14079337
fbshipit-source-id: c7880919aeb3fd7e1cf70067a89a17341c1d973f
Summary:
Those types were internally using DataPack/HistoryPack, limiting their use. We
can make them more generic by using the DataStore/HistoryStore traits. The only
drawback is having to implement the new method for each store type.
Ideally, we could have a trait StoreFromPath (or use the experimental TryFrom)
that all the datastore/historystore types would implement.
As a bonus change, I got rid of the *Builder type, these were required as the
new method was already implemented in the AsyncHistoryStore/AsyncDataStore. We
can simply rename the later and use a new method elsewhere.
Reviewed By: DurhamG
Differential Revision: D14060159
fbshipit-source-id: 31fa278f650ba979eecd3df4175cbac30ebb8180
Summary:
Starting the implementation of tree manifest with the in memory nodes and implementing `get` and `insert`. The in memory nodes are called `Ephemeral` and the stored immutable nodes are going to be called `Durable`.
Using a `BTreeMap` for storing the children because we want to efficiently insert, fetch and remove path components. We also want iteration to be done in ordered fashion so BTreeMap is our collection in this case.
Removing elements from the tree is going to be implemented in a future update.
Reviewed By: DurhamG
Differential Revision: D14016273
fbshipit-source-id: d3bc22e5ddb21b689d07a7d74bd639b8c2b138ce
Summary:
The seed for the rust implementation of manifests.
We start with the most primitive API for manifests and maps a paths to a `Node`. At the basic level we need the same operations that a map implements so we start with `insert`, `get` and `remove`. We know that retrieving data for Manifests can fail so we encode that in our interface using `Fallible`.
I let for future iterations requiring iterator or returning manifest flags.
Reviewed By: DurhamG
Differential Revision: D14016274
fbshipit-source-id: 8f1f83610933b9e9a96f8c5ba2c6e50567c76e06
Summary:
`Node` is not friendly with plain old unit tests because constructing them is a bit involved. This diff adds a constructor from u8 purely for test puposes.
I picked an u8 for input because it is the most convenient type. When we move past rust 1.31 it might make sense to use an u32 and use https://doc.rust-lang.org/std/primitive.u32.html#method.to_le_bytes
To note that property testing is best used in addition to plain old unit testing.
Reviewed By: DurhamG
Differential Revision: D14016272
fbshipit-source-id: 5b831ab0011ef2575f7e94d158ab4ddf30d1ac06
Summary:
During repack, the repacked files are deleted without any verification. Since
Adam saw some data loss, it's possible that somehow repack didn't fully repack
a packfile but it was deleted. Let's verify that the entire packfile was
repacked before deleting it.
Since repack is mostly a background operation, we don't have a way to notify
the user, but we can log the error to a scuba table to analyse further.
Reviewed By: DurhamG
Differential Revision: D14069766
fbshipit-source-id: 4358a87deeb9732eec1afdfb742e8d81db41cd87
Summary:
Removing files on Windows is hard. It can fail for many reasons, many of which
involves another process having the file opened in some way. One way to solve
this problem is that renaming the file isn't as restrictive as removing it.
Since hg repack will attempt removing any temporary files it will also try to
remove the packfiles that we failed to remove earlier.
Reviewed By: DurhamG
Differential Revision: D14030445
fbshipit-source-id: 1f3799e021c2e0451943a1d5bd4cd25ed608ffb6
Summary:
Packfiles are named based on their content, so having an on-disk file with the
same name means that they have the same content. If that happens, let's simply
continue without failing.
Reviewed By: DurhamG
Differential Revision: D14030446
fbshipit-source-id: f04c15507c89b2fca19c95a7b41d8e65c88da019
Summary:
Switch from using OpenSSL (via `native-tls`) to [Rustls](https://github.com/ctz/rustls), a pure-Rust TLS implementation based on the `ring` crypto crate.
Unlike `native-tls`, Rustls supports ALPN, which means it can be used along with Hyper to perform HTTP/2 requests over TLS. (OpenSSL also supports ALPN, but older versions of Windows' `schannel` library do not, and as such `native-tls` doesn't support ALPN either regardless of platform.)
Rustls also builds on Windows without any special configuration, sidestepping the issues we've been having with OpenSSL in the Windows build.
Reviewed By: quark-zju
Differential Revision: D14070084
fbshipit-source-id: 25268c58a88177f4708370696d326b4c0bdc89a0
Summary:
Repacking one packfile will yield the same packfile, so we can save some IO by
not trying to repack.
Differential Revision: D14013789
fbshipit-source-id: 8069840cc7cb1837eb94cea97e50b3bbaa548873
Summary:
D13875656 made a config path change that breaks tests without using HGRCPATH,
or local build runs.
Reviewed By: DurhamG
Differential Revision: D14034919
fbshipit-source-id: 80de214f1769a8f40e79dc0ab1dbba4d55f506a7
Summary:
We've settled on the following grouping for imports:
- standard library
- 3rd party crates
- internal crates
- modules within same crate
This diff updates revisionstore accordingly.
Reviewed By: singhsrb
Differential Revision: D14030243
fbshipit-source-id: 74a7897342e39eb1d80202c8aae8c149bf08fc41
Summary: It turns out that `hyper-tls` does not support ALPN for negotiating HTTP/2 connections, and only supports HTTP/2 prior knowledge. (This is a limitation of the underlying TLS library, `native-tls`.) Unfortunately, while the Mononoke API server itself is fine with HTTP/2 prior knowledge for non-TLS connections, the Mononoke VIPs require TLS, and thus per the HTTP/2 spec require ALPN negotiation from an HTTP/1.1 initial connection. As a result, we need to revert back to using HTTP/1.1 for now in order to use TLS.
Reviewed By: singhsrb
Differential Revision: D14015335
fbshipit-source-id: b78197d4cfecf184479162c5b14ba54cbef66ee7
Summary:
Change system config entry point to only `/etc/mercurial/system.rc` (unix) and
`\ProgramData\Facebook\Mercurial\system.rc` (Windows) so they won't overlap
with a vanilla Mercurial installation.
Another goal of this change is to make it easier to drop the directory
`%include` feature. So detecting config changes (for example, edenfs wants to
make sure ignore rules are up-to-date) can be made cheaper by just stating
files without `listdir`.
Reviewed By: markbt
Differential Revision: D13875656
fbshipit-source-id: 314c0bf87ff086dec5b88e232edca0133356484e
Summary: This will be used by scmmemcache to send history data to memcache
Reviewed By: DurhamG
Differential Revision: D13975346
fbshipit-source-id: f41eaf9a4968072dd07efbcd9d539e6293c3fa4f
Summary:
We can now fetch history data stored in memcache and write it to a history
pack.
Reviewed By: DurhamG
Differential Revision: D13975308
fbshipit-source-id: 2196328ad60a55d1e2b39d88d939f434e496837a
Summary:
The initial get_data/set_data only sent the full-text to memcache, which is
just enough for non-LFS data. Let's use Serde to serialize/deserialize the data
that we send to memcache. This will make it simple to add checksuming, or more
metadata to it.
Reviewed By: DurhamG
Differential Revision: D13974714
fbshipit-source-id: 41a235e1d1e8128b14f00b668745f4f9a070a360
Summary:
Similarly to the get_data, we can now read a datapack and send the proper
deltas to memcache. This change is lacking in the same way the get_data is.
Reviewed By: DurhamG
Differential Revision: D13886026
fbshipit-source-id: a00475e89b7e75dbbe9afa9f9d293a686f969a3f
Summary:
The IterableStore trait allows iterating over all the keys of a DataStore.
Since this is applicable to a UnionStore, let's implement it there. We can now
use it in their async variants.
Regarding the async variants, the code effectively builds a Vec of Key, which
may use a lot of memory, a better alternative would be to use a Stream of Key.
This will be tackled later.
Reviewed By: DurhamG
Differential Revision: D13951905
fbshipit-source-id: 15944b18d7ffea08d191e5dc7e1b8e2b783f69d1
Summary: TP2 version for itertools was updated to 0.8.
Reviewed By: singhsrb
Differential Revision: D14008855
fbshipit-source-id: 081a43c5b02cd39c6a0a6b491bfa0767ddf0b7ed
Summary: `lib/argparse` fails to build with cargo. Removing the crate from the workspace to unblock building with cargo.
Reviewed By: quark-zju
Differential Revision: D13969332
fbshipit-source-id: 0299f74e6aa81632ce64005d91fa2c30a32f5b96
Summary: Ensure that Hyper uses HTTP/2, since we'd like to support connection reuse and multiplexing.
Reviewed By: DurhamG
Differential Revision: D13925320
fbshipit-source-id: 0f39e66fe35a0dc95966d16772d1ab8988067c11
Summary: In Rust it is typically more idiomatic to have a static method on a struct to produce a builder, since this means the builder doesn't need to be explicitly imported to construct a new instance of the struct.
Reviewed By: DurhamG
Differential Revision: D13925323
fbshipit-source-id: c06d5d42ba941dbbb2c619f9470e79fa23f35f68
Summary: Rename Mononoke API to Eden API, per war room discussion.
Reviewed By: quark-zju
Differential Revision: D13908195
fbshipit-source-id: 94a2fe93f8a89d0c5e9b6a24939cc4760cfaade0
Summary:
The Rc is required by the c_api, but there is no longer a reason for
UnionDataStore and UnionHistoryStore to use an Rc, so let's move the Rc into
c_api.
Reviewed By: DurhamG
Differential Revision: D13928332
fbshipit-source-id: a93b54e022d539dc4df9144a8c59e9ffbe3453e0
Summary:
By specifying the IntoIterator differently, we can avoid the clone requirement.
Since Clone isn't implemented on either DataPack or HistoryPack, this will
simplify the callers a bit
Reviewed By: DurhamG
Differential Revision: D13928274
fbshipit-source-id: f0261c50d73868689ebb3ae226f84d41c4c40925
Summary: This way, HistoryStore type constraint will work with these types.
Reviewed By: DurhamG
Differential Revision: D13928128
fbshipit-source-id: aaa9f2633166c137dca5fc2b1f44caab92b57a80
Summary: This way, DataStore type constraint will work with these types.
Reviewed By: DurhamG
Differential Revision: D13928090
fbshipit-source-id: 1567556e3ffea2901acbc754b3bd67491e23056b
Summary: The UnionStore doesn't need internal mutability, so let's simplify it.
Reviewed By: DurhamG
Differential Revision: D13928058
fbshipit-source-id: f0ba085ff8401dcc99fc69c3eb6f5e20c071d650
Summary: This just reuses the AsyncHistoryStore methods.
Reviewed By: DurhamG
Differential Revision: D13891142
fbshipit-source-id: 9553e9824eebc5eacf6a82f9d0f212a62ec8955f
Summary:
Similarly to AsyncDataStore, this is just a blocking wrapper around a
HistoryStore.
Reviewed By: DurhamG
Differential Revision: D13891140
fbshipit-source-id: 76acadfc1849770b47e2400ce8c70f7e32bba4df
Summary: This will be used to wrap an HistoryStore into a AsyncHistoryStore.
Reviewed By: DurhamG
Differential Revision: D13891139
fbshipit-source-id: 41a0ec740f05268259a654e769ff0909617102ff
Summary: Add metadata to each delta entry written to the datapack. Since the HTTP API never serves LFS files, and the only flag currently used simple indicates whether a file should use LFS, the flag field is intentionally set to `None`, leaving only the size in the metadata (which, since we're storing full file content, is the same as the content length).
Differential Revision: D13894292
fbshipit-source-id: 36db25adb0c46cd1c7fde841a69d3e6d48d08d06
Summary: Give MononokeClient the ability to fetch multiple files concurrently. Right now this functionality is not exposed via the Python bindings, so as far as the Mercurial Python code is concerned, nothing has changed. The multi-get functionality will be used later in the stack.
Differential Revision: D13893575
fbshipit-source-id: c9e514fbeb41bbb37f52f6df3920eb01a66df293
Summary: As `MononokeClient` grows, we're going to add more inherent methods on the struct. To avoid cluttering the `client` module, split out all the builder-related things into a separate module.
Reviewed By: singhsrb
Differential Revision: D13892198
fbshipit-source-id: 42918d8a775d8328cfad8a6ac0365cb336893d8f
Summary: Add a new `get_file()` method to `MononokeClient` that fetches Mercurial file content from the API server and writes it to a datapack in the cache. This functionality is exposed via the new `hg debuggetfile` debug command, which takes a filenode and file path and fetches the corresponding file.
Differential Revision: D13889829
fbshipit-source-id: 2b68bf114ee72d641de7a1043cca1975e34cf4e6
Summary:
Crate adding easy conversions between `http::Uri` and `url::Url`.
Rust has two main types for working with URLs: `http::Uri` and `url::Url`. `http::Uri` comes from the `http` crate, which is supposed to be a set of common types to be used throughout the Rust HTTP ecosystem, to ensure mutual compatibility between different HTTP crates and web frameworks. This is the type that HTTP clients like Hyper expect when specifying URLs.
Unfortunately, `http::Uri` is a very simple type that does not expose any means of mutating or otherwise manipulating the URL. It can only parse URLs from strings, forcing the users to construct URLs via error-prone string concatenation.
In contrast, the `url::Url` comes from the `rust-url` crate from the Servo project. This type does support easily constructing and manipulating URLs, making it very useful for assembling a URL from components.
The only way to convert between the two types is to first convert back to a string, and then re-parse as the desired type. Several issues [have](https://github.com/hyperium/hyper/issues/1219) [been](https://github.com/hyperium/hyper/issues/1102) [raised](https://github.com/hyperium/hyper/issues/1219) about this upstream, but there has been no consensus or action as of yet. To get around the problem for now, this crate adds convenience methods to perform the conversions.
Reviewed By: DurhamG
Differential Revision: D13887403
fbshipit-source-id: ecfaf3ea9d884621493b0fe44a6b5658d10108b4
Summary:
D13853115 adds `edenscm/` to `sys.path` and code still uses `import mercurial`.
That has nasty problems if both `import mercurial` and
`import edenscm.mercurial` are used, because Python would think `mercurial.foo`
and `edenscm.mercurial.foo` are different modules so code like
`try: ... except mercurial.error.Foo: ...`, or `isinstance(x, mercurial.foo.Bar)`
would fail to handle the `edenscm.mercurial` version. There are also some
module-level states (ex. `extensions._extensions`) that would cause trouble if
they have multiple versions in a single process.
Change imports to use the `edenscm` so ideally the `mercurial` is no longer
imported at all. Add checks in extensions.py to catch unexpected extensions
importing modules from the old (wrong) locations when running tests.
Reviewed By: phillco
Differential Revision: D13868981
fbshipit-source-id: f4e2513766957fd81d85407994f7521a08e4de48
Summary: Some of the revisionstore imports were unused.
Reviewed By: kulshrax
Differential Revision: D13865074
fbshipit-source-id: 79c7c2ba869f2e1d72fa06aac70a4b027367c831
Summary: Similar to previous diff in this stack, make this type serializable so we can send it as part of an HTTP request.
Reviewed By: singhsrb
Differential Revision: D13858440
fbshipit-source-id: 9173a3e76bcfa6a6600d30ada39d65475f95bc5e
Summary: Make this type serializable so it can be sent as part of an HTTP request. By using Serde, we can easily support a variety of serialization formats without code changes.
Reviewed By: singhsrb
Differential Revision: D13858443
fbshipit-source-id: b6c83f38eaadbb2a28be6d66faf6a3610ede970f
Summary:
The conditional if statement did not prevent the logic inside the
condition from being compiled, which in this case fails on windows. Instead of
using an if, let's just define two functions and conditionally compile the
functions.
Reviewed By: ikostia
Differential Revision: D13855560
fbshipit-source-id: ac417e6bd8fb272106fe8f3b9a8b7db57214ad88
Summary:
Move top-level Python packages `mercurial`, `hgext` and `hgdemandimport` to
a new top-level package `edenscm`. This allows the Python packages provided by
the upstream Mercurial to be installed side-by-side.
To maintain compatibility, `edenscm/` gets added to `sys.path` in
`mercurial/__init__.py`.
Reviewed By: phillco, ikostia
Differential Revision: D13853115
fbshipit-source-id: b296b0673dc54c61ef6a591ebc687057ff53b22e
Summary:
As a last step towards getting rid of loosefiles, memcache will soon be changed
to produce packfiles. One of the missing piece to achieve is the ability to
read and write packfiles asynchronously, as memcache is purely async.
As a first step, we can wrap the packfile into a blocking context.
Reviewed By: DurhamG
Differential Revision: D13806738
fbshipit-source-id: 2211c2a984a453edbb1647830f7f5fb399a03023
Summary:
As a last step towards getting rid of loosefiles, memcache will soon be changed
to produce packfiles. One of the missing piece to achieve is the ability to
read and write packfiles asynchronously, as memcache is purely async.
As a first step, we can wrap the packfile into a blocking context.
Reviewed By: DurhamG
Differential Revision: D13804184
fbshipit-source-id: 01fcb57af1558feca662b1070969f553c479871a
Summary:
The tempfile rust crates opens the file with RW permissions for the user only,
but once written out to disk, the permissions needs to be readable by everyone.
Unfortunately, rust doesn't have a portable way of doing this, so we have to
resort to using `if cfg!(unix)` conditions for doing this.
Reviewed By: DurhamG
Differential Revision: D13703406
fbshipit-source-id: 688bc679b5c1a7943ceab723c1f649d555b61a7a
Summary:
This allows de-duplicating the logic for setting proper permissions on the
files. Most of the changes is code movement and rustfmt formatting.
Reviewed By: DurhamG
Differential Revision: D13703392
fbshipit-source-id: 28be85ef2d4b440202cf4885e50e62ac3c41f774
Summary: Allow the credentials for TLS mutual authentication (namely, the client certificate and private key) to come from separate PEM files. At Facebook, these are usually stored in the same file, but Mercurial's standard TLS configuration options allow these to be configured separately. As such, in order to support the standard options (which will happen in a later diff), provide the ability to handle separate files, but for now just pass the same path for both from Python to Rust.
Reviewed By: markbt
Differential Revision: D13791525
fbshipit-source-id: 556d99d77a4273b9b0bd91cac8940da136088e45
Summary: Use a builder struct rather than a constructor function to configure and initialize new `MononokeClient` instances. Doing it this way is helpful because later in this stack, we'll need to pass a lot of additional configuration to `MononokeClient`; adding all of these items as parameters to the constructor quickly becomes unwieldily. Using a builder keeps the number of parameters in check.
Differential Revision: D13780408
fbshipit-source-id: bfc43ecbe474d5285ae87d4df9cce244a7ff391d
Summary:
Split up the functionality in `MononokeClient` by moving all of the Mononoke API methods to their own separate trait. This maintains a distinction between functionality that is part of the API vs methods for setting up and configuring the client.
Originally, I had tried to avoid using a trait here because of limitations on trait methods (for example, we can't use `impl Trait` for return types). In practice, I don't think this limitation will be an issue since the API exposed by the client needs to be synchronous (since it will be called by FFI bindings to Python), and as such, there shouldn't be any complex Future return types in the API. (The client will still use async code internally, but the external API will be synchronous.)
Differential Revision: D13780089
fbshipit-source-id: 17e80f549d6ac7c41c60b2b8389eb1760531883e
Summary: Boxed slices are difficult to use in practice, so use `Vec<u8>` instead. (No need for `Bytes` here since there is no reference counting required.)
Reviewed By: DurhamG
Differential Revision: D13770055
fbshipit-source-id: 78f48ac32a4da9c105bf05eb44889c1f492721a8
Summary: Use `Bytes` instead of `Rc<Box<[u8]>>` since the former is a nicer type to represent a reference counted heap allocated byte buffer. (Note that `Rc<Box<[u8]>>` should have originally been `Rc<[u8]>` -- the former introduces an unnecessary allocation and layer of indirection.)
Differential Revision: D13769306
fbshipit-source-id: 5f3e788426e28c7e9ccc478f993c717b23663f56
Summary: Boxed bytes slices (e.g., `Box<[u8]>`, `Rc<[u8]>`) are not very ergonomic to use and are somewhat unusual in Rust code. Use the more common and easier to use `Bytes` type instead. Since this type supports shallow, referenced-counted copies, there shouldn't be any new O(n) copying behavior compared to `Rc<[u8]>`.
Reviewed By: markbt
Differential Revision: D13754730
fbshipit-source-id: d5fbc8e39c84c56d30174f4bb194ee21a14bf944
Summary: Use `failure::Fallible<T>` in place of `Result<T, failure::Error>`.
Reviewed By: singhsrb
Differential Revision: D13754688
fbshipit-source-id: cfbe418f5213884816d4837d1077cd90a17359b6
Summary:
Previously, `use` statements were inconsistently and arbitrarily grouped. This diff groups them in the following order:
- 3rd party crates from crates.io
- local crates
- std library imports (collapsed into a single multiline `use` statement)
- modules within current crate
This new ordering ensures that upon migration to Rust 2018, all imports from within the current crate will be grouped together with the `crate::` prefix.
Reviewed By: singhsrb
Differential Revision: D13754393
fbshipit-source-id: e774c09e0547066afa5f797c1a9c2e5ec4190834
Summary: Run the latest version of rustfmt over the code to ensure consistent style.
Reviewed By: singhsrb
Differential Revision: D13754394
fbshipit-source-id: 6cf5937bcb642530bdf41aaf83399366a9ba3c9a
Summary: There were some warnings about unused private fields in various structs in this crate. Add `#[allow(dead_code)]` as needed to suppress these warnings.
Reviewed By: singhsrb
Differential Revision: D13754234
fbshipit-source-id: ca95a2afbfc67ddb66e7c7436c81cde0fa59f06c
Summary:
Use the `Fallible` type alias provided by `failure` rather than defining our
own.
Differential Revision: D13732298
fbshipit-source-id: 2577bc4c34da5b7a88ae2703f9b898bc2a83b816
Summary: The canonical URL type in Rust, `http::Uri`, does not support manipulating URLs easily. (e.g., concatenating path components, etc.) As such, switch to using the `Url` type from the `url` crate, which does support URL manipulation, and convert to `http::Uri` before passing the resulting URL to Hyper.
Reviewed By: phillco
Differential Revision: D13738139
fbshipit-source-id: c7de67f1596ebc1bdde89d3fe87086f49c32b5db
Summary:
Directory listing is different in every OS, and due to the current repack
implementation, this directly affect the order in which the packfiles are added
to the new one. Since the resulting packfile name depends on the hash of its
content, the name was influenced by the directory order.
By sorting the files in list_packs, the packfile name will be independent of
the directory listing and thus be the same for all the OSes.
Reviewed By: singhsrb
Differential Revision: D13700935
fbshipit-source-id: 01e055a0c1bcf7fb2dc4faf614dfb20cd4499017
Summary: For now, combine all files smaller than 100MB that accumulate to less than 4GB.
Reviewed By: DurhamG
Differential Revision: D13603760
fbshipit-source-id: 3fa74f1ced3d3ccd463af8f187ef5e0254e1820b
Summary: Use the newly introduced PackWriter to write the {data,history}packs.
Reviewed By: markbt
Differential Revision: D13603759
fbshipit-source-id: 528a6af7c4ac3321aeec0559805de12114224cfd
Summary:
The packfiles are currently being written via an unbuffered file. This is
inefficient as every write to the file results results in a write(2) syscall.
By buffering these writes we can reduce the number of syscalls and thus
increase the throughput of pack writing operations.
Reviewed By: markbt
Differential Revision: D13603758
fbshipit-source-id: 649186a852d427a1473695b1d32cc9cd87a74a75
Summary:
Update pest to 2.1.0.
This version has a new behaviour for parser error messages: the line feed at
the end of the line is shown in the error output.
Reviewed By: wez
Differential Revision: D13671099
fbshipit-source-id: b8d1142a44a56a0b21b3b72cf027f3f8a30f421e
Summary:
The revisionstore crate currently consists of several public submodules,
each exposing several public types. The APIs exposed by each of the modules
require using types from the other modules. As such, users of this crate are
forced to have complex nested imports to use any of its functionality.
This diff helps ease this problem by reexporting the public types exposed from
each of the public submodules at the top level, thereby allowing crate users to
`use` all of the required types without needing nested imports.
Reviewed By: singhsrb
Differential Revision: D13686913
fbshipit-source-id: 9fb3cce8783787aa5f3f974c7168afada5952712
Summary:
The later tries to read from the disk, while the former is purely in memory and
thus more efficient.
Reviewed By: DurhamG, markbt
Differential Revision: D13603757
fbshipit-source-id: 5fd120ba4065d6a65cb2982db9ab81db3ea26524
Summary:
Use the `Fallible` type alias provided by `failure` rather than defining our
own.
Differential Revision: D13657313
fbshipit-source-id: ae249bc15037cc2be019ce7ce8a440c153aa31cc
Summary:
Use the `Fallible` type alias provided by `failure` rather than defining our
own.
Differential Revision: D13657312
fbshipit-source-id: 55134ee93f1f3aaaeefe5644a4a1f2285603bc1c
Summary:
Use the `Fallible` type alias provided by `failure` rather than defining our
own.
Differential Revision: D13657314
fbshipit-source-id: f1a379089972f7f0066c49ddedf606d36b7ac260
Summary:
Use the `Fallible` type alias provided by `failure` rather than defining our
own.
Differential Revision: D13657310
fbshipit-source-id: cae73fc239a6ad30bb6ef56a664d1ef5a2a19b5f
Summary:
On some platforms, removing a file can fail if it's still mapped or opened. In
mercurial, this can happen during repack as the datapacks are removed while
still being mapped.
Reviewed By: DurhamG
Differential Revision: D13615938
fbshipit-source-id: fdc1ff9370e2767e52ee1828552f4598105f784f
Summary:
After repacking the data/history packs, we need to cleanup the
repacked files. This was an omission from D13363853.
Reviewed By: markbt
Differential Revision: D13577592
fbshipit-source-id: 36e7d5b8e86affe47cdd10d33a769969f02b8a62
Summary:
The python version of the mutable packs set the permission to read-only after
writing them, while the rust version keeps them writeable. Let's make the rust
one more consistent.
Reviewed By: markbt
Differential Revision: D13573572
fbshipit-source-id: 61256994562aa09058a88a7935c16dfd7ddf9d18
Summary:
Use of `write!` requires checking for errors, however in this case, there is no
need to use `write!`, as we just want the error as a string.
Reviewed By: ikostia
Differential Revision: D13596497
fbshipit-source-id: 5892025344936936188cf3a8ca227e71eff57d55
Summary:
When I was debugging an eden importer issue with Puneet, we saw errors caused
by important extensions (ex. remotefilelog, lz4revlog) not being loaded. It
turned out that configpaser was checking the "exe dir" to decide where to
load "system configs". For example, If we run:
C:\open\fbsource\fbcode\scm\hg\build\pythonMSVC2015\python.exe eden_import_helper.py
The "exe dir" is "C:\open\fbsource\fbcode\scm\hg\build", and system config is
not there.
Instead of copying "mercurial.ini" to every possible "exe dir", this diff just
switches to a hard-coded system config path. It's now consistent with what we
do on POSIX systems.
The logic to copy "mercurial.ini" to "C:\open\fbsource\fbcode\scm\hg" or
"C:\tools\hg" become unnecessary and are removed.
Reviewed By: singhsrb
Differential Revision: D13542939
fbshipit-source-id: 5fb50d8e42d36ec6da28af29de89966628fe5549
Summary:
`test-check-fix-code.t` was failing due to copyright header missing
from certain files. This commit fixes the files by running
```
contrib/fix-code.py FILE
```
as suggested in the failure message.
Reviewed By: DurhamG
Differential Revision: D13538506
fbshipit-source-id: d8063c9a0e665377a9976abeccb68fbef6781950
Summary:
Unfortunately required symbols are not exposed by lz4-sys. So we just declare
them ourselves.
Make sure it compresses better:
In [1]: c=open('/bin/bash').read();
In [2]: from mercurial.rust import lz4
In [3]: len(lz4.compress(c))
Out[3]: 762906
In [4]: len(lz4.compresshc(c))
Out[4]: 626970
While it's much slower for larger data (and compresshc is slower than pylz4):
Benchmarking (easy to compress data, 20MB)...
pylz4.compress: 10328.03 MB/s
rustlz4.compress_py: 9373.84 MB/s
pylz4.compressHC: 1666.80 MB/s
rustlz4.compresshc_py: 8298.57 MB/s
pylz4.decompress: 3953.03 MB/s
rustlz4.decompress_py: 3935.57 MB/s
Benchmarking (hard to compress data, 0.2MB)...
pylz4.compress: 4357.88 MB/s
rustlz4.compress_py: 4193.34 MB/s
pylz4.compressHC: 3740.40 MB/s
rustlz4.compresshc_py: 2730.71 MB/s
pylz4.decompress: 5600.94 MB/s
rustlz4.decompress_py: 5362.96 MB/s
Benchmarking (hard to compress data, 20MB)...
pylz4.compress: 5156.72 MB/s
rustlz4.compress_py: 5447.00 MB/s
pylz4.compressHC: 33.70 MB/s
rustlz4.compresshc_py: 22.25 MB/s
pylz4.decompress: 2375.42 MB/s
rustlz4.decompress_py: 5755.46 MB/s
Note python-lz4 was using an ancient version of lz4. So there could be differences.
Reviewed By: DurhamG
Differential Revision: D13528200
fbshipit-source-id: 6be1c1dd71f57d40dcffcc8d212d40a853583254
Summary:
The `pybuf` provides a way to read `bytes`, `bytearray`, some `buffer` types in
a zero-copy way. The main benefit is to use same code to support different
input types. It's copied to a couple of places. Let's move it to `cpython-ext`.
Reviewed By: DurhamG
Differential Revision: D13516206
fbshipit-source-id: f58881c4bfe651a6fdb84cf317a74c3c8d7a4961
Summary: Make it possible to write content directly into a PyBytes buffer.
Reviewed By: DurhamG
Differential Revision: D13528202
fbshipit-source-id: 8c0a4ed030439a8dc40cdfbd72b1f6734a8b2036
Summary:
This allows decompressing into a pre-allocated buffer. After some experiments,
it seems `bytearray` will just break too many things, ex:
- bytearray is not hashable
- bytearray[index] returns an int
- a = bytearray('x'); b = a; b += '3' # will mutate 'a'
- ''.join([bytearray('')]) will raise TypeError
Therefore we have to use zero-copy `bytes` instead, which is less elegent. But
this API change is a step forward.
Reviewed By: DurhamG
Differential Revision: D13528201
fbshipit-source-id: 1cfaf5d55efdc0d6c0df85df9960fe9682028b08
Summary:
I need to convert `Vec<u8>` to a Python object in a zero-copy way for rustlz4
performacne.
Assuming Python and Rust use the same memory allocator, it's possible to transfer
the control of a malloc-ed pointer from Rust to Python. Use this to implement
zero-copy. PyByteArrayObject is chosen because its struct contains such a pointer.
PyBytes cannot be used as it embeds the bytes, without using a pointer.
Sadly there are no CPython APIs to do this job. So we have to write to the raw
structures. That means the code will crash if python is replaced by
python-debug (due to Python object header change). However, that seems less an
issue given the performance wins. If python-debug does become a problem, we can
try vendoring libpython directly.
I didn't implement a feature-rich `PyByteArray` Rust object. It's not easy to
do so outside the cpython crate. Most helper macros to declare types cannot be
reused, because they refer to `::python`, which is not available in the current
crate.
Reviewed By: DurhamG
Differential Revision: D13516209
fbshipit-source-id: 9aa089b309beb71d4d21f6c63fcb97dbc798b5f8
Summary:
This gives some sense about how fast it is.
Background: I was trying to get rid of python-lz4, by exposing this to Python.
However, I noticed it's 10x slower than python-lz4. Therefore I added some
benchmark here to test if it's the wrapper or the Rust lz4 code.
It does not seem to be this crate:
```
# Pure Rust
compress (100M) 77.170 ms
decompress (~100M) 67.043 ms
# python-lz4
In [1]: import lz4, os
In [2]: b=os.urandom(100000000);
In [3]: %timeit lz4.compress(b)
10 loops, best of 3: 87.4 ms per loop
```
Reviewed By: DurhamG
Differential Revision: D13516205
fbshipit-source-id: f55f94bbecc3b49667ed12174f7000b1aa29e7c4
Summary:
This exposes the underlying lookup functions from `Index`.
Alternatively we can allow access to `Index` and provide an `iter_started_from`
method on `Log` which takes a raw offset. I have been trying to avoid exposing
raw offsets in public interfaces, as they would change after `flush()` and cause
problems.
Reviewed By: markbt
Differential Revision: D13498303
fbshipit-source-id: 8b00a2a36a9383e3edb6fd7495a005bc985fd461
Summary:
This is the missing API before `indexedlog::Index` can fit in the
`changelog.partialmatch` case. It's actually more flexible as it can provide
some example commit hashes while the existing revlog.c or radixbuf
implementation just error out saying "ambiguous prefix".
It can be also "abused" for the semantics of sorted "sub-keys". By replace
"key" with "key + subkey" when inserting to the index. Looking up using "key"
would return a lazy result list (`PrefixIter`) sorted by "subkey". Note:
the radix tree is NOT efficient (both in time and space) when there are common
prefixes. So this use-case needs to be careful.
Reviewed By: markbt
Differential Revision: D13498301
fbshipit-source-id: 637856ebd761734d68b20c15866424b1d4518ad6
Summary: This will be used in prefix lookups.
Reviewed By: markbt
Differential Revision: D13498300
fbshipit-source-id: 3db7a21d6f35a18699d9dc3a0eca71a5410e0e61
Summary:
It makes testing duplicated - now `cargo test` would try running tests on 2 entry points:
lib.rs and indexedlog_dump.rs. Move it to a separate crate to solve the issue.
Reviewed By: markbt
Differential Revision: D13498266
fbshipit-source-id: 8abf07c1272dfa825ec7701fd8ea9e0d1310ec5f
Summary: `write!` result needs to be used.
Reviewed By: markbt
Differential Revision: D13471967
fbshipit-source-id: d48752bcac05dd33b112679d7faf990eb8ddd651
Summary: The former is deprecated and thus compiling revisionstore shows many warnings.
Reviewed By: markbt
Differential Revision: D13379278
fbshipit-source-id: d4b4662a1ad00997de4c46274deaf22f48487328