Summary:
Upcoming patches will move hg command implementations from exec/hgmain to
lib/hgcommands. That has two benefits:
- The `bindings` crate can use `hgcommands` to call Rust commands from Python.
- Solve a link issue about CPython APIs. Right now, if `hgmain` depend
on *multiple* libraries that depend on `cpython`, there will be a link error
with `cargo build` complaining about lots of CPython APIs do not exist.
With this change, `hgcommands` will be the only crate that `hgmain` depends
on, therefore no such link issues.
Reviewed By: farnz
Differential Revision: D16713538
fbshipit-source-id: 3b0def6eec4870858cdb74ad1b3099dc4cbc42b2
Summary:
Add a thin library to store a side-by-side "stack" that can provide chainned
human-friendly context information. Right now, it only works with the current
native thread. It can be thought of an "annotation" of the native stack.
The concept is useful for both Python and Rust. So this diff adds a Rust
implementation, and a Python binding will be added as follow-up.
This diff only implements the minimal bits to make it useful. Some future
possibilities are:
- Have a way to collect information from all threads. This requires a global
state and customized thread spawn / join wrappers.
- Have a way to "request" for the "stack description" from another thread.
This might be useful for progress-bar use-case, where the progress bar
logic runs in a different thread.
Note: I searched through `crates.io` for something similar. But I couldn't find
any. They are either coupled with error handling that will miss the "explain
why this prefetch happens" case, or are not lazy, which is undesirable as I'd
imagine some context to have complex logic like rendering a DAG graph.
Reviewed By: sfilipco
Differential Revision: D16023308
fbshipit-source-id: 320a23447dea85089ba8ab02436af3ec93466dd8
Summary: This crate is no longer used by Mercurial, but may still be generally useful to others working with Rust HTTP libraries (such as Hyper, which Mercurial no longer uses but motivated the creation of this crate). Rather than just deleting this crate, let's move it to `common/rust`.
Reviewed By: quark-zju
Differential Revision: D16477315
fbshipit-source-id: 385a17751b289d5186dbd9771891c6679c6192ed
Summary: Very small library ( one file ) that allows for posix-style splitting. The library was not vendored in third-party and therefore was just added to unblock as fast as possible.
Reviewed By: quark-zju
Differential Revision: D15911319
fbshipit-source-id: 2820d5beb5b3493a507f00f4b94e93b0405cf991
Summary:
This is the first step to make the main hg executable log parent process
information to blackbox, so `hg blackbox` can show context about which parent
process triggers the hg command. It also enables us to remove the hg wrapper.
It seems the process info is the only thing that has to be done from the main
executable, instead of a separate fb-only scuba logging process. Therefore only
procinfo.rs was copied, to make it easier to build the main binary in an OSS
environment. Other information (network, OS, etc.) can remain unchanged and the
fb-only logging process can still depend on it.
This stack has breaking changes to the API exposed by procinfo. Namely, the new
APIs look the same across platforms, and does not shell out to `ps`.
Dependencies are cleaned up aggressively.
Reviewed By: sfilipco
Differential Revision: D16137796
fbshipit-source-id: 52b354a17527cd8a10d9c63d04524358c6dc3c73
Summary:
Switch to the new Python runtime. Remove parts that are incompatible and no
longer necessary, including:
- Building curses, python-lz4, and urllib3 (in build_nupkg.py).
- Mangling sys.path (in hgpython and entrypoint.py).
- Zip-related logic (in hgpython).
- Cargo features controlling build environment (in hgpython and hgmain).
- Zipping Python stdlib (in setup.py).
- Shipping files in `templates`, `help`, and most `contrib` files (in
setup.py).
For the hgpython part, the new expectation is: in all cases (windows, linux,
make local, installed, buck), `edenscm.mercurial.entrypoint` and
`edenscmnative` modules are always directly importable and are always the
expected modules if imported. So the `hg` logic just imports and runs it
without having any `sys.path` related logic.
To explain it further:
- When installed on a POSIX system, the default `sys.path` contains
site-packages. Both edenscm and edenscmnative are in site-packages.
- When installed on Windows, the executable (hg.real.exe), python27.dll,
python27.zip, and edenscmnative/ are all in a same directory. Therefore the
default sys.path (exe dir + python27.zip) would be sufficient.
- When using `make local`, and run via `scm/hg/hg`, the `PySys_SetArgv` API
(called by `hgpython`) inserts the `scm/hg` directory as `sys.path[0]`,
therefore edenscm and edenscmnative in `scm/hg` will be imported as expected.
Since we no longer hard code paths to search for modules, this should fix
issues on systems with a different sys.path, ex. debian and ubuntu uses
"dist-packages" instead of "site-packages".
Note: IPython is broken. It seems to be a combination of newer Python version,
newer compiler and 64 bit (see [1]). It looks like prompt_toolkit incorrectly
uses untyped ctypes APIs where it passes "int/long"s to places expecting a
HANDLE. The ctypes library uses 4-byte integers for plain "int/long"s where a
HANDLE is 8 byte on 64 bit Windows. The new interpreter is stricter somehow
and will error out in this case (also explains why D15912516 is needed).
The fix to prompt_toolkit was sent as
https://github.com/prompt-toolkit/python-prompt-toolkit/pull/930.
[1]: https://github.com/prompt-toolkit/python-prompt-toolkit/issues/406
Reviewed By: ikostia
Differential Revision: D15894694
fbshipit-source-id: 560d11ae28c1e65d58b760eac93701e753bd397e
Summary:
When I was trying to migrate existing `.t` tests using blackbox, I found them
all use `grep` to filter entries. That's sad. Ideally, the strongly typed
events can be matched by structured content. For example, to match all watchman
events, use something like: `{"watchman": "_"}`, where `"_"` denotes
"match anything". To match all "get_files" network operations, use
`{"network": {"op": "get_files"}}`. We can later abuse the array type to express
more complicated logic, like `["ge", 100]` for integer range selection, or things
like `["or", pat1, pat2]` for logic operations.
Currently, the `Event` type uses `#[serde(rename=<short_name>)]` to reduce
space usage, which is not friendly for human to write match expressions. The
`Event` type does use `#[serde(alias=<long_name>)]` so a human-friendly
JSON string can be *deserialize* to `Event`. However, to be albe to perform
the complex pattern matching, we have to *serialize* both `Event` and pattern
to `serde_json::Value`, and that should use the human-friendly long name.
Since we already have the desired long names in the "serde alias" attribute, we
can generate code that swaps "serde alias" and "serde rename", so the long name
can be used for serialization. That's this diff.
Reviewed By: xavierd
Differential Revision: D15685469
fbshipit-source-id: e71069dd3e1439ff595da68dfc877128a3e79db2
Summary:
The blackbox library is intended to be the native logging library, with the
goals:
- Unifying Rust and Python logging. Currently, there are no good story in the
Rust logging land.
- Unifying cloud logging and local logging. Currently, there are very different
implementations and both sides have data that are unavailable in the other
side. My plan is to make local logging complete and cloud logging is just
extracting from the local logs.
- Typed data, binary format, with indexes. So the data is easier for code to
consume (ex. no more regex parsing text logs).
For the initial version, I just added the basic features: `log` (write) and
`filter` (read). I was optimizing for "code is easier to write", so I have made
decisions like:
- Silence errors and lose data. No `Fallible` in return type.
- Use `Vec` instead of complex iterator (can be changed later).
- Use a single `enum` type for different event types instead of originally
designed a complex "register" framework to register different event types [1].
(based on the assumption that we know all event types)
Although I did not optimize raw performance too much here, it might be good
enough with the previous improvements to `Log::sync`. I'd expect the fast path
to be able to log 10 entries per millisecond. If `Log::sync` becomes an issue,
we can move it to a background thread in the future.
There are also some places where the logic can be smarter and try to auto
recover from error cases (ex. by automatically rebuilding indexes for Log),
but they are not added in this diff. For now, such index corruption might just
delete the blackbox logs.
The indexedlog code was changed slightly to make the iterator less sensitive
to data corruptions, and continue with potentially good data. It affects one of
the tests added in this diff.
[1]: I later found https://github.com/dtolnay/typetag provides such feature. It
is not adopted for 2 reasons:
- It depends on rust-ctor, which is a bit hacky (ex. does not work for all platforms,
and might break with certain build flags).
See also https://internals.rust-lang.org/t/pre-rfc-add-language-support-for-global-constructor-functions/9840
- Event definitions are in individual crates, instead of a centric one. There are pros
and cons. The cons are, programs (ex. the cloud logging executable) would have
unnecessary dependencies, and it's harder to reuse structures for different events.
Reviewed By: xavierd
Differential Revision: D15588431
fbshipit-source-id: 264a6b9c5fb587cf425ec2fe14cf1fe930c39892
Summary: Initial commit for the cliparser command line argument parsing library. Can be tested with buck, or cargo.
Reviewed By: quark-zju
Differential Revision: D15487278
fbshipit-source-id: 5f3f65e58a815d2c408e38a3eb9e9981086c6715
Summary:
Thin lto is much faster to compile, and doesn't make the resulting binaries
that much bigger.
Reviewed By: xavierd
Differential Revision: D15396282
fbshipit-source-id: 3e2bf059756d47218061d7e41f041e445d7f60c8
Summary:
This library parses an ASCII DAG. It is similar to mercurial/drawdag.py, which
was added by me in [1].
There are some (intentional) differences from the Python drawdag:
- Stricter. Confusing DAG characters like `+` or crossing lines are forbidden.
- Do not special handle `o` as a name.
- Do not try to be compatible with `hg log -G` output.
- Do not support special comments (yet).
- Support both left to right and bottom to top directions.
This library tries to be abstract. i.e. it does not have actual logic about
how to make a commit. Its intended users are Mononoke and scmdag, which have
different ways to make commits.
Since this is a library that is intended to be used only for tests. I didn't spend too
much effort to optimize its performance.
[1]: https://www.mercurial-scm.org/repo/hg/rev/a31634336471
Reviewed By: kulshrax
Differential Revision: D15039768
fbshipit-source-id: 4c33d44759ecf59aadc3d443a84db07d702dc69b
Summary: The scmdag library is going to have things related to the commit graph.
Reviewed By: sfilipco
Differential Revision: D15004984
fbshipit-source-id: f274cceeabae4a57985763216572f7cd055f8e07
Summary: Move the contents of `edenapi-types` into the `types` crate so all of Mercurial's Rust types are in one place.
Reviewed By: quark-zju
Differential Revision: D14114547
fbshipit-source-id: feb8f9c35f102d30bf00b230df81a86a3893a49b
Summary:
For HTTP data fetching, it will be necessary to have the same Rust types in Mononoke and Mercurial, so that Mononoke can send down the serialized types and Mercurial can deserialize them. These types must live in the Mercurial codebase since Mercurial can't link to code outside of fbcode/scm/hg. As such, this diff adds a new crate to Mercurial that Mononoke can link to, containing these shared types.
Right now the only shared type is a `HistoryEntry`, designed to match the interface of `MutableDatapack::add`. This type will be used as part of the HTTP history fetching API.
In the longer term, it would probably make sense to use something like Thrift for defining the on-the-wire formats used between Mercurial and Mononoke (and eventually for RPC as well). However, given that using Thrift from Mercurial is currently nontrivial (since Mercurial is typically built with Cargo and needs to be compatible with open source tooling), defining the schema in this crate and using `serde` for serialization and HTTP/2 for transport should be sufficient for now.
Reviewed By: quark-zju
Differential Revision: D14079337
fbshipit-source-id: c7880919aeb3fd7e1cf70067a89a17341c1d973f
Summary:
The seed for the rust implementation of manifests.
We start with the most primitive API for manifests and maps a paths to a `Node`. At the basic level we need the same operations that a map implements so we start with `insert`, `get` and `remove`. We know that retrieving data for Manifests can fail so we encode that in our interface using `Fallible`.
I let for future iterations requiring iterator or returning manifest flags.
Reviewed By: DurhamG
Differential Revision: D14016274
fbshipit-source-id: 8f1f83610933b9e9a96f8c5ba2c6e50567c76e06
Summary: `lib/argparse` fails to build with cargo. Removing the crate from the workspace to unblock building with cargo.
Reviewed By: quark-zju
Differential Revision: D13969332
fbshipit-source-id: 0299f74e6aa81632ce64005d91fa2c30a32f5b96
Summary: Rename Mononoke API to Eden API, per war room discussion.
Reviewed By: quark-zju
Differential Revision: D13908195
fbshipit-source-id: 94a2fe93f8a89d0c5e9b6a24939cc4760cfaade0
Summary:
Crate adding easy conversions between `http::Uri` and `url::Url`.
Rust has two main types for working with URLs: `http::Uri` and `url::Url`. `http::Uri` comes from the `http` crate, which is supposed to be a set of common types to be used throughout the Rust HTTP ecosystem, to ensure mutual compatibility between different HTTP crates and web frameworks. This is the type that HTTP clients like Hyper expect when specifying URLs.
Unfortunately, `http::Uri` is a very simple type that does not expose any means of mutating or otherwise manipulating the URL. It can only parse URLs from strings, forcing the users to construct URLs via error-prone string concatenation.
In contrast, the `url::Url` comes from the `rust-url` crate from the Servo project. This type does support easily constructing and manipulating URLs, making it very useful for assembling a URL from components.
The only way to convert between the two types is to first convert back to a string, and then re-parse as the desired type. Several issues [have](https://github.com/hyperium/hyper/issues/1219) [been](https://github.com/hyperium/hyper/issues/1102) [raised](https://github.com/hyperium/hyper/issues/1219) about this upstream, but there has been no consensus or action as of yet. To get around the problem for now, this crate adds convenience methods to perform the conversions.
Reviewed By: DurhamG
Differential Revision: D13887403
fbshipit-source-id: ecfaf3ea9d884621493b0fe44a6b5658d10108b4
Summary:
I need to convert `Vec<u8>` to a Python object in a zero-copy way for rustlz4
performacne.
Assuming Python and Rust use the same memory allocator, it's possible to transfer
the control of a malloc-ed pointer from Rust to Python. Use this to implement
zero-copy. PyByteArrayObject is chosen because its struct contains such a pointer.
PyBytes cannot be used as it embeds the bytes, without using a pointer.
Sadly there are no CPython APIs to do this job. So we have to write to the raw
structures. That means the code will crash if python is replaced by
python-debug (due to Python object header change). However, that seems less an
issue given the performance wins. If python-debug does become a problem, we can
try vendoring libpython directly.
I didn't implement a feature-rich `PyByteArray` Rust object. It's not easy to
do so outside the cpython crate. Most helper macros to declare types cannot be
reused, because they refer to `::python`, which is not available in the current
crate.
Reviewed By: DurhamG
Differential Revision: D13516209
fbshipit-source-id: 9aa089b309beb71d4d21f6c63fcb97dbc798b5f8
Summary:
Adds a new crate `cpython-result`, which provides a `ResultExt` trait, which
extends the failure `Result` type to allow coversion to `PyResult` by
converting the error to an appropriate Python Exception.
Reviewed By: quark-zju
Differential Revision: D12980782
fbshipit-source-id: 44a63d31f9ecf2f77efa3b37c68f9a99eaf6d6fa
Summary:
The mutationstore is a new store for recording records of commit mutations for
commits that are not in the local repository.
It uses an indexedlog to store the data. Each mutation entry corresponds to
the information the mutation that led to the creation of a particular commit,
which is recorded as the successor in the entry.
Entries can come from three possible places:
* `Commit` metadata for a commit not available locally
* `Obsmarkers` for repos that have been migrated from evolution tracking
* `Synthetic` for entries created synthetically, e.g. by a pullcreatemarkers
implementation.
The other commits referred to in an entry must predate the successor commit.
For entries that originated from commits, this is ensured, as the successor
commit hash includes the other commit hashes. For other entry types, it is
an error to refer to later commits, and any entry that causes a cycle will
be ignored.
Reviewed By: quark-zju
Differential Revision: D12980773
fbshipit-source-id: 040d3f7369a113e710ed8c9f61fabec6c5ec9258
Summary:
Introduces a nodemap structure that stores the mapping between two
nodes with bidirectional indexes.
Reviewed By: quark-zju
Differential Revision: D13047698
fbshipit-source-id: 967bf4b26a4b57e4fa2421a342edb21d3a5adbf6
Summary:
This diff adds a new `mononokeapi` crate, which is a Rust client library for the Mononoke API server. The crate is intended for use beyond Mercurial, and as such attempts to expose functionality in a reasonably generic way.
Right now, the only method supported by this crate is `/health_check`, which is the API server's health check endpoint that simply returns the string "I_AM_ALIVE" on success. Future diffs will expand this crate to include more of the API server's actual functionality. For now, this version serves as a proof of concept of how all the crate will be structured.
The crate currently uses the `hyper` crate for its HTTP client, with `native-tls` for TLS support. Given that the client credentials required for mutual authentication with the Mononoke VIP are encoded in a format that `native-tls` does not understand, some credential format conversion via the `openssl` crate is necessary.
Reviewed By: DurhamG
Differential Revision: D13055687
fbshipit-source-id: cc944abd579ce49928776646c0dcce567f99c3b6
Summary:
This makes all crates' cache shared and unifies Cargo.lock, which
is used by the next diff.
Reviewed By: ikostia
Differential Revision: D10213071
fbshipit-source-id: 48a979c41423a8e8a9795ff102646cce13c39ff4
Summary:
The `Node` type will be used in multiple places. Let's move it to a standalone
crate so new libraries depending on it won't need to pull in all of
revisionstore's dependencies.
Note: I'd also like the `types` create to only define clean types. Given the
fact NULL_ID is not a great design in Mercurial (`Option<Node>` is a better
choice in Rust), it probably does not belong to the formal Rust `Node` type.
This diff is merely about moving things with minimal changes. NULL_ID will
be decoupled from `Node` in a follow-up.
Reviewed By: markbt
Differential Revision: D10132047
fbshipit-source-id: 5d05c5e0ac06a2d58556c4db11775503f9495626
Summary:
Create a storage object that can be used to load bookmarks from a
mercurial file, modify and query the bookmarks in memory and then write back
to a mercurial bookmark file.
Reviewed By: quark-zju
Differential Revision: D9768564
fbshipit-source-id: ed469d0e588ae2200d614bf62a5a0b577e7c6f74
Summary:
In the later diffs I'll add some more functionality there, not strictly
related to encoding paths.
Reviewed By: quark-zju
Differential Revision: D9441427
fbshipit-source-id: 069ab30a24761038fa2c1a4f180bbc0699d38ef9
Summary:
Backout D9124508.
This is actually more complex than it seems. It breaks non-buck build
everywhere:
- hgbuild on all platforms. POSIX platforms break because `hg archive` will
miss `scm/common`. Windows build breaks because of symlink.
- `make local` on GitHub repo because `failure_ext` is not public. The `pylz4`
Cargo.toml has missing dependencies.
Fixing them correctly seems non-trivial. Therefore let's backout the change to
unblock builds quickly.
The linter change is kept in case we'd like to try again in the future.
Reviewed By: simpkins
Differential Revision: D9225955
fbshipit-source-id: 4170a5f7664ac0f6aa78f3b32f61a09d65e19f63
Summary: Moved the lz4 compression code into a separate module in `scm/common/pylz4` and redirected code referencing the former two files to the new module
Reviewed By: quark-zju, mitrandir77
Differential Revision: D9124508
fbshipit-source-id: e4796cf36d16c3a8c60314c75f26ee942d2f9e65
Summary: This will be used to parse hgrc-like config files.
Reviewed By: mitrandir77
Differential Revision: D8777330
fbshipit-source-id: 73a114df36e23246a3fc1206be202fba8705453a
Summary:
Make `lib` a cargo workspace so building in subprojects would share a
`target` directory and `cargo doc` will build documentation for all
subprojects.
Reviewed By: DurhamG
Differential Revision: D8741175
fbshipit-source-id: 512325bcb23d51e866e764bdc76dddb22c59ef05