Commit Graph

34 Commits

Author SHA1 Message Date
Jun Wu
3a7f54dffe lib: rename hgpython to hgcommands
Summary:
Upcoming patches will move hg command implementations from exec/hgmain to
lib/hgcommands. That has two benefits:

- The `bindings` crate can use `hgcommands` to call Rust commands from Python.
- Solve a link issue about CPython APIs. Right now, if `hgmain` depend
  on *multiple* libraries that depend on `cpython`, there will be a link error
  with `cargo build` complaining about lots of CPython APIs do not exist.
  With this change, `hgcommands` will be the only crate that `hgmain` depends
  on, therefore no such link issues.

Reviewed By: farnz

Differential Revision: D16713538

fbshipit-source-id: 3b0def6eec4870858cdb74ad1b3099dc4cbc42b2
2019-08-08 22:54:09 -07:00
Jun Wu
8315564c33 stackdesc: a thin library to annotate the current thread execution
Summary:
Add a thin library to store a side-by-side "stack" that can provide chainned
human-friendly context information. Right now, it only works with the current
native thread. It can be thought of an "annotation" of the native stack.

The concept is useful for both Python and Rust. So this diff adds a Rust
implementation, and a Python binding will be added as follow-up.

This diff only implements the minimal bits to make it useful. Some future
possibilities are:

- Have a way to collect information from all threads. This requires a global
  state and customized thread spawn / join wrappers.
- Have a way to "request" for the "stack description" from another thread.
  This might be useful for progress-bar use-case, where the progress bar
  logic runs in a different thread.

Note: I searched through `crates.io` for something similar. But I couldn't find
any. They are either coupled with error handling that will miss the "explain
why this prefetch happens" case, or are not lazy, which is undesirable as I'd
imagine some context to have complex logic like rendering a DAG graph.

Reviewed By: sfilipco

Differential Revision: D16023308

fbshipit-source-id: 320a23447dea85089ba8ab02436af3ec93466dd8
2019-08-01 19:53:56 -07:00
Arun Kulshreshtha
ee69300428 url-ext: move url-ext to common/rust
Summary: This crate is no longer used by Mercurial, but may still be generally useful to others working with Rust HTTP libraries (such as Hyper, which Mercurial no longer uses but motivated the creation of this crate). Rather than just deleting this crate, let's move it to `common/rust`.

Reviewed By: quark-zju

Differential Revision: D16477315

fbshipit-source-id: 385a17751b289d5186dbd9771891c6679c6192ed
2019-07-25 23:01:15 -07:00
Jared Bosco
137edc1814 rustshlex: add external source code for posix-style parsing
Summary: Very small library ( one file ) that allows for posix-style splitting.  The library was not vendored in third-party and therefore was just added to unblock as fast as possible.

Reviewed By: quark-zju

Differential Revision: D15911319

fbshipit-source-id: 2820d5beb5b3493a507f00f4b94e93b0405cf991
2019-07-20 01:06:34 -07:00
Jun Wu
e9df7eae04 procinfo: extract part of telemetry into a standalone library
Summary:
This is the first step to make the main hg executable log parent process
information to blackbox, so `hg blackbox` can show context about which parent
process triggers the hg command. It also enables us to remove the hg wrapper.

It seems the process info is the only thing that has to be done from the main
executable, instead of a separate fb-only scuba logging process. Therefore only
procinfo.rs was copied, to make it easier to build the main binary in an OSS
environment. Other information (network, OS, etc.) can remain unchanged and the
fb-only logging process can still depend on it.

This stack has breaking changes to the API exposed by procinfo. Namely, the new
APIs look the same across platforms, and does not shell out to `ps`.

Dependencies are cleaned up aggressively.

Reviewed By: sfilipco

Differential Revision: D16137796

fbshipit-source-id: 52b354a17527cd8a10d9c63d04524358c6dc3c73
2019-07-16 17:10:57 -07:00
Jun Wu
8997f2f15d windows: switch to the new Python runtime
Summary:
Switch to the new Python runtime. Remove parts that are incompatible and no
longer necessary, including:

- Building curses, python-lz4, and urllib3 (in build_nupkg.py).
- Mangling sys.path (in hgpython and entrypoint.py).
- Zip-related logic (in hgpython).
- Cargo features controlling build environment (in hgpython and hgmain).
- Zipping Python stdlib (in setup.py).
- Shipping files in `templates`, `help`, and most `contrib` files (in
  setup.py).

For the hgpython part, the new expectation is: in all cases (windows, linux,
make local, installed, buck), `edenscm.mercurial.entrypoint` and
`edenscmnative` modules are always directly importable and are always the
expected modules if imported. So the `hg` logic just imports and runs it
without having any `sys.path` related logic.

To explain it further:
- When installed on a POSIX system, the default `sys.path` contains
  site-packages.  Both edenscm and edenscmnative are in site-packages.
- When installed on Windows, the executable (hg.real.exe), python27.dll,
  python27.zip, and edenscmnative/ are all in a same directory. Therefore the
  default sys.path (exe dir + python27.zip) would be sufficient.
- When using `make local`, and run via `scm/hg/hg`, the `PySys_SetArgv` API
  (called by `hgpython`) inserts the `scm/hg` directory as `sys.path[0]`,
  therefore edenscm and edenscmnative in `scm/hg` will be imported as expected.

Since we no longer hard code paths to search for modules, this should fix
issues on systems with a different sys.path, ex. debian and ubuntu uses
"dist-packages" instead of "site-packages".

Note: IPython is broken. It seems to be a combination of newer Python version,
newer compiler and 64 bit (see [1]). It looks like prompt_toolkit incorrectly
uses untyped ctypes APIs where it passes "int/long"s to places expecting a
HANDLE. The ctypes library uses 4-byte integers for plain "int/long"s where a
HANDLE is 8 byte on 64 bit Windows.  The new interpreter is stricter somehow
and will error out in this case (also explains why D15912516 is needed).

The fix to prompt_toolkit was sent as
https://github.com/prompt-toolkit/python-prompt-toolkit/pull/930.

[1]: https://github.com/prompt-toolkit/python-prompt-toolkit/issues/406

Reviewed By: ikostia

Differential Revision: D15894694

fbshipit-source-id: 560d11ae28c1e65d58b760eac93701e753bd397e
2019-06-24 08:34:23 -07:00
Jared Bosco
c8a1f2e255 clidispatch: create bare library
Summary: clidispatch is the command-line dispatching library to handle dispatching actions to handlers

Reviewed By: quark-zju

Differential Revision: D15612127

fbshipit-source-id: 2c801eeeda0f72c772e938f039f78c67c44ef1f4
2019-06-10 10:19:52 -07:00
Jun Wu
bc9be5419c blackbox: add a proc macro library to rewrite serde attributes
Summary:
When I was trying to migrate existing `.t` tests using blackbox, I found them
all use `grep` to filter entries. That's sad. Ideally, the strongly typed
events can be matched by structured content. For example, to match all watchman
events, use something like: `{"watchman": "_"}`, where `"_"` denotes
"match anything". To match all "get_files" network operations, use
`{"network": {"op": "get_files"}}`. We can later abuse the array type to express
more complicated logic, like `["ge", 100]` for integer range selection, or things
like `["or", pat1, pat2]` for logic operations.

Currently, the `Event` type uses `#[serde(rename=<short_name>)]` to reduce
space usage, which is not friendly for human to write match expressions. The
`Event` type does use `#[serde(alias=<long_name>)]` so a human-friendly
JSON string can be *deserialize* to `Event`. However, to be albe to perform
the complex pattern matching, we have to *serialize* both `Event` and pattern
to `serde_json::Value`, and that should use the human-friendly long name.

Since we already have the desired long names in the "serde alias" attribute, we
can generate code that swaps "serde alias" and "serde rename", so the long name
can be used for serialization. That's this diff.

Reviewed By: xavierd

Differential Revision: D15685469

fbshipit-source-id: e71069dd3e1439ff595da68dfc877128a3e79db2
2019-06-07 20:06:03 -07:00
Jun Wu
a2a96fcf92 blackbox: initial library for native logging
Summary:
The blackbox library is intended to be the native logging library, with the
goals:

- Unifying Rust and Python logging. Currently, there are no good story in the
  Rust logging land.
- Unifying cloud logging and local logging. Currently, there are very different
  implementations and both sides have data that are unavailable in the other
  side. My plan is to make local logging complete and cloud logging is just
  extracting from the local logs.
- Typed data, binary format, with indexes. So the data is easier for code to
  consume (ex. no more regex parsing text logs).

For the initial version, I just added the basic features: `log` (write) and
`filter` (read). I was optimizing for "code is easier to write", so I have made
decisions like:

- Silence errors and lose data. No `Fallible` in return type.
- Use `Vec` instead of complex iterator (can be changed later).
- Use a single `enum` type for different event types instead of originally
  designed a complex "register" framework to register different event types [1].
  (based on the assumption that we know all event types)

Although I did not optimize raw performance too much here, it might be good
enough with the previous improvements to `Log::sync`. I'd expect the fast path
to be able to log 10 entries per millisecond.  If `Log::sync` becomes an issue,
we can move it to a background thread in the future.

There are also some places where the logic can be smarter and try to auto
recover from error cases (ex. by automatically rebuilding indexes for Log),
but they are not added in this diff. For now, such index corruption might just
delete the blackbox logs.

The indexedlog code was changed slightly to make the iterator less sensitive
to data corruptions, and continue with potentially good data. It affects one of
the tests added in this diff.

[1]: I later found https://github.com/dtolnay/typetag provides such feature. It
is not adopted for 2 reasons:
- It depends on rust-ctor, which is a bit hacky (ex. does not work for all platforms,
  and might break with certain build flags).
  See also https://internals.rust-lang.org/t/pre-rfc-add-language-support-for-global-constructor-functions/9840
- Event definitions are in individual crates, instead of a centric one. There are pros
  and cons. The cons are, programs (ex. the cloud logging executable) would have
  unnecessary dependencies, and it's harder to reuse structures for different events.

Reviewed By: xavierd

Differential Revision: D15588431

fbshipit-source-id: 264a6b9c5fb587cf425ec2fe14cf1fe930c39892
2019-06-04 16:10:34 -07:00
Jared Bosco
d94dde74b2 cliparser: create bare cliparser library
Summary: Initial commit for the cliparser command line argument parsing library.  Can be tested with buck, or cargo.

Reviewed By: quark-zju

Differential Revision: D15487278

fbshipit-source-id: 5f3f65e58a815d2c408e38a3eb9e9981086c6715
2019-05-30 14:42:37 -07:00
Mark Thomas
6c0e62aec2 rust: switch to thin lto
Summary:
Thin lto is much faster to compile, and doesn't make the resulting binaries
that much bigger.

Reviewed By: xavierd

Differential Revision: D15396282

fbshipit-source-id: 3e2bf059756d47218061d7e41f041e445d7f60c8
2019-05-20 04:08:03 -07:00
Jun Wu
c32c89ffa4 drawdag: implement a Rust drawdag library
Summary:
This library parses an ASCII DAG. It is similar to mercurial/drawdag.py, which
was added by me in [1].

There are some (intentional) differences from the Python drawdag:

- Stricter. Confusing DAG characters like `+` or crossing lines are forbidden.
- Do not special handle `o` as a name.
- Do not try to be compatible with `hg log -G` output.
- Do not support special comments (yet).
- Support both left to right and bottom to top directions.

This library tries to be abstract. i.e. it does not have actual logic about
how to make a commit. Its intended users are Mononoke and scmdag, which have
different ways to make commits.

Since this is a library that is intended to be used only for tests. I didn't spend too
much effort to optimize its performance.

[1]: https://www.mercurial-scm.org/repo/hg/rev/a31634336471

Reviewed By: kulshrax

Differential Revision: D15039768

fbshipit-source-id: 4c33d44759ecf59aadc3d443a84db07d702dc69b
2019-05-03 13:35:40 -07:00
Jun Wu
26f1f12a28 dag: add a library
Summary: The scmdag library is going to have things related to the commit graph.

Reviewed By: sfilipco

Differential Revision: D15004984

fbshipit-source-id: f274cceeabae4a57985763216572f7cd055f8e07
2019-04-25 17:05:09 -07:00
Arun Kulshreshtha
8864e6a0da types: move edenapi-types into types crate
Summary: Move the contents of `edenapi-types` into the `types` crate so all of Mercurial's Rust types are in one place.

Reviewed By: quark-zju

Differential Revision: D14114547

fbshipit-source-id: feb8f9c35f102d30bf00b230df81a86a3893a49b
2019-02-15 22:51:04 -08:00
Arun Kulshreshtha
61f9f25a66 edenapi-types: add crate for types shared between Mercurial and Mononoke
Summary:
For HTTP data fetching, it will be necessary to have the same Rust types in Mononoke and Mercurial, so that Mononoke can send down the serialized types and Mercurial can deserialize them. These types must live in the Mercurial codebase since Mercurial can't link to code outside of fbcode/scm/hg. As such, this diff adds a new crate to Mercurial that Mononoke can link to, containing these shared types.

Right now the only shared type is a `HistoryEntry`, designed to match the interface of `MutableDatapack::add`. This type will be used as part of the HTTP history fetching API.

In the longer term, it would probably make sense to use something like Thrift for defining the on-the-wire formats used between Mercurial and Mononoke (and eventually for RPC as well). However, given that using Thrift from Mercurial is currently nontrivial (since Mercurial is typically built with Cargo and needs to be compatible with open source tooling), defining the schema in this crate and using `serde` for serialization and HTTP/2 for transport should be sufficient for now.

Reviewed By: quark-zju

Differential Revision: D14079337

fbshipit-source-id: c7880919aeb3fd7e1cf70067a89a17341c1d973f
2019-02-15 15:17:12 -08:00
Stefan Filip
c1b8cd68d8 Add manifest crate
Summary:
The seed for the rust implementation of manifests.

We start with the most primitive API for manifests and maps a paths to a `Node`. At the basic level we need the same operations that a map implements so we start with `insert`, `get` and `remove`. We know that retrieving data for Manifests can fail so we encode that in our interface using `Fallible`.

I let for future iterations requiring iterator or returning manifest flags.

Reviewed By: DurhamG

Differential Revision: D14016274

fbshipit-source-id: 8f1f83610933b9e9a96f8c5ba2c6e50567c76e06
2019-02-14 13:32:05 -08:00
Stefan Filip
162f93f205 Remove argparse from the lib cargo workspace
Summary: `lib/argparse` fails to build with cargo. Removing the crate from the workspace to unblock building with cargo.

Reviewed By: quark-zju

Differential Revision: D13969332

fbshipit-source-id: 0299f74e6aa81632ce64005d91fa2c30a32f5b96
2019-02-06 16:42:23 -08:00
Arun Kulshreshtha
70aff50986 edenapi: rename mononokeapi to edenapi
Summary: Rename Mononoke API to Eden API, per war room discussion.

Reviewed By: quark-zju

Differential Revision: D13908195

fbshipit-source-id: 94a2fe93f8a89d0c5e9b6a24939cc4760cfaade0
2019-02-05 21:22:48 -08:00
Arun Kulshreshtha
5ae0d91378 url-ext: add url-ext crate
Summary:
Crate adding easy conversions between `http::Uri` and `url::Url`.

Rust has two main types for working with URLs: `http::Uri` and `url::Url`.  `http::Uri` comes from the `http` crate, which is supposed to be a set of common types to be used throughout the Rust HTTP ecosystem, to ensure mutual compatibility between different HTTP crates and web frameworks. This is the type that HTTP clients like Hyper expect when specifying URLs.

Unfortunately, `http::Uri` is a very simple type that does not expose any means of mutating or otherwise manipulating the URL. It can only parse URLs from strings, forcing the users to construct URLs via error-prone string concatenation.

In contrast, the `url::Url` comes from the `rust-url` crate from the Servo project. This type does support easily constructing and manipulating URLs, making it very useful for assembling a URL from components.

The only way to convert between the two types is to first convert back to a string, and then re-parse as the desired type. Several issues [have](https://github.com/hyperium/hyper/issues/1219) [been](https://github.com/hyperium/hyper/issues/1102) [raised](https://github.com/hyperium/hyper/issues/1219) about this upstream, but there has been no consensus or action as of yet. To get around the problem for now, this crate adds convenience methods to perform the conversions.

Reviewed By: DurhamG

Differential Revision: D13887403

fbshipit-source-id: ecfaf3ea9d884621493b0fe44a6b5658d10108b4
2019-01-30 18:30:49 -08:00
Jun Wu
7831e2a4ce cpython-ext: add ways to zero-copy Vec<u8> into a Python object
Summary:
I need to convert `Vec<u8>` to a Python object in a zero-copy way for rustlz4
performacne.

Assuming Python and Rust use the same memory allocator, it's possible to transfer
the control of a malloc-ed pointer from Rust to Python. Use this to implement
zero-copy. PyByteArrayObject is chosen because its struct contains such a pointer.
PyBytes cannot be used as it embeds the bytes, without using a pointer.

Sadly there are no CPython APIs to do this job. So we have to write to the raw
structures. That means the code will crash if python is replaced by
python-debug (due to Python object header change). However, that seems less an
issue given the performance wins. If python-debug does become a problem, we can
try vendoring libpython directly.

I didn't implement a feature-rich `PyByteArray` Rust object. It's not easy to
do so outside the cpython crate. Most helper macros to declare types cannot be
reused, because they refer to `::python`, which is not available in the current
crate.

Reviewed By: DurhamG

Differential Revision: D13516209

fbshipit-source-id: 9aa089b309beb71d4d21f6c63fcb97dbc798b5f8
2018-12-20 17:54:22 -08:00
Mark Thomas
ca135cd33f cpython-failure: Integrate cpython PyResult with the failure crate
Summary:
Adds a new crate `cpython-result`, which provides a `ResultExt` trait, which
extends the failure `Result` type to allow coversion to `PyResult` by
converting the error to an appropriate Python Exception.

Reviewed By: quark-zju

Differential Revision: D12980782

fbshipit-source-id: 44a63d31f9ecf2f77efa3b37c68f9a99eaf6d6fa
2018-12-14 06:43:40 -08:00
Mark Thomas
cf4b52c19c mutationstore: add mutationstore
Summary:
The mutationstore is a new store for recording records of commit mutations for
commits that are not in the local repository.

It uses an indexedlog to store the data.  Each mutation entry corresponds to
the information the mutation that led to the creation of a particular commit,
which is recorded as the successor in the entry.

Entries can come from three possible places:

* `Commit` metadata for a commit not available locally
* `Obsmarkers` for repos that have been migrated from evolution tracking
* `Synthetic` for entries created synthetically, e.g. by a pullcreatemarkers
  implementation.

The other commits referred to in an entry must predate the successor commit.
For entries that originated from commits, this is ensured, as the successor
commit hash includes the other commit hashes.  For other entry types, it is
an error to refer to later commits, and any entry that causes a cycle will
be ignored.

Reviewed By: quark-zju

Differential Revision: D12980773

fbshipit-source-id: 040d3f7369a113e710ed8c9f61fabec6c5ec9258
2018-12-14 06:43:40 -08:00
Durham Goode
e9b755198c nodemap: introduce rust bidirectional node map
Summary:
Introduces a nodemap structure that stores the mapping between two
nodes with bidirectional indexes.

Reviewed By: quark-zju

Differential Revision: D13047698

fbshipit-source-id: 967bf4b26a4b57e4fa2421a342edb21d3a5adbf6
2018-12-06 11:47:41 -08:00
Arun Kulshreshtha
365352a0ba mononokeapi: client library for mononoke api server
Summary:
This diff adds a new `mononokeapi` crate, which is a Rust client library for the Mononoke API server. The crate is intended for use beyond Mercurial, and as such attempts to expose functionality in a reasonably generic way.

Right now, the only method supported by this crate is `/health_check`, which is the API server's health check endpoint that simply returns the string "I_AM_ALIVE" on success. Future diffs will expand this crate to include more of the API server's actual functionality. For now, this version serves as a proof of concept of how all the crate will be structured.

The crate currently uses the `hyper` crate for its HTTP client, with `native-tls` for TLS support. Given that the client credentials required for mutual authentication with the Mononoke VIP are encoded in a format that `native-tls` does not understand, some credential format conversion via the `openssl` crate is necessary.

Reviewed By: DurhamG

Differential Revision: D13055687

fbshipit-source-id: cc944abd579ce49928776646c0dcce567f99c3b6
2018-12-03 17:46:51 -08:00
Saurabh Singh
2def7c19e2 packaging: back out D10213071 to fix continuous build
Summary: D10213071 broke the continuous build. Therefore, backing it out.

Reviewed By: ikostia

Differential Revision: D10238353

fbshipit-source-id: 0b387f6dd802614112cdc969944cbe4c40582b3d
2018-10-08 08:54:08 -07:00
Jun Wu
1cde64ae27 rustlib: move Cargo.toml to top-level
Summary:
This makes all crates' cache shared and unifies Cargo.lock, which
is used by the next diff.

Reviewed By: ikostia

Differential Revision: D10213071

fbshipit-source-id: 48a979c41423a8e8a9795ff102646cce13c39ff4
2018-10-05 16:43:47 -07:00
Jun Wu
7752e9e81f rustlib: move Node to a separate "types" crate
Summary:
The `Node` type will be used in multiple places. Let's move it to a standalone
crate so new libraries depending on it won't need to pull in all of
revisionstore's dependencies.

Note: I'd also like the `types` create to only define clean types. Given the
fact NULL_ID is not a great design in Mercurial (`Option<Node>` is a better
choice in Rust), it probably does not belong to the formal Rust `Node` type.
This diff is merely about moving things with minimal changes. NULL_ID will
be decoupled from `Node` in a follow-up.

Reviewed By: markbt

Differential Revision: D10132047

fbshipit-source-id: 5d05c5e0ac06a2d58556c4db11775503f9495626
2018-10-03 18:19:27 -07:00
Harvey Hunt
70a0c74d3b Implement a bookmark store for managing mercurial bookmarks
Summary:
Create a storage object that can be used to load bookmarks from a
mercurial file, modify and query the bookmarks in memory and then write back
to a mercurial bookmark file.

Reviewed By: quark-zju

Differential Revision: D9768564

fbshipit-source-id: ed469d0e588ae2200d614bf62a5a0b577e7c6f74
2018-09-20 05:05:08 -07:00
Kostia Balytskyi
25a8ee686f hg: rename pathencoding into encoding
Summary:
In the later diffs I'll add some more functionality there, not strictly
related to encoding paths.

Reviewed By: quark-zju

Differential Revision: D9441427

fbshipit-source-id: 069ab30a24761038fa2c1a4f180bbc0699d38ef9
2018-08-22 09:06:20 -07:00
Jun Wu
e33154698b Back out "Reuse pylz4 encoding between hg and Mononoke into a separate library"
Summary:
Backout D9124508.

This is actually more complex than it seems. It breaks non-buck build
everywhere:

- hgbuild on all platforms. POSIX platforms break because `hg archive` will
  miss `scm/common`. Windows build breaks because of symlink.
- `make local` on GitHub repo because `failure_ext` is not public. The `pylz4`
  Cargo.toml has missing dependencies.

Fixing them correctly seems non-trivial. Therefore let's backout the change to
unblock builds quickly.

The linter change is kept in case we'd like to try again in the future.

Reviewed By: simpkins

Differential Revision: D9225955

fbshipit-source-id: 4170a5f7664ac0f6aa78f3b32f61a09d65e19f63
2018-08-08 12:20:54 -07:00
Tuan Tran
f50d617d2d Reuse pylz4 encoding between hg and Mononoke into a separate library
Summary: Moved the lz4 compression code into a separate module in `scm/common/pylz4` and redirected code referencing the former two files to the new module

Reviewed By: quark-zju, mitrandir77

Differential Revision: D9124508

fbshipit-source-id: e4796cf36d16c3a8c60314c75f26ee942d2f9e65
2018-08-08 10:08:11 -07:00
Jun Wu
9e08d19d8e configparser: add a new Rust library
Summary: This will be used to parse hgrc-like config files.

Reviewed By: mitrandir77

Differential Revision: D8777330

fbshipit-source-id: 73a114df36e23246a3fc1206be202fba8705453a
2018-07-11 17:36:06 -07:00
Durham Goode
28e570113e lib: remove cbincode from cargo workspace
Summary: This doesn't exist.

Reviewed By: quark-zju

Differential Revision: D8743699

fbshipit-source-id: b12c2beb600b2918bee8ca579dbf96bc8ce5288c
2018-07-05 18:50:43 -07:00
Jun Wu
d0c1b6d014 cargo: add a workspace
Summary:
Make `lib` a cargo workspace so building in subprojects would share a
`target` directory and `cargo doc` will build documentation for all
subprojects.

Reviewed By: DurhamG

Differential Revision: D8741175

fbshipit-source-id: 512325bcb23d51e866e764bdc76dddb22c59ef05
2018-07-05 16:06:35 -07:00