Summary: This just reuses the AsyncHistoryStore methods.
Reviewed By: DurhamG
Differential Revision: D13891142
fbshipit-source-id: 9553e9824eebc5eacf6a82f9d0f212a62ec8955f
Summary:
Similarly to AsyncDataStore, this is just a blocking wrapper around a
HistoryStore.
Reviewed By: DurhamG
Differential Revision: D13891140
fbshipit-source-id: 76acadfc1849770b47e2400ce8c70f7e32bba4df
Summary: This will be used to wrap an HistoryStore into a AsyncHistoryStore.
Reviewed By: DurhamG
Differential Revision: D13891139
fbshipit-source-id: 41a0ec740f05268259a654e769ff0909617102ff
Summary: Add metadata to each delta entry written to the datapack. Since the HTTP API never serves LFS files, and the only flag currently used simple indicates whether a file should use LFS, the flag field is intentionally set to `None`, leaving only the size in the metadata (which, since we're storing full file content, is the same as the content length).
Differential Revision: D13894292
fbshipit-source-id: 36db25adb0c46cd1c7fde841a69d3e6d48d08d06
Summary: Give MononokeClient the ability to fetch multiple files concurrently. Right now this functionality is not exposed via the Python bindings, so as far as the Mercurial Python code is concerned, nothing has changed. The multi-get functionality will be used later in the stack.
Differential Revision: D13893575
fbshipit-source-id: c9e514fbeb41bbb37f52f6df3920eb01a66df293
Summary: As `MononokeClient` grows, we're going to add more inherent methods on the struct. To avoid cluttering the `client` module, split out all the builder-related things into a separate module.
Reviewed By: singhsrb
Differential Revision: D13892198
fbshipit-source-id: 42918d8a775d8328cfad8a6ac0365cb336893d8f
Summary: Add a new `get_file()` method to `MononokeClient` that fetches Mercurial file content from the API server and writes it to a datapack in the cache. This functionality is exposed via the new `hg debuggetfile` debug command, which takes a filenode and file path and fetches the corresponding file.
Differential Revision: D13889829
fbshipit-source-id: 2b68bf114ee72d641de7a1043cca1975e34cf4e6
Summary:
Crate adding easy conversions between `http::Uri` and `url::Url`.
Rust has two main types for working with URLs: `http::Uri` and `url::Url`. `http::Uri` comes from the `http` crate, which is supposed to be a set of common types to be used throughout the Rust HTTP ecosystem, to ensure mutual compatibility between different HTTP crates and web frameworks. This is the type that HTTP clients like Hyper expect when specifying URLs.
Unfortunately, `http::Uri` is a very simple type that does not expose any means of mutating or otherwise manipulating the URL. It can only parse URLs from strings, forcing the users to construct URLs via error-prone string concatenation.
In contrast, the `url::Url` comes from the `rust-url` crate from the Servo project. This type does support easily constructing and manipulating URLs, making it very useful for assembling a URL from components.
The only way to convert between the two types is to first convert back to a string, and then re-parse as the desired type. Several issues [have](https://github.com/hyperium/hyper/issues/1219) [been](https://github.com/hyperium/hyper/issues/1102) [raised](https://github.com/hyperium/hyper/issues/1219) about this upstream, but there has been no consensus or action as of yet. To get around the problem for now, this crate adds convenience methods to perform the conversions.
Reviewed By: DurhamG
Differential Revision: D13887403
fbshipit-source-id: ecfaf3ea9d884621493b0fe44a6b5658d10108b4
Summary:
D13853115 adds `edenscm/` to `sys.path` and code still uses `import mercurial`.
That has nasty problems if both `import mercurial` and
`import edenscm.mercurial` are used, because Python would think `mercurial.foo`
and `edenscm.mercurial.foo` are different modules so code like
`try: ... except mercurial.error.Foo: ...`, or `isinstance(x, mercurial.foo.Bar)`
would fail to handle the `edenscm.mercurial` version. There are also some
module-level states (ex. `extensions._extensions`) that would cause trouble if
they have multiple versions in a single process.
Change imports to use the `edenscm` so ideally the `mercurial` is no longer
imported at all. Add checks in extensions.py to catch unexpected extensions
importing modules from the old (wrong) locations when running tests.
Reviewed By: phillco
Differential Revision: D13868981
fbshipit-source-id: f4e2513766957fd81d85407994f7521a08e4de48
Summary: Some of the revisionstore imports were unused.
Reviewed By: kulshrax
Differential Revision: D13865074
fbshipit-source-id: 79c7c2ba869f2e1d72fa06aac70a4b027367c831
Summary: Similar to previous diff in this stack, make this type serializable so we can send it as part of an HTTP request.
Reviewed By: singhsrb
Differential Revision: D13858440
fbshipit-source-id: 9173a3e76bcfa6a6600d30ada39d65475f95bc5e
Summary: Make this type serializable so it can be sent as part of an HTTP request. By using Serde, we can easily support a variety of serialization formats without code changes.
Reviewed By: singhsrb
Differential Revision: D13858443
fbshipit-source-id: b6c83f38eaadbb2a28be6d66faf6a3610ede970f
Summary:
The conditional if statement did not prevent the logic inside the
condition from being compiled, which in this case fails on windows. Instead of
using an if, let's just define two functions and conditionally compile the
functions.
Reviewed By: ikostia
Differential Revision: D13855560
fbshipit-source-id: ac417e6bd8fb272106fe8f3b9a8b7db57214ad88
Summary:
Move top-level Python packages `mercurial`, `hgext` and `hgdemandimport` to
a new top-level package `edenscm`. This allows the Python packages provided by
the upstream Mercurial to be installed side-by-side.
To maintain compatibility, `edenscm/` gets added to `sys.path` in
`mercurial/__init__.py`.
Reviewed By: phillco, ikostia
Differential Revision: D13853115
fbshipit-source-id: b296b0673dc54c61ef6a591ebc687057ff53b22e
Summary:
As a last step towards getting rid of loosefiles, memcache will soon be changed
to produce packfiles. One of the missing piece to achieve is the ability to
read and write packfiles asynchronously, as memcache is purely async.
As a first step, we can wrap the packfile into a blocking context.
Reviewed By: DurhamG
Differential Revision: D13806738
fbshipit-source-id: 2211c2a984a453edbb1647830f7f5fb399a03023
Summary:
As a last step towards getting rid of loosefiles, memcache will soon be changed
to produce packfiles. One of the missing piece to achieve is the ability to
read and write packfiles asynchronously, as memcache is purely async.
As a first step, we can wrap the packfile into a blocking context.
Reviewed By: DurhamG
Differential Revision: D13804184
fbshipit-source-id: 01fcb57af1558feca662b1070969f553c479871a
Summary:
The tempfile rust crates opens the file with RW permissions for the user only,
but once written out to disk, the permissions needs to be readable by everyone.
Unfortunately, rust doesn't have a portable way of doing this, so we have to
resort to using `if cfg!(unix)` conditions for doing this.
Reviewed By: DurhamG
Differential Revision: D13703406
fbshipit-source-id: 688bc679b5c1a7943ceab723c1f649d555b61a7a
Summary:
This allows de-duplicating the logic for setting proper permissions on the
files. Most of the changes is code movement and rustfmt formatting.
Reviewed By: DurhamG
Differential Revision: D13703392
fbshipit-source-id: 28be85ef2d4b440202cf4885e50e62ac3c41f774
Summary: Allow the credentials for TLS mutual authentication (namely, the client certificate and private key) to come from separate PEM files. At Facebook, these are usually stored in the same file, but Mercurial's standard TLS configuration options allow these to be configured separately. As such, in order to support the standard options (which will happen in a later diff), provide the ability to handle separate files, but for now just pass the same path for both from Python to Rust.
Reviewed By: markbt
Differential Revision: D13791525
fbshipit-source-id: 556d99d77a4273b9b0bd91cac8940da136088e45
Summary: Use a builder struct rather than a constructor function to configure and initialize new `MononokeClient` instances. Doing it this way is helpful because later in this stack, we'll need to pass a lot of additional configuration to `MononokeClient`; adding all of these items as parameters to the constructor quickly becomes unwieldily. Using a builder keeps the number of parameters in check.
Differential Revision: D13780408
fbshipit-source-id: bfc43ecbe474d5285ae87d4df9cce244a7ff391d
Summary:
Split up the functionality in `MononokeClient` by moving all of the Mononoke API methods to their own separate trait. This maintains a distinction between functionality that is part of the API vs methods for setting up and configuring the client.
Originally, I had tried to avoid using a trait here because of limitations on trait methods (for example, we can't use `impl Trait` for return types). In practice, I don't think this limitation will be an issue since the API exposed by the client needs to be synchronous (since it will be called by FFI bindings to Python), and as such, there shouldn't be any complex Future return types in the API. (The client will still use async code internally, but the external API will be synchronous.)
Differential Revision: D13780089
fbshipit-source-id: 17e80f549d6ac7c41c60b2b8389eb1760531883e
Summary: Boxed slices are difficult to use in practice, so use `Vec<u8>` instead. (No need for `Bytes` here since there is no reference counting required.)
Reviewed By: DurhamG
Differential Revision: D13770055
fbshipit-source-id: 78f48ac32a4da9c105bf05eb44889c1f492721a8
Summary: Use `Bytes` instead of `Rc<Box<[u8]>>` since the former is a nicer type to represent a reference counted heap allocated byte buffer. (Note that `Rc<Box<[u8]>>` should have originally been `Rc<[u8]>` -- the former introduces an unnecessary allocation and layer of indirection.)
Differential Revision: D13769306
fbshipit-source-id: 5f3e788426e28c7e9ccc478f993c717b23663f56
Summary: Boxed bytes slices (e.g., `Box<[u8]>`, `Rc<[u8]>`) are not very ergonomic to use and are somewhat unusual in Rust code. Use the more common and easier to use `Bytes` type instead. Since this type supports shallow, referenced-counted copies, there shouldn't be any new O(n) copying behavior compared to `Rc<[u8]>`.
Reviewed By: markbt
Differential Revision: D13754730
fbshipit-source-id: d5fbc8e39c84c56d30174f4bb194ee21a14bf944
Summary: Use `failure::Fallible<T>` in place of `Result<T, failure::Error>`.
Reviewed By: singhsrb
Differential Revision: D13754688
fbshipit-source-id: cfbe418f5213884816d4837d1077cd90a17359b6
Summary:
Previously, `use` statements were inconsistently and arbitrarily grouped. This diff groups them in the following order:
- 3rd party crates from crates.io
- local crates
- std library imports (collapsed into a single multiline `use` statement)
- modules within current crate
This new ordering ensures that upon migration to Rust 2018, all imports from within the current crate will be grouped together with the `crate::` prefix.
Reviewed By: singhsrb
Differential Revision: D13754393
fbshipit-source-id: e774c09e0547066afa5f797c1a9c2e5ec4190834
Summary: Run the latest version of rustfmt over the code to ensure consistent style.
Reviewed By: singhsrb
Differential Revision: D13754394
fbshipit-source-id: 6cf5937bcb642530bdf41aaf83399366a9ba3c9a
Summary: There were some warnings about unused private fields in various structs in this crate. Add `#[allow(dead_code)]` as needed to suppress these warnings.
Reviewed By: singhsrb
Differential Revision: D13754234
fbshipit-source-id: ca95a2afbfc67ddb66e7c7436c81cde0fa59f06c
Summary:
Use the `Fallible` type alias provided by `failure` rather than defining our
own.
Differential Revision: D13732298
fbshipit-source-id: 2577bc4c34da5b7a88ae2703f9b898bc2a83b816
Summary: The canonical URL type in Rust, `http::Uri`, does not support manipulating URLs easily. (e.g., concatenating path components, etc.) As such, switch to using the `Url` type from the `url` crate, which does support URL manipulation, and convert to `http::Uri` before passing the resulting URL to Hyper.
Reviewed By: phillco
Differential Revision: D13738139
fbshipit-source-id: c7de67f1596ebc1bdde89d3fe87086f49c32b5db
Summary:
Directory listing is different in every OS, and due to the current repack
implementation, this directly affect the order in which the packfiles are added
to the new one. Since the resulting packfile name depends on the hash of its
content, the name was influenced by the directory order.
By sorting the files in list_packs, the packfile name will be independent of
the directory listing and thus be the same for all the OSes.
Reviewed By: singhsrb
Differential Revision: D13700935
fbshipit-source-id: 01e055a0c1bcf7fb2dc4faf614dfb20cd4499017
Summary: For now, combine all files smaller than 100MB that accumulate to less than 4GB.
Reviewed By: DurhamG
Differential Revision: D13603760
fbshipit-source-id: 3fa74f1ced3d3ccd463af8f187ef5e0254e1820b
Summary: Use the newly introduced PackWriter to write the {data,history}packs.
Reviewed By: markbt
Differential Revision: D13603759
fbshipit-source-id: 528a6af7c4ac3321aeec0559805de12114224cfd
Summary:
The packfiles are currently being written via an unbuffered file. This is
inefficient as every write to the file results results in a write(2) syscall.
By buffering these writes we can reduce the number of syscalls and thus
increase the throughput of pack writing operations.
Reviewed By: markbt
Differential Revision: D13603758
fbshipit-source-id: 649186a852d427a1473695b1d32cc9cd87a74a75
Summary:
Update pest to 2.1.0.
This version has a new behaviour for parser error messages: the line feed at
the end of the line is shown in the error output.
Reviewed By: wez
Differential Revision: D13671099
fbshipit-source-id: b8d1142a44a56a0b21b3b72cf027f3f8a30f421e
Summary:
The revisionstore crate currently consists of several public submodules,
each exposing several public types. The APIs exposed by each of the modules
require using types from the other modules. As such, users of this crate are
forced to have complex nested imports to use any of its functionality.
This diff helps ease this problem by reexporting the public types exposed from
each of the public submodules at the top level, thereby allowing crate users to
`use` all of the required types without needing nested imports.
Reviewed By: singhsrb
Differential Revision: D13686913
fbshipit-source-id: 9fb3cce8783787aa5f3f974c7168afada5952712
Summary:
The later tries to read from the disk, while the former is purely in memory and
thus more efficient.
Reviewed By: DurhamG, markbt
Differential Revision: D13603757
fbshipit-source-id: 5fd120ba4065d6a65cb2982db9ab81db3ea26524
Summary:
Use the `Fallible` type alias provided by `failure` rather than defining our
own.
Differential Revision: D13657313
fbshipit-source-id: ae249bc15037cc2be019ce7ce8a440c153aa31cc
Summary:
Use the `Fallible` type alias provided by `failure` rather than defining our
own.
Differential Revision: D13657312
fbshipit-source-id: 55134ee93f1f3aaaeefe5644a4a1f2285603bc1c
Summary:
Use the `Fallible` type alias provided by `failure` rather than defining our
own.
Differential Revision: D13657314
fbshipit-source-id: f1a379089972f7f0066c49ddedf606d36b7ac260
Summary:
Use the `Fallible` type alias provided by `failure` rather than defining our
own.
Differential Revision: D13657310
fbshipit-source-id: cae73fc239a6ad30bb6ef56a664d1ef5a2a19b5f
Summary:
On some platforms, removing a file can fail if it's still mapped or opened. In
mercurial, this can happen during repack as the datapacks are removed while
still being mapped.
Reviewed By: DurhamG
Differential Revision: D13615938
fbshipit-source-id: fdc1ff9370e2767e52ee1828552f4598105f784f
Summary:
After repacking the data/history packs, we need to cleanup the
repacked files. This was an omission from D13363853.
Reviewed By: markbt
Differential Revision: D13577592
fbshipit-source-id: 36e7d5b8e86affe47cdd10d33a769969f02b8a62
Summary:
The python version of the mutable packs set the permission to read-only after
writing them, while the rust version keeps them writeable. Let's make the rust
one more consistent.
Reviewed By: markbt
Differential Revision: D13573572
fbshipit-source-id: 61256994562aa09058a88a7935c16dfd7ddf9d18
Summary:
Use of `write!` requires checking for errors, however in this case, there is no
need to use `write!`, as we just want the error as a string.
Reviewed By: ikostia
Differential Revision: D13596497
fbshipit-source-id: 5892025344936936188cf3a8ca227e71eff57d55
Summary:
When I was debugging an eden importer issue with Puneet, we saw errors caused
by important extensions (ex. remotefilelog, lz4revlog) not being loaded. It
turned out that configpaser was checking the "exe dir" to decide where to
load "system configs". For example, If we run:
C:\open\fbsource\fbcode\scm\hg\build\pythonMSVC2015\python.exe eden_import_helper.py
The "exe dir" is "C:\open\fbsource\fbcode\scm\hg\build", and system config is
not there.
Instead of copying "mercurial.ini" to every possible "exe dir", this diff just
switches to a hard-coded system config path. It's now consistent with what we
do on POSIX systems.
The logic to copy "mercurial.ini" to "C:\open\fbsource\fbcode\scm\hg" or
"C:\tools\hg" become unnecessary and are removed.
Reviewed By: singhsrb
Differential Revision: D13542939
fbshipit-source-id: 5fb50d8e42d36ec6da28af29de89966628fe5549
Summary:
`test-check-fix-code.t` was failing due to copyright header missing
from certain files. This commit fixes the files by running
```
contrib/fix-code.py FILE
```
as suggested in the failure message.
Reviewed By: DurhamG
Differential Revision: D13538506
fbshipit-source-id: d8063c9a0e665377a9976abeccb68fbef6781950
Summary:
Unfortunately required symbols are not exposed by lz4-sys. So we just declare
them ourselves.
Make sure it compresses better:
In [1]: c=open('/bin/bash').read();
In [2]: from mercurial.rust import lz4
In [3]: len(lz4.compress(c))
Out[3]: 762906
In [4]: len(lz4.compresshc(c))
Out[4]: 626970
While it's much slower for larger data (and compresshc is slower than pylz4):
Benchmarking (easy to compress data, 20MB)...
pylz4.compress: 10328.03 MB/s
rustlz4.compress_py: 9373.84 MB/s
pylz4.compressHC: 1666.80 MB/s
rustlz4.compresshc_py: 8298.57 MB/s
pylz4.decompress: 3953.03 MB/s
rustlz4.decompress_py: 3935.57 MB/s
Benchmarking (hard to compress data, 0.2MB)...
pylz4.compress: 4357.88 MB/s
rustlz4.compress_py: 4193.34 MB/s
pylz4.compressHC: 3740.40 MB/s
rustlz4.compresshc_py: 2730.71 MB/s
pylz4.decompress: 5600.94 MB/s
rustlz4.decompress_py: 5362.96 MB/s
Benchmarking (hard to compress data, 20MB)...
pylz4.compress: 5156.72 MB/s
rustlz4.compress_py: 5447.00 MB/s
pylz4.compressHC: 33.70 MB/s
rustlz4.compresshc_py: 22.25 MB/s
pylz4.decompress: 2375.42 MB/s
rustlz4.decompress_py: 5755.46 MB/s
Note python-lz4 was using an ancient version of lz4. So there could be differences.
Reviewed By: DurhamG
Differential Revision: D13528200
fbshipit-source-id: 6be1c1dd71f57d40dcffcc8d212d40a853583254
Summary:
The `pybuf` provides a way to read `bytes`, `bytearray`, some `buffer` types in
a zero-copy way. The main benefit is to use same code to support different
input types. It's copied to a couple of places. Let's move it to `cpython-ext`.
Reviewed By: DurhamG
Differential Revision: D13516206
fbshipit-source-id: f58881c4bfe651a6fdb84cf317a74c3c8d7a4961
Summary: Make it possible to write content directly into a PyBytes buffer.
Reviewed By: DurhamG
Differential Revision: D13528202
fbshipit-source-id: 8c0a4ed030439a8dc40cdfbd72b1f6734a8b2036
Summary:
This allows decompressing into a pre-allocated buffer. After some experiments,
it seems `bytearray` will just break too many things, ex:
- bytearray is not hashable
- bytearray[index] returns an int
- a = bytearray('x'); b = a; b += '3' # will mutate 'a'
- ''.join([bytearray('')]) will raise TypeError
Therefore we have to use zero-copy `bytes` instead, which is less elegent. But
this API change is a step forward.
Reviewed By: DurhamG
Differential Revision: D13528201
fbshipit-source-id: 1cfaf5d55efdc0d6c0df85df9960fe9682028b08
Summary:
I need to convert `Vec<u8>` to a Python object in a zero-copy way for rustlz4
performacne.
Assuming Python and Rust use the same memory allocator, it's possible to transfer
the control of a malloc-ed pointer from Rust to Python. Use this to implement
zero-copy. PyByteArrayObject is chosen because its struct contains such a pointer.
PyBytes cannot be used as it embeds the bytes, without using a pointer.
Sadly there are no CPython APIs to do this job. So we have to write to the raw
structures. That means the code will crash if python is replaced by
python-debug (due to Python object header change). However, that seems less an
issue given the performance wins. If python-debug does become a problem, we can
try vendoring libpython directly.
I didn't implement a feature-rich `PyByteArray` Rust object. It's not easy to
do so outside the cpython crate. Most helper macros to declare types cannot be
reused, because they refer to `::python`, which is not available in the current
crate.
Reviewed By: DurhamG
Differential Revision: D13516209
fbshipit-source-id: 9aa089b309beb71d4d21f6c63fcb97dbc798b5f8
Summary:
This gives some sense about how fast it is.
Background: I was trying to get rid of python-lz4, by exposing this to Python.
However, I noticed it's 10x slower than python-lz4. Therefore I added some
benchmark here to test if it's the wrapper or the Rust lz4 code.
It does not seem to be this crate:
```
# Pure Rust
compress (100M) 77.170 ms
decompress (~100M) 67.043 ms
# python-lz4
In [1]: import lz4, os
In [2]: b=os.urandom(100000000);
In [3]: %timeit lz4.compress(b)
10 loops, best of 3: 87.4 ms per loop
```
Reviewed By: DurhamG
Differential Revision: D13516205
fbshipit-source-id: f55f94bbecc3b49667ed12174f7000b1aa29e7c4
Summary:
This exposes the underlying lookup functions from `Index`.
Alternatively we can allow access to `Index` and provide an `iter_started_from`
method on `Log` which takes a raw offset. I have been trying to avoid exposing
raw offsets in public interfaces, as they would change after `flush()` and cause
problems.
Reviewed By: markbt
Differential Revision: D13498303
fbshipit-source-id: 8b00a2a36a9383e3edb6fd7495a005bc985fd461
Summary:
This is the missing API before `indexedlog::Index` can fit in the
`changelog.partialmatch` case. It's actually more flexible as it can provide
some example commit hashes while the existing revlog.c or radixbuf
implementation just error out saying "ambiguous prefix".
It can be also "abused" for the semantics of sorted "sub-keys". By replace
"key" with "key + subkey" when inserting to the index. Looking up using "key"
would return a lazy result list (`PrefixIter`) sorted by "subkey". Note:
the radix tree is NOT efficient (both in time and space) when there are common
prefixes. So this use-case needs to be careful.
Reviewed By: markbt
Differential Revision: D13498301
fbshipit-source-id: 637856ebd761734d68b20c15866424b1d4518ad6
Summary: This will be used in prefix lookups.
Reviewed By: markbt
Differential Revision: D13498300
fbshipit-source-id: 3db7a21d6f35a18699d9dc3a0eca71a5410e0e61
Summary:
It makes testing duplicated - now `cargo test` would try running tests on 2 entry points:
lib.rs and indexedlog_dump.rs. Move it to a separate crate to solve the issue.
Reviewed By: markbt
Differential Revision: D13498266
fbshipit-source-id: 8abf07c1272dfa825ec7701fd8ea9e0d1310ec5f
Summary: `write!` result needs to be used.
Reviewed By: markbt
Differential Revision: D13471967
fbshipit-source-id: d48752bcac05dd33b112679d7faf990eb8ddd651
Summary: The former is deprecated and thus compiling revisionstore shows many warnings.
Reviewed By: markbt
Differential Revision: D13379278
fbshipit-source-id: d4b4662a1ad00997de4c46274deaf22f48487328
Summary:
Adds a new crate `cpython-result`, which provides a `ResultExt` trait, which
extends the failure `Result` type to allow coversion to `PyResult` by
converting the error to an appropriate Python Exception.
Reviewed By: quark-zju
Differential Revision: D12980782
fbshipit-source-id: 44a63d31f9ecf2f77efa3b37c68f9a99eaf6d6fa
Summary:
The mutationstore is a new store for recording records of commit mutations for
commits that are not in the local repository.
It uses an indexedlog to store the data. Each mutation entry corresponds to
the information the mutation that led to the creation of a particular commit,
which is recorded as the successor in the entry.
Entries can come from three possible places:
* `Commit` metadata for a commit not available locally
* `Obsmarkers` for repos that have been migrated from evolution tracking
* `Synthetic` for entries created synthetically, e.g. by a pullcreatemarkers
implementation.
The other commits referred to in an entry must predate the successor commit.
For entries that originated from commits, this is ensured, as the successor
commit hash includes the other commit hashes. For other entry types, it is
an error to refer to later commits, and any entry that causes a cycle will
be ignored.
Reviewed By: quark-zju
Differential Revision: D12980773
fbshipit-source-id: 040d3f7369a113e710ed8c9f61fabec6c5ec9258
Summary:
The derived debug for Node prints out each byte as a decimal number. Instead,
make the Debug output for nodes look like `Node("hexstring")`.
Reviewed By: DurhamG
Differential Revision: D12980775
fbshipit-source-id: 042cbf6eade8403759684969e1f69f7f4e335582
Summary:
Add a utility function for tests to generate a vector of random nodes. This
will be used in future tests.
Reviewed By: DurhamG
Differential Revision: D12980784
fbshipit-source-id: 73fc8643503e11a46a845671df94c912a5e49d23
Summary:
Add traits that extend `std::io::Read` and `std::io::Write` to implement new
`read_node` and `write_node` methods, allowing simple reading and writing of
binary nodes from and to streams.
Reviewed By: DurhamG
Differential Revision: D12980778
fbshipit-source-id: fc6751cd43a1693a5a5a3ac93aea74aec5fda4fe
Summary:
The future of mercurial is rust, and one of the missing piece is repacking of data/history packs. For now, let's implement a very basic packing strategy that just pulls all the packs into one, with one small optimization that puts all the delta chains close together in the output file.
At first, it's expected that this code will be driven by the existing python code, but more and more will be done in rust as time goes.
Reviewed By: DurhamG
Differential Revision: D13363853
fbshipit-source-id: ad1ac2039e1732f7141d99abf7f01804a9bde097
Summary:
Add a new entry type - INLINE_LEAF, which embeds the EXT_KEY and LINK entries
to save space.
The index size for referred keys is significantly reduced with little overhead:
index insertion (owned key) 3.732 ms
index insertion (referred key) 3.604 ms
index flush 11.868 ms
index lookup (memory) 1.159 ms
index lookup (disk, no verify) 2.175 ms
index lookup (disk, verified) 4.303 ms
index size (5M owned keys) 216626039
index size (5M referred keys) 96616431
11.87s user 2.96s system 98% cpu 15.107 total
The breakdown of the "5M referred keys" size is:
type count bytes
radixes 1729472 33835772
inline_leafs 5000000 62780651
There are no other kinds of entries stored.
Previously, the index size of referred keys is:
index size (5M referred keys) 136245815 bytes
So it's 136MB -> 96MB, 40% decrease.
Reviewed By: DurhamG
Differential Revision: D13036801
fbshipit-source-id: 27e68e4b6c332c1dc419abc6aba69271952e4b3d
Summary:
Replace the 20-byte "jump table" with 3-byte "flag + bitmap". This saves space
for indexes less than 4GB. There are some reserved bits in the "flag" so if we
run into space issues when indexes are larger than 4GB, we can try adding
6-byte integer, or VLQ back without breaking backwards-compatibility.
It seems to hurt flush performance a bit, because we have to scan the child
array twice. However, lookup (the most important performance) does not change
much. And the index is more compact.
After:
index flush 19.644 ms
index lookup (disk, no verify) 2.220 ms
index lookup (disk, verified) 4.067 ms
index size (5M owned keys) 216626039 bytes
index size (5M referred keys) 136245815 bytes
Before:
index flush 16.764 ms
index lookup (disk, no verify) 2.205 ms
index lookup (disk, verified) 4.030 ms
index size (5M owned keys) 240838647 bytes
index size (5M referred keys) 160458423 bytes
For the "referred key" case, it's 160->136MB, 17% decrease.
A detailed break down of components of index is:
After:
type count bytes (using owned keys)
radixes 1729472 33835772
links 5000000 27886336
leafs 5000000 44629384
keys 5000000 110000000
type count bytes (using referred keys)
radixes 1729472 33835772
links 5000000 27886336
leafs 5000000 44629384
ext_keys 5000000 29894315
Before:
type count bytes (using owned keys)
radixes 1729472 58048380
links 5000000 27886336
leafs 5000000 44903923
keys 5000000 110000000
type count bytes (using referred keys)
radixes 1729472 58048380
links 5000000 27886336
leafs 5000000 44629384
ext_keys 5000000 29894315
Leaf nodes are taking too much space. It seems the next big optimization might
be inlining ext_keys into leafs.
Reviewed By: DurhamG, markbt
Differential Revision: D13028196
fbshipit-source-id: 6043b16fd67a497eb52d20a17e153fcba5cb3e81
Summary:
Since the size test only runs once, we can use a larger number of keys. This is
closer to some production use-cases.
`cargo bench size` shows:
index size (5M owned keys) 240838647
index size (5M referred keys) 160458423
It currently uses 32 bytes per key for 5M referred keys.
Reviewed By: markbt
Differential Revision: D13027880
fbshipit-source-id: 726f5fb2da056e77ab93d82fda9f1afa500d0a8d
Summary:
Add benchmarks about index sizes, and a benchmark of insertion using key
references.
An example `cargo bench` result running on my devserver looks like:
index insertion (owned key) 3.551 ms
index insertion (referred key) 3.713 ms
index flush 20.648 ms
index lookup (memory) 1.087 ms
index lookup (disk, no verify) 2.041 ms
index lookup (disk, verified) 4.347 ms
index size (owned key) 886010
index size (referred key) 534298
Reviewed By: markbt
Differential Revision: D13027879
fbshipit-source-id: 70644c504026ffee2122d857d5035f5b7eea4f42
Summary:
For checksum values like xxhash, there is no benefit using big endian. Switch
to little endian so it's slightly slightly faster on the major platforms we
care about.
This is a breaking change. However, the format is not used in production yet.
So there is no migration code.
Reviewed By: markbt
Differential Revision: D13015465
fbshipit-source-id: ca83d19b3328370d089b03a33e848e64b728ef2a
Summary:
Previously, the format of an Log entry is hard-coded - length, xxhash, and
content. The xxhash always takes 8 bytes.
For small (ex. 40-byte) entries, xxhash32 is actually faster and takes less
disk space.
Introduce the "entry flags" concept so we can store some metadata about what
checksum function to use. The concept could be potentially used to support
other new format changes at per entry level in the future.
As we're here, also support data without checksums. That can be useful for
content with its own checksum, like a blob store with its own SHA1 integrity
check.
Performance-wise, log insertion is slower (but the majority insertaion overhead
would be on the index part), iteration is a little bit faster, perhaps because
the log can use less data.
Before:
log insertion 15.874 ms
log iteration (memory) 6.778 ms
log iteration (disk) 6.830 ms
After:
log insertion 18.114 ms
log iteration (memory) 6.403 ms
log iteration (disk) 6.307 ms
Reviewed By: DurhamG, markbt
Differential Revision: D13051386
fbshipit-source-id: 629c251633ecf85058ee7c3ce7a9f576dfac7bdf
Summary:
Xxhash result won't usually have leading zeros. So VLQ encoding is not an
efficient choice. Use non-VLQ encoding instead.
Performance wise, this is noticably faster than before:
log insertion 14.161 ms
log insertion with index 102.724 ms
log flush 11.336 ms
log iteration (memory) 6.351 ms
log iteration (disk) 7.922 ms
10.18s user 3.66s system 97% cpu 14.218 total
log insertion 13.377 ms
log insertion with index 97.422 ms
log flush 11.792 ms
log iteration (memory) 6.890 ms
log iteration (disk) 7.139 ms
10.20s user 3.56s system 97% cpu 14.117 total
log insertion 14.573 ms
log insertion with index 94.216 ms
log flush 18.993 ms
log iteration (memory) 7.867 ms
log iteration (disk) 7.567 ms
9.85s user 3.73s system 96% cpu 14.073 total
log insertion 15.526 ms
log insertion with index 98.868 ms
log flush 19.600 ms
log iteration (memory) 7.533 ms
log iteration (disk) 7.150 ms
10.13s user 4.02s system 96% cpu 14.647 total
log insertion 14.629 ms
log insertion with index 100.449 ms
log flush 20.997 ms
log iteration (memory) 7.299 ms
log iteration (disk) 7.518 ms
10.14s user 3.65s system 96% cpu 14.274 total
This is a format-breaking change. Fortunately we haven't really use the old
format in production yet.
Reviewed By: DurhamG, markbt
Differential Revision: D13015463
fbshipit-source-id: 6e7e4f7a845ea8dbf0904b3902740b65cc7467d5
Summary:
Some simple benchmark for "log". The initial result running from my devserver
looks like:
log insertion 33.146 ms
log insertion with index 106.449 ms
log flush 9.623 ms
log iteration (memory) 10.644 ms
log iteration (disk) 11.517 ms
13.75s user 3.61s system 97% cpu 17.778 total
log insertion 27.906 ms
log insertion with index 107.683 ms
log flush 19.204 ms
log iteration (memory) 10.239 ms
log iteration (disk) 11.118 ms
12.89s user 3.55s system 97% cpu 16.924 total
log insertion 31.645 ms
log insertion with index 109.403 ms
log flush 9.416 ms
log iteration (memory) 10.226 ms
log iteration (disk) 10.757 ms
13.07s user 3.02s system 97% cpu 16.423 total
log insertion 31.848 ms
log insertion with index 109.332 ms
log flush 18.345 ms
log iteration (memory) 10.709 ms
log iteration (disk) 11.346 ms
13.12s user 3.70s system 97% cpu 17.276 total
log insertion 29.665 ms
log insertion with index 106.041 ms
log flush 16.159 ms
log iteration (memory) 10.367 ms
log iteration (disk) 11.110 ms
12.99s user 3.27s system 97% cpu 16.717 total
Reviewed By: markbt
Differential Revision: D13015464
fbshipit-source-id: 035fee6c8b6d0bea4cfe194eed3d58ba4b5ebcb8
Summary:
An upcoming diff will need the ability to iterate over all the keys in
the store. So let's expose that functionality.
Reviewed By: quark-zju
Differential Revision: D13062575
fbshipit-source-id: a173fcdbbf44e2d3f09f7229266cca6f3e67944b
Summary:
Introduces a nodemap structure that stores the mapping between two
nodes with bidirectional indexes.
Reviewed By: quark-zju
Differential Revision: D13047698
fbshipit-source-id: 967bf4b26a4b57e4fa2421a342edb21d3a5adbf6
Summary:
You can currently iterate over indexlog entries, but there's no way to
iterate over the keys without keeping a copy of the index function with you.
Let's add a key iterator function.
Reviewed By: quark-zju
Differential Revision: D13010744
fbshipit-source-id: 1fcaf959ae82417e5cbafae7c1927c3ae8f8e76a
Summary: Allow MononokeClient to support both HTTP and HTTPS. The protocol use is determined by the scheme of the server base URI passed in. For example, specifying `https://mononoke-api.internal.tfbnw.net` would use HTTPS, whereas specifying `http://localhost:12345` would use HTTP. This is useful for local testing.
Reviewed By: DurhamG
Differential Revision: D13089197
fbshipit-source-id: 2da72ac98c60746200334e4bcc0e2568abe3073b
Summary:
This diff adds a new `mononokeapi` crate, which is a Rust client library for the Mononoke API server. The crate is intended for use beyond Mercurial, and as such attempts to expose functionality in a reasonably generic way.
Right now, the only method supported by this crate is `/health_check`, which is the API server's health check endpoint that simply returns the string "I_AM_ALIVE" on success. Future diffs will expand this crate to include more of the API server's actual functionality. For now, this version serves as a proof of concept of how all the crate will be structured.
The crate currently uses the `hyper` crate for its HTTP client, with `native-tls` for TLS support. Given that the client credentials required for mutual authentication with the Mononoke VIP are encoded in a format that `native-tls` does not understand, some credential format conversion via the `openssl` crate is necessary.
Reviewed By: DurhamG
Differential Revision: D13055687
fbshipit-source-id: cc944abd579ce49928776646c0dcce567f99c3b6
Summary:
Turn BookmarkStore rust implementation into indexed-log backed.
Note that this no longer matches existing mercurial bookmark store
disk representation.
Reviewed By: DurhamG
Differential Revision: D13133605
fbshipit-source-id: 2e0a27738bcec607892b0edab6f759116929c8e1
Summary:
Before I implement a proper fix [1], let's just use the correct slashes.
[1]
Correct fix is de-verbatimization of canonicalized paths.
So, if `a-symlink->b` and `"C:\a".canonicalize()` produces `\\?\C:\b`, then doing `.push("c/d")` produces `\\?\C:\b\c/d`, where `c/d` is a *single* path component, becuase the path starts with `\\?\`. If there isn't such prefix, it's fine to push forward-slash-separated things into Windows paths.
Differential Revision: D13234288
fbshipit-source-id: 2ca0326bbd91ddc6ffd259153915037264292dc1
Summary: I only tested the original diff of Windows, aparently.
Reviewed By: mitrandir77
Differential Revision: D13188952
fbshipit-source-id: 9dc33cb0eedb8d3c09cb7a734528f71afd7cbe8a
Summary:
We need to canonicalize `current_exe` to resolve symlinks on OSX.
Unfortunetely, on there's no way to just resolve symlinks and not
turn path into a `\\?\` on Windows, AFAIK.
Once the path is canonicalized on Windows, it starts with `\\?\` and
forward slashes are no longer recognized as valid separators.
Here's a demonstration:
```
> cat src\main.rs
use std::path::Path;
fn main() {
let p = Path::new("\\\\?\\C:\\Code\\fbsource\\fbcode\\scm\\hg\\mercurial/entrypoint.py");
for comp in p.components() { println!("{:?}", comp); }
println!("{:?} exists: {}", p, p.exists());
let p = Path::new("\\\\?\\C:\\Code\\fbsource\\fbcode\\scm\\hg\\mercurial\\entrypoint.py");
for comp in p.components() { println!("{:?}", comp); }
println!("{:?} exists: {}", p, p.exists());
let p = Path::new("C:\\Code\\fbsource\\fbcode\\scm\\hg\\mercurial/entrypoint.py");
for comp in p.components() { println!("{:?}", comp); }
println!("{:?} exists: {}", p, p.exists());
}
> cargo run
Prefix(PrefixComponent { raw: "\\\\?\\C:", parsed: VerbatimDisk(67) })
RootDir
Normal("Code")
Normal("fbsource")
Normal("fbcode")
Normal("scm")
Normal("hg")
Normal("mercurial/entrypoint.py")
"\\\\?\\C:\\Code\\fbsource\\fbcode\\scm\\hg\\mercurial/entrypoint.py" exists: false
Prefix(PrefixComponent { raw: "\\\\?\\C:", parsed: VerbatimDisk(67) })
RootDir
Normal("Code")
Normal("fbsource")
Normal("fbcode")
Normal("scm")
Normal("hg")
Normal("mercurial")
Normal("entrypoint.py")
"\\\\?\\C:\\Code\\fbsource\\fbcode\\scm\\hg\\mercurial\\entrypoint.py" exists: true
Prefix(PrefixComponent { raw: "C:", parsed: Disk(67) })
RootDir
Normal("Code")
Normal("fbsource")
Normal("fbcode")
Normal("scm")
Normal("hg")
Normal("mercurial")
Normal("entrypoint.py")
"C:\\Code\\fbsource\\fbcode\\scm\\hg\\mercurial/entrypoint.py" exists: true
```
Differential Revision: D13176266
fbshipit-source-id: 5f35a3263e058d179b237c80f28e4fdf44105576
Summary:
This is important on OSX where `current_exe` will return the symlink address if `hg.rust` is a symlink.
Therefore, if you create a symlink to the `hg.rust` in the repo (like tests do), repo Python code won't be picked up, and the system code will be.
Reviewed By: mitrandir77
Differential Revision: D13138333
fbshipit-source-id: ffdf27329609d77bee4b8a2eecc47e02cb2dd5c8
Summary:
Add "--dry-run" for fix-code.py and use it in test-check.
This avoids license header and version = "*" issues.
Reviewed By: ikostia
Differential Revision: D10213070
fbshipit-source-id: 9fdd49ead3dfcecf292d5f42c028f20e5dde65d3
Summary:
This is done by running `fix-code.py`. Note that those strings are
semvers so they do not pin down the exact version. An API-compatiable upgrade
is still possible.
Reviewed By: ikostia
Differential Revision: D10213073
fbshipit-source-id: 82f90766fb7e02cdeb6615ae3cb7212d928ed48d
Summary:
The "misc" benchmark requires the base16 module to be public. It was made
private in a previous change. Let's make it public again so the benchmark can
run.
Reviewed By: singhsrb
Differential Revision: D13015031
fbshipit-source-id: 0dc1542803aae290de26651e367898eebfc95e83
Summary: This is the final step to make CAT authentification work
Reviewed By: markbt
Differential Revision: D12975214
fbshipit-source-id: e445ca502f8abaac914140f3f30476d50b3c2fbc
Summary:
To solve friction with OAuth tokens we will support CAT tokens as well in Scm Daemon.
Icebreaker support has been done in D12942971
CATs tokens can be generated on dev servers without user (via the tool based on TLS certs).
So we are going to use them in the next diff.
This will allow us to enable token-less cloud sync for everyone, scm daemon will use CATs.
Reviewed By: markbt
Differential Revision: D12962342
fbshipit-source-id: 173301387ee446622bf77b2d6bed6934b5ced2c3
Summary:
Basically if Unauthorized it will try to access the token again and restart all the subscriptions
rather than trying to reconnect with the same token in infinite loop.
We know OAuth tokens have potential to be invalidated.
CAT token (that we are going to support as well) will always be valid for some time - like 1 day, so we need a smooth way to recover from Unauthorized and issue a fresh token.
Reviewed By: markbt
Differential Revision: D12960843
fbshipit-source-id: 630c446c490b0724df38c61507ee555dc7ed7241
Summary: This is a backport of my upstream patch https://phab.mercurial-scm.org/D4147
Differential Revision: D12970974
fbshipit-source-id: ed9d8db2e32818e6e5ab3f23f5a0097bfa2cc14e
Summary:
The vendored crates were changed by D12811597. Bump `zstd-sys` in `Cargo.toml` to be compatible.
As we're here, also bump rust compiler to 1.30.0 so it's consistent with buck build.
Reviewed By: kulshrax
Differential Revision: D12952552
fbshipit-source-id: 6274bf829b98b16aeb6795209d12aba8b475b46d
Summary:
This diff implements getBlob on top of the mercurial rust
datapack code. It adds a C++ binding on top of the rust code to
make it easier to use and hooks it up in the hg backing store.
Need to figure this out for our opensource and windows builds:
* Need to teach them how to build and link the rust code
* need to add a windows version of the methods that accept paths;
this is just a matter of adding a WCHAR version of the functions.
Reviewed By: strager
Differential Revision: D10433450
fbshipit-source-id: 45ce34fb9c383ea6018a0ca858581e0fe11ef3b5
Summary:
I broke the Windows build because the return type of `path_to_local_bytes` is
different on Windows and Unix, and so must be dealt with differently. They are
different because on Windows we often need to make a copy, whereas on Unix we
can just use references to the byte data. Cows to the rescue: unify them
behind a Cow type.
While we're here, tidy up and unify the docs.
Reviewed By: quark-zju, ikostia
Differential Revision: D12833091
fbshipit-source-id: e02e308e6f81dd3d8ddf33e76c3073f51d3eccc1
Summary: It needs to be Send to be used in cpython.
Reviewed By: ikostia
Differential Revision: D10250289
fbshipit-source-id: ea57e356a0752764e50db9b6872b5cc4a456303f
Summary:
Make it more detailed for public APIs. Hide too detailed information (file
format).
Reviewed By: DurhamG
Differential Revision: D10250140
fbshipit-source-id: d9d9af9d67984b80f07db13e69bbffdf77e6a30e
Summary:
The log module is the "entry point" of other features. Update it so things are
more detailed. I tried to make it more friendly for people without knowledge
about the implementation details.
This could probably be further improved by adding some examples. For now, I'm
focusing on the plain English parts.
To reviewers: Let me know how you feel reading it assuming no prior knowledge
with the implementation. Ways to make sentences shorter, natural to native
speakers without losing important information are also very welcome.
Reviewed By: DurhamG
Differential Revision: D10250141
fbshipit-source-id: 35258c7197c1ce0a1d3d0554fab2f2d2866e123c
Summary:
Make important modules public. Make internal utility (base16) private. Add
some text to the crate-level document. It just refers to important structures.
Will revise document of those structures.
Reviewed By: DurhamG, kulshrax
Differential Revision: D10250143
fbshipit-source-id: c79859ee7d3d9cc4ee9a093ef5d12ec6599f2a42
Summary:
The `VLQEncode` and `VLQDecode` traits erroneously expected the (automatic)
`Sized` marker trait for `Read` and `Write`. This meant they couldn't be used
for trait object `Read`s or `Write`s without jumping through hoops or extra
`mut` keywords.
By not requiring `Sized` we can remove those workarounds.
Reviewed By: quark-zju
Differential Revision: D12816459
fbshipit-source-id: 16353e8fefff5738bd24a9f41c9d7d250aea56fd
Summary:
If the rust pack stores are used to access truncated pack files, currently they
panic. Instead, return a proper error showing what's wrong.
Reviewed By: quark-zju
Differential Revision: D10868299
fbshipit-source-id: 57fe5ec1ee4ee2a7bb10d2d5c5ca7082dc34125d
Summary:
The `configlist` function converts a config value to a list of strings.
I have thought about using pest to parse it. However, pest might return errors
(ex. `a,",b` does not parse due to missing end quote), while the original logic
can happily parse everything (`a,",b` gets parsed into `['a', '"', 'b']`).
The code might be simplified to make it more obvious that `unwrap()` cannot
panic. But it handles so many corner cases that I'd like to port as-is for
correctness.
Reviewed By: DurhamG
Differential Revision: D9323743
fbshipit-source-id: 5f8be562b7437260b7551d87d751424558d76e8f
Summary: This is just the result of running `./contrib/fix-code.py $(hg files .)`
Reviewed By: ikostia
Differential Revision: D10213075
fbshipit-source-id: 88577c9b9588a5b44fcf1fe6f0082815dfeb363a
Summary:
The histpack format requires that entries in each file section be
written in topological order, so that future readers can compute ancestors by
just linearly scanning. Let's make the rust mutable history pack support this.
Technically the rust historypack reader does not require this for now, but the python
one does, so we need to enforce it.
Reviewed By: kulshrax
Differential Revision: D10441286
fbshipit-source-id: dfdb57182909270b760bd79a100873aa3903a2a5
Summary:
I've first forked argparse in May but I didn't end up making many chnages to
it. I want to start over with the current state.
Reviewed By: wez
Differential Revision: D10378110
fbshipit-source-id: 7d4220d79a527c16cfcf2f199f19c0c2f417a7ab
Summary:
During an ancestor traversal, we were adding items to the queue if they
hadn't be processed yet. In a highly merge-y history this could result in adding
an exponential number of items to the queue since we aren't preventing items
from being added until they are actually consumed.
The fix is to just add the items to the seen set as we add them to the queue.
Reviewed By: quark-zju
Differential Revision: D10434655
fbshipit-source-id: 430b51adb2d24a99d8c780031f3dbf22c56b9347
Summary:
noticed this while trying to load the blob for `fbcode/eden/AUTODEPS`;
we'd stack overflow in here because mpatch_fold wouldn't terminate.
Looking at the code in `scm/hg/hgext/extlib/cstore/uniondatapackstore.cpp`,
there is logic to short circuit when there are no deltas, and throwin that
in here seems to do the right thing
Reviewed By: quark-zju, ikostia
Differential Revision: D10351279
fbshipit-source-id: 0d340e506fbad2ef056d0b51c474287babf527ce
Summary: As per quark-zju's request in the earlier diff.
Reviewed By: quark-zju
Differential Revision: D10173168
fbshipit-source-id: 20ab1fbc597b8329bbfec5dabd501d202571bdec
Summary:
Following the conversation with quark-zju, this in future will help us conditionally dynamically load
the `hgpython` `.dll`/`.so` only if we need it.
Reviewed By: quark-zju
Differential Revision: D10084949
fbshipit-source-id: c20ef014ad9922913ee36d1ec28b0555b64f7d1f
Summary: The old version cannot be found and its making the build fail.
Reviewed By: markbt
Differential Revision: D10255834
fbshipit-source-id: d14572885423622ecfe3730bbda07ae1bee7363a
Summary:
This makes all crates' cache shared and unifies Cargo.lock, which
is used by the next diff.
Reviewed By: ikostia
Differential Revision: D10213071
fbshipit-source-id: 48a979c41423a8e8a9795ff102646cce13c39ff4
Summary:
The code block is not a valid Rust program. Mark it as "plain".
This fixes `cargo doc`.
Reviewed By: markbt
Differential Revision: D10137806
fbshipit-source-id: 1197d3a2ebc1450a0738686fa6cfa7c7b79dcb0d
Summary:
The `Node` type will be used in multiple places. Let's move it to a standalone
crate so new libraries depending on it won't need to pull in all of
revisionstore's dependencies.
Note: I'd also like the `types` create to only define clean types. Given the
fact NULL_ID is not a great design in Mercurial (`Option<Node>` is a better
choice in Rust), it probably does not belong to the formal Rust `Node` type.
This diff is merely about moving things with minimal changes. NULL_ID will
be decoupled from `Node` in a follow-up.
Reviewed By: markbt
Differential Revision: D10132047
fbshipit-source-id: 5d05c5e0ac06a2d58556c4db11775503f9495626
Summary:
Before this patch, `%include` support on Windows is:
# Works fine - UNC path: `\\?\c:\1.rc`.
%include c:\1.rc
# Works fine - UNC path: `\\?\c:\1.rc`.
%include \1.rc
# Works fine - UNC path: `\\?\c:\1.rc`.
%include c:/1.rc
# Bad - UNC path: `\\?\c:/1.rc`.
%include /1.rc
People expect `%include /1.rc` to work on Windows. Fix it by normalizing
the path in `%include` handling.
More context:
Normally, `/` and `\` can be used interchangeably on Windows. But it's not true
for UNC paths. The config parser uses `std::fs::canonicalize` to normalize
paths. The following Python script demonstrates the difference:
>>> import os
>>> open('c:\\1.rc').close()
>>> os.path.exists('\\\\?\\c:\\1.rc')
True
>>> os.path.exists('\\\\?\\c:/1.rc')
False
Reviewed By: phillco
Differential Revision: D10036882
fbshipit-source-id: fd85e0bc86d1e5776701077751ac875e71d60568
Summary:
It's cleaner for the config parser to take care of environment variable
handling.
A side effect of this change is, `$HGPROF` only affects `profiling.type`,
not `profiling:foo.type`, which is more desirable since we don't want
`profiling:foo.type` to be overridden by `$HGPROF`.
Reviewed By: markbt
Differential Revision: D9828547
fbshipit-source-id: 27be3683beee60a4eee6040ca1b4160dc1a89f73
Summary:
Create a storage object that can be used to load bookmarks from a
mercurial file, modify and query the bookmarks in memory and then write back
to a mercurial bookmark file.
Reviewed By: quark-zju
Differential Revision: D9768564
fbshipit-source-id: ed469d0e588ae2200d614bf62a5a0b577e7c6f74
Summary:
Copy functions from Mononoke to implement the Display trait
for a Node.
Reviewed By: quark-zju
Differential Revision: D9768566
fbshipit-source-id: 6961026a9e4cdaf4a0f2592dc9284abebadb0aa3
Summary:
Preserve leading (but not tailing) new lines so the config (where `_` denotes a
space):
x_=__
__Foo
__
is parsed as `"\nFoo"`.
This is useful in template configs.
Reviewed By: ryanmce
Differential Revision: D9929764
fbshipit-source-id: e30659df94937c7c2121627f42ea425191003fb1
Summary:
Be more permissive about spaces. Namely:
- Spaces after a section name like `[foo] ` are allowed.
- Spaces in config names are allowed.
- Spaces at trailing lines are ignored and no longer insert an `\n` to the previous config.
This makes it closer to the older config parser behavior. But it's still
different on some cases, like `[foo]]`, `[foo] # bar`, `[foo]]` still do not
parse.
Benchmark shows no obvious (within 10%) slowdown. So this is probably fine.
Reviewed By: strager
Differential Revision: D9620253
fbshipit-source-id: 8489ef8e83606d0557db56e8da0a017d55ff1514
Summary:
Maybe useful as a backup for the regular path and also for syncing speed up.
Scm daemon know new and removed heads, so if for example 1 new and 1 removed head - it is the most probably just an amend, so scm daemon can try the fast path first depends on information in the notification, and if it fails try the slow path.
So users can have better experience before Mononoke, it is much much faster and scm daemon makes 2 attempts anyway!
Reviewed By: quark-zju
Differential Revision: D9309856
fbshipit-source-id: d59f498160a45fab11760b5c1397b48470feb7f8
Summary: This would provide information about performance changes.
Reviewed By: singhsrb
Differential Revision: D9620252
fbshipit-source-id: 51d243b50b349c63e552bd1c43db17497025f73a
Summary:
Local bytes `&[u8]` or `Vec<u8>` is frequently wrapped into a `CString`,
because `CString` includes a trailing 0. Let's add a helper for that.
Reviewed By: quark-zju
Differential Revision: D9482435
fbshipit-source-id: 096ba725d83acc9c5fc1fe836dce509fe36e49e9
Summary:
This improvement avoids an extra conversion to String (which can fail if
something cannot be encoded as UTF8).
Reviewed By: quark-zju
Differential Revision: D9447823
fbshipit-source-id: fa13ff9b833cc4edf9f5dc518b3f8712518c97fd
Summary:
liubov-dmitrieva encountered an issue where her home hgrc is not loaded. That's because
environment variables in HGRCPATH are not expanded. Fix it by calling
`expand_path` on the paths.
Reviewed By: phillco
Differential Revision: D9499239
fbshipit-source-id: cd4b7a26fd12f1c3148a21dbb5584bbeb3885286
Summary:
Orignally both `mercurial.ini` and `hgrc.d` were looked up in the same location
as main Mercurial executable, not in the `datadir`. See from `scmwindows.py`:
```
filename = util.executablepath()
# Use mercurial.ini found in directory with hg.exe
progrc = os.path.join(os.path.dirname(filename), "mercurial.ini")
rcpath.append(progrc)
# Use hgrc.d found in directory with hg.exe
progrcd = os.path.join(os.path.dirname(filename), "hgrc.d")
```
Reviewed By: quark-zju
Differential Revision: D9540052
fbshipit-source-id: d5921193dd14fcb46cf428aaa77d26a58aef7868
Summary:
Without this patch, all hg commands will fail with our current config:
hg: parse error: <filename>
--> 6:5
|
6 | commandexception
| ^---
|
= expected new_line
The config is:
[blackbox]
track = command
commandexception
...
Because "\r\n" was treated as the same as double "\n"s.
Reviewed By: ryanmce
Differential Revision: D9494909
fbshipit-source-id: 64ef173c69f3cf61d4e71116c581dbca72fb2c4b
Summary:
This just adds some more tests that in case of Windows try to execute the
string encoding APIs directly, while paying attention to the ANSI Code Page.
Reviewed By: quark-zju
Differential Revision: D9441406
fbshipit-source-id: c0873dca9fc8775839a62da60af46ff29e700634
Summary: Another convenience method that I plan to use in the `hgmain` later.
Reviewed By: quark-zju
Differential Revision: D9441426
fbshipit-source-id: 007e4932a344b9d1c8d4d654152bcca5c2362431
Summary:
I think it's more readable to split the implementations into platform-specific
bits.
Reviewed By: quark-zju
Differential Revision: D9441424
fbshipit-source-id: 136d5a00aa4ed8cf4f0886bda0f77a40cba1f542
Summary:
We almost never need an `OEM` code page: Windows API calls use ANSI-encoded
strings if they are `A` calls and Wide strings if they are `W` calls.
Reviewed By: quark-zju
Differential Revision: D9441425
fbshipit-source-id: 979697c349389ea4f7569be9949be3b636f6063c
Summary:
In the later diffs I'll add some more functionality there, not strictly
related to encoding paths.
Reviewed By: quark-zju
Differential Revision: D9441427
fbshipit-source-id: 069ab30a24761038fa2c1a4f180bbc0699d38ef9
Summary:
This diff is first in the series to make Eden work on Windows. It includes:
1. HG backing store and Object store, which provides the capability to talk to mercurial and fetch the file and folder contents on Windows.
2. Subprocess and Pipe definition for Windows.
3. The Visual studio solution and projects files to compile Eden and scm datapack.
Few Important points:
1. Most of the changes to existing code is done under a macro EDEN_WIN so that it doesn't impact on other platform.
2. Sqlite is used for caching the fetched contents. We are not using Rocksdb on Windows.
3. The main function only calls some test code and exit after printing the output.
4. The initializeMononoke code is disabled for Windows because it needs Proxygen to talk HTTP. Will enable this once I get Proxygen and other dependencies working.
5. HgImporter pass Windows handles to hg_import_helper as command line args. The code to convert these handles into fds is in a separate diff.
Reviewed By: wez
Differential Revision: D8653992
fbshipit-source-id: 52a3c3750425fb92c2a7158c2c214a9372661e13
Summary:
This was meant to be in a prior diff but was forgotten. This also
exposes an issue where we aren't producing ancestors in topological order.
Reviewed By: quark-zju
Differential Revision: D9380009
fbshipit-source-id: 6a49f0f31c3e107353f9192ca15cda0b1b9c3693
Summary:
The config remapping, whitelisting features are hg specific. And is done by
using `append_filter` API exposed by `config.rs`. They are more of "extended
features". So move them to `hg.rs`.
Reviewed By: DurhamG
Differential Revision: D9323789
fbshipit-source-id: 89bc4416ee7276c2d1d4db8eba6404747cbb4ec4
Summary:
On Windows `%include` can have paths containing environment variables like
`%PROGRAMDATA%`. We already ship that kind of config files to users therefore
let's add support for that.
The change assumes `%` is not used as part of a normal path, which is probably
good enough for practical uses. If `%` does need to be legally used in a
filename, we can add escaping support later.
Reviewed By: DurhamG
Differential Revision: D9283303
fbshipit-source-id: bcc80307fe19dfc40aea88b6a0a5f69681e835fc
Summary:
v0 history packs require more complicated and slow logic for looking up
a node. Instead of complicating our rust implementation, let's just not support
v0.
Reviewed By: quark-zju
Differential Revision: D9373395
fbshipit-source-id: 6d28a3684966b55a617619e3cae765b2944919a0
Summary:
When calling get_ancestors with 'partial' enabled, we want to return a
key error if the first key can't be found, but not if later keys can't be found.
Reviewed By: singhsrb
Differential Revision: D9367477
fbshipit-source-id: 0e9ad7ea82f83db7326392accab96bd31318f28e
Summary:
Previously HistoryIndex.write() accepted a vector and a hashmap that
contained Box<[u8]>. This diff changes it to be &Box<[u8]>, which allows us to
avoid a ton of allocations.
Reviewed By: quark-zju
Differential Revision: D9350962
fbshipit-source-id: 3f900c551584e3431202f3a30afd61aa10fbb78c
Summary:
I learned that Box::from() can be used to copy a slice into a box, so
let's replace my previous to_vec().into_boxed_slice() with this.
Reviewed By: quark-zju
Differential Revision: D9350961
fbshipit-source-id: 94053b82cd64923dfabc9acf3a9dab6daca20cf3