Summary:
Make the struct and method names in this crate more clearly reflective of what they do:
- `Auth` -> `AuthGroup`
- `Auth::try_from` -> `AuthGroup::new`
- `AuthConfig` -> `AuthSection`
- `AuthConfig::new` -> `AuthSection::from_config`
- `AuthConfig::auth_for_url` -> `AuthSection::best_match_for`
Reviewed By: singhsrb
Differential Revision: D26436095
fbshipit-source-id: a5ec5d9c48d3b75a0ee166b74e5340f9c529eeae
Summary:
Move these conversion related function and trait out of `hg` module so EdenFS can use it too. Changes:
* Moved `get_opt`, `get_or` and `get_or_default` directly into `ConfigSet`.
* Moved `FromConfigValue` and `ByteCount` into `configparser::convert`.
Reviewed By: quark-zju
Differential Revision: D26355403
fbshipit-source-id: 9096b7b737bc4a0cccee1a3883e89a323f864fac
Summary: Add Python bindings to the Rust `auth` crate, with the intention of replacing `httpconnection.readauthforuri`.
Reviewed By: quark-zju
Differential Revision: D26419447
fbshipit-source-id: dd13bea74961137790beb8c96120ebef99e3c313
Summary:
The auth crate is now able to check the presence and expiration of client certificates (D26009207 (9f7d4447fd)). When a problem is detected, it emits an `X509Error`, which specifies exactly what the problem is. Since this error always indicates a certificate issue, we can print out the message configured in `help.tlsauthhelp` (which is more specific than `help.tlshelp` from the previous diff).
Previously, Mercurial would attempt to use the certificate anyway, resulting in a difficult to understand error message. Although the previous diffs in this stack improved the error messages on any TLS failure, the `X509Error` messages are even more helpful.
Users can opt in to this certificate validation with `edenapi.validate-certs`. The functionality is gated on a config option to prevent Mercurial from crashing if certificates are misconfigured, but EdenAPI isn't being used.
Reviewed By: quark-zju
Differential Revision: D26385843
fbshipit-source-id: 9809f612f8aab3f2dd442d6dd8dc348f1af45296
Summary: Add a new `TlsError` Python exception type corresponding to `HttpClientError::Tls`.
Reviewed By: quark-zju
Differential Revision: D26385846
fbshipit-source-id: c0df543032461de650a4d24c26c6b8aaab1abbb9
Summary:
One of the primary use cases for hash_to_location is translating user provided
hashes. It is then perfectly valid for the hashes that are provided to not
exist. Where we would previously return an error for the full request if a
hash was invalid, we now omit the hash from the response.
Reviewed By: quark-zju
Differential Revision: D26389472
fbshipit-source-id: c59529d43f44bed7cdb2af0e9babc96160e0c4a7
Summary:
The approach is very similar to what commitrevlogdata does. You could say
that it's cargo culted.
I am not sure how appropriate it is to return CommitLocationToHashResponse
but I think that it's fine for now.
Reviewed By: quark-zju
Differential Revision: D26374219
fbshipit-source-id: 61d851d5a4fc4223c65078ef434a0c67314a90cd
Summary:
In our upcoming migration away from chef/static rc files, we'll be
marking certain files as "allowed". Our hope is that that list only includes
things like .hg/hgrc, ~/.hgrc, etc.
There are cases however where it's convienent to continue to use chef, for
instance when we condition on machine type. To support this, let's add an
allowed_config option, which will allow configs from non-supported locations.
This will also be useful when remediating issues that come up when we start
enforcing allow_location, without rolling back the entire thing.
Reviewed By: quark-zju
Differential Revision: D26233451
fbshipit-source-id: 71789e0361923a6f80de4aef7f012afc0269440d
Summary:
Previously, `write` can fail because the destination file exists as a
directory, or the parent directory is missing. pyworker handles those cases
by calling `clear_conflicts` to remove conflicted directories and create
missing parent directories and retry `write`. Practically, for all `write`
usecases (including checkout) we always want the `clear_conflicts` behavior.
Therefore, move `clear_conflicts` to vfs `write` and make it private.
Reviewed By: quark-zju
Differential Revision: D26257829
fbshipit-source-id: 03d1da0767202edba61c47ae5654847c0ea3b33e
Summary:
This matches the Rust behavior and is useful for filtering
because the env logger syntax applies to target:
% RUST_LOG=edenscm.hgext.debugshell=info lhg dbsh
In [1]: from edenscm import tracing
In [2]: tracing.info('foo')
[2021-02-05T19:21:41.082Z INFO edenscm.hgext.debugshell] message="foo"
Reviewed By: kulshrax
Differential Revision: D26282053
fbshipit-source-id: 8ee9e82b955835b24c49f9bf81c7a3aec7a65a33
Summary:
This affects `first`, `last`, `limit` revset functions.
Previously, they just iterate through the set naively. Now they have Rust fast
paths. For example:
In [1]: time repo.revs('last(parents(:1000000),100)')
CPU times: user 5.08 ms, sys: 1.02 ms, total: 6.1 ms
Wall time: 4.96 ms
Out[1]: <nameset+ <spans [0a087e42b29ba5c9ceb3588477d78f7f09ce2663:af5e72c2a0c2e78de462bba8bde63f2499aeb9b5+999900:999999]>>
In [2]: time repo.revs('first(parents(:1000000),100)')
CPU times: user 2.21 ms, sys: 40 µs, total: 2.25 ms
Wall time: 1.83 ms
Out[2]: <nameset+ <spans [06b96ec2a8b60d984606f36c30d3dbc899d804df:4cab7b68c0bbdc13eb2eded8fc8c4c8d520a7189+0:99]>>
In [5]: time repo.revs('first(reverse(parents(:1000000)),100)')
CPU times: user 2.2 ms, sys: 185 µs, total: 2.39 ms
Wall time: 1.67 ms
Out[5]: <nameset- <spans [0a087e42b29ba5c9ceb3588477d78f7f09ce2663:af5e72c2a0c2e78de462bba8bde63f2499aeb9b5+999900:999999]>>
In [6]: time repo.revs('last(reverse(parents(:1000000)),100)')
CPU times: user 2.01 ms, sys: 12 µs, total: 2.02 ms
Wall time: 1.68 ms
Out[6]: <nameset- <spans [06b96ec2a8b60d984606f36c30d3dbc899d804df:4cab7b68c0bbdc13eb2eded8fc8c4c8d520a7189+0:99]>>
In [7]: time repo.revs('limit(reverse(parents(:1000000)),100,10000)')
CPU times: user 1.48 ms, sys: 887 µs, total: 2.37 ms
Wall time: 1.89 ms
Out[7]: <nameset- <spans [5e23b6f07f1512a8991de5a0e883ff4d598ac1d7:f5f68207be4a458e99d4a9200296977aecf44ea2+989900:989999]>>
This, together with fast `_firstancestors`, could potentially answer
globalrev-like queries on the master branch without recording the globalrev
information server side (which has complexities like locking, etc). For
example, to convert a global rev `g` to commit:
c = repo.revs('limit(_firstancestors(master), 1, %s)', g).first()
To convert a commit `c` to global rev:
g = len(repo.revs('_firstancestors(%s)', c)) - 1
Reviewed By: sfilipco
Differential Revision: D26203558
fbshipit-source-id: 14d9247bbb07260f783e05b3fb1034406de48121
Summary:
Previously, `LazySet` was constructed with default `Hints`. That disables fast
paths. Revise the API so LazySet requires an explicit `Hints` to address the
issue.
Reviewed By: sfilipco
Differential Revision: D26203561
fbshipit-source-id: c92cd1f7eb7b40ffaaf53abcf05e64f3d41b906d
Summary:
This just renames types so `IdSet` is the recommended name and `SpanSet`
remains an implementation detail.
Reviewed By: sfilipco
Differential Revision: D26203560
fbshipit-source-id: 7ca0262f3ad6d874363c73445f40f8c5bf3dc40e
Summary:
Optimize the `_firstancestors` revset function using Rust.
When calculating `len(_firstancestors(master))`, the new code took 2ms while
the old code needs 76s, a ~40000x improvement.
Reviewed By: sfilipco
Differential Revision: D26182242
fbshipit-source-id: 55f17b014e727d8e8e3099b7c287f8bf8479279b
Summary:
Optimize the `x~n` revset function using Rust.
Note: This changes the behavior a bit, `x~n` no longer returns `null`.
Reviewed By: sfilipco
Differential Revision: D26142683
fbshipit-source-id: d6a45b7e67352d74986274e52002a769bbae772e
Summary: Optimize the "merge()" revset function using the merges() from Rust.
Reviewed By: sfilipco
Differential Revision: D26142169
fbshipit-source-id: 47f426625869b7889b28bb1a18544d4abae36cae
Summary:
The `bytes` crate still does not support zero-copy on mmaped buffer.
Switch to `minibytes::Bytes` so bytes returned from our main storage
backend indexedlog is a zero-copy slice backed by a mmap buffer.
Migrate vfs, revisionstore, pyworker to minibytes so they can preserve
zero-copy mmap buffers from indexedlog.
The edenapi/types is unchanged, since it's also used in Mononoke which uses
`bytes::Bytes` all the places. The conversion to `minibytes::Bytes` is cheap
so it's probably not a performance issue.
Reviewed By: kulshrax
Differential Revision: D26218289
fbshipit-source-id: e4f1c631143b7676c6b48d3b4f97055299bfd334
Summary:
The original migration strategy with dynamicconfig was to fix configs
one by one until the dynamicconfig values matched the chef/static ones, then we
can turn off chef/static configs. This looks to be too much work, so we're going
to try a different strategy of just turning off all chef/static configs on a
small number of hosts and seeing what breaks.
The legacylist and disallowlist configs were part of the old strategy, and they
make it more complicated to fix dynamicconfig mismatches, so let's get rid of
them.
Reviewed By: quark-zju
Differential Revision: D26208548
fbshipit-source-id: 63171f1f16aa0498c0eefa994dffaeb8e0cc0d72
Summary:
Currently the data layer eats all errors from remote stores and treats
them as KeyErrors. This hides connection issues from users behind obscure
KeyErrors. Let's make it so that any non-key error reported by the remote store
is propagated up as a legitimate error.
This diff makes Http errors from EdenApi show up with a nicer error message,
suggesting that the user run fixmyserver.
Further fixes will probably be necessary to categorize other errors from the
remote store more nicely.
Reviewed By: quark-zju
Differential Revision: D26117726
fbshipit-source-id: 7d7dee6ec101c6a1d226185bb27423d977096050
Summary:
Similar to Rust tracing's instrument. It's a decorator for functions that will
generate spans when called.
Reviewed By: sfilipco
Differential Revision: D26021958
fbshipit-source-id: bbb648ab07d6db233cd16f56a5f0441df07e1c2e
Summary:
This allows Python to create span callsites understood by Rust tracing,
and operate on the spans (ex. enter, exit, record, etc.).
Reviewed By: sfilipco
Differential Revision: D26013747
fbshipit-source-id: 2783b29750e3279c5481422bddc83366ff7a3548
Summary:
This allows Python to create callsites for one-off events understood by Rust.
This diff adds the "EventCallsite" for logging one-off events.
Reviewed By: sfilipco
Differential Revision: D26013749
fbshipit-source-id: 2520928dc360852afe2780267036e9d22c212191
Summary:
We used to get those in the old (Python) LFS extension, but didn't have them in
the new one. However, this is helpful to correlate requests to LFS with data in
hg logs. It's also convenient to be able to identify whether a set of requests
are part of the same session or not.
This diffs threads the client correlator through to the LFS store from the
Python, similarly to how it's done in EdenAPI.
Reviewed By: DurhamG
Differential Revision: D25804930
fbshipit-source-id: a5d5508617fa4184344834bbd8e3423816aa7668
Summary:
The most common scenario where we see matcher errors is when we iterate through
a manifest and the user sends SIGTERM to the process. The matcher may be both
Rust and Python code. The Python code handles the interrupt and prevents future
function calls. The iterating Rust code will continue to call matcher functions
through this time so we get matcher errors from the terminated Python stack.
As long as we have Python matcher code, errors are valid.
It is unclear to me whether the matcher trait should have `Result` return
values when all implementations are Rust. It is easy to imagine implementations
that can fail in different circumstances but the ones that we use if we just
port the Python code wouldn't fail.
All in all, I think that this is a reasonable step forward.
Reviewed By: quark-zju
Differential Revision: D25697099
fbshipit-source-id: f61c80bd0a8caa58040a447ed02d48a1ae84ad60
Summary: This makes it easier to investigate fast path issues.
Reviewed By: sfilipco
Differential Revision: D25598077
fbshipit-source-id: 27b7042fb9510321c25371f8c5d134e248b3d5d5
Summary: This makes it possible to add new commits in a repo without revlog.
Reviewed By: sfilipco
Differential Revision: D25602527
fbshipit-source-id: 56c27a5f00307bcf35efa4517c7664a865c47a43
Summary:
We want to start disallowing non-approved config files from being
loaded. To do that, let's update the config verifier to accept an optional list
of allowed locations. If it's provided, we delete any values that came from a
disallowed location.
This will enable us to prune our config sources down to rust configs,
configerator configs, .hg/hgrc, and ~/.hgrc.
Reviewed By: quark-zju
Differential Revision: D25539738
fbshipit-source-id: 0ece1c7038e4a563c92140832edfa726e879e498
Summary:
Unlike streamcommitrawtext, the new API does not put Python logic to a
background thread. This will make it easier to reason about Python logic as
they do not need to be thread-safe, and we don't need to think about Python GIL
deadlocks in the Rust async world.
Reviewed By: sfilipco
Differential Revision: D25513057
fbshipit-source-id: 4b30d7bab27070badd205ac1a9d54bae7f1f8cec
Summary:
When a blob is redacted server side, the http code 410 is returned.
Unfortunately, that HTTP code wasn't supported and caused Mercurial to crash.
To fix this, we just need to store a placeholder when receiving this HTTP code
and simply return it up the stack when reading it.
Reviewed By: DurhamG
Differential Revision: D25433001
fbshipit-source-id: 66ec365fa2643bcf29e38b114ae1fc92aeaf1a7b
Summary: Make IdConvert async and migrate all its users.
Reviewed By: sfilipco
Differential Revision: D25350915
fbshipit-source-id: f05c89a43418f1180bf0ffa573ae2cdb87162c76
Summary: This will make it easier to make IdConvert async.
Reviewed By: sfilipco
Differential Revision: D25345239
fbshipit-source-id: 684a0843ae32270aa9b537ef9a2b17a28c027e51
Summary: This will make it easier to make IdConvert async.
Reviewed By: sfilipco
Differential Revision: D25345232
fbshipit-source-id: b8967ea51a6141a95070006a289dd724522f8e18
Summary:
Update DagAlgorithm and all its users to async. This makes it easier to make
IdConvert async.
Reviewed By: sfilipco
Differential Revision: D25345236
fbshipit-source-id: d6cf76723356bd0eb81822843b2e581de1e3290a
Summary: This will make it easier if the `Set` passed in requires async.
Reviewed By: sfilipco
Differential Revision: D25345230
fbshipit-source-id: a327d4e5d425b7eb5296b2fbe25c446492aa9ea7
Summary: This makes it easier to make DagAlgorithm async.
Reviewed By: sfilipco
Differential Revision: D25345234
fbshipit-source-id: 5ca4bac38f5aac4c6611146a87f423a244f1f5a2
Summary: This will make it easier to use `.await` if part of `dag` becomes async.
Reviewed By: sfilipco
Differential Revision: D25345237
fbshipit-source-id: 7f07cdaa9c2e0468667638066611fabe3a3f7f28
Summary: Use async function for the PrefixLookup trait.
Reviewed By: sfilipco
Differential Revision: D24840820
fbshipit-source-id: d22cac9f11b06e3127fa956e3f116cf232214125
Summary: This makes `dyn IdConvert` include `PrefixLookup`.
Reviewed By: sfilipco
Differential Revision: D24840819
fbshipit-source-id: 8d4e25c534f6e4397ec6f643eb3aa116bff12a2c
Summary:
Change the main API of NameSet to async. Use the `nonblocking` crate to bridge
the sync and async world for compatibility. Future changes will migrate
Iterator to async Stream.
Reviewed By: sfilipco
Differential Revision: D24806696
fbshipit-source-id: f72571407a5747a4eabe096dada288656c9d426e
Summary:
```
warning: attribute should be applied to a function or static
--> pyhgtime/src/lib.rs:70:13
|
70 | #[no_mangle]
| ^^^^^^^^^^^^
71 | static timezone: c_long;
| ------------------------ not a function or static
|
= note: `#[warn(unused_attributes)]` on by default
```
Reviewed By: ikostia
Differential Revision: D25345233
fbshipit-source-id: 4b6f753544af1e3e8479dceed908299f6dc57ad5
Summary:
In a future diff we'll use the indexedlog stores for local history. We
want those to exist forever, so let's move IndexedLogHgIdHistoryStore to use a
Store under the hood, and add an enum for distinguishing between the two types
at creation time.
Reviewed By: quark-zju
Differential Revision: D25429675
fbshipit-source-id: 5f2dc494e1175d4c1dc74992d3311d2e55d784ca
Summary:
We temporarily dropped repair support when transitioning to using Store instead
of a raw RotateLog. Let's add that back now.
Reviewed By: xavierd
Differential Revision: D25371622
fbshipit-source-id: e28fc425a6ffb50c93540672b0df75a172ebbe9c
Summary:
In a future diff we'll use the indexedlog stores for local data. We
want those to exist forever, so let's move IndexedLogHgIdDataStore to use a
Store under the hood, and add an enum for distinguishing between the two types
at creation time.
Reviewed By: xavierd
Differential Revision: D23915622
fbshipit-source-id: 296cf6dfcd53e5cf1ae7624fdccedf0a60a77f22
Summary:
Introduce a new API type, `TreeAttributes`, corresponding to the existing type `WireTreeAttributesRequest`, which exposes which optional attributes are available for fetching. An `Option<TreeAttributes>` parameter is added to the `trees` API, and if set to `None`, the client will make a request with TreeAttributes::default().
The Python bindings accept a dictionary for the attributes parameter, and any fields present will overwrite the default settings from TreeAttributes::default(). Unrecognized attributes will be silently ignored.
Reviewed By: kulshrax
Differential Revision: D25041255
fbshipit-source-id: 5c581c20aac06eeb0428fff42bfd93f6aecbb629
Summary:
The original issue was a rust-cpython bug, solved by D24698226, or https://github.com/dgrunwald/rust-cpython/pull/244.
Original commit changeset: 08f598df0892
Reviewed By: sfilipco
Differential Revision: D24759765
fbshipit-source-id: f9a1359cfce68c8754ddd1bcb8bfc54bf75af7ff
Summary: This is essentially a backout of D24365328 (9f664a8b30).
Reviewed By: DurhamG
Differential Revision: D24544495
fbshipit-source-id: 08f598df0892a8479fac563096f9782038e18dfe
Summary: Rather than silently dropping entries which cannot be fetched, this change has the `WireTreeEntry` type carry optional error information, allowing it to be (de)serialized to / from `Result<TreeEntry, EdenApiServerError>` instead of a bare `TreeEntry`. Currently, handling of these failures is up to the individual application code, but it might be useful to introduce utility functions to drop failed entries and log errors.
Reviewed By: kulshrax
Differential Revision: D24315399
fbshipit-source-id: 94e4593b77cf2dc12d0dcc93d174c8a4eda95344
Summary:
The generatorset has a pure Python implementation for rewindable generator.
However it is not thread-safe, and we want thread-safety since the set
iteration will be driven by async Rust in multiple threads.
One of the issues is, during thread 1 `next(gen)`, thread 2 might call
`next(gen)`, and that's not allowed by Python.
Fix them by switching to the Rust RGenerator.
Reviewed By: DurhamG
Differential Revision: D24365328
fbshipit-source-id: 2785e80c7c460a7f754ed23e3af99f4a5c9fbcdf
Summary:
The ContentStore/Metadatastore are made of several different stores, attempting
to expose all of them to Python to drive the repair logic from there would leak
implementation detail of how the stores are implemented.
Instead, let's simply expose a single `repair` function out of the
pyrevisionstore crate that takes care of repairing all of the underlying
stores. For now, this is just moving code around, but a future diff will
integrate the LFS stores.
Reviewed By: DurhamG
Differential Revision: D24449203
fbshipit-source-id: 1631ced9068716453cb404bf7e65cefbf2db5247
Summary:
Pure Python generator has a few limitations:
- Can only be iterated once.
- Cannot be iterated by multiple threads concurrently.
By converting revset iterators to Rust Stream, they run in different threads
and can context switch in `next()`, breaking the program.
Provide a native reentrant generator implementation to address the issue.
This is sound because Rust methods cannot be interrupted by Python.
Reviewed By: DurhamG
Differential Revision: D24365331
fbshipit-source-id: 885dade922b7863a73203b206a96b492d55bccd0
Summary:
Update the return type of streamcommitrawtext from
`{"vertex": v, "raw_text": t}` to `(v, t)`. This makes it easier to use
in Python, as Python supports `for v, t in ...` but does not support
`for {"vertex": v, "raw_text": t} in ...`.
Reviewed By: DurhamG
Differential Revision: D24295457
fbshipit-source-id: 284a29b9deae2d8509d3afea0fcbcaadbfebbae8
Summary:
Include a `User-Agent` header in EdenAPI requests from Mercurial. This will allow us to see the version in Scuba, and in the future, will allow us to distinguish between requests send by Mercurial and those sent directly by EdenFS.
Keeping with the current output of `hg version`, the application is specified as "EdenSCM" rather than "Mercurial".
Reviewed By: singhsrb
Differential Revision: D24347021
fbshipit-source-id: e323cfc945c9d95d8b2a0490e22c2b2505a620dc
Summary:
Include the client correlator string from the `clienttelemetry` extension in each EdenAPI HTTP request via the `X-Client-Correlator` header.
The `ClientIdentityMiddleware` in `gotham_ext` already understands this header (as it is already used by the LFS server), and `gotham_ext`'s `ScubaMiddleware` will automatically include the provided correlator in the server's Scuba samples.
Reviewed By: farnz
Differential Revision: D24282244
fbshipit-source-id: 13d04e706eda38893cff6e740bd1d7bf104e43dd
Summary:
This change introduces two new metadata types, Category and Transience, and a mechanism for Category to provide a default Fault and Transience, which can be overriden by the user.
Also introduces a mechanism for attempting to log exceptions which occur during exception logging, falling back to the previous behavior of just swallowing the exception on failure.
Reviewed By: DurhamG
Differential Revision: D22677565
fbshipit-source-id: 1cf75ca1e2a65964a0ede1f072439378a46bd391
Summary:
Support constructing the "hybrid" commits backend, which is similar to
"doublewrite" but read commit text from edenapi via the `streamcommitrawtext`
method.
Reviewed By: sfilipco
Differential Revision: D23924149
fbshipit-source-id: cb15ee4be7953af7798d460557ba2ae2d4f24a52
Summary:
This can be used like:
In [1]: s=cl.inner.streamcommitrawtext(repo.nodes('.%%master')) # repo.nodes returns a generator, becomes stream
In [2]: s
Out[2]: <stream at 0x7f5eec742df0>
In [3]: list(s)
Out[3]: [{'vertex': ..., 'raw_text': ...}, ...]
In [4]: s.typename()
Out[4]: 'cpython_ext::convert::Serde<hgcommits::ParentlessHgCommit>'
Reviewed By: sfilipco
Differential Revision: D23911870
fbshipit-source-id: f54959a551d446ed5b8086a2235fe74e47b29e70
Summary:
The API is basically to resolve `input_stream` to `output_stream`, with a
stateful "resolver" that can resolve locally and remotely.
Reviewed By: sfilipco
Differential Revision: D23915775
fbshipit-source-id: 14a3a37fc897c8229514acac5c91c7e46b270896
Summary:
`cpython_ext` provides utilities to implement From/ToPyObject directly for
serde types. Lets' use it to simplify the code and set up an example.
debugshell:
In [2]: s,f=api.commitdata(repo.name, list(repo.nodes('master')))
In [3]: list(s)
Out[3]:
[{'hgid': (7, 61, 22, ...), 'revlog_data': '...'}]
Note: `HgId` serialization should probably be changed to use `serde_bytes` somehow
so it does not translate to a Python tuple. That will be fixed later.
Reviewed By: kulshrax
Differential Revision: D23966987
fbshipit-source-id: 9278ccae6f543c387eafe401d4ef8d6ce96d370f
Summary:
The py_stream_class causes the code to be more verbose. It basically enforces
the bindings crate to define new types wrapping pure Rust types, and then
define py_stream_class.
In a future diff, I'm adding FromPyObject/ToPyObject support for types that
implements serde Deserialize/Serialize. py_stream_class gets in the way,
because the blanket type from cpython-ext cannot be used in the py_stream_class
macro. cpython-ext is not the proper place to define business-related stream
types.
Therefore, define a type-erased Python class, and implement
FromPyObject/ToPyObject automatically for TStream<anyhow::Result<T>> where
T implements FromPyObject or ToPyObject.
The FromPyObject now converts a Python iterator back to a stream. It's
no longer zero-cost. However, I'd imagine such usecases can be short-cut
using pure Rust code.
Background: Initially, I added some FromPyObject/ToPyObject impls to pure
Rust crates gated by a "pytypes" feature. While that works fine with cargo
build, buck does not support dynamic features and the fact that we support
both py2 and py3 makes it extremely hard to support cleanly in buck build.
For example, if minibytes::Bytes defines ToPyObject for Bytes, then any
crate using minibytes would have 2 different versions: a py2 version, a
py3 version, and they both depend on python. That seems to be a bad approach.
Reviewed By: sfilipco
Differential Revision: D23966984
fbshipit-source-id: eafb31ad458dcbdd8e970d8e419a10fbbe30595f
Summary: We now get progress bar output when fetching from memcache!
Reviewed By: kulshrax
Differential Revision: D24060663
fbshipit-source-id: ff5efa08bced2dac12f1e16c4a55fbc37fbc0837
Summary: Now that we progress bars in Rust, add them to the EdenAPI client bindings and remove any existing progress bars around callsites in the Python code.
Reviewed By: quark-zju
Differential Revision: D24037797
fbshipit-source-id: eb26ccaae35ab23eb76f6f2b2be575a22e1f1e53
Summary: Add Python bindings to the Rust progress wrappers. This may seem pointless since the Rust code just calls right back into Python, but this is a useful step to get the Rust and Python code to use a common interface for progress. (Which, in turn, will allow switching to a Rust progress implementation down the line.)
Reviewed By: markbt
Differential Revision: D23999816
fbshipit-source-id: 9bca0f23170d3ca474a1cb5d547840e63572ec71
Summary: Add Rust wrappers around Mercurial's Python `progress` module, allowing Rust code to create and use Python progress bars. The wrapper types implement the traits from the `progress` crate, so they can be passed to pure Rust crates in `scm/lib`. In typical usage, the Rust bindings will create a `PyProgressFactory`, which will be passed to pure Rust code as a trait object or via generics.
Reviewed By: markbt
Differential Revision: D23982317
fbshipit-source-id: 4c0fde0b2423b6449c7c5155fdfd98f5da042b0d
Summary: Now that the `async_runtime` crate exists, use Mercurial's global `tokio::Runtime` instead of creating one for each EdenAPI store.
Reviewed By: quark-zju
Differential Revision: D23945569
fbshipit-source-id: 7d7ef6efbb554ca80131daeeb2467e57bbda6e72
Summary: Now that the EdenAPI server is using the `LoadMiddleware` from `gotham_ext`, each response will contain an `X-Load` header that contains the number of active requests that the server is currently handling.
Reviewed By: quark-zju
Differential Revision: D23922809
fbshipit-source-id: 973143de5ddccf074d28aa3ef38d73f9fc1501b6
Summary:
Treemanifest needs to be able to write to the shared stores from paths
other than just prefetch (like when it receives certain trees via a standard
pull). To make this possible we need to expose the Rust shared mutable stores.
This will also make just general integration with Python cleaner.
In the future we can get rid of the non-prefetch download paths and remove this.
Reviewed By: quark-zju
Differential Revision: D23772385
fbshipit-source-id: c1e67e3d21b354b85895dba8d82a7a9f0ffc5d73
Summary:
This simplifies the code a bit, and avoids creating tokio Runtime multiple
times.
Reviewed By: kulshrax
Differential Revision: D23799642
fbshipit-source-id: 21cee6124ef6f9ab6e165891d9ee87b2feb553ac
Summary:
Exercises the PyStream type from cpython-async.
`hg dbsh`:
In [1]: s,f=api._rustclient.commitdata('fbsource', list(repo.nodes('master^^::master')))
In [2]: s
Out[2]: <stream at 0x7ff2db700690>
In [3]: it=iter(s)
In [4]: next(it)
Out[4]: ('6\xf9\x18\xe4\x1c\x05\xfc\xb0\xd3\xb2\xe9\xec\x18E\xec\x0f\x1a:\xb7\xcd', ...)
In [5]: next(it)
Out[5]: ('}\x1f(\xe1o\xf1a\x9b\x81\xb9\x83}\x1b\xbbt\xd2e\xb1\xedb',...)
In [6]: next(it)
Out[6]: ('\xf1\xf0f\x97<\xf3\xdd\xe41w>\x92\xd1\xc0\x9ah\xdd\x87~^',...)
In [7]: next(it)
StopIteration:
In [8]: f.wait()
Out[8]: <bindings.edenapi.stats at 0x7ff2e006a3d8>
In [9]: str(Out[8])
Out[9]: '2.42 kB downloaded in 165 ms over 1 request (0.01 MB/s; latency: 165 ms)'
In [10]: iter(s)
ValueError: stream was consumed
Reviewed By: kulshrax
Differential Revision: D23799645
fbshipit-source-id: 732a5da4ccdee4646386b6080408c0d8958dd67f
Summary:
Exercises the PyFuture type from cpython-async.
`hg dbsh`:
In [1]: api._rustclient.commitdata('fbsource', list(repo.nodes('master^^::master')))
Out[1]:
([...], <future at 0x7f7b65d05060>)
In [2]: f=Out[1][-1]
In [3]: f.wait()
Out[3]: <bindings.edenapi.stats at 0x7f7b665e8228>
In [4]: f.wait()
ValueError: future was awaited
In [5]: str(Out[3])
Out[5]: '2.42 kB downloaded in 172 ms over 1 request (0.01 MB/s; latency: 171 ms)'
Reviewed By: kulshrax
Differential Revision: D23799643
fbshipit-source-id: d4fcef7dca58bc4902bb0809adc065493bb94bd3
Summary:
During an hg update we first prefetch all the data, then write all the
data to disk. There are cases where the prefetched data is not available during
the writing phase, in which case we fall back to fetching the files one-by-one.
This has truly atrocious performance.
Let's allow the worker threads to check for missing data then do bulk fetching
of it. In the case where the cache was completely lost for some reason, this
would reduce the number of serial fetches by 100x.
Note, the background workers already spawn their own ssh connection's, so
they're already getting some level of parallelism even when they're doing 1-by-1
fetching. That's why we aren't seeing a 100x improvement in performance.
Reviewed By: xavierd
Differential Revision: D23766424
fbshipit-source-id: d88a1e55b1c21e9cea7e50fc6dbfd8a27bd97bb0
Summary: This will be used to test Ctrl+C handling with native code.
Reviewed By: kulshrax
Differential Revision: D23759714
fbshipit-source-id: 50da40d475b80da26b7dbc654e010d77cb0ad2d1
Summary: This makes it easier to test the API via debugshell.
Reviewed By: kulshrax
Differential Revision: D23750677
fbshipit-source-id: e29284395f03c9848cf90dd2df187e437890c56e
Summary:
Adds the initial condition and creation logic for creating a Rust
treemanifest store. Fetching and some other code paths don't work just yet, but
subsequent diffs enable more and more functionality.
Reviewed By: quark-zju
Differential Revision: D23662052
fbshipit-source-id: a0e7090c9a3bf27a7738bf093f2d4eb6098b1ed6
Summary:
The rust pack stores currently have logic to refresh their list of
packs if there's a key miss and if it's been a while since we last loaded the
list of packs. In some cases we want to manually trigger this refresh, like if
we're in the middle of a histedit and it invokes an external command that
produces pack files that the histedit should later consume (like an external
amend, that histedit then needs to work on top of).
Python pack stores solve this by allowing callers to mark the store for a
refresh. Let's add the same logic for rust stores. Once pack files are gone we
can delete this.
This will be useful for the upcoming migration of treemanifest to Rust
contentstore. Filelog usage of the Rust contentstore avoided this issue by
recreating the entire contentstore object in certain situations, but refresh
seems useful and less expensive.
Reviewed By: quark-zju
Differential Revision: D23657036
fbshipit-source-id: 7c6438024c3d642bd22256a8e58961a6ee4bc867
Summary:
Previously the MetadataStore would always construct a mutable pack, even
if the operation was readonly. This meant all read commands required write
access. It also means that random .tmp files get scattered all over the place
when the rust structures are not properly destructed (like if python doesn't
bother doing the final gc to call destructors for the Rust types).
Let's just only create mutable packs when we actually need them.
Reviewed By: quark-zju
Differential Revision: D23219961
fbshipit-source-id: a47f3d94f70adac1f2ee763f3170ed582ef01a14
Summary:
Previously the ContentStore would always construct a mutable pack, even
if the operation was readonly. This meant all read commands required write
access. It also means that random .tmp files get scattered all over the place
when the rust structures are not properly destructed (like if python doesn't
bother doing the final gc to call destructors for the Rust types).
Let's just only create mutable packs when we actually need them.
Reviewed By: quark-zju
Differential Revision: D23219962
fbshipit-source-id: 573844f81966d36ad324df03eecec3711c14eafe
Summary:
This imports the async-compression crate. We have an equivalent-ish in
common/rust, but it targets Tokio 0.1, whereas this community-supported crate
targets Tokio 0.2 (it offers a richer API, notably in the sense that we
can use it for Streams, whereas the async-compression crate we have is only for
AsyncWrite).
In the immediate term, I'd like to use this for transfer compression in
Mononoke's LFS Server. In the future, we might also use it in Mononoke where we
currently use our own async compression crate when all that stuff moves to
Tokio 0.2.
Finally, this also updates zstd: the version we link to from tp2 is actually
zstd 1.4.5, so it's a good idea to just get the same version of the zstd crate.
The zstd crate doesn't keep a great changelog, so it's hard to tell what has changed.
At a glance, it looks like the answer is not much, but I'm going to look to Sandcastle
to root out potential issues here.
Reviewed By: StanislavGlebik
Differential Revision: D23652335
fbshipit-source-id: e250cef7a52d640bbbcccd72448fd2d4f548a48a
Summary:
For repositories that have the old-style LFS extension enabled, the pointers
are stored in packfiles/indexedlog alongside with a flag that signify to the
upper layers that the blob is externally stored. With the new way of doing LFS,
pointers are stored separately.
When both are enabled, we are observing some interesting behavior where
different get and get_meta calls may return different blobs/metadata for the
same filenode. This may happen if a filenode is stored in both a packfile as an
LFS pointers, and in the LFS store. Guaranteeing that the revisionstore code is
deterministic in this situation is unfortunately way too costly (a get_meta
call would for instance have to fully validate the sha256 of the blob, and this
wouldn't guarantee that it wouldn't become corrupted on disk before calling
get).
The solution take here is to simply ignore all the lfs pointers from
packfiles/indexedlog when remotefilelog.lfs is enabled. This way, there is no
risk of reading the metadata from the packfiles, and the blob from the
LFSStore. This brings however another complication for the user created blobs:
these are stored in packfiles and would thus become unreadable, the solution is
to simply perform a one-time full repack of the local store to make sure that
all the pointers are moved from the packfiles to to LFSStore.
In the code, the Python bindings are using ExtStoredPolicy::Ignore directly as
these are only used in the treemanifest code where no LFS pointers should be
present, the repack code uses ExtStoredPolicy::Use to be able to read the
pointers, it wouldn't be able to otherwise.
Reviewed By: DurhamG
Differential Revision: D22951598
fbshipit-source-id: 0e929708ba5a3bb2a02c0891fd62dae1ccf18204
Summary:
In order to keep the hgcache size bounded we need to keep track of pack
file size even during normal operations and delete excess packs.
This has the negative side effect of deleting necessary data if the operation is
legitimately huge, but we'd rather have extra downloading time than fill up the
entire disk.
Reviewed By: quark-zju
Differential Revision: D23486922
fbshipit-source-id: d21be095a8671d2bfc794c85918f796358dc4834
Summary:
As the repository grows the opportunity for large downloads increases.
Today all writes to data packs get sent straight to disk, but we have no way to
prevent this from eating all the disk.
Let's automatically flush datapacks when they reach a certain size (default
4GB). In a future diff this will let us automatically garbage collect data packs
to bound the maximum size of packs.
Rotatelog already have this behavior.
Reviewed By: quark-zju
Differential Revision: D23478780
fbshipit-source-id: 14f9f707e8bffc59260c2d04c18b1e4f6bdb2f90
Summary:
Generated by formatting with rustfmt 2.0.0-rc.2 and then a second time with fbsource's current rustfmt (1.4.14).
This results in formatting for which rustfmt 1.4 is idempotent but is closer to the style of rustfmt 2.0, reducing the amount of code that will need to change atomically in that upgrade.
---
*Why now?* **:** The 1.x branch is no longer being developed and fixes like https://github.com/rust-lang/rustfmt/issues/4159 (which we need in fbcode) only land to the 2.0 branch.
---
Reviewed By: zertosh
Differential Revision: D23568779
fbshipit-source-id: 477200f35b280a4f6471d8e574e37e5f57917baf
Summary:
Now that the Rust revisionstore records undesired filename fetches,
let's log those results to Scuba in Python.
Reviewed By: StanislavGlebik
Differential Revision: D23462572
fbshipit-source-id: b55f2290e30e3a5c3b67d9f612b24bc3aad403a8
Summary:
Spawning processes turns out to be tricky.
Python 2:
- "fork & exec" in plain Python is potentially dangerous. See D22855986 (c35b8088ef).
Disabling GC might have solved it, but still seems fragile.
- "close_fds=True" works on Windows if there is no redirection.
- Does not work well with `disable_standard_handle_inheritability` from `hgmain`.
We patched it. See `contrib/python2-winbuild/0002-windows-make-subprocess-work-with-non-inheritable-st.patch`.
Python 3:
- "subprocess" uses native code for "fork & exec". It's safer.
- (>= 3.8) "close_fds=True" works on Windows even with redirection.
- "subprocess" exposes options to tweak low-level details on Windows.
Rust:
- No "close_fds=True" support for both Windows and Unix.
- Does not have the `disable_standard_handle_inheritability` issue on Windows.
- Impossible to cleanly support "close_fds=True" on Windows with existing stdlib.
https://github.com/rust-lang/rust/pull/75551 attempts to add that to stdlib.
D23124167 provides a short-term solution that can have corner cases.
Mercurial:
- `win32.spawndetached` uses raw Win32 APIs to spawn processes, bypassing
the `subprocess` Python stdlib.
- Its use of `CreateProcessA` is undesirable. We probably want `CreateProcessW`
(unless `CreateProcessA` speaks utf-8 natively).
We are still on Python 2 on Windows, and we'd need to spawn processes correctly
from Rust anyway, and D23124167 kind of fills the missing feature of `close_fds=True`
from Python. So let's expose the Rust APIs.
The binding APIs closely match the Rust API. So when we migrate from Python to
Rust, the translation is more straightforward.
Reviewed By: DurhamG
Differential Revision: D23124168
fbshipit-source-id: 94a404f19326e9b4cca7661da07a4b4c55bcc395
Summary: This would be used to avoid excessive memory usage during pull.
Reviewed By: DurhamG
Differential Revision: D23408833
fbshipit-source-id: 8edd95ab8201697074f65cc118d14755a230567d
Summary: This makes the code simpler.
Reviewed By: sfilipco
Differential Revision: D23269858
fbshipit-source-id: bb9ac0bd1696f7429ca1856e6c63e04fabc2757a
Summary: Provide a way to see segments.
Reviewed By: sfilipco
Differential Revision: D23196408
fbshipit-source-id: b1418f945a5a3364ac73b0f97466d973dd4b6300
Summary: This is more consistent with `id_map_snapshot`.
Reviewed By: sfilipco
Differential Revision: D23182519
fbshipit-source-id: 62b7fc8bfdc9d6b3a4639a6518ea084c7f3807dd
Summary:
Based on [user report](https://fb.workplace.com/groups/scm/permalink/3128221090560823/).
Note that slices in rust behave differently and if index exceeds slice size this will always be panic. My fix was based on assumption that behavior should be similar to python.
Reviewed By: quark-zju
Differential Revision: D23263922
fbshipit-source-id: 3d2a1a1b59f14e43b1f1a2b7102982b11637c0b4
Summary: Migrate to concrete types so it can be typechecked.
Reviewed By: DurhamG
Differential Revision: D23095469
fbshipit-source-id: 27c6da30ca8a1329df544cd2ded7d9734593e48a
Summary: Expose the Rust API so `getdag` can choose to skip successors or predecessors.
Reviewed By: markbt
Differential Revision: D23036056
fbshipit-source-id: 30cd437c5420d2d10176e33ef9de98814046f4ce
Summary:
The new path does not calculate the complicated `successorssets`, and is
known to make wez's repo operations significantly faster (which, I suspect is
slowed by a very long chain).
The new code is about 3x faster on my repo too:
# before
In [1]: list(repo.nodes('draft()'))
In [2]: %time len(m.mutation.obsoletenodes(repo))
CPU times: user 246 ms, sys: 42.2 ms, total: 288 ms
Wall time: 316 ms
Out[2]: 1127
# after
In [1]: list(repo.nodes('draft()'))
In [2]: %time len(m.mutation.obsoletenodes(repo))
CPU times: user 74.3 ms, sys: 7.92 ms, total: 82.3 ms
Wall time: 82.3 ms
Out[2]: 1127
Reviewed By: markbt
Differential Revision: D23036063
fbshipit-source-id: afd6ac122bb5d8d513b5cdc033e04d2c377286eb
Summary: This will be useful for the `obsolete()` set.
Reviewed By: sfilipco
Differential Revision: D23036072
fbshipit-source-id: 2f944ef31cf19f902622d90545fa02b7dda89221
Summary:
This trades a bit performance (calculating the snapshot) for correctness (no
pointer reuse issues) and convenience (set captures dag information with them
and enables use-cases like converting NameSet from another dag to the
current dag without requiring extra `dag` objects).
Reviewed By: sfilipco
Differential Revision: D23036067
fbshipit-source-id: 2e691f09ad401ba79dbc635e908d79e54dadca5e
Summary:
This API allows the underlying Dag to provide a snapshot. The snapshot can then
be used in places that do not want a lifetime (ex. NameSet).
Reviewed By: sfilipco
Differential Revision: D22970579
fbshipit-source-id: ededff82009fd5b4583f871eef084ec907b45d33
Summary:
The only intended use of the inverse DAG is to implement the Python dag
interface in `dagutil.py`. D22519589 (2d4d44cf3d) stack changed it so the Python dag
interface becomes optional. Therefore there is no need to keep the inverse DAG
interface, which is a bit tricky on sorting.
Reviewed By: sfilipco
Differential Revision: D22970581
fbshipit-source-id: 58a126b41d992e75beaf76ece25cb578ee84760b
Summary:
This will be used for migrating revlog DAG to segmented changelog. It does not
migrate commit text data (which can take 10+ minutes).
Reviewed By: DurhamG, sfilipco
Differential Revision: D22970582
fbshipit-source-id: 125a8726d48e15ceb06edb139d6d5b2fc132a32c
Summary: Update bindings to expose the DoubleWrite backend and the DescribeBackend API.
Reviewed By: sfilipco
Differential Revision: D22970574
fbshipit-source-id: bdb52ff21dd0b9ffa0be214b4a4824025f460092
Summary:
This new disallowlist will let us specify config section.key's which
should not be accepted from old rc files. This will let us incrementally disable
loading of those configs from the old files, which will then let us delete them
from the old rc's and eventually delete the old rc's entirely.
This diff also removes hgrc.local and hgrc.od from the list of configs we
verify, since those are not on the list of configs that need to be removed in
this initiative.
Reviewed By: quark-zju
Differential Revision: D23065595
fbshipit-source-id: 5cd742d099efd651174cab5e87bb7cdc4bae8054
Summary:
This threads the calls to load_dynamic and load_repo through the Rust
layer up to the Python bindings. This diff does 2 notable things:
1. It adds a reload API for reloading configs in place, versus creating a new
one. This will be used in localrepo.__init__ to construct a new config for the
repo while still maintaining the old pinned values from the copied ui.
2. It threads a repo path and readonly config list from Python down to the Rust
code. This allows load_dynamic and load_repo to operate on the repo path, and
allows the readonly filter to applied to all configs during reloading.
Reviewed By: quark-zju
Differential Revision: D22712623
fbshipit-source-id: a0f372f4971c5feac2f20e89a0fb3fe6d4a65d6f
Summary:
As part of moving all hg config loading and generation logic into Rust,
let's move the config generation logic from hgcommands and pyconfigparser to
configparser, unifying them at the same time.
Future diffs will move config loading in as well.
Reviewed By: quark-zju
Differential Revision: D22590208
fbshipit-source-id: d1760c404a6a5c57347df30713c20de55cfdb9a4
Summary:
A future diff will unify all config loading into configparser::hg, but
to do so we need dynamicconfig to live in configparser, so it can load
dynamicconfigs. Let's move everything in.
Reviewed By: quark-zju
Differential Revision: D22587237
fbshipit-source-id: 5613094175b6e1597aa113ee3e6d92ce7ec79f6d
Summary:
We had two spots that loaded system and user configs, one in the
pyconfigparser layer, and one in the pure rust config layer. In an upcoming diff
I'd like to move dynamicconfig loading down into the pure rust layer, so let's
unify these.
Reviewed By: quark-zju
Differential Revision: D22585554
fbshipit-source-id: 0cea7801ae1d5a3a3c12b80ee23b37f9e690e2bc
Summary:
In a future diff we'll increase the size of the rotatelog temporarily
during clones. To do so we need it to be configurable.
Reviewed By: quark-zju
Differential Revision: D23089539
fbshipit-source-id: ebfc3beaf3c0fe5b01b87d97c19455b0a24afa72
Summary:
In a future diff we'll increase the size of the rotatelog temporarily
during clones. To do so we need it to be configurable.
Reviewed By: quark-zju
Differential Revision: D23089541
fbshipit-source-id: 5010e417a83a2611283322f1dbb7023f4286f503
Summary: If parent_revs gets an out-of-bound rev, it should fail.
Reviewed By: sfilipco
Differential Revision: D23036071
fbshipit-source-id: 7fae0fd5adf07ac3c933a29d7d06289d8d740c60
Summary:
The primary change is in `eden/scm/lib/edenapi/types`:
* Split `DataEntry` into `FileEntry` and `TreeEntry`.
* Split `DataError` into `FileError` and `TreeError`. Remove `Redacted` error variant from `TreeError` and `MaybeHybridManifest` error variant from `FileError`.
* Split `DataRequest`, `DataResponse` into appropriate File and Tree types.
* Refactor `data.rs` into `file.rs` and `tree.rs`.
* Lift `InvalidHgId` error, used by both File and Tree, into `lib.rs`.
* Bugfix: change `MaybeHybridManifest` to be returned only for hash mismatches with empty paths, to match documented behavior.
Most of the remaining changes are straightforward fallout of this split. Notable changes include:
* `eden/scm/lib/edenapi/tools/read_res`: I've split the "data" commands into "file" and "tree", but I've left the identical arguments sharing the same argument structs. These can be refactored later if / when they diverge.
* `eden/scm/lib/types/src/hgid.rs`: Moved `compute_hgid` from `eden/scm/lib/edenapi/types/src/data.rs` to as a new `from_content` constructor on the `HgId` struct.
* `eden/scm/lib/revisionstore/src/datastore.rs`: Split `add_entry` method on `HgIdMutableDeltaStore` trait into `add_file` and `add_tree` methods.
* `eden/scm/lib/revisionstore/src/edenapi`
* `mod.rs`: Split `prefetch` method on `EdenApiStoreKind` into `prefetch_files` and `prefetch_trees`, which are given a default implementation that fails with `unimplemented!`.
* `data.rs`: Replace blanket trait implementations for `EdenApiDataStore<T>` with specific implementations for `EdenApiDataStore<File>` and `EdenApiDataStore<Tree>` which call the appropriate fetch and add functions.
* `data.rs` `test_get_*`: Replace dummy hashes with real hashes. These tests were only passing due to the hash mismatches (incorrectly) being considered `MaybeHybridManifest` errors, and allowed to pass.
Reviewed By: kulshrax
Differential Revision: D22958373
fbshipit-source-id: 788baaad4d9be20686d527f819a7342678740bc3
Summary: Begin adding some initial type annotations for the Rust Python bindings.
Reviewed By: quark-zju
Differential Revision: D22993222
fbshipit-source-id: 2073db93b22f6bb04e30b767594d435c36ddb17f
Summary:
Introduce taggederror-util, which provides a new trait `AnyhowEdenExt`, which provides a method `eden_metadata` for anyhow errors and results. This method works much like `AnyhowExt::common_metadata`, but additionally supports extracting default error metadata from known `Tagged` types which are listed explicitly in the method implementation.
Extend `FilteredAnyhow` to support a configuration "metadata function", which allows swapping out `eden_metadata` for the standard `common_metadata`.
Modify Rust dispatch and Python bindings to use `AnyhowEdenExt` for metadata extraction and printing.
Modify `intentional_error` to rely on `AnyhowEdenExt` for tagging (removes `.tagged` call, no tags will be visible if `AnyhowEdenExt` is not used).
Reviewed By: DurhamG
Differential Revision: D22927203
fbshipit-source-id: 04b36fdfaa24af591118acb9e418d1ed7ae33f91
Summary:
This is more complex than previous libraries, mainly because `dag` defines APIs
(traits) used by other code, which might raise error type not interested
by `dag` itself. `BackendError::Other(anyhow::Error)` is currently used to
capture types that do not fit in `dag`'s predefined error types.
Reviewed By: sfilipco
Differential Revision: D22883865
fbshipit-source-id: 3699e14775f335620eec28faa9a05c3cc750e1d1
Summary:
Prefix some `Result` with `dag::Result`. Since `dag::Result` is just
`anyhow::Result` for now, this does not change anything but makes
it more compatible with upcoming changes.
Reviewed By: sfilipco
Differential Revision: D22883864
fbshipit-source-id: 95a26897ed026f1bb8000b7caddeb461dcaad0e7
Summary:
All dependencies of revlogindex have migrated to concreted error types.
Let's migrate revlogindex itself. This allows compile-time type checks
and makes the error returned by revlogindex APIs more predictable.
Reviewed By: sfilipco
Differential Revision: D22857554
fbshipit-source-id: 7d32599508ad682c6e9c827d4599e6ed0769899c
Summary: So reachableroots can be called from Python.
Reviewed By: sfilipco
Differential Revision: D22657186
fbshipit-source-id: 36b1b5ed1e32c88bb07e6c7c7e0a7ca89e0751a3
Summary:
This replaces RustError that might happen during `addcommits`, and allow us to
handle it without having a stacktrace.
Reviewed By: DurhamG
Differential Revision: D22539564
fbshipit-source-id: 356814b9baf0b31528dfc92d62b0dcf352bc1e24
Summary: It's the same as `__add__`. It's consistent with the revset language.
Reviewed By: sfilipco
Differential Revision: D22638456
fbshipit-source-id: 928177d553220461192650f4792ac39cadd57dc2
Summary:
The hint indicates a set `X` is equivalent to `ancestors(X)`.
This allows us to make `heads` use `heads_ancestors` (which is faster in
segmented changelog) automatically without affecting correctness. It also
makes special queries like `ancestors(all())` super cheap because it'll just
return `all()` as-is.
Reviewed By: sfilipco
Differential Revision: D22638463
fbshipit-source-id: 44d9bbcbb0d7e2975a0c8322181c88daa1ba4e37
Summary:
Replace the Python spanset with the Rust-backed idset.
The idset can represent multiple ranges and works better with Rust code.
The `idset` fast paths do not preserve order for the `or` operation, as
demonstrated in the test changes.
Reviewed By: DurhamG, kulshrax
Differential Revision: D22519584
fbshipit-source-id: 5d976a937e372a87e7f087d862e4b56d673f81d6
Summary: Similarly to the changes made for `get`, the same can be applied to prefetch.
Reviewed By: DurhamG
Differential Revision: D22565609
fbshipit-source-id: 0fbc1a0086fa44593a6aaffb746ed36b3261040c
Summary: Make `store` the first argument for all of the EdenAPI Python methods. I've found this arrangement to be more ergonomic when working with the client later in the stack.
Reviewed By: quark-zju
Differential Revision: D22703915
fbshipit-source-id: b0ca900d969ec86ee91e8c62d281c2102860e9ef
Summary:
When using LFS, it's possible that a pointer may be present in the local
LfsStore, but the blob would only be in the shared one. Such scenario can
happen after an upload, when the blob is moved to the shared store for
instance. In this case, during a `get` call, the local LFS store won't be able
to find the blob and thus would return Ok(None), the shared LFS store woud not
be able to find the pointer itself and would thus return Ok(None) too. If the
server is not aware of the file node itself, the `ContentStore::get` would also
return Ok(None), even though all the information is present locally.
The main reason why this is happening is due to the `get` call operating
primarily on file node based keys, and for content-based stores (like LFS),
this means that the translation layer needs to be present in the same store,
which in some case may not be the case. By allowing stores to return a
`StoreKey` when progress was made in finding the key we can effectively solve
the problem described above, the local store would translate the file node key
onto a content key, and the shared store would read the blob properly.
Reviewed By: DurhamG
Differential Revision: D22565607
fbshipit-source-id: 94dd74a462526778f7a7e232a97b21211f95239f
Summary: This change introduces a bail macro that allows tagging errors using the syntax `bail!(fault=Fault::Request, "my normal {}", bail_args)` or `bail!(Fault::Request, "my normal {}", bail_args)`.
Reviewed By: DurhamG
Differential Revision: D22646428
fbshipit-source-id: a6ec2940001b26db8ddc3a6d3620a1e17406c867
Summary:
The spanset has the assumption that `0..len(repo)` are valid revs.
That's not true with segmented changelog. So reduce the dependency on the
assumption.
Reviewed By: kulshrax
Differential Revision: D22519586
fbshipit-source-id: a493d26d6d69a36966f4a037f87a03593b697cbd
Summary:
It turns out the Python world needs the integer range API in many places.
Deprecating them is non-trivial. Therefore expose the API.
Reviewed By: DurhamG
Differential Revision: D22402201
fbshipit-source-id: de31d15c18e5f4e0f8826f71315b98ad58b1764e
Summary:
About 64 tests depend on the revlog `strip` behavior. `strip` is not used in
production client-repos. I tried to migrate them off `strip` but that seems
too much work for now. Instead let's just implement `strip` in the HgCommits
layer to be compatible to run the tests.
Reviewed By: DurhamG
Differential Revision: D22402195
fbshipit-source-id: f68d005e04690d8765d5268c698b6c96b981eb0a
Summary:
I dropped the special case of wdir handling. With the hope that we will handle
the virtual commits differently eventually (ex. drop special cases, insert real
commits to Rust DAG but do not flush them to disk, support multiple wdir
virtual commits, null is no longer an ancestor of every commit).
`test-listkeyspatterns.t` is changed because `0` no longer resolves to `null`.
Reviewed By: DurhamG
Differential Revision: D22368836
fbshipit-source-id: 14b9914506ef59bb69363b602d646ec89ce0d89a
Summary: Make the Python EdenAPI client's `health()` method return a dict of server metadata.
Reviewed By: DurhamG
Differential Revision: D22604932
fbshipit-source-id: 51ca60cc95a8dbd15635520b2a9bd72603643cb6
Summary:
Implements based Rust-Python binding layer for error metadata propagation.
We introduce a new type, `TaggedExceptionData`, which carries CommonMetadata and the original (without metadata) error message for a Rust Anyhow error. This class is passed to RustError and can be accessed in Python (somewhat awkwardly) via indexing:
```
except error.RustError as e:
fault = e.args[0].fault()
typename = e.args[0].typename()
message = e.args[0].message()
```
As far as I can tell, due to limitations in cpython-rs, this can't be made more ergonomic without introducing a Python shim around the Rust binding layer, which could adapt the cpython-rs classes to use whatever API we'd like.
Currently, anyhow errors that are not otherwise special-cased will be converted into RustError, with both the original error message and any attached metadata printed as shown below
```
abort: intentional error for debugging with message 'intentional_error'
error has type name taggederror::IntentionalError and fault None
```
We can of course re-raise the error if desired to maintain the previous behavior for handling a RustError.
If we'd like other, specialized Rust Python Exception types to carry metadata (such as `IndexedLogError`), we'll need to modify them to accept a `TaggedExceptionData` like `RustError`.
Renamed the "cause an error in pure rust command" function to `debugcauserusterror`, and instead used the name `debugthrowrustexception` for a command which causes an error in rust which is converted to a Python exception across the binding layer.
Introduced a simple integration test which exercises `debugthrowrustexception`.
Added a basic handler for RustError to scmutil.py
Reviewed By: DurhamG
Differential Revision: D22517796
fbshipit-source-id: 0409489243fe739a26958aad48f608890eb93aa0
Summary: Move the `tokio::Runtime` into `EdenApiRemoteStore` so that if initialization fails, we can propagate the error instead of panicking.
Reviewed By: xavierd
Differential Revision: D22564210
fbshipit-source-id: 9db1be99f2f77c6bb0f6e9dc445d624dc5990afe
Summary: Instead of restricting the allowed characters in a repo name, allow any UTF-8 string. The string will be percent-encoded before being used in URLs.
Reviewed By: quark-zju
Differential Revision: D22559830
fbshipit-source-id: f9caa51d263e06d424531e0947766f4fd37b035f
Summary: Adds support for sharding based on user name.
Reviewed By: quark-zju
Differential Revision: D22537540
fbshipit-source-id: 962f9582c8947335dc9d9d29c500d8c09df69878
Summary:
Previously you could only canary locally on a devserver by setting an
environment variable. Let's add a --canary flag to debugdynamicconfig that
accepts a host. Hg will ssh to that host and run the configerator cli to grab
the canaried config from that host.
Reviewed By: quark-zju
Differential Revision: D22535509
fbshipit-source-id: af1c21d8402c4e729769e50388d913bf52b66b89
Summary: Add an optional `edenapi` argument to metadatastore that allows using EdenAPI in place of the SSH remote store.
Reviewed By: quark-zju
Differential Revision: D22492535
fbshipit-source-id: eba034c9ba86c79c9a9dee6bab3ff615d0575b6f
Summary: Reimplement `EdenApiHgIdRemoteStore` as `EdenApiRemoteStore<T>`, where `T` is a marker type indicating whether this store fetches files or trees. This allows working with the stores in a more strongly-typed way, and avoid having to check what kind of store this is at runtime when fetching data.
Reviewed By: quark-zju
Differential Revision: D22492160
fbshipit-source-id: e17556093fa9b81d2301f281da36d75a03e33c5e
Summary:
Previously we would audit all configs and report them if the
dynamicconfig did not match the rc-file config. Now that dynamicconfigs are
widely deployed, let's switch this around to auditing only configs we know have
had issues. This will let us start adding new configs via dynamicconfigs instead
of via the legacy staticfiles and chef, before we've finished migrating all the
legacy configs over.
Reviewed By: quark-zju
Differential Revision: D22401865
fbshipit-source-id: 5c41c674d39c8113b2a40da61e020e8a33c39312
Summary:
We're seeing cases were cloning can take 10's of GB of memory because
we pend all the history information in memory. Let's flush the history info
every 10 million adds to bound the memory usage.
10 million was chosen somewhat arbitrarily, but it results in pack files that
are 800MB, which corresponds roughly with 8GB of memory usage.
This requires updating repack to be aware that a single flush could produce
multiple packs. Note, since repack writes via this same path, it may result in
repack producing multiple pack files. In the degenerate case repack could
produce the same number (or more) of pack files than was inputted. If we set the
threshold high enough I think we'll be fine though. 800MB is probably
sufficient.
Reviewed By: xavierd
Differential Revision: D22438569
fbshipit-source-id: 425d5d3b7999b81e44d1dbe1f2a4ea453ab6ca4f
Summary: Per comments on D22429347, add a new `ExtractInnerRef` trait that is similar to `ExtractInner`, but returns a reference to the underlying value. A default implementation is provided for types whose inner value is `Clone + 'static`, so in practice most types will only need to implement `ExtractInnerRef`, whereas the callsite may choose whether it needs a reference or an owned value.
Reviewed By: quark-zju
Differential Revision: D22464158
fbshipit-source-id: 7b97329aedcddb0e51fd242b519e79eba2eed350
Summary: Add add a `edenapistore` class to that wraps a `EdenApiHgIdRemoteStore`. This class is purely used as a means to set up the stores from Python code, and is only used as a way to get an `Arc<EdenApiHgIdRemoteStore>` to the Rust content store. It has no functionality of its own.
Reviewed By: quark-zju
Differential Revision: D22449702
fbshipit-source-id: ad2094c79da523071b6ed8344c8dde706e448c95
Summary: This is effectively a complete rewrite of the EdenAPI Python bindings to use the new client.
Reviewed By: quark-zju
Differential Revision: D22442903
fbshipit-source-id: c3cf2b2b8291e24d6d4d3a3546ccc69472510567
Summary:
A common pattern in Mercurial's data storage layer Python bindings is to have a Python object that wraps a Rust object. These Python objects are often passed across the FFI boundary to Rust code, which then may need to access the underlying Rust value.
Previously, the objects that used this pattern did so in an ad-hoc manner, typically by providing an `into_inner` or `to_inner` inherent method. This diff introduces a new `ExtractInner` trait that standardizes this pattern into a single interface, which in turn allows this pattern to be used with generics.
Reviewed By: quark-zju
Differential Revision: D22429347
fbshipit-source-id: cab4c24b8b98c6ef8307f72a9b4726aabdc829cc
Summary: Update the EdenAPI Python bindings to use the new client. This is mostly just a stopgap measure to allow us to delete the old client code; nothing in production actually uses these bindings anymore, and the new client will primarily be used from Rust.
Reviewed By: quark-zju
Differential Revision: D22379476
fbshipit-source-id: 953e0ffc2ce682869ee234d672a154046b373c1e
Summary:
We've seen a handful of users complaining about clone failing and not being
able to recover from it. From looking at the various reports and the
stacktraces, I believe this is caused by a flaky connection on the user end
that causes the Python code to retry the getpack calls. Before retrying, the
code will figure out what still needs fetching and this is done via the
getmissing API. When LFS pointers were fetched, the LFS blobs aren't yet
present on disk, and thus the underlying ContentStore::get_missing will a set
of keys that contain some StoreKey::Content keys. The code would previously
fail at this point, but since the key also contains the original key, we can
simply return this, the pointers might be refetched but these are fairly small.
Taking a step back from this bug, the issue really is that the retry logic is
done in code that cannot understand content-keys, and moving it to a part of
the code that understands this would also resolve the issue.
I went with the simple approach for now, but since other remote stores
(EdenAPI, the LFS one, etc) would also benefit from the retry logic, we may
want to move the logic into Rust and remove the getmissing API from the Python
exposed ContentStore.
Reviewed By: DurhamG
Differential Revision: D22425600
fbshipit-source-id: 69c2898cc302d2170cd0f206c89189c341db5278
Summary:
The Mercurial's concept of `null` revision (hardcoded as 20 zeros) is a
headache to special case. See https://www.mercurial-scm.org/wiki/RevsetVirtualRevisionPlan.
The Rust DAG layer cannot handle it. Make pydag drop the nullid or nullrev when
crossing the Python -> Rust boundary.
A cleaner way to handle `null` might be:
- Create a new vertex in the DAG in memory that has empty content.
Calculate its commit hash normally. The commit is isolated from other parts
of the commit graph. It has no parents and no children.
The vertex has an assigned Id, which is not zero if the repo is not empty.
- Assign the `null` special name (like how we do for `tip`) to the commit.
- Remove all hard-code special cases of the 20-zero `nullid`.
That would allow things like `hg up null`, `hg diff -r null -r X` to continue
work without special casing it in the commit graph layer.
Reviewed By: sfilipco
Differential Revision: D22240188
fbshipit-source-id: 707af47cbf36a7df60097a17d69094aae89d3250
Summary:
Change pydag from using concreate `namedag` and `memnamedag` to trait objects:
- `commits`: High-level read-write commits storage, supports Rust `HgCommits`
(segmented changelog), `MemHgCommits`, and `RevlogCommits`.
- `dagalgo`: maps to the `DagAlgorithm` Rust trait.
- `idmap`: maps to the `IdConvert + PrefixLookup` Rust traits.
The idea is that we move the revlog / segmented changelog difference from Python
to behind Rust trait objects so the Python code looks overall cleaner, the Rust
revset alternative gets exercised early, and switching from revlog to segmented
changelog becomes easier.
Reviewed By: sfilipco
Differential Revision: D21796242
fbshipit-source-id: 3a4a3ff3d9e7e46059d1ed3461a55003c352e82d
Summary:
Similar to D7121487 (af8ecd5f80) but works for mutation store. This makes sure at the Rust
layer, mutation entries won't get lost after rebasing or metaeditting a set of
commits where a subset of the commits being edited has mutation relations.
Unlike the Python layer, the Rust layer works for mutation chains. Therefore
some of the tests changes.
Reviewed By: markbt
Differential Revision: D22174991
fbshipit-source-id: d62f7c1071fc71f939ec8771ac5968b992aa253c
Summary: Move old EdenAPI crate to `scm/lib/edenapi/old` to make room for the new crate. This old code will eventually been deleted once all references to it are removed from the codebase.
Reviewed By: quark-zju
Differential Revision: D22305173
fbshipit-source-id: 45d211340900192d0488543ba13d9bf84909ce53
Summary:
D21626209 (38d6c6a819) changed revlogindex to read `00changelog.i` by its own instead of
taking the data from Python. That turns out to be racy. The `00changelog.i`
might be changed between the Rust and Python reads and that caused issues.
This diff makes Python re-use the indexdata read by Rust so they are guaranteed
the same.
Reviewed By: DurhamG
Differential Revision: D22303305
fbshipit-source-id: 823bf3aefc970a4a6ce8ab58bccf972a78f6de70
Summary:
This will be used by the next change.
The reason we use a `buffer` or `memoryview` instead of Python `bytes` is to expose
the buffer in a zero-copy way. That is important for startup performance.
Reviewed By: DurhamG
Differential Revision: D22303306
fbshipit-source-id: 3f7c8dff3575b998e025cd5940faa0c183b11626
Summary:
Fetches configs from a remote endpoint and caches them locally. If the
remote endpoint fails to respond, we use the cached version.
Reviewed By: quark-zju
Differential Revision: D22010684
fbshipit-source-id: bd6d4349d185d7450a3d18f9db2709967edc2971
Summary:
Whenever data is redacted on the server, a specific tombstone is returned when
fetching it. Make sure that whenever we update the file on disk, we write a
nice message to the user instead of the tombstone itself.
While this code could have been moved into the Rust store code itself, I prefer
to leave it to its users to decide what to do with redacted data. EdenFS for
instance may want to prevent access to it instead of showing the redacted
message.
Reviewed By: kulshrax
Differential Revision: D21999345
fbshipit-source-id: 39a83cdf5ea4567628a13fbd59520b9677aba749
Summary: The tracing APIs and error context APIs can achieve similar effects.
Reviewed By: xavierd
Differential Revision: D22129585
fbshipit-source-id: 0626e3f4c1a552c69c046ff06ac36f5e98a6c3d8
Summary:
In the case where an expensive network operation is involved, holding the GIL
would mean that Mercurial cannot display progress bars, or simply cannot be
interrupted. This is a less than ideal user experience.
To fix this, let's release the GIL whenever we enter into the revisionstore
code.
Reviewed By: DurhamG
Differential Revision: D22140399
fbshipit-source-id: 131c8cf81e39128810e0f20d1922b5681a33d95a
Summary: Replace usages of whitelist/blacklist with include/exclude/filter/allow. These terms are more descriptive and less likely to contribute to racial stereotyping. More context: https://fb.workplace.com/groups/sourcecontrolteam/permalink/2926049127516414/
Reviewed By: kulshrax
Differential Revision: D22039298
fbshipit-source-id: 255c7389ee5ce5e54bbccdfb05ffa4cafc6958e5
Summary:
The anyhow error contains the context for the error which will lead to better
error message for the user (and us). What the code previously did was simply
using the Debug trait to print the error, and thus was missing context.
Reviewed By: DurhamG
Differential Revision: D21985745
fbshipit-source-id: 31c603d7f42e79a360541f39e4aaf0fcfbb9a14f
Summary:
With the internal streampager, progress bars must be sent on a separate stream so that
streampager can render them correctly.
Reviewed By: quark-zju
Differential Revision: D21906173
fbshipit-source-id: eb41b0bf22807d9cae518b3f676996ab1c642c6e
Summary:
Mostly copy-paste from code added in D19503373 and D19511574. Adjusted to match
the revlog index interface.
Reviewed By: sfilipco
Differential Revision: D21626201
fbshipit-source-id: 05d160e4c03d7e2482b6a4f2d68c3688ad78f568
Summary:
The NameSet is not really about Dag. It is about using Id and is static.
Rename it to clarify. In an upcoming change we'll have IdLazySet.
Reviewed By: sfilipco
Differential Revision: D21626204
fbshipit-source-id: 84f25008f7032f6e26a26fc656ccbcd2a5880ecf
Summary:
Previously, the NameSet has properties like "is_all", "is_topo_sorted", etc.
To make lazy sets efficient, it's important to have hints about min / max Ids
and maybe some other information.
Add a dedicated Hints structure for that.
Reviewed By: sfilipco
Differential Revision: D21626219
fbshipit-source-id: 845e88d3333f0f48f60f2739adae3dccc4a2dfc4
Summary:
Implements part of the dag IdMap related traits.
It does not get used yet, but eventually I'd like `pydag` to be able to work
with an abstracted dag including RevlogIndex.
Reviewed By: sfilipco
Differential Revision: D21626210
fbshipit-source-id: 53f19622f03fd71b76073dccf8dcc9b4778b40ca
Summary:
This will allow RevLogIndex to answer node -> rev and hex lookup queries.
Also change RevlogIndex::new to take file names so it can write back the
nodemap index when the index is lagging. That part of logic currently exists in
pyindexes + clindex.pyx, which are going to be replaced by revlogindex.
Practically, this will generate a `00changelog.nodemap` file in svfs, which is
temporarily unused, but will be used once clindex.pyx gets replaced.
Reviewed By: sfilipco
Differential Revision: D21626209
fbshipit-source-id: 297d9eff26a73c26558708f7a2290d4d8ba1e777
Summary:
Change `NodeRevMap`'s changelog type from `[u8]` to `[RevlogEntry]`.
This makes it consistent with `RevlogIndex`.
Reviewed By: sfilipco
Differential Revision: D21626203
fbshipit-source-id: 7457f48ccd7b3489264684a5db21d21e9eb7a937
Summary:
NodeRevMap helps converting from a commit hash to a rev number. It's similar to
IdMap in the dag crate, but was designed for the revlog.
Move NodeRevMap to revlogindex so it becomes easier to implement the IdConvert
trait required by the dag crate.
Reviewed By: sfilipco
Differential Revision: D21626211
fbshipit-source-id: 14996f1234231b507efb5186ec30f84df5aaad10
Summary:
The idea is that the pure Rust revlogindex crate can implement the DagAlgorithm
interface so we will have a consistent interface in the code base that works
for both the existing storage (revlog) and the new segmented changelog.
The other way to do this is to implement the `bindings.dag.namedag` interface
in pure Python for the revlog-based DAG, or supporting quite different
interfaces (ex. revlog DAG and the Rust segmented changelog DAG) in the code
base. At present, I think implementing the Rust DAG traits for revlog is the
most appealing, partially because we already have some key algorithms
implemented (ex. prefix lookup, common ancestors, etc).
Reviewed By: sfilipco
Differential Revision: D21626197
fbshipit-source-id: 733b1af1bcd5fc0784764fc7103412988894d43b
Summary:
Previously the return type was String which, in Python 2, could turn into bytes or
unicode depending on the contents of the string. We always want bytes in Python
2, so let's use the Str type instead.
Reviewed By: quark-zju
Differential Revision: D21794189
fbshipit-source-id: 6493fbacab354a78476f522fc3c41b7336dbbdb1
Summary: This allows other kinds of DAG to implement the operations.
Reviewed By: sfilipco
Differential Revision: D21626220
fbshipit-source-id: 896c5ccebb1672324d346dfca6bcac9b4d3b4929
Summary: This makes things a bit more flexible.
Reviewed By: sfilipco
Differential Revision: D21626194
fbshipit-source-id: f3ad486bcd5a6478d9e00f674d48f99504cded8c
Summary: This makes it possible for other types to implement the hex prefix lookup.
Reviewed By: sfilipco
Differential Revision: D21626218
fbshipit-source-id: 96e8b8c37e5aae2bd60658a238333b61902936d1
Summary:
Types like IdDag are not really used. The use of the word "name" is sometimes
confusing in other context. Therefore export shorter names like Dag, MemDag,
Vertex, avoid "name" in NameDag, MemNameDag and NameSet. This makes external
code shorter and less ambiguous.
Reviewed By: sfilipco
Differential Revision: D21626212
fbshipit-source-id: 5bcf3cecfd38277149b41bf3ba9e6d4ef2a07b2b
Summary:
This decouples DagAlgorithm from the IdMap + IdDag backend, making it possible
to support other kinds of backends of DagAlgorithm (ex. a revlog backend).
Reviewed By: sfilipco
Differential Revision: D21626200
fbshipit-source-id: f53cc271a200062e9c02f739b6453e1d7de84e6d
Summary:
When we got rid of the delta logic, we also needed to get rid of some
unused functions.
Reviewed By: singhsrb
Differential Revision: D21725043
fbshipit-source-id: ac069e6b0468e2275f353a9970b8971b5a2cfa23
Summary:
If we release a new version of Mercurial, we want to ensure that it's
builtin configs are used immediately. To do so, let's write a version number
into the generated config file, and if the version number doesn't match, we
force a synchronous regeneration of the config file.
For now, if regeneration fails, we just log it. In the future we'll probably
throw an exception and block the user since we want to ensure people are running
with modern configuration.
Reviewed By: quark-zju
Differential Revision: D21651317
fbshipit-source-id: 3edbaf6777f4ca2363d8617fad03c21204b468a2
Summary:
During clone the hgrc.dynamic file doesn't exist and doesn't even have
a place for us to generate it to. Let's instead generate and apply the config in
memory.
In the future, if generate fetches data from the network, this will mean clone
would depend on the network, since if generate fails the clone would fail. In
some situations this is desirable, since users shouldn't be cloning without our
approved configs, but if it causes problems we could probably tweak generate to
support an offline mode.
Reviewed By: quark-zju
Differential Revision: D21643086
fbshipit-source-id: d9a758207738d5983213d95725061517e0aa17db
Summary:
Change pyrenderdag to accept non-revision-number graph vertexes so it can
render a graph even if the graph does not use revision numbers.
The next diff wants this behavior so it can just emit commit hashes to
the renderer without caring about revision numbers. The type is made
so it can still support revision numbers, since the legacy graphlog
interface would still use revision numbers.
Reviewed By: markbt
Differential Revision: D21554671
fbshipit-source-id: 20572683b831f7cecb03957c83f278ff3903eff0
Summary:
The previous code was wrong - it converts the PyObject to iterator every time
(ex. if the PyObject is a set, then it calls `set.__iter__` every time, and
will only get the first element of the set).
For example, it will enter an infinite loop for evaluating this:
bindings.dag.nameset({'1', '2'})
Fix it by calling `__iter__`, to get the iterator object and use that instead
of the original PyObject.
Reviewed By: markbt
Differential Revision: D21554676
fbshipit-source-id: 0f2adae8f123530cee2d473da37ca1a93a941fde
Summary:
This function reorders commits so the graph looks better.
It will be used to optimize graph rendering for cloud smartlog (and perhaps
smartlog in the future).
Reviewed By: markbt
Differential Revision: D21554675
fbshipit-source-id: d3f0f27c7935c49581cfa6e87d7c32eb5a075f75
Summary: Expose the API that returns a real graph.
Reviewed By: DurhamG
Differential Revision: D21486520
fbshipit-source-id: 4ebdb4011df8971c54930173c4e77503cd2dac47
Summary:
This allows us to replace the pyre2 C++ bindings so the fast regex engine can
work with Python 3, and simplify our build steps.
Reviewed By: DurhamG
Differential Revision: D20973179
fbshipit-source-id: e123ac18954991f2c701526108f5c2ecd2b31a3b
Summary:
Pass `configparser::config::ConfigSet` to `repack` in
`revisionstore/src/repack.rs` so that we can use various config values in `filter_incrementalpacks`.
* `repack.maxdatapacksize`, `repack.maxhistpacksize`
* The overall max pack size
* `repack.sizelimit`
* The size limit for any individual pack
* `repack.maxpacks`
* The maximum number of packs we want to have after repack (overrides sizelimit)
Reviewed By: xavierd
Differential Revision: D21484836
fbshipit-source-id: 0407d50dfd69f23694fb736e729819b7285f480f
Summary:
When Qing implemented all the get method, the translate_lfs_missing function
didn't exist, and I forgot to add them in the right places when landing the
diff that added it. Fix this.
Reviewed By: sfilipco
Differential Revision: D21418043
fbshipit-source-id: baf67b0fe60ed20aeb2c1acd50a209d04dc91c5e
Summary: Make them reusable in other Python bindings, ex. pymutation.
Reviewed By: sfilipco
Differential Revision: D21486524
fbshipit-source-id: 258455c6a442353c77588fadcb560cb5a170926e
Summary: This makes it easier to visualize a MemNameDag.
Reviewed By: sfilipco
Differential Revision: D21486523
fbshipit-source-id: c65f1fc421bd654dc820faae3c93f2aa57f910d4
Summary:
This will allow clients to operate on MemNameDag.
Unfortunately, it isn't that easy to reuse code in `py_class!`. Since they are
just thin wrappers, I live with the copy-paste for now.
Reviewed By: sfilipco
Differential Revision: D21479015
fbshipit-source-id: ddcc7f5c7ede6bb1e9c73d058779805875b09200
Summary:
This makes it easier to add an "in-memory-only" NameDag with all the algorithms
implemented.
Reviewed By: sfilipco
Differential Revision: D21479020
fbshipit-source-id: c1a73e95f3291c273c800650f70db2a7eb0966d7
Summary:
Remove HgIdDataStore::get_delta and all implementations. Remove HgIdDataStore::get_delta_chain from trait, remove all unnecessary implentations, remove all implementations from public Rust API. Leave Python API and introduce "delta-wrapping".
MutableDataPack::get_delta_chain must remain in some form, as it necessary to implement get using a sequence of Deltas. It has been moved to a private inherent impl.
DataPack::get_delta_chain must remain in some form for the same reasons, and in fact both implenetations can probably be merged, but it is also used in repack.rs for the free function repack_datapack. There are a few ways to address this without making DataPack::get_delta_chain part of the public API. I've currently chosen to make the method pub(crate), ie visible only within the revisionstore crate. Alternatively, we could move the repack_datapack function to a method on DataPack, or use a trait in a private module, or some other technique to restrict visibility to only where necessary.
UnionDataStore::get has been modified to call get on it's sub-stores and return the first which matches the given key.
MultiplexDeltaStore has been modified to implement get similarly to UnionDataStore.
Reviewed By: xavierd
Differential Revision: D21356420
fbshipit-source-id: d04e18a0781374a138395d1c21c3687897223d15
Summary:
This allows us to understand what config is used during a transaction.
For example, is `selectivepull` enabled during a `pull`?
Reviewed By: DurhamG
Differential Revision: D21222146
fbshipit-source-id: a8c82f2b02e9657885947a706f728e28b1bfc1e2
Summary:
Adds python logic for validating the dynamic configs. Any dynamic
configs that don't match the given list of rc files will be reported and removed
Reviewed By: quark-zju
Differential Revision: D21310919
fbshipit-source-id: 07f584bba990f1b01347dfbc285e3ca814fe5c5a
Summary:
I wanted to figure out "who added this visible head", "what is the difference
between this metalog root and that root". Those are actually source control
operations (blame, diff). Add a git export feature so we can export metalog
to git to run those queries.
Choosing git here as we don't have native Rust utilities to create a more
efficient hg repo yet.
Ideally we can also make hg operate on a metalog directory as a "metalogrepo"
directly. However that seems to be quite difficult right now due to poor
abstractions.
Reviewed By: DurhamG
Differential Revision: D21213073
fbshipit-source-id: 4cc0331fbad6e1586907c0a66c18bcc25608ea49
Summary:
This makes metalog easier to use in debugshell context. For example, to
investigate the "bookmarks" in the past, the code gets simplified from:
roots = b.metalog.metalog.listroots(repo.svfs.join('metalog'))
past_ml = b.metalog.metalog(repo.svfs.join('metalog'), root[10])
past_ml.get("bookmarks")
to:
roots = ml.roots()
past_ml = ml.checkout(roots[10])
past_ml.get("bookmarks")
Reviewed By: DurhamG
Differential Revision: D21162568
fbshipit-source-id: 7cc5581afe596a3d2696311a36ac11caa718428a
Summary: This allows the Python world to obtain the root ID for logging purpose.
Reviewed By: DurhamG
Differential Revision: D21179513
fbshipit-source-id: 3f289c06d3d470ff492de39fa985203b3facbf00