Commit Graph

495 Commits

Author SHA1 Message Date
Andrey Chursin
11faa47a7d pycheckout: empty pycheckout crate
Reviewed By: DurhamG

Differential Revision: D26435596

fbshipit-source-id: 013070ff918c7b45105c7abdad0950ee96d46d49
2021-02-17 12:02:43 -08:00
Arun Kulshreshtha
c97db6b042 auth: rename structs
Summary:
Make the struct and method names in this crate more clearly reflective of what they do:

- `Auth` -> `AuthGroup`
- `Auth::try_from` -> `AuthGroup::new`
- `AuthConfig` -> `AuthSection`
- `AuthConfig::new` -> `AuthSection::from_config`
- `AuthConfig::auth_for_url` -> `AuthSection::best_match_for`

Reviewed By: singhsrb

Differential Revision: D26436095

fbshipit-source-id: a5ec5d9c48d3b75a0ee166b74e5340f9c529eeae
2021-02-12 17:52:29 -08:00
Zeyi (Rice) Fan
d552144478 configparser: move conversion related to a separated module
Summary:
Move these conversion related function and trait out of `hg` module so EdenFS can use it too. Changes:

* Moved `get_opt`, `get_or` and `get_or_default` directly into `ConfigSet`.
* Moved `FromConfigValue` and `ByteCount` into `configparser::convert`.

Reviewed By: quark-zju

Differential Revision: D26355403

fbshipit-source-id: 9096b7b737bc4a0cccee1a3883e89a323f864fac
2021-02-12 12:33:47 -08:00
Arun Kulshreshtha
8412d537f8 pyauth: add bindings to auth crate
Summary: Add Python bindings to the Rust `auth` crate, with the intention of replacing `httpconnection.readauthforuri`.

Reviewed By: quark-zju

Differential Revision: D26419447

fbshipit-source-id: dd13bea74961137790beb8c96120ebef99e3c313
2021-02-12 10:04:27 -08:00
Arun Kulshreshtha
9fc3b326fb bindings/pyerror: add CertificateError exception
Summary:
The auth crate is now able to check the presence and expiration of client certificates (D26009207 (9f7d4447fd)). When a problem is detected, it emits an `X509Error`, which specifies exactly what the problem is. Since this error always indicates a certificate issue, we can print out the message configured in `help.tlsauthhelp` (which is more specific than `help.tlshelp` from the previous diff).

Previously, Mercurial would attempt to use the certificate anyway, resulting in a difficult to understand error message. Although the previous diffs in this stack improved the error messages on any TLS failure, the `X509Error` messages are even more helpful.

Users can opt in to this certificate validation with `edenapi.validate-certs`. The functionality is gated on a config option to prevent Mercurial from crashing if certificates are misconfigured, but EdenAPI isn't being used.

Reviewed By: quark-zju

Differential Revision: D26385843

fbshipit-source-id: 9809f612f8aab3f2dd442d6dd8dc348f1af45296
2021-02-11 19:29:22 -08:00
Arun Kulshreshtha
8b9cf7e7cb bindings/pyerror: add TlsError exception type
Summary: Add a new `TlsError` Python exception type corresponding to `HttpClientError::Tls`.

Reviewed By: quark-zju

Differential Revision: D26385846

fbshipit-source-id: c0df543032461de650a4d24c26c6b8aaab1abbb9
2021-02-11 19:29:21 -08:00
Stefan Filip
93c1231c55 segmented_changelog: update hash_to_location to gracefully handle unknown hashes
Summary:
One of the primary use cases for hash_to_location is translating user provided
hashes. It is then perfectly valid for the hashes that are provided to not
exist.  Where we would previously return an error for the full request if a
hash was invalid, we now omit the hash from the response.

Reviewed By: quark-zju

Differential Revision: D26389472

fbshipit-source-id: c59529d43f44bed7cdb2af0e9babc96160e0c4a7
2021-02-11 12:17:35 -08:00
Stefan Filip
c9f3ae8fa4 edenapi: add commithashtolocation to python client
Summary: Same approach as locationtohash.

Reviewed By: quark-zju

Differential Revision: D26382616

fbshipit-source-id: a06c62c3eebcbe0b07b5c28fce4789a1334d55a4
2021-02-11 12:17:35 -08:00
Stefan Filip
cfed6bb108 edenapi: add commitlocationtohash to python client
Summary:
The approach is very similar to what commitrevlogdata does. You could say
that it's cargo culted.
I am not sure how appropriate it is to return CommitLocationToHashResponse
but I think that it's fine for now.

Reviewed By: quark-zju

Differential Revision: D26374219

fbshipit-source-id: 61d851d5a4fc4223c65078ef434a0c67314a90cd
2021-02-11 12:17:35 -08:00
Durham Goode
8c08a42d22 dynamicconfig: introduce configs.allowedconfigs
Summary:
In our upcoming migration away from chef/static rc files, we'll be
marking certain files as "allowed". Our hope is that that list only includes
things like .hg/hgrc, ~/.hgrc, etc.

There are cases however where it's convienent to continue to use chef, for
instance when we condition on machine type. To support this, let's add an
allowed_config option, which will allow configs from non-supported locations.

This will also be useful when remediating issues that come up when we start
enforcing allow_location, without rolling back the entire thing.

Reviewed By: quark-zju

Differential Revision: D26233451

fbshipit-source-id: 71789e0361923a6f80de4aef7f012afc0269440d
2021-02-10 19:30:35 -08:00
Andrey Chursin
14064f8582 vfs: move conflict handling from pyworker to vfs write
Summary:
Previously, `write` can fail because the destination file exists as a
directory, or the parent directory is missing. pyworker handles those cases
by calling `clear_conflicts` to remove conflicted directories and create
missing parent directories and retry `write`. Practically, for all `write`
usecases (including checkout) we always want the `clear_conflicts` behavior.
Therefore, move `clear_conflicts` to vfs `write` and make it private.

Reviewed By: quark-zju

Differential Revision: D26257829

fbshipit-source-id: 03d1da0767202edba61c47ae5654847c0ea3b33e
2021-02-09 17:04:30 -08:00
Jun Wu
ae8aa967bb pytracing: set target to module name by default
Summary:
This matches the Rust behavior and is useful for filtering
because the env logger syntax applies to target:

  % RUST_LOG=edenscm.hgext.debugshell=info lhg dbsh
  In [1]: from edenscm import tracing

  In [2]: tracing.info('foo')
  [2021-02-05T19:21:41.082Z INFO  edenscm.hgext.debugshell] message="foo"

Reviewed By: kulshrax

Differential Revision: D26282053

fbshipit-source-id: 8ee9e82b955835b24c49f9bf81c7a3aec7a65a33
2021-02-05 15:19:34 -08:00
Jun Wu
a64040a7e0 revset: optimize nameset._slice
Summary:
This affects `first`, `last`, `limit` revset functions.

Previously, they just iterate through the set naively. Now they have Rust fast
paths. For example:

  In [1]: time repo.revs('last(parents(:1000000),100)')
  CPU times: user 5.08 ms, sys: 1.02 ms, total: 6.1 ms
  Wall time: 4.96 ms
  Out[1]: <nameset+ <spans [0a087e42b29ba5c9ceb3588477d78f7f09ce2663:af5e72c2a0c2e78de462bba8bde63f2499aeb9b5+999900:999999]>>

  In [2]: time repo.revs('first(parents(:1000000),100)')
  CPU times: user 2.21 ms, sys: 40 µs, total: 2.25 ms
  Wall time: 1.83 ms
  Out[2]: <nameset+ <spans [06b96ec2a8b60d984606f36c30d3dbc899d804df:4cab7b68c0bbdc13eb2eded8fc8c4c8d520a7189+0:99]>>

  In [5]: time repo.revs('first(reverse(parents(:1000000)),100)')
  CPU times: user 2.2 ms, sys: 185 µs, total: 2.39 ms
  Wall time: 1.67 ms
  Out[5]: <nameset- <spans [0a087e42b29ba5c9ceb3588477d78f7f09ce2663:af5e72c2a0c2e78de462bba8bde63f2499aeb9b5+999900:999999]>>

  In [6]: time repo.revs('last(reverse(parents(:1000000)),100)')
  CPU times: user 2.01 ms, sys: 12 µs, total: 2.02 ms
  Wall time: 1.68 ms
  Out[6]: <nameset- <spans [06b96ec2a8b60d984606f36c30d3dbc899d804df:4cab7b68c0bbdc13eb2eded8fc8c4c8d520a7189+0:99]>>

  In [7]: time repo.revs('limit(reverse(parents(:1000000)),100,10000)')
  CPU times: user 1.48 ms, sys: 887 µs, total: 2.37 ms
  Wall time: 1.89 ms
  Out[7]: <nameset- <spans [5e23b6f07f1512a8991de5a0e883ff4d598ac1d7:f5f68207be4a458e99d4a9200296977aecf44ea2+989900:989999]>>

This, together with fast `_firstancestors`, could potentially answer
globalrev-like queries on the master branch without recording the globalrev
information server side (which has complexities like locking, etc). For
example, to convert a global rev `g` to commit:

  c = repo.revs('limit(_firstancestors(master), 1, %s)', g).first()

To convert a commit `c` to global rev:

  g = len(repo.revs('_firstancestors(%s)', c)) - 1

Reviewed By: sfilipco

Differential Revision: D26203558

fbshipit-source-id: 14d9247bbb07260f783e05b3fb1034406de48121
2021-02-05 12:00:41 -08:00
Jun Wu
95668fd91d dag: ensure LazySet has Hints set
Summary:
Previously, `LazySet` was constructed with default `Hints`. That disables fast
paths. Revise the API so LazySet requires an explicit `Hints` to address the
issue.

Reviewed By: sfilipco

Differential Revision: D26203561

fbshipit-source-id: c92cd1f7eb7b40ffaaf53abcf05e64f3d41b906d
2021-02-05 11:53:46 -08:00
Jun Wu
9b02ebc711 dag: make spanset module private
Summary:
This just renames types so `IdSet` is the recommended name and `SpanSet`
remains an implementation detail.

Reviewed By: sfilipco

Differential Revision: D26203560

fbshipit-source-id: 7ca0262f3ad6d874363c73445f40f8c5bf3dc40e
2021-02-05 11:53:45 -08:00
Jun Wu
cc123cc1ce revset: optimize _firstancestors using Rust fast path
Summary:
Optimize the `_firstancestors` revset function using Rust.

When calculating `len(_firstancestors(master))`, the new code took 2ms while
the old code needs 76s, a ~40000x improvement.

Reviewed By: sfilipco

Differential Revision: D26182242

fbshipit-source-id: 55f17b014e727d8e8e3099b7c287f8bf8479279b
2021-02-05 11:53:45 -08:00
Jun Wu
01b0122b0b revset: optimize x~n using Rust fast path
Summary:
Optimize the `x~n` revset function using Rust.

Note: This changes the behavior a bit, `x~n` no longer returns `null`.

Reviewed By: sfilipco

Differential Revision: D26142683

fbshipit-source-id: d6a45b7e67352d74986274e52002a769bbae772e
2021-02-05 11:37:51 -08:00
Jun Wu
20603043e2 revset: optimize merge() using Rust fast path
Summary: Optimize the "merge()" revset function using the merges() from Rust.

Reviewed By: sfilipco

Differential Revision: D26142169

fbshipit-source-id: 47f426625869b7889b28bb1a18544d4abae36cae
2021-02-05 11:37:50 -08:00
Jun Wu
6956496602 lib: bytes::Bytes -> minibytes::Bytes
Summary:
The `bytes` crate still does not support zero-copy on mmaped buffer.
Switch to `minibytes::Bytes` so bytes returned from our main storage
backend indexedlog is a zero-copy slice backed by a mmap buffer.

Migrate vfs, revisionstore, pyworker to minibytes so they can preserve
zero-copy mmap buffers from indexedlog.

The edenapi/types is unchanged, since it's also used in Mononoke which uses
`bytes::Bytes` all the places. The conversion to `minibytes::Bytes` is cheap
so it's probably not a performance issue.

Reviewed By: kulshrax

Differential Revision: D26218289

fbshipit-source-id: e4f1c631143b7676c6b48d3b4f97055299bfd334
2021-02-03 20:22:32 -08:00
Durham Goode
88900aaf93 dynamicconfig: remove legacylist and disallowedlist deprecation logic
Summary:
The original migration strategy with dynamicconfig was to fix configs
one by one until the dynamicconfig values matched the chef/static ones, then we
can turn off chef/static configs. This looks to be too much work, so we're going
to try a different strategy of just turning off all chef/static configs on a
small number of hosts and seeing what breaks.

The legacylist and disallowlist configs were part of the old strategy, and they
make it more complicated to fix dynamicconfig mismatches, so let's get rid of
them.

Reviewed By: quark-zju

Differential Revision: D26208548

fbshipit-source-id: 63171f1f16aa0498c0eefa994dffaeb8e0cc0d72
2021-02-03 09:53:00 -08:00
Durham Goode
13de9f801a doctor: repair treemanifest indexedlogs as well
Summary: Previously we only repaired the file indexedlogs.

Reviewed By: xavierd

Differential Revision: D26202423

fbshipit-source-id: b1c673ae69a357d66ab2baf5c36985a3b0597427
2021-02-02 16:21:01 -08:00
Durham Goode
7f555d2d06 http: improve error messages from http failures
Summary:
Currently the data layer eats all errors from remote stores and treats
them as KeyErrors. This hides connection issues from users behind obscure
KeyErrors. Let's make it so that any non-key error reported by the remote store
is propagated up as a legitimate error.

This diff makes Http errors from EdenApi show up with a nicer error message,
suggesting that the user run fixmyserver.

Further fixes will probably be necessary to categorize other errors from the
remote store more nicely.

Reviewed By: quark-zju

Differential Revision: D26117726

fbshipit-source-id: 7d7dee6ec101c6a1d226185bb27423d977096050
2021-01-29 09:40:19 -08:00
Jun Wu
74512659ef pytracing: add instrument
Summary:
Similar to Rust tracing's instrument. It's a decorator for functions that will
generate spans when called.

Reviewed By: sfilipco

Differential Revision: D26021958

fbshipit-source-id: bbb648ab07d6db233cd16f56a5f0441df07e1c2e
2021-01-28 13:17:59 -08:00
Jun Wu
9b4732a145 pytracing: expose APIs about runtime span callsites and spans
Summary:
This allows Python to create span callsites understood by Rust tracing,
and operate on the spans (ex. enter, exit, record, etc.).

Reviewed By: sfilipco

Differential Revision: D26013747

fbshipit-source-id: 2783b29750e3279c5481422bddc83366ff7a3548
2021-01-28 13:17:59 -08:00
Jun Wu
1d314204e3 pytracing: expose APIs about runtime event callsites
Summary:
This allows Python to create callsites for one-off events understood by Rust.
This diff adds the "EventCallsite" for logging one-off events.

Reviewed By: sfilipco

Differential Revision: D26013749

fbshipit-source-id: 2520928dc360852afe2780267036e9d22c212191
2021-01-28 13:17:59 -08:00
Thomas Orozco
9d7b0267dd revisionstore: pass client correlator
Summary:
We used to get those in the old (Python) LFS extension, but didn't have them in
the new one. However, this is helpful to correlate requests to LFS with data in
hg logs. It's also convenient to be able to identify whether a set of requests
are part of the same session or not.

This diffs threads the client correlator through to the LFS store from the
Python, similarly to how it's done in EdenAPI.

Reviewed By: DurhamG

Differential Revision: D25804930

fbshipit-source-id: a5d5508617fa4184344834bbd8e3423816aa7668
2021-01-11 10:46:20 -08:00
Stefan Filip
02606da6c5 pathmatcher: allow errors in match function definition
Summary:
The most common scenario where we see matcher errors is when we iterate through
a manifest and the user sends SIGTERM to the process. The matcher may be both
Rust and Python code. The Python code handles the interrupt and prevents future
function calls. The iterating Rust code will continue to call matcher functions
through this time so we get matcher errors from the terminated Python stack.

As long as we have Python matcher code, errors are valid.
It is unclear to me whether the matcher trait should have `Result` return
values when all implementations are Rust. It is easy to imagine implementations
that can fail in different circumstances but the ones that we use if we just
port the Python code wouldn't fail.
All in all, I think that this is a reasonable step forward.

Reviewed By: quark-zju

Differential Revision: D25697099

fbshipit-source-id: f61c80bd0a8caa58040a447ed02d48a1ae84ad60
2021-01-07 16:22:17 -08:00
Jun Wu
fab3b21289 pydag: expose Dag and IdMap ID via hints
Summary: This makes it easier to investigate fast path issues.

Reviewed By: sfilipco

Differential Revision: D25598077

fbshipit-source-id: 27b7042fb9510321c25371f8c5d134e248b3d5d5
2020-12-18 16:56:43 -08:00
Jun Wu
207f755dc0 hgcommits: make revlog optional for the hybrid backend
Summary: This makes it possible to add new commits in a repo without revlog.

Reviewed By: sfilipco

Differential Revision: D25602527

fbshipit-source-id: 56c27a5f00307bcf35efa4517c7664a865c47a43
2020-12-18 16:47:11 -08:00
Durham Goode
dce0faa9b1 configparser: add allowed_location criteria to config verifier
Summary:
We want to start disallowing non-approved config files from being
loaded. To do that, let's update the config verifier to accept an optional list
of allowed locations. If it's provided, we delete any values that came from a
disallowed location.

This will enable us to prune our config sources down to rust configs,
configerator configs, .hg/hgrc, and ~/.hgrc.

Reviewed By: quark-zju

Differential Revision: D25539738

fbshipit-source-id: 0ece1c7038e4a563c92140832edfa726e879e498
2020-12-17 06:37:54 -08:00
Jun Wu
843ef6aab2 pydag: add API to get commit text in batch
Summary:
Unlike streamcommitrawtext, the new API does not put Python logic to a
background thread. This will make it easier to reason about Python logic as
they do not need to be thread-safe, and we don't need to think about Python GIL
deadlocks in the Rust async world.

Reviewed By: sfilipco

Differential Revision: D25513057

fbshipit-source-id: 4b30d7bab27070badd205ac1a9d54bae7f1f8cec
2020-12-14 19:10:44 -08:00
Xavier Deguillard
49de273c77 revisionstore: handle redacted blob in LFS
Summary:
When a blob is redacted server side, the http code 410 is returned.
Unfortunately, that HTTP code wasn't supported and caused Mercurial to crash.
To fix this, we just need to store a placeholder when receiving this HTTP code
and simply return it up the stack when reading it.

Reviewed By: DurhamG

Differential Revision: D25433001

fbshipit-source-id: 66ec365fa2643bcf29e38b114ae1fc92aeaf1a7b
2020-12-11 12:28:38 -08:00
Jun Wu
ad6f25addc dag: make IdConvert async
Summary: Make IdConvert async and migrate all its users.

Reviewed By: sfilipco

Differential Revision: D25350915

fbshipit-source-id: f05c89a43418f1180bf0ffa573ae2cdb87162c76
2020-12-10 12:37:35 -08:00
Jun Wu
461fa77fd7 dag: make Set::flatten async
Summary: This will make it easier to make IdConvert async.

Reviewed By: sfilipco

Differential Revision: D25345239

fbshipit-source-id: 684a0843ae32270aa9b537ef9a2b17a28c027e51
2020-12-10 12:37:34 -08:00
Jun Wu
53bdae78d9 dag: make ToIdSet async
Summary: This will make it easier to make IdConvert async.

Reviewed By: sfilipco

Differential Revision: D25345232

fbshipit-source-id: b8967ea51a6141a95070006a289dd724522f8e18
2020-12-10 12:37:34 -08:00
Jun Wu
f854d2e03e dag: make DagAlgorithm async
Summary:
Update DagAlgorithm and all its users to async. This makes it easier to make
IdConvert async.

Reviewed By: sfilipco

Differential Revision: D25345236

fbshipit-source-id: d6cf76723356bd0eb81822843b2e581de1e3290a
2020-12-10 12:37:34 -08:00
Jun Wu
0ac5bcef79 mutationstore: make calculate_obsolete async
Summary: This will make it easier if the `Set` passed in requires async.

Reviewed By: sfilipco

Differential Revision: D25345230

fbshipit-source-id: a327d4e5d425b7eb5296b2fbe25c446492aa9ea7
2020-12-10 12:37:34 -08:00
Jun Wu
a03e8f4c55 dag: make DagPersistent and DagAddHeads async
Summary: This makes it easier to make DagAlgorithm async.

Reviewed By: sfilipco

Differential Revision: D25345234

fbshipit-source-id: 5ca4bac38f5aac4c6611146a87f423a244f1f5a2
2020-12-10 12:37:33 -08:00
Jun Wu
6d9e8eb249 hgcommits: make traits async
Summary: This will make it easier to use `.await` if part of `dag` becomes async.

Reviewed By: sfilipco

Differential Revision: D25345237

fbshipit-source-id: 7f07cdaa9c2e0468667638066611fabe3a3f7f28
2020-12-10 12:37:33 -08:00
Jun Wu
2496281d78 dag: make PrefixLookup async
Summary: Use async function for the PrefixLookup trait.

Reviewed By: sfilipco

Differential Revision: D24840820

fbshipit-source-id: d22cac9f11b06e3127fa956e3f116cf232214125
2020-12-10 12:37:32 -08:00
Jun Wu
4d791fd823 dag: require PrefixLookup for IdConvert
Summary: This makes `dyn IdConvert` include `PrefixLookup`.

Reviewed By: sfilipco

Differential Revision: D24840819

fbshipit-source-id: 8d4e25c534f6e4397ec6f643eb3aa116bff12a2c
2020-12-10 12:37:32 -08:00
Jun Wu
32176eca42 dag: add async interface for NameSet
Summary:
Change the main API of NameSet to async. Use the `nonblocking` crate to bridge
the sync and async world for compatibility. Future changes will migrate
Iterator to async Stream.

Reviewed By: sfilipco

Differential Revision: D24806696

fbshipit-source-id: f72571407a5747a4eabe096dada288656c9d426e
2020-12-10 12:37:31 -08:00
Jun Wu
fa18f70f8f pyhgtime: resolve a compiler warning
Summary:
```
warning: attribute should be applied to a function or static
  --> pyhgtime/src/lib.rs:70:13
   |
70 |             #[no_mangle]
   |             ^^^^^^^^^^^^
71 |             static timezone: c_long;
   |             ------------------------ not a function or static
   |
   = note: `#[warn(unused_attributes)]` on by default
```

Reviewed By: ikostia

Differential Revision: D25345233

fbshipit-source-id: 4b6f753544af1e3e8479dceed908299f6dc57ad5
2020-12-10 12:37:31 -08:00
Durham Goode
a5f8e0a301 revisionstore: allow persistent IndexedLogHistory
Summary:
In a future diff we'll use the indexedlog stores for local history. We
want those to exist forever, so let's move IndexedLogHgIdHistoryStore to use a
Store under the hood, and add an enum for distinguishing between the two types
at creation time.

Reviewed By: quark-zju

Differential Revision: D25429675

fbshipit-source-id: 5f2dc494e1175d4c1dc74992d3311d2e55d784ca
2020-12-10 07:28:13 -08:00
Durham Goode
4f611364eb revisionstore: add repair support to IndexedLogHgIdDatatore
Summary:
We temporarily dropped repair support when transitioning to using Store instead
of a raw RotateLog. Let's add that back now.

Reviewed By: xavierd

Differential Revision: D25371622

fbshipit-source-id: e28fc425a6ffb50c93540672b0df75a172ebbe9c
2020-12-09 07:07:40 -08:00
Durham Goode
925e457ae7 revisionstore: allow persistent IndexedLogData
Summary:
In a future diff we'll use the indexedlog stores for local data. We
want those to exist forever, so let's move IndexedLogHgIdDataStore to use a
Store under the hood, and add an enum for distinguishing between the two types
at creation time.

Reviewed By: xavierd

Differential Revision: D23915622

fbshipit-source-id: 296cf6dfcd53e5cf1ae7624fdccedf0a60a77f22
2020-12-09 07:07:40 -08:00
Meyer Jacobs
293053b774 edenapi: expose "attributes" parameter in tree request API
Summary:
Introduce a new API type, `TreeAttributes`, corresponding to the existing type `WireTreeAttributesRequest`, which exposes which optional attributes are available for fetching. An `Option<TreeAttributes>` parameter is added to the `trees` API, and if set to `None`, the client will make a request with TreeAttributes::default().

The Python bindings accept a dictionary for the attributes parameter, and any fields present will overwrite the default settings from TreeAttributes::default(). Unrecognized attributes will be silently ignored.

Reviewed By: kulshrax

Differential Revision: D25041255

fbshipit-source-id: 5c581c20aac06eeb0428fff42bfd93f6aecbb629
2020-12-01 19:07:25 -08:00
Jun Wu
c6741e4c3a Back out "smartset: back out use Rust reentrant generator for generatorset"
Summary:
The original issue was a rust-cpython bug, solved by D24698226, or https://github.com/dgrunwald/rust-cpython/pull/244.

Original commit changeset: 08f598df0892

Reviewed By: sfilipco

Differential Revision: D24759765

fbshipit-source-id: f9a1359cfce68c8754ddd1bcb8bfc54bf75af7ff
2020-11-06 16:25:00 -08:00
Saurabh Singh
7fcc71e9d4 smartset: back out use Rust reentrant generator for generatorset
Summary: This is essentially a backout of D24365328 (9f664a8b30).

Reviewed By: DurhamG

Differential Revision: D24544495

fbshipit-source-id: 08f598df0892a8479fac563096f9782038e18dfe
2020-10-26 11:58:31 -07:00
Meyer Jacobs
0ccae1cef9 errors: introduce per-item errors to EdenAPI protocol
Summary: Rather than silently dropping entries which cannot be fetched, this change has the `WireTreeEntry` type carry optional error information, allowing it to be (de)serialized to / from `Result<TreeEntry, EdenApiServerError>` instead of a bare `TreeEntry`. Currently, handling of these failures is up to the individual application code, but it might be useful to introduce utility functions to drop failed entries and log errors.

Reviewed By: kulshrax

Differential Revision: D24315399

fbshipit-source-id: 94e4593b77cf2dc12d0dcc93d174c8a4eda95344
2020-10-25 18:39:34 -07:00
Jun Wu
9f664a8b30 smartset: use Rust reentrant generator for generatorset
Summary:
The generatorset has a pure Python implementation for rewindable generator.
However it is not thread-safe, and we want thread-safety since the set
iteration will be driven by async Rust in multiple threads.

One of the issues is, during thread 1 `next(gen)`, thread 2 might call
`next(gen)`, and that's not allowed by Python.

Fix them by switching to the Rust RGenerator.

Reviewed By: DurhamG

Differential Revision: D24365328

fbshipit-source-id: 2785e80c7c460a7f754ed23e3af99f4a5c9fbcdf
2020-10-21 17:17:08 -07:00
Xavier Deguillard
c62cd9e8c8 revisionstore: move the repair logic from doctor
Summary:
The ContentStore/Metadatastore are made of several different stores, attempting
to expose all of them to Python to drive the repair logic from there would leak
implementation detail of how the stores are implemented.

Instead, let's simply expose a single `repair` function out of the
pyrevisionstore crate that takes care of repairing all of the underlying
stores. For now, this is just moving code around, but a future diff will
integrate the LFS stores.

Reviewed By: DurhamG

Differential Revision: D24449203

fbshipit-source-id: 1631ced9068716453cb404bf7e65cefbf2db5247
2020-10-21 13:20:51 -07:00
Jun Wu
229713757d pythreading: add a reentrant generator
Summary:
Pure Python generator has a few limitations:
- Can only be iterated once.
- Cannot be iterated by multiple threads concurrently.

By converting revset iterators to Rust Stream, they run in different threads
and can context switch in `next()`, breaking the program.

Provide a native reentrant generator implementation to address the issue.
This is sound because Rust methods cannot be interrupted by Python.

Reviewed By: DurhamG

Differential Revision: D24365331

fbshipit-source-id: 885dade922b7863a73203b206a96b492d55bccd0
2020-10-20 15:24:27 -07:00
Jun Wu
06249e957f pydag: update return type of streamcommitrawtext from dict to tuple
Summary:
Update the return type of streamcommitrawtext from
`{"vertex": v, "raw_text": t}` to `(v, t)`.  This makes it easier to use
in Python, as Python supports `for v, t in ...` but does not support
`for {"vertex": v, "raw_text": t} in ...`.

Reviewed By: DurhamG

Differential Revision: D24295457

fbshipit-source-id: 284a29b9deae2d8509d3afea0fcbcaadbfebbae8
2020-10-20 15:24:27 -07:00
Arun Kulshreshtha
d16a62ce06 edenapi: send user agent string
Summary:
Include a `User-Agent` header in EdenAPI requests from Mercurial. This will allow us to see the version in Scuba, and in the future, will allow us to distinguish between requests send by Mercurial and those sent directly by EdenFS.

Keeping with the current output of `hg version`, the application is specified as "EdenSCM" rather than "Mercurial".

Reviewed By: singhsrb

Differential Revision: D24347021

fbshipit-source-id: e323cfc945c9d95d8b2a0490e22c2b2505a620dc
2020-10-16 11:05:24 -07:00
Arun Kulshreshtha
32c109d955 edenapi: send client correlator to server
Summary:
Include the client correlator string from the `clienttelemetry` extension in each EdenAPI HTTP request via the  `X-Client-Correlator` header.

The `ClientIdentityMiddleware` in `gotham_ext` already understands this header (as it is already used by the LFS server), and `gotham_ext`'s `ScubaMiddleware` will automatically include the provided correlator in the server's Scuba samples.

Reviewed By: farnz

Differential Revision: D24282244

fbshipit-source-id: 13d04e706eda38893cff6e740bd1d7bf104e43dd
2020-10-13 13:25:52 -07:00
Meyer Jacobs
f9958ca35a taggederror: introduce category and transience metadata and precedence
Summary:
This change introduces two new metadata types, Category and Transience, and a mechanism for Category to provide a default Fault and Transience, which can be overriden by the user.

Also introduces a mechanism for attempting to log exceptions which occur during exception logging, falling back to the previous behavior of just swallowing the exception on failure.

Reviewed By: DurhamG

Differential Revision: D22677565

fbshipit-source-id: 1cf75ca1e2a65964a0ede1f072439378a46bd391
2020-10-12 17:17:34 -07:00
Jun Wu
8dfd6c26ea pydag: support hybrid commits backend
Summary:
Support constructing the "hybrid" commits backend, which is similar to
"doublewrite" but read commit text from edenapi via the `streamcommitrawtext`
method.

Reviewed By: sfilipco

Differential Revision: D23924149

fbshipit-source-id: cb15ee4be7953af7798d460557ba2ae2d4f24a52
2020-10-06 19:13:03 -07:00
Jun Wu
9741de4136 pydag: expose API to read commit text using streams
Summary:
This can be used like:

  In [1]: s=cl.inner.streamcommitrawtext(repo.nodes('.%%master'))  # repo.nodes returns a generator, becomes stream

  In [2]: s
  Out[2]: <stream at 0x7f5eec742df0>

  In [3]: list(s)
  Out[3]: [{'vertex': ..., 'raw_text': ...}, ...]

  In [4]: s.typename()
  Out[4]: 'cpython_ext::convert::Serde<hgcommits::ParentlessHgCommit>'

Reviewed By: sfilipco

Differential Revision: D23911870

fbshipit-source-id: f54959a551d446ed5b8086a2235fe74e47b29e70
2020-10-06 19:13:02 -07:00
Jun Wu
6defe87dcb streams: add abstraction about downloading missing data from remote
Summary:
The API is basically to resolve `input_stream` to `output_stream`, with a
stateful "resolver" that can resolve locally and remotely.

Reviewed By: sfilipco

Differential Revision: D23915775

fbshipit-source-id: 14a3a37fc897c8229514acac5c91c7e46b270896
2020-10-06 19:13:02 -07:00
Jun Wu
ee82a84a29 pyedenapi: use serde serialization to simplify type conversion
Summary:
`cpython_ext` provides utilities to implement From/ToPyObject directly for
serde types. Lets' use it to simplify the code and set up an example.

debugshell:

  In [2]: s,f=api.commitdata(repo.name, list(repo.nodes('master')))

  In [3]: list(s)
  Out[3]:
  [{'hgid': (7, 61, 22, ...), 'revlog_data': '...'}]

Note: `HgId` serialization should probably be changed to use `serde_bytes` somehow
so it does not translate to a Python tuple. That will be fixed later.

Reviewed By: kulshrax

Differential Revision: D23966987

fbshipit-source-id: 9278ccae6f543c387eafe401d4ef8d6ce96d370f
2020-10-06 16:01:23 -07:00
Jun Wu
833ac3fb4c cpython-async: drop py_stream_class macro
Summary:
The py_stream_class causes the code to be more verbose. It basically enforces
the bindings crate to define new types wrapping pure Rust types, and then
define py_stream_class.

In a future diff, I'm adding FromPyObject/ToPyObject support for types that
implements serde Deserialize/Serialize. py_stream_class gets in the way,
because the blanket type from cpython-ext cannot be used in the py_stream_class
macro. cpython-ext is not the proper place to define business-related stream
types.

Therefore, define a type-erased Python class, and implement
FromPyObject/ToPyObject automatically for TStream<anyhow::Result<T>> where
T implements FromPyObject or ToPyObject.

The FromPyObject now converts a Python iterator back to a stream. It's
no longer zero-cost. However, I'd imagine such usecases can be short-cut
using pure Rust code.

Background: Initially, I added some FromPyObject/ToPyObject impls to pure
Rust crates gated by a "pytypes" feature. While that works fine with cargo
build, buck does not support dynamic features and the fact that we support
both py2 and py3 makes it extremely hard to support cleanly in buck build.
For example, if minibytes::Bytes defines ToPyObject for Bytes, then any
crate using minibytes would have 2 different versions: a py2 version, a
py3 version, and they both depend on python. That seems to be a bad approach.

Reviewed By: sfilipco

Differential Revision: D23966984

fbshipit-source-id: eafb31ad458dcbdd8e970d8e419a10fbbe30595f
2020-10-02 21:51:49 -07:00
Durham Goode
2a9263cfe2 memcache: add progress bar to Rust memcachestore
Summary: We now get progress bar output when fetching from memcache!

Reviewed By: kulshrax

Differential Revision: D24060663

fbshipit-source-id: ff5efa08bced2dac12f1e16c4a55fbc37fbc0837
2020-10-02 15:03:17 -07:00
Arun Kulshreshtha
33a380cf56 pyedenapi: add progress bars to EdenAPI client
Summary: Now that we progress bars in Rust, add them to the EdenAPI client bindings and remove any existing progress bars around callsites in the Python code.

Reviewed By: quark-zju

Differential Revision: D24037797

fbshipit-source-id: eb26ccaae35ab23eb76f6f2b2be575a22e1f1e53
2020-09-30 19:53:21 -07:00
Arun Kulshreshtha
dfbe53cf11 revisionstore: add progress bars to EdenAPI stores
Summary: Make EdenAPI data stores optionally show progress bars.

Reviewed By: markbt

Differential Revision: D23982320

fbshipit-source-id: b3affd3b630258f15c3cdc64c213df8aa28af589
2020-09-30 13:01:15 -07:00
Arun Kulshreshtha
2016586fe2 pyprogress: add Python bindings for Rust progress crate
Summary: Add Python bindings to the Rust progress wrappers. This may seem pointless since the Rust code just calls right back into Python, but this is a useful step to get the Rust and Python code to use a common interface for progress. (Which, in turn, will allow switching to a Rust progress implementation down the line.)

Reviewed By: markbt

Differential Revision: D23999816

fbshipit-source-id: 9bca0f23170d3ca474a1cb5d547840e63572ec71
2020-09-30 13:01:15 -07:00
Arun Kulshreshtha
31107a525f pyprogress: add Rust wrapper for Python progress bars
Summary: Add Rust wrappers around Mercurial's Python `progress` module, allowing Rust code to create and use Python progress bars. The wrapper types implement the traits from the `progress` crate, so they can be passed to pure Rust crates in `scm/lib`. In typical usage, the Rust bindings will create a `PyProgressFactory`, which will be passed to pure Rust code as a trait object or via generics.

Reviewed By: markbt

Differential Revision: D23982317

fbshipit-source-id: 4c0fde0b2423b6449c7c5155fdfd98f5da042b0d
2020-09-30 11:20:32 -07:00
Arun Kulshreshtha
d3b39542f0 revisionstore: use async_runtime in EdenAPI stores
Summary: Now that the `async_runtime` crate exists, use Mercurial's global `tokio::Runtime` instead of creating one for each EdenAPI store.

Reviewed By: quark-zju

Differential Revision: D23945569

fbshipit-source-id: 7d7ef6efbb554ca80131daeeb2467e57bbda6e72
2020-09-26 16:50:06 -07:00
Arun Kulshreshtha
29b855b256 edenapi: add server load to ResponseMeta
Summary: Now that the EdenAPI server is using the `LoadMiddleware` from `gotham_ext`, each response will contain an `X-Load` header that contains the number of active requests that the server is currently handling.

Reviewed By: quark-zju

Differential Revision: D23922809

fbshipit-source-id: 973143de5ddccf074d28aa3ef38d73f9fc1501b6
2020-09-24 21:05:21 -07:00
Durham Goode
46d0991cd0 revisionstore: expose shared mutable stores to Python
Summary:
Treemanifest needs to be able to write to the shared stores from paths
other than just prefetch (like when it receives certain trees via a standard
pull). To make this possible we need to expose the Rust shared mutable stores.
This will also make just general integration with Python cleaner.

In the future we can get rid of the non-prefetch download paths and remove this.

Reviewed By: quark-zju

Differential Revision: D23772385

fbshipit-source-id: c1e67e3d21b354b85895dba8d82a7a9f0ffc5d73
2020-09-24 09:46:59 -07:00
Jun Wu
ebf708e17a pyedenapi: switch to async_runtime::block_on_future
Summary:
This simplifies the code a bit, and avoids creating tokio Runtime multiple
times.

Reviewed By: kulshrax

Differential Revision: D23799642

fbshipit-source-id: 21cee6124ef6f9ab6e165891d9ee87b2feb553ac
2020-09-21 13:28:07 -07:00
Jun Wu
186151e8f9 pyedenapi: return commit data in a stream fashion
Summary:
Exercises the PyStream type from cpython-async.

`hg dbsh`:

  In [1]: s,f=api._rustclient.commitdata('fbsource', list(repo.nodes('master^^::master')))

  In [2]: s
  Out[2]: <stream at 0x7ff2db700690>

  In [3]: it=iter(s)

  In [4]: next(it)
  Out[4]: ('6\xf9\x18\xe4\x1c\x05\xfc\xb0\xd3\xb2\xe9\xec\x18E\xec\x0f\x1a:\xb7\xcd', ...)

  In [5]: next(it)
  Out[5]: ('}\x1f(\xe1o\xf1a\x9b\x81\xb9\x83}\x1b\xbbt\xd2e\xb1\xedb',...)

  In [6]: next(it)
  Out[6]: ('\xf1\xf0f\x97<\xf3\xdd\xe41w>\x92\xd1\xc0\x9ah\xdd\x87~^',...)

  In [7]: next(it)
  StopIteration:

  In [8]: f.wait()
  Out[8]: <bindings.edenapi.stats at 0x7ff2e006a3d8>

  In [9]: str(Out[8])
  Out[9]: '2.42 kB downloaded in 165 ms over 1 request (0.01 MB/s; latency: 165 ms)'

  In [10]: iter(s)
  ValueError: stream was consumed

Reviewed By: kulshrax

Differential Revision: D23799645

fbshipit-source-id: 732a5da4ccdee4646386b6080408c0d8958dd67f
2020-09-21 13:28:07 -07:00
Jun Wu
cd7f831c6c pyedenapi: return a Future of Stats for commitdata
Summary:
Exercises the PyFuture type from cpython-async.

`hg dbsh`:

    In [1]: api._rustclient.commitdata('fbsource', list(repo.nodes('master^^::master')))
    Out[1]:
    ([...], <future at 0x7f7b65d05060>)

    In [2]: f=Out[1][-1]

    In [3]: f.wait()
    Out[3]: <bindings.edenapi.stats at 0x7f7b665e8228>

    In [4]: f.wait()
    ValueError: future was awaited

    In [5]: str(Out[3])
    Out[5]: '2.42 kB downloaded in 172 ms over 1 request (0.01 MB/s; latency: 171 ms)'

Reviewed By: kulshrax

Differential Revision: D23799643

fbshipit-source-id: d4fcef7dca58bc4902bb0809adc065493bb94bd3
2020-09-21 13:28:07 -07:00
Durham Goode
63d19e1eca workers: bulk fetch data in worker thread
Summary:
During an hg update we first prefetch all the data, then write all the
data to disk. There are cases where the prefetched data is not available during
the writing phase, in which case we fall back to fetching the files one-by-one.
This has truly atrocious performance.

Let's allow the worker threads to check for missing data then do bulk fetching
of it. In the case where the cache was completely lost for some reason, this
would reduce the number of serial fetches by 100x.

Note, the background workers already spawn their own ssh connection's, so
they're already getting some level of parallelism even when they're doing 1-by-1
fetching. That's why we aren't seeing a 100x improvement in performance.

Reviewed By: xavierd

Differential Revision: D23766424

fbshipit-source-id: d88a1e55b1c21e9cea7e50fc6dbfd8a27bd97bb0
2020-09-21 11:27:12 -07:00
Jun Wu
c4e2f5cb0f bindings: add sleep for testing blocking Rust functions
Summary: This will be used to test Ctrl+C handling with native code.

Reviewed By: kulshrax

Differential Revision: D23759714

fbshipit-source-id: 50da40d475b80da26b7dbc654e010d77cb0ad2d1
2020-09-18 13:28:33 -07:00
Jun Wu
6cb78fa90c pyedenapi: expose API querying hg commit data
Summary: This makes it easier to test the API via debugshell.

Reviewed By: kulshrax

Differential Revision: D23750677

fbshipit-source-id: e29284395f03c9848cf90dd2df187e437890c56e
2020-09-18 13:28:33 -07:00
Durham Goode
cbe4499da8 treemanifest: add option for instantiating a Rust treemanifest store
Summary:
Adds the initial condition and creation logic for creating a Rust
treemanifest store. Fetching and some other code paths don't work just yet, but
subsequent diffs enable more and more functionality.

Reviewed By: quark-zju

Differential Revision: D23662052

fbshipit-source-id: a0e7090c9a3bf27a7738bf093f2d4eb6098b1ed6
2020-09-17 10:16:03 -07:00
Durham Goode
6ae1cf9619 revisionstore: add refresh function
Summary:
The rust pack stores currently have logic to refresh their list of
packs if there's a key miss and if it's been a while since we last loaded the
list of packs. In some cases we want to manually trigger this refresh, like if
we're in the middle of a histedit and it invokes an external command that
produces pack files that the histedit should later consume (like an external
amend, that histedit then needs to work on top of).

Python pack stores solve this by allowing callers to mark the store for a
refresh. Let's add the same logic for rust stores. Once pack files are gone we
can delete this.

This will be useful for the upcoming migration of treemanifest to Rust
contentstore. Filelog usage of the Rust contentstore avoided this issue by
recreating the entire contentstore object in certain situations, but refresh
seems useful and less expensive.

Reviewed By: quark-zju

Differential Revision: D23657036

fbshipit-source-id: 7c6438024c3d642bd22256a8e58961a6ee4bc867
2020-09-17 10:16:03 -07:00
Durham Goode
dd387dd0d1 mutablepacks: only create mutable history packs when needed
Summary:
Previously the MetadataStore would always construct a mutable pack, even
if the operation was readonly. This meant all read commands required write
access. It also means that random .tmp files get scattered all over the place
when the rust structures are not properly destructed (like if python doesn't
bother doing the final gc to call destructors for the Rust types).

Let's just only create mutable packs when we actually need them.

Reviewed By: quark-zju

Differential Revision: D23219961

fbshipit-source-id: a47f3d94f70adac1f2ee763f3170ed582ef01a14
2020-09-16 21:39:25 -07:00
Durham Goode
1f5835e70a mutablepacks: only create mutable data packs when needed
Summary:
Previously the ContentStore would always construct a mutable pack, even
if the operation was readonly. This meant all read commands required write
access. It also means that random .tmp files get scattered all over the place
when the rust structures are not properly destructed (like if python doesn't
bother doing the final gc to call destructors for the Rust types).

Let's just only create mutable packs when we actually need them.

Reviewed By: quark-zju

Differential Revision: D23219962

fbshipit-source-id: 573844f81966d36ad324df03eecec3711c14eafe
2020-09-16 21:39:25 -07:00
Thomas Orozco
21290702e1 third-party/rust: import async-compression + update zstd
Summary:
This imports the async-compression crate. We have an equivalent-ish in
common/rust, but it targets Tokio 0.1, whereas this community-supported crate
targets Tokio 0.2 (it offers a richer API, notably in the sense that we
can use it for Streams, whereas the async-compression crate we have is only for
AsyncWrite).

In the immediate term, I'd like to use this for transfer compression in
Mononoke's LFS Server. In the future, we might also use it in Mononoke where we
currently use our own async compression crate when all that stuff moves to
Tokio 0.2.

Finally, this also updates zstd: the version we link to from tp2 is actually
zstd 1.4.5, so it's a good idea to just get the same version of the zstd crate.

The zstd crate doesn't keep a great changelog, so it's hard to tell what has changed.
At a glance, it looks like the answer is not much, but I'm going to look to Sandcastle
to root out potential issues here.

Reviewed By: StanislavGlebik

Differential Revision: D23652335

fbshipit-source-id: e250cef7a52d640bbbcccd72448fd2d4f548a48a
2020-09-15 07:59:53 -07:00
Xavier Deguillard
ed4021b8e3 revisionstore: disallow reading LFS pointers from packfiles
Summary:
For repositories that have the old-style LFS extension enabled, the pointers
are stored in packfiles/indexedlog alongside with a flag that signify to the
upper layers that the blob is externally stored. With the new way of doing LFS,
pointers are stored separately.

When both are enabled, we are observing some interesting behavior where
different get and get_meta calls may return different blobs/metadata for the
same filenode. This may happen if a filenode is stored in both a packfile as an
LFS pointers, and in the LFS store. Guaranteeing that the revisionstore code is
deterministic in this situation is unfortunately way too costly (a get_meta
call would for instance have to fully validate the sha256 of the blob, and this
wouldn't guarantee that it wouldn't become corrupted on disk before calling
get).

The solution take here is to simply ignore all the lfs pointers from
packfiles/indexedlog when remotefilelog.lfs is enabled. This way, there is no
risk of reading the metadata from the packfiles, and the blob from the
LFSStore. This brings however another complication for the user created blobs:
these are stored in packfiles and would thus become unreadable, the solution is
to simply perform a one-time full repack of the local store to make sure that
all the pointers are moved from the packfiles to to LFSStore.

In the code, the Python bindings are using ExtStoredPolicy::Ignore directly as
these are only used in the treemanifest code where no LFS pointers should be
present, the repack code uses ExtStoredPolicy::Use to be able to read the
pointers, it wouldn't be able to otherwise.

Reviewed By: DurhamG

Differential Revision: D22951598

fbshipit-source-id: 0e929708ba5a3bb2a02c0891fd62dae1ccf18204
2020-09-09 18:27:42 -07:00
Stefan Filip
de9b34e83a bindings: add pyhgmetrics to bind the hg-metrics crate
Summary: Exposing the hg-metrics crate to the Python application.

Reviewed By: quark-zju

Differential Revision: D23577875

fbshipit-source-id: 1d919160f8514ae8bfcb0171a0c9d1d9d0de80e6
2020-09-09 17:35:48 -07:00
David Tolnay
e83e05ff25 Update formatter to rustfmt 2.0
Reviewed By: zertosh

Differential Revision: D23591028

fbshipit-source-id: f458503fc2b9c25023fa1643eca5e166882a4811
2020-09-09 07:52:34 -07:00
Durham Goode
2919268555 revisionstore: auto-delete when we have too much pack data
Summary:
In order to keep the hgcache size bounded we need to keep track of pack
file size even during normal operations and delete excess packs.

This has the negative side effect of deleting necessary data if the operation is
legitimately huge, but we'd rather have extra downloading time than fill up the
entire disk.

Reviewed By: quark-zju

Differential Revision: D23486922

fbshipit-source-id: d21be095a8671d2bfc794c85918f796358dc4834
2020-09-08 11:33:50 -07:00
Durham Goode
651a0690be revisionstore: auto-commit datapacks when they get large
Summary:
As the repository grows the opportunity for large downloads increases.
Today all writes to data packs get sent straight to disk, but we have no way to
prevent this from eating all the disk.

Let's automatically flush datapacks when they reach a certain size (default
4GB). In a future diff this will let us automatically garbage collect data packs
to bound the maximum size of packs.

Rotatelog already have this behavior.

Reviewed By: quark-zju

Differential Revision: D23478780

fbshipit-source-id: 14f9f707e8bffc59260c2d04c18b1e4f6bdb2f90
2020-09-08 11:33:50 -07:00
David Tolnay
e62b176170 Prepare for rustfmt 2.0
Summary:
Generated by formatting with rustfmt 2.0.0-rc.2 and then a second time with fbsource's current rustfmt (1.4.14).

This results in formatting for which rustfmt 1.4 is idempotent but is closer to the style of rustfmt 2.0, reducing the amount of code that will need to change atomically in that upgrade.

 ---

*Why now?* **:** The 1.x branch is no longer being developed and fixes like https://github.com/rust-lang/rustfmt/issues/4159 (which we need in fbcode) only land to the 2.0 branch.

 ---

Reviewed By: zertosh

Differential Revision: D23568779

fbshipit-source-id: 477200f35b280a4f6471d8e574e37e5f57917baf
2020-09-07 20:47:59 -07:00
Durham Goode
8b91cccc8b remotefilelog: log undesired filename fetches
Summary:
Now that the Rust revisionstore records undesired filename fetches,
let's log those results to Scuba in Python.

Reviewed By: StanislavGlebik

Differential Revision: D23462572

fbshipit-source-id: b55f2290e30e3a5c3b67d9f612b24bc3aad403a8
2020-09-04 14:55:15 -07:00
Jun Wu
a90c8ea775 bindings: export rust process handling to Python
Summary:
Spawning processes turns out to be tricky.

Python 2:

- "fork & exec" in plain Python is potentially dangerous. See D22855986 (c35b8088ef).
  Disabling GC might have solved it, but still seems fragile.
- "close_fds=True" works on Windows if there is no redirection.
- Does not work well with `disable_standard_handle_inheritability` from `hgmain`.
  We patched it. See `contrib/python2-winbuild/0002-windows-make-subprocess-work-with-non-inheritable-st.patch`.

Python 3:

- "subprocess" uses native code for "fork & exec". It's safer.
- (>= 3.8) "close_fds=True" works on Windows even with redirection.
- "subprocess" exposes options to tweak low-level details on Windows.

Rust:

- No "close_fds=True" support for both Windows and Unix.
- Does not have the `disable_standard_handle_inheritability` issue on Windows.
- Impossible to cleanly support "close_fds=True" on Windows with existing stdlib.
  https://github.com/rust-lang/rust/pull/75551 attempts to add that to stdlib.
  D23124167 provides a short-term solution that can have corner cases.

Mercurial:

- `win32.spawndetached` uses raw Win32 APIs to spawn processes, bypassing
  the `subprocess` Python stdlib.
- Its use of `CreateProcessA` is undesirable. We probably want `CreateProcessW`
  (unless `CreateProcessA` speaks utf-8 natively).

We are still on Python 2 on Windows, and we'd need to spawn processes correctly
from Rust anyway, and D23124167 kind of fills the missing feature of `close_fds=True`
from Python. So let's expose the Rust APIs.

The binding APIs closely match the Rust API. So when we migrate from Python to
Rust, the translation is more straightforward.

Reviewed By: DurhamG

Differential Revision: D23124168

fbshipit-source-id: 94a404f19326e9b4cca7661da07a4b4c55bcc395
2020-08-31 17:34:48 -07:00
Jun Wu
01c551bb30 hgcommits: add flush_commit_data API
Summary: This would be used to avoid excessive memory usage during pull.

Reviewed By: DurhamG

Differential Revision: D23408833

fbshipit-source-id: 8edd95ab8201697074f65cc118d14755a230567d
2020-08-31 11:57:53 -07:00
Jun Wu
0705bd3b8d pydag: use dag::delegate to simplify code
Summary: This makes the code simpler.

Reviewed By: sfilipco

Differential Revision: D23269858

fbshipit-source-id: bb9ac0bd1696f7429ca1856e6c63e04fabc2757a
2020-08-26 15:32:26 -07:00
Jun Wu
ded7c2e380 hgcommits: add explain_internals to print human-readable segments
Summary: Provide a way to see segments.

Reviewed By: sfilipco

Differential Revision: D23196408

fbshipit-source-id: b1418f945a5a3364ac73b0f97466d973dd4b6300
2020-08-26 15:32:24 -07:00
Jun Wu
bd7769b34a dag: rename snapshot_dag to dag_snapshot
Summary: This is more consistent with `id_map_snapshot`.

Reviewed By: sfilipco

Differential Revision: D23182519

fbshipit-source-id: 62b7fc8bfdc9d6b3a4639a6518ea084c7f3807dd
2020-08-26 15:32:22 -07:00
Pavel Aslanov
69e57b232d fix panic in slice index
Summary:
Based on [user report](https://fb.workplace.com/groups/scm/permalink/3128221090560823/).
Note that slices in rust behave differently and if index exceeds slice size this will always be panic. My fix was based on assumption that behavior should be similar to python.

Reviewed By: quark-zju

Differential Revision: D23263922

fbshipit-source-id: 3d2a1a1b59f14e43b1f1a2b7102982b11637c0b4
2020-08-24 05:24:58 -07:00
Jun Wu
749602e534 hgcommits: add gitsegments backend
Summary:
The backend translates git commit graph to segments. It's useful for
benchmarking on git commit graphs.

Reviewed By: DurhamG

Differential Revision: D23095470

fbshipit-source-id: 21a28869e91ef8f38bbf9925443eb4ac26f05e3d
2020-08-21 13:00:45 -07:00
Jun Wu
d352133d6d hgcommits: use concrete error types
Summary: Migrate to concrete types so it can be typechecked.

Reviewed By: DurhamG

Differential Revision: D23095469

fbshipit-source-id: 27c6da30ca8a1329df544cd2ded7d9734593e48a
2020-08-21 13:00:45 -07:00
Jun Wu
f26dfc7d46 pymutationstore: make getdag support selecting successors or predecessors
Summary: Expose the Rust API so `getdag` can choose to skip successors or predecessors.

Reviewed By: markbt

Differential Revision: D23036056

fbshipit-source-id: 30cd437c5420d2d10176e33ef9de98814046f4ce
2020-08-21 13:00:45 -07:00
Jun Wu
45db3bbf96 mutationstore: add a native path to calculate 'obsolete()'
Summary:
The new path does not calculate the complicated `successorssets`, and is
known to make wez's repo operations significantly faster (which, I suspect is
slowed by a very long chain).

The new code is about 3x faster on my repo too:

  # before
  In [1]: list(repo.nodes('draft()'))
  In [2]: %time len(m.mutation.obsoletenodes(repo))
  CPU times: user 246 ms, sys: 42.2 ms, total: 288 ms
  Wall time: 316 ms
  Out[2]: 1127

  # after
  In [1]: list(repo.nodes('draft()'))
  In [2]: %time len(m.mutation.obsoletenodes(repo))
  CPU times: user 74.3 ms, sys: 7.92 ms, total: 82.3 ms
  Wall time: 82.3 ms
  Out[2]: 1127

Reviewed By: markbt

Differential Revision: D23036063

fbshipit-source-id: afd6ac122bb5d8d513b5cdc033e04d2c377286eb
2020-08-21 13:00:45 -07:00
Jun Wu
adf027742e nameset: add flatten API
Summary: This will be useful for the `obsolete()` set.

Reviewed By: sfilipco

Differential Revision: D23036072

fbshipit-source-id: 2f944ef31cf19f902622d90545fa02b7dda89221
2020-08-21 13:00:45 -07:00
Jun Wu
0ac5f05097 nameset: use real dag snapshot instead of a pointer in hints
Summary:
This trades a bit performance (calculating the snapshot) for correctness (no
pointer reuse issues) and convenience (set captures dag information with them
and enables use-cases like converting NameSet from another dag to the
current dag without requiring extra `dag` objects).

Reviewed By: sfilipco

Differential Revision: D23036067

fbshipit-source-id: 2e691f09ad401ba79dbc635e908d79e54dadca5e
2020-08-21 13:00:45 -07:00
Jun Wu
f666cb1cf0 dag: add DagAlgorithm::snapshot_dag
Summary:
This API allows the underlying Dag to provide a snapshot. The snapshot can then
be used in places that do not want a lifetime (ex. NameSet).

Reviewed By: sfilipco

Differential Revision: D22970579

fbshipit-source-id: ededff82009fd5b4583f871eef084ec907b45d33
2020-08-21 13:00:45 -07:00
Jun Wu
741d050f10 dag: drop inverse DAG
Summary:
The only intended use of the inverse DAG is to implement the Python dag
interface in `dagutil.py`. D22519589 (2d4d44cf3d) stack changed it so the Python dag
interface becomes optional. Therefore there is no need to keep the inverse DAG
interface, which is a bit tricky on sorting.

Reviewed By: sfilipco

Differential Revision: D22970581

fbshipit-source-id: 58a126b41d992e75beaf76ece25cb578ee84760b
2020-08-21 13:00:45 -07:00
Jun Wu
fa25f42fea pydag: add an API to migrate from one DAG to segmented DAG
Summary:
This will be used for migrating revlog DAG to segmented changelog. It does not
migrate commit text data (which can take 10+ minutes).

Reviewed By: DurhamG, sfilipco

Differential Revision: D22970582

fbshipit-source-id: 125a8726d48e15ceb06edb139d6d5b2fc132a32c
2020-08-21 13:00:45 -07:00
Jun Wu
1024afc05a pydag: update bindings
Summary: Update bindings to expose the DoubleWrite backend and the DescribeBackend API.

Reviewed By: sfilipco

Differential Revision: D22970574

fbshipit-source-id: bdb52ff21dd0b9ffa0be214b4a4824025f460092
2020-08-21 13:00:45 -07:00
Durham Goode
33a634167e dynamicconfig: support a disallowlist config
Summary:
This new disallowlist will let us specify config section.key's which
should not be accepted from old rc files. This will let us incrementally disable
loading of those configs from the old files, which will then let us delete them
from the old rc's and eventually delete the old rc's entirely.

This diff also removes hgrc.local and hgrc.od from the list of configs we
verify, since those are not on the list of configs that need to be removed in
this initiative.

Reviewed By: quark-zju

Differential Revision: D23065595

fbshipit-source-id: 5cd742d099efd651174cab5e87bb7cdc4bae8054
2020-08-16 16:56:00 -07:00
Durham Goode
2da121cb60 configs: add rust support for loading dynamic and repo configs
Summary:
This threads the calls to load_dynamic and load_repo through the Rust
layer up to the Python bindings. This diff does 2 notable things:

1. It adds a reload API for reloading configs in place, versus creating a new
one. This will be used in localrepo.__init__ to construct a new config for the
repo while still maintaining the old pinned values from the copied ui.
2. It threads a repo path and readonly config list from Python down to the Rust
code. This allows load_dynamic and load_repo to operate on the repo path, and
allows the readonly filter to applied to all configs during reloading.

Reviewed By: quark-zju

Differential Revision: D22712623

fbshipit-source-id: a0f372f4971c5feac2f20e89a0fb3fe6d4a65d6f
2020-08-16 16:56:00 -07:00
Durham Goode
0b123ba41d configs: move Rust dynamicconfig generation into configparser::hg
Summary:
As part of moving all hg config loading and generation logic into Rust,
let's move the config generation logic from hgcommands and pyconfigparser to
configparser, unifying them at the same time.

Future diffs will move config loading in as well.

Reviewed By: quark-zju

Differential Revision: D22590208

fbshipit-source-id: d1760c404a6a5c57347df30713c20de55cfdb9a4
2020-08-16 16:55:59 -07:00
Durham Goode
7ff28d3e1c configs: move dynamicconfig into configparser
Summary:
A future diff will unify all config loading into configparser::hg, but
to do so we need dynamicconfig to live in configparser, so it can load
dynamicconfigs. Let's move everything in.

Reviewed By: quark-zju

Differential Revision: D22587237

fbshipit-source-id: 5613094175b6e1597aa113ee3e6d92ce7ec79f6d
2020-08-16 16:55:59 -07:00
Durham Goode
a40331be8d configs: unify system+user config loading into pure rust layer
Summary:
We had two spots that loaded system and user configs, one in the
pyconfigparser layer, and one in the pure rust config layer. In an upcoming diff
I'd like to move dynamicconfig loading down into the pure rust layer, so let's
unify these.

Reviewed By: quark-zju

Differential Revision: D22585554

fbshipit-source-id: 0cea7801ae1d5a3a3c12b80ee23b37f9e690e2bc
2020-08-16 16:55:59 -07:00
Durham Goode
3129f032a4 contentstore: make history rotatelog size configurable
Summary:
In a future diff we'll increase the size of the rotatelog temporarily
during clones. To do so we need it to be configurable.

Reviewed By: quark-zju

Differential Revision: D23089539

fbshipit-source-id: ebfc3beaf3c0fe5b01b87d97c19455b0a24afa72
2020-08-16 16:44:16 -07:00
Durham Goode
b821ab3766 contentstore: make data rotatelog size configurable
Summary:
In a future diff we'll increase the size of the rotatelog temporarily
during clones. To do so we need it to be configurable.

Reviewed By: quark-zju

Differential Revision: D23089541

fbshipit-source-id: 5010e417a83a2611283322f1dbb7023f4286f503
2020-08-16 16:44:16 -07:00
Jun Wu
2db783bed8 revlogindex: make parent_revs fallible
Summary: If parent_revs gets an out-of-bound rev, it should fail.

Reviewed By: sfilipco

Differential Revision: D23036071

fbshipit-source-id: 7fae0fd5adf07ac3c933a29d7d06289d8d740c60
2020-08-14 22:00:26 -07:00
Meyer Jacobs
b9ce375f36 edenapi: Split DataEntry into FileEntry and TreeEntry
Summary:
The primary change is in `eden/scm/lib/edenapi/types`:
* Split `DataEntry` into `FileEntry` and `TreeEntry`.
* Split `DataError` into `FileError` and `TreeError`. Remove `Redacted` error variant from `TreeError` and `MaybeHybridManifest` error variant from `FileError`.
* Split `DataRequest`, `DataResponse` into appropriate File and Tree types.
* Refactor `data.rs` into `file.rs` and `tree.rs`.
* Lift `InvalidHgId` error, used by both File and Tree, into `lib.rs`.
* Bugfix: change `MaybeHybridManifest` to be returned only for hash mismatches with empty paths, to match documented behavior.

Most of the remaining changes are straightforward fallout of this split. Notable changes include:
* `eden/scm/lib/edenapi/tools/read_res`: I've split the "data" commands into "file" and "tree", but I've left the identical arguments sharing the same argument structs. These can be refactored later if / when they diverge.
* `eden/scm/lib/types/src/hgid.rs`: Moved `compute_hgid` from `eden/scm/lib/edenapi/types/src/data.rs` to as a new `from_content` constructor on the `HgId` struct.
* `eden/scm/lib/revisionstore/src/datastore.rs`: Split `add_entry` method on `HgIdMutableDeltaStore` trait into `add_file` and `add_tree` methods.
*  `eden/scm/lib/revisionstore/src/edenapi`
    * `mod.rs`: Split `prefetch` method on `EdenApiStoreKind` into `prefetch_files` and `prefetch_trees`, which are given a default implementation that fails with `unimplemented!`.
    * `data.rs`: Replace blanket trait implementations for `EdenApiDataStore<T>` with specific implementations for `EdenApiDataStore<File>` and `EdenApiDataStore<Tree>` which call the appropriate fetch and add functions.
    * `data.rs` `test_get_*`: Replace dummy hashes with real hashes. These tests were only passing due to the hash mismatches (incorrectly) being considered `MaybeHybridManifest` errors, and allowed to pass.

Reviewed By: kulshrax

Differential Revision: D22958373

fbshipit-source-id: 788baaad4d9be20686d527f819a7342678740bc3
2020-08-13 10:01:40 -07:00
Adam Simpkins
0cb0a0bb2a begin adding some type stubs for the Rust Python bindings
Summary: Begin adding some initial type annotations for the Rust Python bindings.

Reviewed By: quark-zju

Differential Revision: D22993222

fbshipit-source-id: 2073db93b22f6bb04e30b767594d435c36ddb17f
2020-08-11 21:45:04 -07:00
Meyer Jacobs
b9f3c9c692 taggederror: Introduce taggederror-util for more ergonomic error tagging for eden error types.
Summary:
Introduce taggederror-util, which provides a new trait `AnyhowEdenExt`, which provides a method `eden_metadata` for anyhow errors and results. This method works much like `AnyhowExt::common_metadata`, but additionally supports extracting default error metadata from known `Tagged` types which are listed explicitly in the method implementation.

Extend `FilteredAnyhow` to support a configuration "metadata function", which allows swapping out `eden_metadata` for the standard `common_metadata`.

Modify Rust dispatch and Python bindings to use `AnyhowEdenExt` for metadata extraction and printing.

Modify `intentional_error` to rely on `AnyhowEdenExt` for tagging (removes `.tagged` call, no tags will be visible if `AnyhowEdenExt` is not used).

Reviewed By: DurhamG

Differential Revision: D22927203

fbshipit-source-id: 04b36fdfaa24af591118acb9e418d1ed7ae33f91
2020-08-06 19:37:25 -07:00
Jun Wu
6fd7a2e582 dag: use concrete error types
Summary:
This is more complex than previous libraries, mainly because `dag` defines APIs
(traits) used by other code, which might raise error type not interested
by `dag` itself. `BackendError::Other(anyhow::Error)` is currently used to
capture types that do not fit in `dag`'s predefined error types.

Reviewed By: sfilipco

Differential Revision: D22883865

fbshipit-source-id: 3699e14775f335620eec28faa9a05c3cc750e1d1
2020-08-06 12:31:57 -07:00
Jun Wu
8d0f48c4da dag: rename some anyhow::Result to dag::Result
Summary:
Prefix some `Result` with `dag::Result`. Since `dag::Result` is just
`anyhow::Result` for now, this does not change anything but makes
it more compatible with upcoming changes.

Reviewed By: sfilipco

Differential Revision: D22883864

fbshipit-source-id: 95a26897ed026f1bb8000b7caddeb461dcaad0e7
2020-08-06 12:31:57 -07:00
Jun Wu
ff9c979b07 revlogindex: use concrete error types
Summary:
All dependencies of revlogindex have migrated to concreted error types.
Let's migrate revlogindex itself. This allows compile-time type checks
and makes the error returned by revlogindex APIs more predictable.

Reviewed By: sfilipco

Differential Revision: D22857554

fbshipit-source-id: 7d32599508ad682c6e9c827d4599e6ed0769899c
2020-08-06 12:31:57 -07:00
Jun Wu
64d4f5743f dag: delegate reachable_root to inner implementations
Summary: Otherwise the default implementation will be used.

Reviewed By: sfilipco

Differential Revision: D22657206

fbshipit-source-id: dea31149efe41cb3d9e30b33c138e437dce8011e
2020-07-30 20:32:37 -07:00
Jun Wu
bdf0655def pydag: expose Rust reachableroots
Summary: So reachableroots can be called from Python.

Reviewed By: sfilipco

Differential Revision: D22657186

fbshipit-source-id: 36b1b5ed1e32c88bb07e6c7c7e0a7ca89e0751a3
2020-07-30 20:32:37 -07:00
Jun Wu
fcc78319a0 revlogindex: use dedicated error type for missing commits
Summary:
This replaces RustError that might happen during `addcommits`, and allow us to
handle it without having a stacktrace.

Reviewed By: DurhamG

Differential Revision: D22539564

fbshipit-source-id: 356814b9baf0b31528dfc92d62b0dcf352bc1e24
2020-07-30 20:32:33 -07:00
Jun Wu
e35b18923a pydag: implement nameset.__or__
Summary: It's the same as `__add__`. It's consistent with the revset language.

Reviewed By: sfilipco

Differential Revision: D22638456

fbshipit-source-id: 928177d553220461192650f4792ac39cadd57dc2
2020-07-30 20:32:32 -07:00
Jun Wu
a02c93864f dag: add ANCESTORS hint
Summary:
The hint indicates a set `X` is equivalent to `ancestors(X)`.

This allows us to make `heads` use `heads_ancestors` (which is faster in
segmented changelog) automatically without affecting correctness. It also
makes special queries like `ancestors(all())` super cheap because it'll just
return `all()` as-is.

Reviewed By: sfilipco

Differential Revision: D22638463

fbshipit-source-id: 44d9bbcbb0d7e2975a0c8322181c88daa1ba4e37
2020-07-30 20:32:30 -07:00
Jun Wu
49a25c9525 smartset: replace spanset with idset
Summary:
Replace the Python spanset with the Rust-backed idset.
The idset can represent multiple ranges and works better with Rust code.

The `idset` fast paths do not preserve order for the `or` operation, as
demonstrated in the test changes.

Reviewed By: DurhamG, kulshrax

Differential Revision: D22519584

fbshipit-source-id: 5d976a937e372a87e7f087d862e4b56d673f81d6
2020-07-30 20:00:41 -07:00
Xavier Deguillard
e9b3f79b70 revisionstore: return missing keys from prefetch
Summary: Similarly to the changes made for `get`, the same can be applied to prefetch.

Reviewed By: DurhamG

Differential Revision: D22565609

fbshipit-source-id: 0fbc1a0086fa44593a6aaffb746ed36b3261040c
2020-07-28 10:51:38 -07:00
Arun Kulshreshtha
0a2172f589 pyedenapi: rearrange parameters
Summary: Make `store` the first argument for all of the EdenAPI Python methods. I've found this arrangement to be more ergonomic when working with the client later in the stack.

Reviewed By: quark-zju

Differential Revision: D22703915

fbshipit-source-id: b0ca900d969ec86ee91e8c62d281c2102860e9ef
2020-07-27 14:54:57 -07:00
Xavier Deguillard
3a97764d70 revisionstore: add a new StoreResult type
Summary:
When using LFS, it's possible that a pointer may be present in the local
LfsStore, but the blob would only be in the shared one. Such scenario can
happen after an upload, when the blob is moved to the shared store for
instance. In this case, during a `get` call, the local LFS store won't be able
to find the blob and thus would return Ok(None), the shared LFS store woud not
be able to find the pointer itself and would thus return Ok(None) too. If the
server is not aware of the file node itself, the `ContentStore::get` would also
return Ok(None), even though all the information is present locally.

The main reason why this is happening is due to the `get` call operating
primarily on file node based keys, and for content-based stores (like LFS),
this means that the translation layer needs to be present in the same store,
which in some case may not be the case. By allowing stores to return a
`StoreKey` when progress was made in finding the key we can effectively solve
the problem described above, the local store would translate the file node key
onto a content key, and the shared store would read the blob properly.

Reviewed By: DurhamG

Differential Revision: D22565607

fbshipit-source-id: 94dd74a462526778f7a7e232a97b21211f95239f
2020-07-24 10:45:40 -07:00
Meyer Jacobs
586ada8de6 taggederror: introduce bail macro replacement which allows tagging
Summary: This change introduces a bail macro that allows tagging errors using the syntax `bail!(fault=Fault::Request, "my normal {}", bail_args)` or `bail!(Fault::Request, "my normal {}", bail_args)`.

Reviewed By: DurhamG

Differential Revision: D22646428

fbshipit-source-id: a6ec2940001b26db8ddc3a6d3620a1e17406c867
2020-07-22 15:37:14 -07:00
Jun Wu
8cc63c6d37 smartset: reduce dependency on len(repo)
Summary:
The spanset has the assumption that `0..len(repo)` are valid revs.
That's not true with segmented changelog. So reduce the dependency on the
assumption.

Reviewed By: kulshrax

Differential Revision: D22519586

fbshipit-source-id: a493d26d6d69a36966f4a037f87a03593b697cbd
2020-07-20 17:27:54 -07:00
Jun Wu
680c0592b7 pydag: add an unsaferange API
Summary:
It turns out the Python world needs the integer range API in many places.
Deprecating them is non-trivial. Therefore expose the API.

Reviewed By: DurhamG

Differential Revision: D22402201

fbshipit-source-id: de31d15c18e5f4e0f8826f71315b98ad58b1764e
2020-07-20 17:27:52 -07:00
Jun Wu
7b7ae0bd09 hgcommits: implement strip_commits for testing
Summary:
About 64 tests depend on the revlog `strip` behavior. `strip` is not used in
production client-repos.  I tried to migrate them off `strip` but that seems
too much work for now. Instead let's just implement `strip` in the HgCommits
layer to be compatible to run the tests.

Reviewed By: DurhamG

Differential Revision: D22402195

fbshipit-source-id: f68d005e04690d8765d5268c698b6c96b981eb0a
2020-07-17 22:23:05 -07:00
Jun Wu
eb4c007145 changelog: use Rust RevlogIndex for partialmatch
Summary:
I dropped the special case of wdir handling. With the hope that we will handle
the virtual commits differently eventually (ex. drop special cases, insert real
commits to Rust DAG but do not flush them to disk, support multiple wdir
virtual commits, null is no longer an ancestor of every commit).

`test-listkeyspatterns.t` is changed because `0` no longer resolves to `null`.

Reviewed By: DurhamG

Differential Revision: D22368836

fbshipit-source-id: 14b9914506ef59bb69363b602d646ec89ce0d89a
2020-07-17 22:23:04 -07:00
Arun Kulshreshtha
bfd46b50b0 pyedenapi: return dict from health method
Summary: Make the Python EdenAPI client's `health()` method return a dict of server metadata.

Reviewed By: DurhamG

Differential Revision: D22604932

fbshipit-source-id: 51ca60cc95a8dbd15635520b2a9bd72603643cb6
2020-07-17 16:04:32 -07:00
Meyer Jacobs
e3b86cf77d debug: introduce binding layer for propagating error metadata to Python
Summary:
Implements based Rust-Python binding layer for error metadata propagation.

We introduce a new type, `TaggedExceptionData`, which carries CommonMetadata and the original (without metadata) error message for a Rust Anyhow error. This class is passed to RustError and can be accessed in Python (somewhat awkwardly) via indexing:
```
except error.RustError as e:
    fault = e.args[0].fault()
    typename = e.args[0].typename()
    message = e.args[0].message()
```
As far as I can tell, due to limitations in cpython-rs, this can't be made more ergonomic without introducing a Python shim around the Rust binding layer, which could adapt the cpython-rs classes to use whatever API we'd like.

Currently, anyhow errors that are not otherwise special-cased will be converted into RustError, with both the original error message and any attached metadata printed as shown below
```
  abort: intentional error for debugging with message 'intentional_error'
  error has type name taggederror::IntentionalError and fault None
```
We can of course re-raise the error if desired to maintain the previous behavior for handling a RustError.

If we'd like other, specialized Rust Python Exception types to carry metadata (such as `IndexedLogError`), we'll need to modify them to accept a `TaggedExceptionData` like `RustError`.

Renamed the "cause an error in pure rust command" function to `debugcauserusterror`, and instead used the name `debugthrowrustexception` for a command which causes an error in rust which is converted to a Python exception across the binding layer.

Introduced a simple integration test which exercises `debugthrowrustexception`.

Added a basic handler for RustError to scmutil.py

Reviewed By: DurhamG

Differential Revision: D22517796

fbshipit-source-id: 0409489243fe739a26958aad48f608890eb93aa0
2020-07-16 19:30:00 -07:00
Arun Kulshreshtha
bffb24216d revisionstore: move tokio runtime into EdenApiRemoteStore
Summary: Move the `tokio::Runtime` into `EdenApiRemoteStore` so that if initialization fails, we can propagate the error instead of panicking.

Reviewed By: xavierd

Differential Revision: D22564210

fbshipit-source-id: 9db1be99f2f77c6bb0f6e9dc445d624dc5990afe
2020-07-16 13:32:19 -07:00
Arun Kulshreshtha
3327e15201 edenapi: percent-encode repo names
Summary: Instead of restricting the allowed characters in a repo name, allow any UTF-8 string. The string will be percent-encoded before being used in URLs.

Reviewed By: quark-zju

Differential Revision: D22559830

fbshipit-source-id: f9caa51d263e06d424531e0947766f4fd37b035f
2020-07-16 13:32:19 -07:00
Durham Goode
76df783c93 configs: implement user sharding
Summary: Adds support for sharding based on user name.

Reviewed By: quark-zju

Differential Revision: D22537540

fbshipit-source-id: 962f9582c8947335dc9d9d29c500d8c09df69878
2020-07-16 09:07:53 -07:00
Durham Goode
28ddd1d1cc configs: add hg debugdynamicconfig --canary devvmXXX.prnY support
Summary:
Previously you could only canary locally on a devserver by setting an
environment variable. Let's add a --canary flag to debugdynamicconfig that
accepts a host.  Hg will ssh to that host and run the configerator cli to grab
the canaried config from that host.

Reviewed By: quark-zju

Differential Revision: D22535509

fbshipit-source-id: af1c21d8402c4e729769e50388d913bf52b66b89
2020-07-15 01:14:30 -07:00
Arun Kulshreshtha
676aaeb367 pyrevisionstore: make metadatastore constructor accept an edenapi store
Summary: Add an optional `edenapi` argument to metadatastore that allows using EdenAPI in place of the SSH remote store.

Reviewed By: quark-zju

Differential Revision: D22492535

fbshipit-source-id: eba034c9ba86c79c9a9dee6bab3ff615d0575b6f
2020-07-13 17:35:31 -07:00
Arun Kulshreshtha
670ed17ba6 revisionstore: add EdenApiFileStore and EdenApiTreeStore
Summary: Reimplement `EdenApiHgIdRemoteStore` as `EdenApiRemoteStore<T>`, where `T` is a marker type indicating whether this store fetches files or trees. This allows working with the stores in a more strongly-typed way, and avoid having to check what kind of store this is at runtime when fetching data.

Reviewed By: quark-zju

Differential Revision: D22492160

fbshipit-source-id: e17556093fa9b81d2301f281da36d75a03e33c5e
2020-07-13 17:35:31 -07:00
Durham Goode
11972bf57e configs: switch to auditing the specific list of known problematic configs
Summary:
Previously we would audit all configs and report them if the
dynamicconfig did not match the rc-file config. Now that dynamicconfigs are
widely deployed, let's switch this around to auditing only configs we know have
had issues. This will let us start adding new configs via dynamicconfigs instead
of via the legacy staticfiles and chef, before we've finished migrating all the
legacy configs over.

Reviewed By: quark-zju

Differential Revision: D22401865

fbshipit-source-id: 5c41c674d39c8113b2a40da61e020e8a33c39312
2020-07-13 08:53:18 -07:00
Durham Goode
6774dfe154 packs: flush history packs every 10 million adds
Summary:
We're seeing cases were cloning can take 10's of GB of memory because
we pend all the history information in memory. Let's flush the history info
every 10 million adds to bound the memory usage.

10 million was chosen somewhat arbitrarily, but it results in pack files that
are 800MB, which corresponds roughly with 8GB of memory usage.

This requires updating repack to be aware that a single flush could produce
multiple packs. Note, since repack writes via this same path, it may result in
repack producing multiple pack files. In the degenerate case repack could
produce the same number (or more) of pack files than was inputted. If we set the
threshold high enough I think we'll be fine though. 800MB is probably
sufficient.

Reviewed By: xavierd

Differential Revision: D22438569

fbshipit-source-id: 425d5d3b7999b81e44d1dbe1f2a4ea453ab6ca4f
2020-07-13 08:10:14 -07:00
Arun Kulshreshtha
14a7fe636f cpython-ext: Add ExtractInnerRef trait
Summary: Per comments on D22429347, add a new `ExtractInnerRef` trait that is similar to `ExtractInner`, but returns a reference to the underlying value. A default implementation is provided for types whose inner value is `Clone + 'static`, so in practice most types will only need to implement `ExtractInnerRef`, whereas the callsite may choose whether it needs a reference or an owned value.

Reviewed By: quark-zju

Differential Revision: D22464158

fbshipit-source-id: 7b97329aedcddb0e51fd242b519e79eba2eed350
2020-07-09 19:05:55 -07:00
Arun Kulshreshtha
a5ae136439 pyrevisionstore: add EdenAPI store bindings
Summary: Add add a `edenapistore` class to that wraps a `EdenApiHgIdRemoteStore`. This class is purely used as a means to set up the stores from Python code, and is only used as a way to get an `Arc<EdenApiHgIdRemoteStore>` to the Rust content store. It has no functionality of its own.

Reviewed By: quark-zju

Differential Revision: D22449702

fbshipit-source-id: ad2094c79da523071b6ed8344c8dde706e448c95
2020-07-09 19:05:55 -07:00
Arun Kulshreshtha
2f97de536f pyedenapi: rewrite bindings
Summary: This is effectively a complete rewrite of the EdenAPI Python bindings to use the new client.

Reviewed By: quark-zju

Differential Revision: D22442903

fbshipit-source-id: c3cf2b2b8291e24d6d4d3a3546ccc69472510567
2020-07-09 19:05:55 -07:00
Arun Kulshreshtha
7ae097e8da cpython-ext: add ExtractInner trait
Summary:
A common pattern in Mercurial's data storage layer Python bindings is to have a Python object that wraps a Rust object. These Python objects are often passed across the FFI boundary to Rust code, which then may need to access the underlying Rust value.

Previously, the objects that used this pattern did so in an ad-hoc manner, typically by providing an `into_inner` or `to_inner` inherent method. This diff introduces a new `ExtractInner` trait that standardizes this pattern into a single interface, which in turn allows this pattern to be used with generics.

Reviewed By: quark-zju

Differential Revision: D22429347

fbshipit-source-id: cab4c24b8b98c6ef8307f72a9b4726aabdc829cc
2020-07-09 19:05:55 -07:00
Arun Kulshreshtha
14392eb035 pyedenapi: use new EdenAPI crate
Summary: Update the EdenAPI Python bindings to use the new client. This is mostly just a stopgap measure to allow us to delete the old client code; nothing in production actually uses these bindings anymore, and the new client will primarily be used from Rust.

Reviewed By: quark-zju

Differential Revision: D22379476

fbshipit-source-id: 953e0ffc2ce682869ee234d672a154046b373c1e
2020-07-09 13:08:27 -07:00
Xavier Deguillard
71a9ae11d9 pyrevisionstore: do not abort on partial fetches
Summary:
We've seen a handful of users complaining about clone failing and not being
able to recover from it. From looking at the various reports and the
stacktraces, I believe this is caused by a flaky connection on the user end
that causes the Python code to retry the getpack calls. Before retrying, the
code will figure out what still needs fetching and this is done via the
getmissing API. When LFS pointers were fetched, the LFS blobs aren't yet
present on disk, and thus the underlying ContentStore::get_missing will a set
of keys that contain some StoreKey::Content keys. The code would previously
fail at this point, but since the key also contains the original key, we can
simply return this, the pointers might be refetched but these are fairly small.

Taking a step back from this bug, the issue really is that the retry logic is
done in code that cannot understand content-keys, and moving it to a part of
the code that understands this would also resolve the issue.

I went with the simple approach for now, but since other remote stores
(EdenAPI, the LFS one, etc) would also benefit from the retry logic, we may
want to move the logic into Rust and remove the getmissing API from the Python
exposed ContentStore.

Reviewed By: DurhamG

Differential Revision: D22425600

fbshipit-source-id: 69c2898cc302d2170cd0f206c89189c341db5278
2020-07-07 19:44:01 -07:00
Jun Wu
ae2f6a939c pydag: ignore nullrev or nullid from Python
Summary:
The Mercurial's concept of `null` revision (hardcoded as 20 zeros) is a
headache to special case. See https://www.mercurial-scm.org/wiki/RevsetVirtualRevisionPlan.

The Rust DAG layer cannot handle it. Make pydag drop the nullid or nullrev when
crossing the Python -> Rust boundary.

A cleaner way to handle `null` might be:

- Create a new vertex in the DAG in memory that has empty content.
  Calculate its commit hash normally. The commit is isolated from other parts
  of the commit graph. It has no parents and no children.
  The vertex has an assigned Id, which is not zero if the repo is not empty.
- Assign the `null` special name (like how we do for `tip`) to the commit.
- Remove all hard-code special cases of the 20-zero `nullid`.

That would allow things like `hg up null`, `hg diff -r null -r X` to continue
work without special casing it in the commit graph layer.

Reviewed By: sfilipco

Differential Revision: D22240188

fbshipit-source-id: 707af47cbf36a7df60097a17d69094aae89d3250
2020-07-06 15:51:00 -07:00
Jun Wu
0c91746cc7 pydag: use trait object abstractions
Summary:
Change pydag from using concreate `namedag` and `memnamedag` to trait objects:

- `commits`: High-level read-write commits storage, supports Rust `HgCommits`
  (segmented changelog), `MemHgCommits`, and `RevlogCommits`.
- `dagalgo`: maps to the `DagAlgorithm` Rust trait.
- `idmap`: maps to the `IdConvert + PrefixLookup` Rust traits.

The idea is that we move the revlog / segmented changelog difference from Python
to behind Rust trait objects so the Python code looks overall cleaner, the Rust
revset alternative gets exercised early, and switching from revlog to segmented
changelog becomes easier.

Reviewed By: sfilipco

Differential Revision: D21796242

fbshipit-source-id: 3a4a3ff3d9e7e46059d1ed3461a55003c352e82d
2020-07-06 15:51:00 -07:00
Jun Wu
137fa3cd34 revlogindex: implement writing to revlog data
Summary: Extend RevlogIndex to support writing to revlog data.

Reviewed By: sfilipco

Differential Revision: D21854227

fbshipit-source-id: 11b6bf3b706b316f23c33ab07144530c9db92d58
2020-07-06 15:50:58 -07:00
Jun Wu
868c2b0108 mutationstore: copy entries automatically on flush
Summary:
Similar to D7121487 (af8ecd5f80) but works for mutation store. This makes sure at the Rust
layer, mutation entries won't get lost after rebasing or metaeditting a set of
commits where a subset of the commits being edited has mutation relations.

Unlike the Python layer, the Rust layer works for mutation chains. Therefore
some of the tests changes.

Reviewed By: markbt

Differential Revision: D22174991

fbshipit-source-id: d62f7c1071fc71f939ec8771ac5968b992aa253c
2020-07-02 13:22:34 -07:00
Arun Kulshreshtha
fdba0b98c2 edenapi: rename to edenapi_old
Summary: Move old EdenAPI crate to `scm/lib/edenapi/old` to make room for the new crate. This old code will eventually been deleted once all references to it are removed from the codebase.

Reviewed By: quark-zju

Differential Revision: D22305173

fbshipit-source-id: 45d211340900192d0488543ba13d9bf84909ce53
2020-06-30 21:10:41 -07:00
Jun Wu
4868f5bf5b revlog: avoid racy reads on 00changelog.i
Summary:
D21626209 (38d6c6a819) changed revlogindex to read `00changelog.i` by its own instead of
taking the data from Python. That turns out to be racy. The `00changelog.i`
might be changed between the Rust and Python reads and that caused issues.

This diff makes Python re-use the indexdata read by Rust so they are guaranteed
the same.

Reviewed By: DurhamG

Differential Revision: D22303305

fbshipit-source-id: 823bf3aefc970a4a6ce8ab58bccf972a78f6de70
2020-06-30 10:19:03 -07:00
Jun Wu
08093da71d pybytes: convert Rust Bytes to Python buffer or memoryview
Summary:
This will be used by the next change.

The reason we use a `buffer` or `memoryview` instead of Python `bytes` is to expose
the buffer in a zero-copy way. That is important for startup performance.

Reviewed By: DurhamG

Differential Revision: D22303306

fbshipit-source-id: 3f7c8dff3575b998e025cd5940faa0c183b11626
2020-06-30 10:19:03 -07:00
Durham Goode
977155ef99 configs: fetch remote configs during dynamic config generation
Summary:
Fetches configs from a remote endpoint and caches them locally. If the
remote endpoint fails to respond, we use the cached version.

Reviewed By: quark-zju

Differential Revision: D22010684

fbshipit-source-id: bd6d4349d185d7450a3d18f9db2709967edc2971
2020-06-30 09:50:44 -07:00
Xavier Deguillard
a4ac0a8449 pyworker: support redacted content
Summary:
Whenever data is redacted on the server, a specific tombstone is returned when
fetching it. Make sure that whenever we update the file on disk, we write a
nice message to the user instead of the tombstone itself.

While this code could have been moved into the Rust store code itself, I prefer
to leave it to its users to decide what to do with redacted data. EdenFS for
instance may want to prevent access to it instead of showing the redacted
message.

Reviewed By: kulshrax

Differential Revision: D21999345

fbshipit-source-id: 39a83cdf5ea4567628a13fbd59520b9677aba749
2020-06-23 18:47:44 -07:00
Jun Wu
1020f76e7d stackdesc: remove the crate
Summary: The tracing APIs and error context APIs can achieve similar effects.

Reviewed By: xavierd

Differential Revision: D22129585

fbshipit-source-id: 0626e3f4c1a552c69c046ff06ac36f5e98a6c3d8
2020-06-23 14:06:54 -07:00
Xavier Deguillard
afc1a33c21 pyrevisionstore: aggressively release the GIL
Summary:
In the case where an expensive network operation is involved, holding the GIL
would mean that Mercurial cannot display progress bars, or simply cannot be
interrupted. This is a less than ideal user experience.

To fix this, let's release the GIL whenever we enter into the revisionstore
code.

Reviewed By: DurhamG

Differential Revision: D22140399

fbshipit-source-id: 131c8cf81e39128810e0f20d1922b5681a33d95a
2020-06-19 19:54:21 -07:00
Carolyn Busch
2c7f30d0c4 configparser: replace whitelist/blacklist term
Summary: Replace usages of whitelist/blacklist with include/exclude/filter/allow. These terms are more descriptive and less likely to contribute to racial stereotyping. More context: https://fb.workplace.com/groups/sourcecontrolteam/permalink/2926049127516414/

Reviewed By: kulshrax

Differential Revision: D22039298

fbshipit-source-id: 255c7389ee5ce5e54bbccdfb05ffa4cafc6958e5
2020-06-15 12:47:08 -07:00
Xavier Deguillard
8b9401b0b2 pyerror: format the original error, not the downcasted one
Summary:
The anyhow error contains the context for the error which will lead to better
error message for the user (and us). What the code previously did was simply
using the Debug trait to print the error, and thus was missing context.

Reviewed By: DurhamG

Differential Revision: D21985745

fbshipit-source-id: 31c603d7f42e79a360541f39e4aaf0fcfbb9a14f
2020-06-12 12:38:06 -07:00
Mark Thomas
860594a0e6 streampager: fix progress rendering
Summary:
With the internal streampager, progress bars must be sent on a separate stream so that
streampager can render them correctly.

Reviewed By: quark-zju

Differential Revision: D21906173

fbshipit-source-id: eb41b0bf22807d9cae518b3f676996ab1c642c6e
2020-06-10 19:29:28 -07:00
Jun Wu
de8b085e6e revlogindex: port gca and range algorithms from bindag
Summary:
Mostly copy-paste from code added in D19503373 and D19511574. Adjusted to match
the revlog index interface.

Reviewed By: sfilipco

Differential Revision: D21626201

fbshipit-source-id: 05d160e4c03d7e2482b6a4f2d68c3688ad78f568
2020-06-03 13:26:26 -07:00
Jun Wu
223faebe5f dag: rename DagSet to IdStaticSet
Summary:
The NameSet is not really about Dag. It is about using Id and is static.
Rename it to clarify. In an upcoming change we'll have IdLazySet.

Reviewed By: sfilipco

Differential Revision: D21626204

fbshipit-source-id: 84f25008f7032f6e26a26fc656ccbcd2a5880ecf
2020-06-03 13:26:24 -07:00
Jun Wu
fb56b1962d dag: move optimization hints to a dedicate structure
Summary:
Previously, the NameSet has properties like "is_all", "is_topo_sorted", etc.
To make lazy sets efficient, it's important to have hints about min / max Ids
and maybe some other information.

Add a dedicated Hints structure for that.

Reviewed By: sfilipco

Differential Revision: D21626219

fbshipit-source-id: 845e88d3333f0f48f60f2739adae3dccc4a2dfc4
2020-06-02 14:00:36 -07:00
Jun Wu
48c003fb11 revlogindex: impl IdConvert and PrefixLookup for RevlogIndex
Summary:
Implements part of the dag IdMap related traits.

It does not get used yet, but eventually I'd like `pydag` to be able to work
with an abstracted dag including RevlogIndex.

Reviewed By: sfilipco

Differential Revision: D21626210

fbshipit-source-id: 53f19622f03fd71b76073dccf8dcc9b4778b40ca
2020-06-02 14:00:35 -07:00
Jun Wu
38d6c6a819 revlogindex: include NodeRevMap in RevlogIndex
Summary:
This will allow RevLogIndex to answer node -> rev and hex lookup queries.

Also change RevlogIndex::new to take file names so it can write back the
nodemap index when the index is lagging. That part of logic currently exists in
pyindexes + clindex.pyx, which are going to be replaced by revlogindex.

Practically, this will generate a `00changelog.nodemap` file in svfs, which is
temporarily unused, but will be used once clindex.pyx gets replaced.

Reviewed By: sfilipco

Differential Revision: D21626209

fbshipit-source-id: 297d9eff26a73c26558708f7a2290d4d8ba1e777
2020-06-02 14:00:34 -07:00
Jun Wu
6b595410ce revlogindex: make changelog data type consistent
Summary:
Change `NodeRevMap`'s changelog type from `[u8]` to `[RevlogEntry]`.
This makes it consistent with `RevlogIndex`.

Reviewed By: sfilipco

Differential Revision: D21626203

fbshipit-source-id: 7457f48ccd7b3489264684a5db21d21e9eb7a937
2020-06-01 10:56:55 -07:00
Jun Wu
445e9f9fa7 pyindexes: move NodeRevMap to revlogindex
Summary:
NodeRevMap helps converting from a commit hash to a rev number. It's similar to
IdMap in the dag crate, but was designed for the revlog.

Move NodeRevMap to revlogindex so it becomes easier to implement the IdConvert
trait required by the dag crate.

Reviewed By: sfilipco

Differential Revision: D21626211

fbshipit-source-id: 14996f1234231b507efb5186ec30f84df5aaad10
2020-06-01 10:56:55 -07:00
Jun Wu
40fbbff9af pyrevlogindex: move non-Python logic to a pure Rust crate
Summary:
The idea is that the pure Rust revlogindex crate can implement the DagAlgorithm
interface so we will have a consistent interface in the code base that works
for both the existing storage (revlog) and the new segmented changelog.

The other way to do this is to implement the `bindings.dag.namedag` interface
in pure Python for the revlog-based DAG, or supporting quite different
interfaces (ex. revlog DAG and the Rust segmented changelog DAG) in the code
base. At present, I think implementing the Rust DAG traits for revlog is the
most appealing, partially because we already have some key algorithms
implemented (ex. prefix lookup, common ancestors, etc).

Reviewed By: sfilipco

Differential Revision: D21626197

fbshipit-source-id: 733b1af1bcd5fc0784764fc7103412988894d43b
2020-06-01 10:56:54 -07:00
Durham Goode
995a2852c1 configs: return bytes for config parsers validation results
Summary:
Previously the return type was String which, in Python 2, could turn into bytes or
unicode depending on the contents of the string. We always want bytes in Python
2, so let's use the Str type instead.

Reviewed By: quark-zju

Differential Revision: D21794189

fbshipit-source-id: 6493fbacab354a78476f522fc3c41b7336dbbdb1
2020-06-01 09:45:19 -07:00
Jun Wu
64dc05ab9d dag: move add_heads, flush, add_heads_and_flush to traits
Summary: This allows other kinds of DAG to implement the operations.

Reviewed By: sfilipco

Differential Revision: D21626220

fbshipit-source-id: 896c5ccebb1672324d346dfca6bcac9b4d3b4929
2020-05-27 12:16:47 -07:00
Jun Wu
4934987796 dag: implement PrefixLookup for Dag, MemDag and MemIdMap
Summary: This makes things a bit more flexible.

Reviewed By: sfilipco

Differential Revision: D21626194

fbshipit-source-id: f3ad486bcd5a6478d9e00f674d48f99504cded8c
2020-05-27 12:16:46 -07:00
Jun Wu
26217dcdb5 dag: move hex prefix lookup to a trait
Summary: This makes it possible for other types to implement the hex prefix lookup.

Reviewed By: sfilipco

Differential Revision: D21626218

fbshipit-source-id: 96e8b8c37e5aae2bd60658a238333b61902936d1
2020-05-27 12:16:46 -07:00
Jun Wu
38cc83e1bf dag: add short aliases for main public types
Summary:
Types like IdDag are not really used. The use of the word "name" is sometimes
confusing in other context. Therefore export shorter names like Dag, MemDag,
Vertex, avoid "name" in NameDag, MemNameDag and NameSet. This makes external
code shorter and less ambiguous.

Reviewed By: sfilipco

Differential Revision: D21626212

fbshipit-source-id: 5bcf3cecfd38277149b41bf3ba9e6d4ef2a07b2b
2020-05-27 12:16:45 -07:00
Jun Wu
e0d11803f2 dag: move DagAlgorithm to an independent trait
Summary:
This decouples DagAlgorithm from the IdMap + IdDag backend, making it possible
to support other kinds of backends of DagAlgorithm (ex. a revlog backend).

Reviewed By: sfilipco

Differential Revision: D21626200

fbshipit-source-id: f53cc271a200062e9c02f739b6453e1d7de84e6d
2020-05-27 12:16:45 -07:00
Durham Goode
8ed66bc3fc pyrevisionstore: fix unused code warnings
Summary:
When we got rid of the delta logic, we also needed to get rid of some
unused functions.

Reviewed By: singhsrb

Differential Revision: D21725043

fbshipit-source-id: ac069e6b0468e2275f353a9970b8971b5a2cfa23
2020-05-26 18:09:22 -07:00
Durham Goode
9f6f200a08 configs: version dynamic configs
Summary:
If we release a new version of Mercurial, we want to ensure that it's
builtin configs are used immediately. To do so, let's write a version number
into the generated config file, and if the version number doesn't match, we
force a synchronous regeneration of the config file.

For now, if regeneration fails, we just log it. In the future we'll probably
throw an exception and block the user since we want to ensure people are running
with modern configuration.

Reviewed By: quark-zju

Differential Revision: D21651317

fbshipit-source-id: 3edbaf6777f4ca2363d8617fad03c21204b468a2
2020-05-20 13:35:28 -07:00
Durham Goode
f0d7044aff configs: apply dynamicconfig during clone
Summary:
During clone the hgrc.dynamic file doesn't exist and doesn't even have
a place for us to generate it to. Let's instead generate and apply the config in
memory.

In the future, if generate fetches data from the network, this will mean clone
would depend on the network, since if generate fails the clone would fail. In
some situations this is desirable, since users shouldn't be cloning without our
approved configs, but if it causes problems we could probably tweak generate to
support an offline mode.

Reviewed By: quark-zju

Differential Revision: D21643086

fbshipit-source-id: d9a758207738d5983213d95725061517e0aa17db
2020-05-19 19:51:27 -07:00
Durham Goode
861f813f25 configs: convert facebook_overrides.rc
Summary: Converts facebook_overrides.rs to our dynamic config generator

Reviewed By: quark-zju

Differential Revision: D21625721

fbshipit-source-id: 2a374939d90f1fb7f9173268e2a7fa636d672393
2020-05-19 13:23:19 -07:00
Jun Wu
a27bf2fc42 pyrenderdag: support non-revision-numbers graph vertexes
Summary:
Change pyrenderdag to accept non-revision-number graph vertexes so it can
render a graph even if the graph does not use revision numbers.

The next diff wants this behavior so it can just emit commit hashes to
the renderer without caring about revision numbers. The type is made
so it can still support revision numbers, since the legacy graphlog
interface would still use revision numbers.

Reviewed By: markbt

Differential Revision: D21554671

fbshipit-source-id: 20572683b831f7cecb03957c83f278ff3903eff0
2020-05-14 12:03:44 -07:00
Jun Wu
96ac755c06 pydag: fix lazy set iteration
Summary:
The previous code was wrong - it converts the PyObject to iterator every time
(ex. if the PyObject is a set, then it calls `set.__iter__` every time, and
will only get the first element of the set).

For example, it will enter an infinite loop for evaluating this:

  bindings.dag.nameset({'1', '2'})

Fix it by calling `__iter__`, to get the iterator object and use that instead
of the original PyObject.

Reviewed By: markbt

Differential Revision: D21554676

fbshipit-source-id: 0f2adae8f123530cee2d473da37ca1a93a941fde
2020-05-14 12:03:44 -07:00
Jun Wu
aeac1551d2 dag: implement beautify
Summary:
This function reorders commits so the graph looks better.
It will be used to optimize graph rendering for cloud smartlog (and perhaps
smartlog in the future).

Reviewed By: markbt

Differential Revision: D21554675

fbshipit-source-id: d3f0f27c7935c49581cfa6e87d7c32eb5a075f75
2020-05-14 12:03:43 -07:00
Jun Wu
0ac5c6d4f3 pymutationstore: expose the getdag API
Summary: Expose the API that returns a real graph.

Reviewed By: DurhamG

Differential Revision: D21486520

fbshipit-source-id: 4ebdb4011df8971c54930173c4e77503cd2dac47
2020-05-13 09:45:24 -07:00
Jun Wu
e817197b09 bindings: add bindings to regex
Summary:
This allows us to replace the pyre2 C++ bindings so the fast regex engine can
work with Python 3, and simplify our build steps.

Reviewed By: DurhamG

Differential Revision: D20973179

fbshipit-source-id: e123ac18954991f2c701526108f5c2ecd2b31a3b
2020-05-12 16:32:50 -07:00
Ellis Hoag
1d0d626a36 Pass config object down to repack
Summary:
Pass `configparser::config::ConfigSet` to `repack` in
`revisionstore/src/repack.rs` so that we can use various config values in `filter_incrementalpacks`.

* `repack.maxdatapacksize`, `repack.maxhistpacksize`
  * The overall max pack size
* `repack.sizelimit`
  * The size limit for any individual pack
* `repack.maxpacks`
  * The maximum number of packs we want to have after repack (overrides sizelimit)

Reviewed By: xavierd

Differential Revision: D21484836

fbshipit-source-id: 0407d50dfd69f23694fb736e729819b7285f480f
2020-05-11 16:41:30 -07:00
Xavier Deguillard
2001c3fd69 revisionstore: add translate_lfs_missing to remote store get
Summary:
When Qing implemented all the get method, the translate_lfs_missing function
didn't exist, and I forgot to add them in the right places when landing the
diff that added it. Fix this.

Reviewed By: sfilipco

Differential Revision: D21418043

fbshipit-source-id: baf67b0fe60ed20aeb2c1acd50a209d04dc91c5e
2020-05-11 10:34:01 -07:00
Jun Wu
d8abb30eeb pydag: expose some memnamedag APIs
Summary: Make them reusable in other Python bindings, ex. pymutation.

Reviewed By: sfilipco

Differential Revision: D21486524

fbshipit-source-id: 258455c6a442353c77588fadcb560cb5a170926e
2020-05-11 09:50:01 -07:00
Jun Wu
6835eb4b9d pydag: expose render into string feature for memnamedag
Summary: This makes it easier to visualize a MemNameDag.

Reviewed By: sfilipco

Differential Revision: D21486523

fbshipit-source-id: c65f1fc421bd654dc820faae3c93f2aa57f910d4
2020-05-11 09:50:01 -07:00
Jun Wu
010bcac66a pydag: expose MemNameDag APIs
Summary:
This will allow clients to operate on MemNameDag.

Unfortunately, it isn't that easy to reuse code in `py_class!`. Since they are
just thin wrappers, I live with the copy-paste for now.

Reviewed By: sfilipco

Differential Revision: D21479015

fbshipit-source-id: ddcc7f5c7ede6bb1e9c73d058779805875b09200
2020-05-11 09:50:01 -07:00
Jun Wu
f014f86b7a dag: move NameDag algorithms to a trait
Summary:
This makes it easier to add an "in-memory-only" NameDag with all the algorithms
implemented.

Reviewed By: sfilipco

Differential Revision: D21479020

fbshipit-source-id: c1a73e95f3291c273c800650f70db2a7eb0966d7
2020-05-11 09:49:56 -07:00
Meyer Jacobs
d49ac73f4c datastore: remove HgIdDataStore ::get_delta and ::get_delta_chain
Summary:
Remove HgIdDataStore::get_delta and all implementations. Remove HgIdDataStore::get_delta_chain from trait, remove all unnecessary implentations, remove all implementations from public Rust API. Leave Python API and introduce "delta-wrapping".

MutableDataPack::get_delta_chain must remain in some form, as it necessary to implement get using a sequence of Deltas. It has been moved to a private inherent impl.

DataPack::get_delta_chain must remain in some form for the same reasons, and in fact both implenetations can probably be merged, but it is also used in repack.rs for the free function repack_datapack. There are a few ways to address this without making DataPack::get_delta_chain part of the public API. I've currently chosen to make the method pub(crate), ie visible only within the revisionstore crate. Alternatively, we could move the repack_datapack function to a method on DataPack, or use a trait in a private module, or some other technique to restrict visibility to only where necessary.

UnionDataStore::get has been modified to call get on it's sub-stores and return the first which matches the given key.

MultiplexDeltaStore has been modified to implement get similarly to UnionDataStore.

Reviewed By: xavierd

Differential Revision: D21356420

fbshipit-source-id: d04e18a0781374a138395d1c21c3687897223d15
2020-05-07 11:04:01 -07:00
Jun Wu
44c8c7a9e3 transaction: write hgrc to metalog
Summary:
This allows us to understand what config is used during a transaction.
For example, is `selectivepull` enabled during a `pull`?

Reviewed By: DurhamG

Differential Revision: D21222146

fbshipit-source-id: a8c82f2b02e9657885947a706f728e28b1bfc1e2
2020-05-06 12:15:36 -07:00
Durham Goode
e67d609e1d configs: validate dynamic configs
Summary:
Adds python logic for validating the dynamic configs. Any dynamic
configs that don't match the given list of rc files will be reported and removed

Reviewed By: quark-zju

Differential Revision: D21310919

fbshipit-source-id: 07f584bba990f1b01347dfbc285e3ca814fe5c5a
2020-05-05 18:19:09 -07:00
Jun Wu
5b881f086f pyzstore: further reduce cpython_ext::Bytes usage
Summary: This avoids data copies.

Reviewed By: DurhamG

Differential Revision: D21213075

fbshipit-source-id: 9575173f163d71543affabd9861931c11086f40a
2020-05-01 14:24:52 -07:00
Jun Wu
73ff6559e6 zstore: add simple caching
Summary: Add simple caching so zstore can avoid some zstd calculation.

Reviewed By: DurhamG

Differential Revision: D21213076

fbshipit-source-id: 5e3152949cf4e6d6193c3ef3401f24e2efac5620
2020-05-01 14:24:52 -07:00
Carolyn Busch
4eeab3b81b Update cpython to 0.5
Summary:
D21270958 updated the cpython, python27-sys, and python3-sys crates to 0.5. Update
the Mercurial cargo dependencies to match.

Reviewed By: xavierd

Differential Revision: D21281875

fbshipit-source-id: ccad68749a25d11240351b5faeef27cb9c693456
2020-04-28 11:47:41 -07:00
Jun Wu
d479053954 metalog: support exporting to a git repo
Summary:
I wanted to figure out "who added this visible head", "what is the difference
between this metalog root and that root". Those are actually source control
operations (blame, diff). Add a git export feature so we can export metalog
to git to run those queries.

Choosing git here as we don't have native Rust utilities to create a more
efficient hg repo yet.

Ideally we can also make hg operate on a metalog directory as a "metalogrepo"
directly. However that seems to be quite difficult right now due to poor
abstractions.

Reviewed By: DurhamG

Differential Revision: D21213073

fbshipit-source-id: 4cc0331fbad6e1586907c0a66c18bcc25608ea49
2020-04-27 20:25:25 -07:00
Jun Wu
3df5fcf779 pymetalog: add handy APIs for debugshell
Summary:
This makes metalog easier to use in debugshell context. For example, to
investigate the "bookmarks" in the past, the code gets simplified from:

  roots = b.metalog.metalog.listroots(repo.svfs.join('metalog'))
  past_ml = b.metalog.metalog(repo.svfs.join('metalog'), root[10])
  past_ml.get("bookmarks")

to:

  roots = ml.roots()
  past_ml = ml.checkout(roots[10])
  past_ml.get("bookmarks")

Reviewed By: DurhamG

Differential Revision: D21162568

fbshipit-source-id: 7cc5581afe596a3d2696311a36ac11caa718428a
2020-04-27 20:04:18 -07:00
Jun Wu
a0207c4542 metalog: expose root id API
Summary: This allows the Python world to obtain the root ID for logging purpose.

Reviewed By: DurhamG

Differential Revision: D21179513

fbshipit-source-id: 3f289c06d3d470ff492de39fa985203b3facbf00
2020-04-27 19:50:58 -07:00