Summary: Add appropriate EdenAPI calls to allow for HTTP tree fetching in `treemanifest`. Enabling `remotefilelog.http` essentially reroutes `_prefetch` and `getdesignatednodes` to their HTTP equivalents.
Reviewed By: DurhamG
Differential Revision: D23877319
fbshipit-source-id: 8a71934b47d07d2655fa46c103a14fb99e2f7b1f
Summary: Use the functionality from D23910534 (721f5af278) to set a timeout for EdenAPI requests, configured via the `edenapi.timeout` option.
Reviewed By: DurhamG
Differential Revision: D23911552
fbshipit-source-id: 4a6e3de1094d0faa1daaf6fe4b9b7aafb37a25a8
Summary:
Cast `names` and `nodes` to lists. The reason that they are sets is deduplication, but otherwise the code doesn't rely on them being sets (and in fact casts them to lists at multiple points later).
The main motivation for this is to allow these to be passed to Rust code later. The Rust bindings make a distinction between sequence types and unordered types, so passing a set in place of a list would result in a type error.
Reviewed By: DurhamG
Differential Revision: D23893108
fbshipit-source-id: 9ce2addb824867bcb2d24ba14c589b8791a156e8
Summary: Add the ability to set a timeout on HTTP requests. Equivalent to [`CURLOPT_TIMEOUT_MS`](https://curl.haxx.se/libcurl/c/CURLOPT_TIMEOUT_MS.html).
Reviewed By: DurhamG
Differential Revision: D23910534
fbshipit-source-id: a7aec792ec3c122a01aa44fcfe2e2df6e3a111fc
Summary: These concatenated strings and bytes and needed to be fixed.
Reviewed By: kulshrax
Differential Revision: D23907301
fbshipit-source-id: 0008d7d54469266ecbae8ddaaa7625820f62cb7e
Summary:
There are several places in the HTTP client where we log and discard errors. (Typically, these are "this should never happen" type situations.)
Previously, these were logged at the `trace` log level, meaning that in practice no one would ever know if we did hit these errors.
Let's upgrade them to `error` so that they'll be printed out. (In theory, users should never see these error messages unless something has gone horribly wrong.)
Reviewed By: DurhamG
Differential Revision: D23888268
fbshipit-source-id: 9007205f946ebb0127238c76812cf62524878047
Summary:
This test relies on precise timing of racing processes. This is flakey
in our automated tests. Since this is mainly about revlog based repos, and we
only have that on the servers now, and we're going to delete them soon, let's
delete this test to make our tests more stable.
Reviewed By: kulshrax
Differential Revision: D23908316
fbshipit-source-id: 3fd110a8267d3bc16bbcb4545b9ff921274f7588
Summary: Linkrevs are on their way out. Let's switch to linknodes.
Reviewed By: quark-zju
Differential Revision: D23765176
fbshipit-source-id: 0dc1e0db11d732ce1edd24d863f32f08a5a5ce42
Summary:
The rust contentstore doesn't allow runtime manipulation of the list of
stores, which is required in order to insert the bundle store into the store.
Let's continue using the old python union store in these cases. This still let's
us delete the python pack code later, but we'll have to keep around the python
union store until we come up with a better solution.
Reviewed By: quark-zju
Differential Revision: D23689630
fbshipit-source-id: 0c60e6f268e81804149afa24409f64b5303e1e34
Summary:
The remaining test failures are mostly around bundle support, which
I'll fix in a later diff.
Reviewed By: quark-zju
Differential Revision: D23664037
fbshipit-source-id: 2bdde3cb4fcded6e0cf3afdc23269662544821df
Summary:
The high level prefetch trees API had a depth parameter, but the Rust
prefetch path doesn't support that. In the long run we probably want to get rid
of the depth parameter (or make it more useful), so for now let's get rid of it
from the function signature. You can still set the depth via config, and the few
places that needed depth are changed to use the config.
Reviewed By: quark-zju
Differential Revision: D23772384
fbshipit-source-id: a037d7207d4076a47368366ef7fd2dc1cfbf5cfb
Summary:
Treemanifest needs to be able to write to the shared stores from paths
other than just prefetch (like when it receives certain trees via a standard
pull). To make this possible we need to expose the Rust shared mutable stores.
This will also make just general integration with Python cleaner.
In the future we can get rid of the non-prefetch download paths and remove this.
Reviewed By: quark-zju
Differential Revision: D23772385
fbshipit-source-id: c1e67e3d21b354b85895dba8d82a7a9f0ffc5d73
Summary:
Introduce separate wire types to allow protocol evolution and client API changes to happen independently.
* Duplicate `*Request`, `*Entry`, `Key`, `Parents`, `RepoPathBuf`, `HgId`, and `revisionstore_types::Metadata` types into the `wire` module. The versions in the `wire` module are required to have proper `serde` annotations, `Serialize` / `Deserialize` implementations, etc. These have been removed from the original structs.
* Introduce infallible conversions from "API types" to "wire types" with the `ToWire` trait and fallible conversions from "wire types" to "API types" with the `ToApi`. API -> wire conversions should never fail in a binary that builds succesfully, but wire -> API conversions can fail in the case that the server and client are using different versions of the library. This will cause, for instance, a newly-introduced enum variant used by the client to be deserialized into the catch-all `Unknown` variant on the server, which won't generally have a corresponding representation in the API type.
* Cleanup: remove `*Response` types, which are no longer used anywhere.
* Introduce a `map` method on `Fetch` struct which allows a fallible conversion function to be used to convert a `Fetch<T>` to a `Fetch<U>`. This function is used in the edenapi client implementation to convert from wire types to API types.
* Modify `edenapi_server` to convert from API types to wire types.
* Modify `edenapi_cli` to convert back to wire types before serializing responses to disk.
* Modify `make_req` to use `ToWire` for converting API structs from the `json` module to wire structs.
* Modify `read_res` to use `ToApi` to convert deserialized wire types to API types with the necessary methods for investigating the contents (`.data()`, primarily). It will print an error message to stderr if it encounters a wire type which cannot be converted into the corresponding API type.
* Add some documentation about protocol conventions to the root of the `wire` module.
Reviewed By: kulshrax
Differential Revision: D23224705
fbshipit-source-id: 88f8addc403f3a8da3cde2aeee765899a826446d
Summary: Add log messages for debugging using the `tracing` crate, which allows them to be enabled via `env_logger`.
Reviewed By: quark-zju
Differential Revision: D23858076
fbshipit-source-id: a8ef1afac6c9ecbfb5d6d78232aa0d03a2fe2054
Summary: Log HTTP stats to stderr to assist with ad-hoc debugging. Will not be printed unless `RUST_LOG` is set appropriately.
Reviewed By: quark-zju
Differential Revision: D23858077
fbshipit-source-id: 39acf3de3fd0ca4403a986eb5373a6a79f1d004a
Summary: Similar to D23819023 (c96de76ac0) but works on Python 2, too.
Reviewed By: DurhamG
Differential Revision: D23858273
fbshipit-source-id: b15be07c8657bc8cb37960b631f2b31e4a78892b
Summary:
test-commitcloud-sync.t is a new change and just needs to be made cross
platform.
I have no idea how test-common-commands-fb.t ever worked. When HGRCPATH is set,
I expect the system hgrc to not be loaded, and therefore we can't run hg-clone.
Let's just unset it, since this is meant to test if the new Mercurial can
execute a clone. Ideally we'd redirect the system hgrc to the in-repo
staticfiles, but that's more effort.
Reviewed By: singhsrb
Differential Revision: D23869645
fbshipit-source-id: 66669d9fd9c3a23b01bc43b365723185b7b2ed33
Summary:
Move some commit cloud operations under infinitepush read path:
those are:
* `hg cloud check` command
* `hg cloud sync` command when the local repo is clean
* `hg cloud switch` command will normally use the read path for the dest workspace because we clean up the repo before performing the switch
* `hg cloud rejoin` command we use in fbclone will normally go through the read path as it runs in a fresh repo
If something is broken, there is always a way to rerun any of these command with '--dest' flag pointing it to the write path.
```
./hg cloud check -r 0c9596fd1 --remote --dest infinitepush-write
./hg cloud sync --dest infinitepush-write
./hg cloud switch -w other --dest infinitepush-write
```
Those use cases are limited and the lag of forward filler shouldn't be noticeable for them but we will be able to collect more signal how Mononoke performs with Commit Cloud.
Sitevar to control the routing of read traffic:
https://www.internalfb.com/intern/sv/HG_SSH_WRAPPER_MONONOKE_ROLLOUT/#revisions_list
Reviewed By: mitrandir77
Differential Revision: D23840914
fbshipit-source-id: 40fbe2e72756e7a4cf8bc5be6a0b94f6cf4906b4
Summary:
With segmented changelog backend, the revs can be changed, even if len(repo)
didn't change. Caching revs might not get invalidated properly. Let's cache
head nodes instead.
Reviewed By: DurhamG
Differential Revision: D23856176
fbshipit-source-id: c5154c536298c348b847a12de8c4f582f877f96e
Summary:
On Ubuntu the output is a bit different:
```
$ hg cloud sync --use-bgssh
commitcloud: synchronizing 'server' with 'user/test/default'
- remote: /bin/sh: trashssh: command not found
- abort: no suitable response from remote hg!
+ remote: /bin/sh: 1: trashssh: not found
+ abort: no suitable response from remote hg: '[Errno 32] Broken pipe'!
```
Glob them out to make the test pass.
Reviewed By: DurhamG
Differential Revision: D23824735
fbshipit-source-id: 7f96149ee16daff31fd0a1c68975b5edfa27cc46
Summary:
It seems OSX python2 has SIGINT handler set to SIG_IGN by default when running
inside tests. Detect that and reset SIGINT handler to raise KeyboardInterrupt.
This fixes test-ctrl-c.t on OSX.
As we're here, improve test-ctrl-c.t so it checks a bit more things and run
quicker.
Reviewed By: DurhamG
Differential Revision: D23853455
fbshipit-source-id: 05c47650bc80f9880f724828d307c32786265e2c
Summary:
Phabstatus for smartlog uses `PeekeaheadList` rather than `PeekaheadRevsetIterator` as
all of the commits are known ahead of time, and we don't need to collect together
batches as we iterate across the revset.
However, we should still batch up requests to Phabricator, as users with very high
numbers of commits in their smartlog may hit timeouts.
Add a batching mechanism to `PeekaheadList` that splits the list into chunks to
return with each peekahead.
Reviewed By: liubov-dmitrieva
Differential Revision: D23840071
fbshipit-source-id: 68596c7eb4f7404ce6109e69914f328565e34582
Summary:
This provides a way to fix the local cache of backed up heads if it is in an
invalid state.
The most important, it will allow early dogfooding of write traffic from Mononoke
without the reverse filler in place for developers or for the team.
You could just run `hg cloud backup -f` assuming the repo is backfilled to fix
any inconsistency when switch between the two backends
Reviewed By: markbt
Differential Revision: D23840162
fbshipit-source-id: bbd331162d65ba193c4774e37324f15ed0635f82
Summary:
For Python 3 we must ensure that the displayer messages have all been converted
to unicode before providing them to the Rust graph renderer.
The is because the Python 3 version of `encoding.unifromlocal` is a no-op, so
the result may still be `bytes` that need to be converted to `str`.
Reviewed By: quark-zju
Differential Revision: D23827233
fbshipit-source-id: 8f2b707ceceb210c0a2b5b589b99d4016452c61c
Summary:
D23759711 (be51116cf4) changed the way signal handlers work, which apparently causes
this test to fail. The SIGCHLD signal of the child changing state is received
during os.waitpid, which apparently counts as a signal during a system call,
which throws an OSError.
I'm not sure what the real fix should be. Sleeping gets us past the issue, since
presumably the signal is handled before the system call.
Reviewed By: quark-zju
Differential Revision: D23832606
fbshipit-source-id: 70fca19e419da55bbf546b8530406c9b3a9a6d77
Summary:
This simplifies the code a bit, and avoids creating tokio Runtime multiple
times.
Reviewed By: kulshrax
Differential Revision: D23799642
fbshipit-source-id: 21cee6124ef6f9ab6e165891d9ee87b2feb553ac
Summary:
Exercises the PyStream type from cpython-async.
`hg dbsh`:
In [1]: s,f=api._rustclient.commitdata('fbsource', list(repo.nodes('master^^::master')))
In [2]: s
Out[2]: <stream at 0x7ff2db700690>
In [3]: it=iter(s)
In [4]: next(it)
Out[4]: ('6\xf9\x18\xe4\x1c\x05\xfc\xb0\xd3\xb2\xe9\xec\x18E\xec\x0f\x1a:\xb7\xcd', ...)
In [5]: next(it)
Out[5]: ('}\x1f(\xe1o\xf1a\x9b\x81\xb9\x83}\x1b\xbbt\xd2e\xb1\xedb',...)
In [6]: next(it)
Out[6]: ('\xf1\xf0f\x97<\xf3\xdd\xe41w>\x92\xd1\xc0\x9ah\xdd\x87~^',...)
In [7]: next(it)
StopIteration:
In [8]: f.wait()
Out[8]: <bindings.edenapi.stats at 0x7ff2e006a3d8>
In [9]: str(Out[8])
Out[9]: '2.42 kB downloaded in 165 ms over 1 request (0.01 MB/s; latency: 165 ms)'
In [10]: iter(s)
ValueError: stream was consumed
Reviewed By: kulshrax
Differential Revision: D23799645
fbshipit-source-id: 732a5da4ccdee4646386b6080408c0d8958dd67f
Summary:
Exercises the PyFuture type from cpython-async.
`hg dbsh`:
In [1]: api._rustclient.commitdata('fbsource', list(repo.nodes('master^^::master')))
Out[1]:
([...], <future at 0x7f7b65d05060>)
In [2]: f=Out[1][-1]
In [3]: f.wait()
Out[3]: <bindings.edenapi.stats at 0x7f7b665e8228>
In [4]: f.wait()
ValueError: future was awaited
In [5]: str(Out[3])
Out[5]: '2.42 kB downloaded in 172 ms over 1 request (0.01 MB/s; latency: 171 ms)'
Reviewed By: kulshrax
Differential Revision: D23799643
fbshipit-source-id: d4fcef7dca58bc4902bb0809adc065493bb94bd3
Summary:
Add a `PyFuture<F>` type that can be used as return type in binding function.
It converts Rust Future to a Python object with an `await` method so Python
can access the value stored in the future.
Unlike `TStream`, it's currently only designed to support Rust->Python one
way conversion so it looks simpler.
Reviewed By: kulshrax
Differential Revision: D23799644
fbshipit-source-id: da4a322527ad9bb4c2dbaa1c302147b784d1ee41
Summary:
The exposed type can be used as a Python iterator:
for value in stream:
...
The Python type can be used as input and output parameters in binding functions:
# Rust
type S = TStream<anyhow::Result<X>>;
def f1() -> PyResult<S> { ... }
def f2(x: S) -> PyResult<S> { Ok(x.stream().map_ok(...).into()) }
# Python
stream1 = f1()
stream2 = f2(stream1)
This crate is similar to `cpython-ext`: it does not define actual business
logic exposed by `bindings` module. So it's put in `lib`, not
`bindings/modules`.
Reviewed By: markbt
Differential Revision: D23799641
fbshipit-source-id: c13b0c788a6465679b562976728f0002fd872bee
Summary:
See the previous diff for context. Move the error handling and ipdb logic to
the background thread so it can show proper traceback.
Reviewed By: kulshrax
Differential Revision: D23819022
fbshipit-source-id: 8ddae019ab939d8fb2c89afca2a7769094ebe26a
Summary:
With D23759710 (34d8dca79a), the main command was moved to a background thread, but the
error handling isn't. That can cause less useful traceback like:
Traceback (most recent call last):
File "dispatch.py", line 698, in _callcatch
return scmutil.callcatch(ui, func)
File "scmutil.py", line 147, in callcatch
return func()
File "util.py", line 4358, in wrapped
raise value
Set `e.__traceback__` so `raise e` preserves the traceback information.
This only works on Python 3. On Python 2 it is possible to use
`raise exctype, excvalue, tb`. But that's invalid Python 3 code. I'm
going to fix Python 2 traceback differently.
Reviewed By: kulshrax
Differential Revision: D23819023
fbshipit-source-id: 953ac8bd6108f4c0dae193607bee3f931c2bd13e
Summary:
The parameter `mtimethreshold` should be used instead of a constant of 14 days.
This fixes an issue where sigtrace output takes a lot of space in hg rage
output.
Reviewed By: DurhamG
Differential Revision: D23819021
fbshipit-source-id: e639b01d729463a4822fa93604ce3a038fbd4a9a
Summary:
filter returns a filter object, so the second time we iterate, it is empty
This is only in Python3 I believe, so migration to py3 broke it.
Reviewed By: markbt
Differential Revision: D23815206
fbshipit-source-id: 1a6503b2bbfd44959307c189d17dec9b5d5ff991
Summary:
During an hg update we first prefetch all the data, then write all the
data to disk. There are cases where the prefetched data is not available during
the writing phase, in which case we fall back to fetching the files one-by-one.
This has truly atrocious performance.
Let's allow the worker threads to check for missing data then do bulk fetching
of it. In the case where the cache was completely lost for some reason, this
would reduce the number of serial fetches by 100x.
Note, the background workers already spawn their own ssh connection's, so
they're already getting some level of parallelism even when they're doing 1-by-1
fetching. That's why we aren't seeing a 100x improvement in performance.
Reviewed By: xavierd
Differential Revision: D23766424
fbshipit-source-id: d88a1e55b1c21e9cea7e50fc6dbfd8a27bd97bb0
Summary:
Automigration gets messed up with `hg cloud rejoin` command in fbclone code because it triggered by the pull command.
As a result fbclone ends up to join a hostname workspace instead of the default for some cases.
* make sure that the migration never runs if background commit cloud operations are disabled
* also, add skip the migration in the pull command in fbclone
Once of those would be enough to fix the issue but I prefer to make both
changes.
Reviewed By: markbt
Differential Revision: D23813184
fbshipit-source-id: 3b49a3f079e889634e3c4f98b51557ca0679090b
Summary:
The children revset iterated over everything in the subset, which in
many cases was the entire repo. This can take hundreds of milliseconds. Let's
use the new _makerangeset to only iterate over descendants of the parentset.
Reviewed By: quark-zju
Differential Revision: D23794344
fbshipit-source-id: 9ac9bc014d56a95b5ac65534769389167b0f4508
Summary:
Now that Mercurial itself can properly handle SIGINT, there isn't a need for a Python wrapper around the Rust EdenAPI client (since the main purpose of the wrapper was to ensure proper SIGINT handling--something that could only be done in Python).
Note that while this does remove some code that prints out certificate warnings, that code was actually broken after the big refactor of the Rust bindings. (The exception types referenced no longer exist, so the code would simple result in a `NameError` if it actually tried to catch an exception from the Rust client.)
Reviewed By: singhsrb
Differential Revision: D23801363
fbshipit-source-id: 3359c181fd05dbec24d77fa1b7d9c8bd821b49a6
Summary:
This extends the Ctrl+C special handling from edenapi to the entire Python
command so Ctrl+C should be able to exit the program even if it's running
some blocking Rust functions.
`edenapi` no longer needs to spawn threads for fetching.
Reviewed By: singhsrb
Differential Revision: D23759710
fbshipit-source-id: cbaaa8e5f93d8d74a8692117a00d9de20646d232
Summary:
Move bunch of code into a separate file (scm daemon related options). Move them
out of cloud sync.
Also introduce additional check that the `hg cloud sync` command scm daemon
runs is intended for the current connected workspace
In theory when we switch a subscription, the SCM daemon gets notified but races possible and it is better to have this additional check, so SCM daemon triggers cloud sync where it is supposed to.
Reviewed By: markbt
Differential Revision: D23783616
fbshipit-source-id: b91a8b79189b7810538c15f8e61080b41abde386
Summary:
The config is not actually used any more (with rust-commits, it is forced on, without rust-commits,
there is no point to keep it on). Therefore removed.
Reviewed By: singhsrb
Differential Revision: D23771570
fbshipit-source-id: ad3e89619ac5e193ef552c25fc064ca9eddba0c6
Summary:
See the previous diff for context. This allows the code to run from non-main
thread.
Reviewed By: singhsrb
Differential Revision: D23759712
fbshipit-source-id: 044193a9d7193488c700d769da9ad68987356d69
Summary:
The idea is to extend D22703916 (61712e381c)'s way of calling functions from just edenapi to
the entire command for better Ctrl+C handling. Some code paths (ex. pager,
crecord) use `signal.signal` and `signal.signal` does not work from non-main
thread.
To workaround the `signal.signal` limitation, we pre-register all signals we care
about in the main thread to a special handler. The special handler reads a
global variable to decide what to do. Other threads can modify that global
variable to affect what the special signal handler does, therefore indirectly
"register" their handles.
Reviewed By: kulshrax
Differential Revision: D23759711
fbshipit-source-id: 8ba389072433e68a36360db6a1b17638e40faefa
Summary:
Before this change, for a long-running function wrapped by 'threaded',
it might:
background thread> start
main thread> receive SIGINT, raise KeyboardInterrupt
main thread> raise at 'thread.join(1)'
main thread> exiting, but wait for threads to complete (Py_Finalize)
background thread> did not receive KeyboardInterrupt, continue running
main thread> continue waiting for background thread
Teach `thread.join(1)` to forward the `KeyboardInterrupt` (or its subclass
`error.SignalInterrupt`) to the background thread, so the background thread
_might_ stop. Besides, label the background thread as daemon so it won't
be waited upon exit.
Reviewed By: kulshrax
Differential Revision: D23759713
fbshipit-source-id: 91893d034f1ad256007ab09b7a8b974325157ea5
Summary:
Move the wrapper to util.py. It'll be used in dispatch.py to make the entire
command Ctrl+C friendly.
Reviewed By: singhsrb
Differential Revision: D23759715
fbshipit-source-id: fa2098362413dcfd0b68e05455aad543a6980907
Summary: This will be used to test Ctrl+C handling with native code.
Reviewed By: kulshrax
Differential Revision: D23759714
fbshipit-source-id: 50da40d475b80da26b7dbc654e010d77cb0ad2d1
Summary: This makes it easier to test the API via debugshell.
Reviewed By: kulshrax
Differential Revision: D23750677
fbshipit-source-id: e29284395f03c9848cf90dd2df187e437890c56e