Commit Graph

2813 Commits

Author SHA1 Message Date
Arun Kulshreshtha
613fbc858f hg-http: optionally print stats
Summary: Log HTTP stats to stderr to assist with ad-hoc debugging. Will not be printed unless `RUST_LOG` is set appropriately.

Reviewed By: quark-zju

Differential Revision: D23858077

fbshipit-source-id: 39acf3de3fd0ca4403a986eb5373a6a79f1d004a
2020-09-23 17:19:28 -07:00
Arun Kulshreshtha
31ceb7f0d1 hg-http: use autocargo
Summary: Onboard the crate onto autocargo.

Reviewed By: quark-zju

Differential Revision: D23858075

fbshipit-source-id: 7179ae0f9ca8a1d4e664d7eb5cb614940e2b2c30
2020-09-23 16:40:49 -07:00
Jun Wu
2f5752eda5 util: raise with traceback in Python 2
Summary: Similar to D23819023 (c96de76ac0) but works on Python 2, too.

Reviewed By: DurhamG

Differential Revision: D23858273

fbshipit-source-id: b15be07c8657bc8cb37960b631f2b31e4a78892b
2020-09-23 12:37:44 -07:00
Durham Goode
58667a3fe7 tests: fix test-commitcloud-sync.t and test-common-commands-fb.t
Summary:
test-commitcloud-sync.t is a new change and just needs to be made cross
platform.

I have no idea how test-common-commands-fb.t ever worked.  When HGRCPATH is set,
I expect the system hgrc to not be loaded, and therefore we can't run hg-clone.
Let's just unset it, since this is meant to test if the new Mercurial can
execute a clone. Ideally we'd redirect the system hgrc to the in-repo
staticfiles, but that's more effort.

Reviewed By: singhsrb

Differential Revision: D23869645

fbshipit-source-id: 66669d9fd9c3a23b01bc43b365723185b7b2ed33
2020-09-23 10:58:19 -07:00
Liubov Dmitrieva
c5328d9d0e move some operations under read path
Summary:
Move some commit cloud operations under infinitepush read path:

those are:
*  `hg cloud check` command
*  `hg cloud sync` command when the local repo is clean
* `hg cloud switch` command will normally use the read path for the dest workspace because we clean up the repo before performing the switch
*  `hg cloud rejoin` command we use in fbclone will normally go through the read path as it runs in a fresh repo

If something is broken, there is always a way to rerun any of these command with '--dest' flag pointing it to the write path.

```
./hg cloud check -r 0c9596fd1 --remote --dest infinitepush-write
./hg cloud sync --dest infinitepush-write
./hg cloud switch -w other --dest infinitepush-write
```

Those use cases are limited and the lag of forward filler shouldn't be noticeable for them but we will be able to collect more signal how Mononoke performs with Commit Cloud.

Sitevar to control the routing of read traffic:
https://www.internalfb.com/intern/sv/HG_SSH_WRAPPER_MONONOKE_ROLLOUT/#revisions_list

Reviewed By: mitrandir77

Differential Revision: D23840914

fbshipit-source-id: 40fbe2e72756e7a4cf8bc5be6a0b94f6cf4906b4
2020-09-23 08:42:13 -07:00
Jun Wu
1b7c3b6a13 localrepo: cache headnodes instead of headrevs
Summary:
With segmented changelog backend, the revs can be changed, even if len(repo)
didn't change. Caching revs might not get invalidated properly. Let's cache
head nodes instead.

Reviewed By: DurhamG

Differential Revision: D23856176

fbshipit-source-id: c5154c536298c348b847a12de8c4f582f877f96e
2020-09-22 18:11:05 -07:00
Jun Wu
27f4f7e94c test-commitcloud-sync: fix test on Ubuntu
Summary:
On Ubuntu the output is a bit different:

```
   $ hg cloud sync --use-bgssh
   commitcloud: synchronizing 'server' with 'user/test/default'
-  remote: /bin/sh: trashssh: command not found
-  abort: no suitable response from remote hg!
+  remote: /bin/sh: 1: trashssh: not found
+  abort: no suitable response from remote hg: '[Errno 32] Broken pipe'!
```

Glob them out to make the test pass.

Reviewed By: DurhamG

Differential Revision: D23824735

fbshipit-source-id: 7f96149ee16daff31fd0a1c68975b5edfa27cc46
2020-09-22 15:21:56 -07:00
Jun Wu
bcbacfebf4 dispatch: ensure SIGINT triggers KeyboardInterrupt
Summary:
It seems OSX python2 has SIGINT handler set to SIG_IGN by default when running
inside tests. Detect that and reset SIGINT handler to raise KeyboardInterrupt.

This fixes test-ctrl-c.t on OSX.

As we're here, improve test-ctrl-c.t so it checks a bit more things and run
quicker.

Reviewed By: DurhamG

Differential Revision: D23853455

fbshipit-source-id: 05c47650bc80f9880f724828d307c32786265e2c
2020-09-22 15:10:12 -07:00
Mark Thomas
ee0299cda0 phabstatus: batch peekahead for smartlog
Summary:
Phabstatus for smartlog uses `PeekeaheadList` rather than `PeekaheadRevsetIterator` as
all of the commits are known ahead of time, and we don't need to collect together
batches as we iterate across the revset.

However, we should still batch up requests to Phabricator, as users with very high
numbers of commits in their smartlog may hit timeouts.

Add a batching mechanism to `PeekaheadList` that splits the list into chunks to
return with each peekahead.

Reviewed By: liubov-dmitrieva

Differential Revision: D23840071

fbshipit-source-id: 68596c7eb4f7404ce6109e69914f328565e34582
2020-09-22 07:26:18 -07:00
Liubov Dmitrieva
c68e928d6f add --fource option to hg cloud backup command to reinitialise the local cache of backed up heads from the server
Summary:
This provides a way to fix the local cache of backed up heads if it is in an
invalid state.

The most important, it will allow early dogfooding of write traffic from Mononoke
without the reverse filler in place for developers or for the team.

You could just run `hg cloud backup -f` assuming the repo is backfilled to fix
any inconsistency when switch between the two backends

Reviewed By: markbt

Differential Revision: D23840162

fbshipit-source-id: bbd331162d65ba193c4774e37324f15ed0635f82
2020-09-22 07:12:28 -07:00
Mark Thomas
c55cb1914a cmdutil: ensure rust graph renderer messages are unicode
Summary:
For Python 3 we must ensure that the displayer messages have all been converted
to unicode before providing them to the Rust graph renderer.

The is because the Python 3 version of `encoding.unifromlocal` is a no-op, so
the result may still be `bytes` that need to be converted to `str`.

Reviewed By: quark-zju

Differential Revision: D23827233

fbshipit-source-id: 8f2b707ceceb210c0a2b5b589b99d4016452c61c
2020-09-22 04:04:12 -07:00
Durham Goode
737c07ca24 tests: fix test-fb-hgext-extutil.py on OSX
Summary:
D23759711 (be51116cf4) changed the way signal handlers work, which apparently causes
this test to fail. The SIGCHLD signal of the child changing state is received
during os.waitpid, which apparently counts as a signal during a system call,
which throws an OSError.

I'm not sure what the real fix should be. Sleeping gets us past the issue, since
presumably the signal is handled before the system call.

Reviewed By: quark-zju

Differential Revision: D23832606

fbshipit-source-id: 70fca19e419da55bbf546b8530406c9b3a9a6d77
2020-09-22 03:37:28 -07:00
Jun Wu
ebf708e17a pyedenapi: switch to async_runtime::block_on_future
Summary:
This simplifies the code a bit, and avoids creating tokio Runtime multiple
times.

Reviewed By: kulshrax

Differential Revision: D23799642

fbshipit-source-id: 21cee6124ef6f9ab6e165891d9ee87b2feb553ac
2020-09-21 13:28:07 -07:00
Jun Wu
186151e8f9 pyedenapi: return commit data in a stream fashion
Summary:
Exercises the PyStream type from cpython-async.

`hg dbsh`:

  In [1]: s,f=api._rustclient.commitdata('fbsource', list(repo.nodes('master^^::master')))

  In [2]: s
  Out[2]: <stream at 0x7ff2db700690>

  In [3]: it=iter(s)

  In [4]: next(it)
  Out[4]: ('6\xf9\x18\xe4\x1c\x05\xfc\xb0\xd3\xb2\xe9\xec\x18E\xec\x0f\x1a:\xb7\xcd', ...)

  In [5]: next(it)
  Out[5]: ('}\x1f(\xe1o\xf1a\x9b\x81\xb9\x83}\x1b\xbbt\xd2e\xb1\xedb',...)

  In [6]: next(it)
  Out[6]: ('\xf1\xf0f\x97<\xf3\xdd\xe41w>\x92\xd1\xc0\x9ah\xdd\x87~^',...)

  In [7]: next(it)
  StopIteration:

  In [8]: f.wait()
  Out[8]: <bindings.edenapi.stats at 0x7ff2e006a3d8>

  In [9]: str(Out[8])
  Out[9]: '2.42 kB downloaded in 165 ms over 1 request (0.01 MB/s; latency: 165 ms)'

  In [10]: iter(s)
  ValueError: stream was consumed

Reviewed By: kulshrax

Differential Revision: D23799645

fbshipit-source-id: 732a5da4ccdee4646386b6080408c0d8958dd67f
2020-09-21 13:28:07 -07:00
Jun Wu
cd7f831c6c pyedenapi: return a Future of Stats for commitdata
Summary:
Exercises the PyFuture type from cpython-async.

`hg dbsh`:

    In [1]: api._rustclient.commitdata('fbsource', list(repo.nodes('master^^::master')))
    Out[1]:
    ([...], <future at 0x7f7b65d05060>)

    In [2]: f=Out[1][-1]

    In [3]: f.wait()
    Out[3]: <bindings.edenapi.stats at 0x7f7b665e8228>

    In [4]: f.wait()
    ValueError: future was awaited

    In [5]: str(Out[3])
    Out[5]: '2.42 kB downloaded in 172 ms over 1 request (0.01 MB/s; latency: 171 ms)'

Reviewed By: kulshrax

Differential Revision: D23799643

fbshipit-source-id: d4fcef7dca58bc4902bb0809adc065493bb94bd3
2020-09-21 13:28:07 -07:00
Jun Wu
7f1c05dd74 cpython-async: expose Rust Future to Python
Summary:
Add a `PyFuture<F>` type that can be used as return type in binding function.
It converts Rust Future to a Python object with an `await` method so Python
can access the value stored in the future.

Unlike `TStream`, it's currently only designed to support Rust->Python one
way conversion so it looks simpler.

Reviewed By: kulshrax

Differential Revision: D23799644

fbshipit-source-id: da4a322527ad9bb4c2dbaa1c302147b784d1ee41
2020-09-21 13:28:07 -07:00
Jun Wu
41b200c8d8 cpython-async: expose Rust Stream to Python
Summary:
The exposed type can be used as a Python iterator:

  for value in stream:
      ...

The Python type can be used as input and output parameters in binding functions:

  # Rust
  type S = TStream<anyhow::Result<X>>;
  def f1() -> PyResult<S> { ... }
  def f2(x: S) -> PyResult<S> { Ok(x.stream().map_ok(...).into()) }

  # Python
  stream1 = f1()
  stream2 = f2(stream1)

This crate is similar to `cpython-ext`: it does not define actual business
logic exposed by `bindings` module. So it's put in `lib`, not
`bindings/modules`.

Reviewed By: markbt

Differential Revision: D23799641

fbshipit-source-id: c13b0c788a6465679b562976728f0002fd872bee
2020-09-21 13:28:07 -07:00
Jun Wu
71e99bf8e7 dispatch: run ipdb in the command thread
Summary:
See the previous diff for context.  Move the error handling and ipdb logic to
the background thread so it can show proper traceback.

Reviewed By: kulshrax

Differential Revision: D23819022

fbshipit-source-id: 8ddae019ab939d8fb2c89afca2a7769094ebe26a
2020-09-21 13:15:15 -07:00
Jun Wu
c96de76ac0 util: set traceback if error happens in threaded execution
Summary:
With D23759710 (34d8dca79a), the main command was moved to a background thread, but the
error handling isn't. That can cause less useful traceback like:

  Traceback (most recent call last):
    File "dispatch.py", line 698, in _callcatch
      return scmutil.callcatch(ui, func)
    File "scmutil.py", line 147, in callcatch
      return func()
    File "util.py", line 4358, in wrapped
      raise value

Set `e.__traceback__` so `raise e` preserves the traceback information.
This only works on Python 3. On Python 2 it is possible to use
`raise exctype, excvalue, tb`. But that's invalid Python 3 code. I'm
going to fix Python 2 traceback differently.

Reviewed By: kulshrax

Differential Revision: D23819023

fbshipit-source-id: 953ac8bd6108f4c0dae193607bee3f931c2bd13e
2020-09-21 13:15:15 -07:00
Jun Wu
15fb0f4f51 util: fix mtime used in gcdir
Summary:
The parameter `mtimethreshold` should be used instead of a constant of 14 days.
This fixes an issue where sigtrace output takes a lot of space in hg rage
output.

Reviewed By: DurhamG

Differential Revision: D23819021

fbshipit-source-id: e639b01d729463a4822fa93604ce3a038fbd4a9a
2020-09-21 13:15:15 -07:00
Liubov Dmitrieva
2b0829b9f5 fix ussue with incorrent update reference request in some cases
Summary:
filter returns a filter object, so the second time we iterate, it is empty

This is only in Python3 I believe, so migration to py3 broke it.

Reviewed By: markbt

Differential Revision: D23815206

fbshipit-source-id: 1a6503b2bbfd44959307c189d17dec9b5d5ff991
2020-09-21 13:15:15 -07:00
Durham Goode
63d19e1eca workers: bulk fetch data in worker thread
Summary:
During an hg update we first prefetch all the data, then write all the
data to disk. There are cases where the prefetched data is not available during
the writing phase, in which case we fall back to fetching the files one-by-one.
This has truly atrocious performance.

Let's allow the worker threads to check for missing data then do bulk fetching
of it. In the case where the cache was completely lost for some reason, this
would reduce the number of serial fetches by 100x.

Note, the background workers already spawn their own ssh connection's, so
they're already getting some level of parallelism even when they're doing 1-by-1
fetching. That's why we aren't seeing a 100x improvement in performance.

Reviewed By: xavierd

Differential Revision: D23766424

fbshipit-source-id: d88a1e55b1c21e9cea7e50fc6dbfd8a27bd97bb0
2020-09-21 11:27:12 -07:00
Liubov Dmitrieva
584de33443 fix workspace name for fbclone
Summary:
Automigration gets messed up with `hg cloud rejoin` command in fbclone code because it triggered by the pull command.

As a result fbclone ends up to join a hostname workspace instead of the default for some cases.

* make sure that the migration never runs if background commit cloud operations are disabled
* also, add skip the migration in the pull command in fbclone

Once of those would be enough to fix the issue but I prefer to make both
changes.

Reviewed By: markbt

Differential Revision: D23813184

fbshipit-source-id: 3b49a3f079e889634e3c4f98b51557ca0679090b
2020-09-21 05:09:40 -07:00
Durham Goode
7b4bbc2f64 revset: avoid full repo scan in children revset
Summary:
The children revset iterated over everything in the subset, which in
many cases was the entire repo. This can take hundreds of milliseconds. Let's
use the new _makerangeset to only iterate over descendants of the parentset.

Reviewed By: quark-zju

Differential Revision: D23794344

fbshipit-source-id: 9ac9bc014d56a95b5ac65534769389167b0f4508
2020-09-20 21:43:50 -07:00
Arun Kulshreshtha
683520106e edenapi: remove python wrapper
Summary:
Now that Mercurial itself can properly handle SIGINT, there isn't a need for a Python wrapper around the Rust EdenAPI client (since the main purpose of the wrapper was to ensure proper SIGINT handling--something that could only be done in Python).

Note that while this does remove some code that prints out certificate warnings, that code was actually broken after the big refactor of the Rust bindings. (The exception types referenced no longer exist, so the code would simple result in a `NameError` if it actually tried to catch an exception from the Rust client.)

Reviewed By: singhsrb

Differential Revision: D23801363

fbshipit-source-id: 3359c181fd05dbec24d77fa1b7d9c8bd821b49a6
2020-09-19 14:23:55 -07:00
Jun Wu
34d8dca79a dispatch: run command in non-main thread
Summary:
This extends the Ctrl+C special handling from edenapi to the entire Python
command so Ctrl+C should be able to exit the program even if it's running
some blocking Rust functions.

`edenapi` no longer needs to spawn threads for fetching.

Reviewed By: singhsrb

Differential Revision: D23759710

fbshipit-source-id: cbaaa8e5f93d8d74a8692117a00d9de20646d232
2020-09-18 18:47:24 -07:00
Liubov Dmitrieva
01615ae4de improve scm daemon checks and check workspace name as well
Summary:
Move bunch of code into a separate file (scm daemon related options). Move them
out of cloud sync.

Also introduce additional check that the `hg cloud sync` command scm daemon
runs is intended for the current connected workspace

In theory when we switch a subscription, the SCM daemon gets notified but races possible and it is better to have this additional check, so SCM daemon triggers cloud sync where it is supposed to.

Reviewed By: markbt

Differential Revision: D23783616

fbshipit-source-id: b91a8b79189b7810538c15f8e61080b41abde386
2020-09-18 14:01:11 -07:00
Jun Wu
664fa0b8ec config: remove experimental.head-based-commit-transaction
Summary:
The config is not actually used any more (with rust-commits, it is forced on, without rust-commits,
there is no point to keep it on). Therefore removed.

Reviewed By: singhsrb

Differential Revision: D23771570

fbshipit-source-id: ad3e89619ac5e193ef552c25fc064ca9eddba0c6
2020-09-18 13:28:34 -07:00
Jun Wu
6d3f17bb16 codemod: signal.signal -> util.signal
Summary:
See the previous diff for context. This allows the code to run from non-main
thread.

Reviewed By: singhsrb

Differential Revision: D23759712

fbshipit-source-id: 044193a9d7193488c700d769da9ad68987356d69
2020-09-18 13:28:34 -07:00
Jun Wu
be51116cf4 util: add util.signal that works for non-main threads
Summary:
The idea is to extend D22703916 (61712e381c)'s way of calling functions from just edenapi to
the entire command for better Ctrl+C handling. Some code paths (ex. pager,
crecord) use `signal.signal` and `signal.signal` does not work from non-main
thread.

To workaround the `signal.signal` limitation, we pre-register all signals we care
about in the main thread to a special handler. The special handler reads a
global variable to decide what to do. Other threads can modify that global
variable to affect what the special signal handler does, therefore indirectly
"register" their handles.

Reviewed By: kulshrax

Differential Revision: D23759711

fbshipit-source-id: 8ba389072433e68a36360db6a1b17638e40faefa
2020-09-18 13:28:34 -07:00
Jun Wu
b5a01b9c05 util: improve interruption handling for 'threaded'
Summary:
Before this change, for a long-running function wrapped by 'threaded',
it might:

  background thread> start
  main thread> receive SIGINT, raise KeyboardInterrupt
  main thread> raise at 'thread.join(1)'
  main thread> exiting, but wait for threads to complete (Py_Finalize)
  background thread> did not receive KeyboardInterrupt, continue running
  main thread> continue waiting for background thread

Teach `thread.join(1)` to forward the `KeyboardInterrupt` (or its subclass
`error.SignalInterrupt`) to the background thread, so the background thread
_might_ stop. Besides, label the background thread as daemon so it won't
be waited upon exit.

Reviewed By: kulshrax

Differential Revision: D23759713

fbshipit-source-id: 91893d034f1ad256007ab09b7a8b974325157ea5
2020-09-18 13:28:34 -07:00
Jun Wu
51a9d37730 edenapi: edenapi._spawnthread -> util.threaded
Summary:
Move the wrapper to util.py. It'll be used in dispatch.py to make the entire
command Ctrl+C friendly.

Reviewed By: singhsrb

Differential Revision: D23759715

fbshipit-source-id: fa2098362413dcfd0b68e05455aad543a6980907
2020-09-18 13:28:33 -07:00
Jun Wu
c4e2f5cb0f bindings: add sleep for testing blocking Rust functions
Summary: This will be used to test Ctrl+C handling with native code.

Reviewed By: kulshrax

Differential Revision: D23759714

fbshipit-source-id: 50da40d475b80da26b7dbc654e010d77cb0ad2d1
2020-09-18 13:28:33 -07:00
Jun Wu
6cb78fa90c pyedenapi: expose API querying hg commit data
Summary: This makes it easier to test the API via debugshell.

Reviewed By: kulshrax

Differential Revision: D23750677

fbshipit-source-id: e29284395f03c9848cf90dd2df187e437890c56e
2020-09-18 13:28:33 -07:00
Jun Wu
80bf264e24 debugshell: add "api" object
Summary: It is handy to test edenapi methods directly.

Reviewed By: kulshrax

Differential Revision: D23750709

fbshipit-source-id: 33c15cecaa0372ba9e4688502e7d8f3fdda7c3b8
2020-09-18 13:28:33 -07:00
Jun Wu
478e1fe524 commands: add debugrebuildchangelog
Summary:
Add a command to rebuild the changelog without recloning other parts of the
repo. This can be used as a way to recover from corrupted changelog. It
currently uses revlog because revlog is still the only supported format during
streamclone.

In the future this can be used for defragmentation.

Reviewed By: DurhamG

Differential Revision: D23720215

fbshipit-source-id: 6db0453d18dbf553660d55d528f990a4029d9da4
2020-09-18 13:28:33 -07:00
Liubov Dmitrieva
70dc57f48b improve help
Summary:
Improve help to reflect that the system is also meant for managing backups

add missing commands
reshuffle a bit

Reviewed By: markbt

Differential Revision: D23782794

fbshipit-source-id: d7fd3fa06ca7acd649cef557f3fe020295259e3d
2020-09-18 07:03:25 -07:00
Liubov Dmitrieva
d94f354708 implement a command to reclaim workspaces
Summary: The command will be provided as hint if username changes has been detected in configuration.

Reviewed By: markbt

Differential Revision: D23769942

fbshipit-source-id: 3e84ecef6dd68267022b92bf10f5e68dfc07f270
2020-09-18 04:18:11 -07:00
Saurabh Singh
a703572183 fb-scratch: stop building the package
Summary:
`scratch` provided by `fb-scratch` was replaced by `mkscratch` provided by
the Mercurial package. See linked task for details.

Reviewed By: quark-zju

Differential Revision: D23773840

fbshipit-source-id: de0582069ce1a09c3cd9fc6b02d2d149f70d0d78
2020-09-17 18:32:19 -07:00
Durham Goode
41b0cf71e8 mutation: remove exponential algorithm from obsoletenodes
Summary:
Computing all successorsets is exponential with the number of splits
that have happened. This can slow things down tremendously.

The obsoletenodes path only needs to know "is there a visible successor" in
order to determine if a draft commit is obsolete. Let's use allsuccessors
instead of successorset.

Reviewed By: quark-zju

Differential Revision: D23771025

fbshipit-source-id: 666875e681c2e3306fc301357c95f1ab5bb40a87
2020-09-17 18:29:40 -07:00
Liubov Dmitrieva
57e4688aa4 introduce commands for renaming workspaces and rehost workspace
Summary:
`hg cloud join --merge` doesn't really solve rename problem because it doesn't
preserve:

1. old heads
2. history

I added a proper API in Commit Cloud Service for renaming workspaces and now we
can use it to provide a rename command and 'rehost' command which is a version
of renaming to bind the current workspace to the current devserver.
Rehost command is meant to be used after dev server migration. I am plannig to
add this to the dev server migration wiki.

Next diff will cover how we can use the rename command to fix a username in workspaces names after username has been changed.

Reviewed By: markbt

Differential Revision: D23757722

fbshipit-source-id: dc11cb226eb76d347cdab70b3c72566448dcd098
2020-09-17 17:45:05 -07:00
Durham Goode
f68177a983 treemanifest: flush shared stores when flushing local stores
Summary:
The Rust contentstore has no way to flush the shared stores, except
when the object is destructed. In treemanifest, the lifetime of the shared store
seems to be different from with files and we're not seeing them flushes
appropriately during certain commands. Let's make the flush api also flush the
shared stores.

Reviewed By: quark-zju

Differential Revision: D23662976

fbshipit-source-id: a542c3e45d5b489fcb5faf2726854cb49df16f4c
2020-09-17 14:27:50 -07:00
Durham Goode
84f72950ad treemanifest: make Python repack work with Rust treemanifest stores
Summary:
Now that treemanifests can use Rust stores, we need to update the
Python repack code to support that.

Reviewed By: quark-zju

Differential Revision: D23662361

fbshipit-source-id: c802852c476425eef74181ead04f70b11ff9a27c
2020-09-17 14:27:50 -07:00
Durham Goode
7aca64d8f9 treemanifest: integrate treemanifest prefetch with Rust store prefetch
Summary:
This makes Rust contentstore prefetch route through the remotetreestore
prefetch logic to reach the lower level tree fetching, and makes the higher
level Python fetching route through the Rust contentstore to do prefetching. The
consequence of this is that there's a relatively unified code path for both
Python and Rust, and hopefully we can delete the janky Python bits once we're
completely migrated to Rust.

The way this diff works is pretty hacky. The code comment explains it, but the
tl;dr is that Rust prefetch works by providing references to the mutable stores,
while Python prefetch assumes they are stored and accessible on the repository.
Inorder for the old python tree fetching logic to work with both models, we
monkey patch the Rust mutable store references we receive into the function that
will later be called to request the repositories mutable stores. This is awful.

A cleaner fix might be to thread the mutable stores all the way through the
python fetching logic, then move the Python accessing of the repositories
mutable stores to the higher layer, near where Rust would provide it. That's a
lot of code churn though, so I'd like to do that in a later diff once we stop
using the non-rust logic entirely.

Reviewed By: quark-zju

Differential Revision: D23662351

fbshipit-source-id: 76007b6089ddf0e558581cd179a112311f8b58e3
2020-09-17 14:27:49 -07:00
Durham Goode
c268c02298 treemanifest: refactor remotetreestore prefetching
Summary:
As part of moving treemanifest to use the Rust tree store, we need to
move prefetch to be able to be initiated from Rust. Rust requires a certain
signature for the prefetch function which accepts multiple keys.

In preparation for this requirement, let's refactor the current remotetreestore
fetching path to have a separate function. In a later diff we'll route Rust
prefetch requests through this function so the python and rust code shares the
same base tree discovery logic.

Reviewed By: quark-zju

Differential Revision: D23662196

fbshipit-source-id: 127045c279dc22914f7e1f3a619f6620586010ba
2020-09-17 14:27:49 -07:00
Durham Goode
a88287fd45 rebase: move inmemory fallback outside of except
Summary:
Python 3 reports exceptions in except clauses by showing the original
exception, then saying another exception happened during the original exception
and hiding the second exception stack trace.

To make update exceptions more debuggable, let's move the handling outside the
except clause.

Reviewed By: quark-zju

Differential Revision: D23761667

fbshipit-source-id: bec758a3c7c0b88a5a569f794730058bf6f1eaad
2020-09-17 14:21:49 -07:00
Liubov Dmitrieva
94761154da reduce repetitions in the http client code
Summary: Better engineering: reduce repetitions in the http client code

Reviewed By: markbt

Differential Revision: D23731119

fbshipit-source-id: cf1cb939231fa38ae23f4a2d86a867c3881d16b4
2020-09-17 11:07:07 -07:00
Liubov Dmitrieva
62c2bd52c9 detect changed usernames in configs
Summary:
This is the initial step to track username when the workspace has been created and provide users an appropriate advice how to fix their workspace names if the username in configuration has been changed

in another diff I will provide the advice itself

I will build rename workspace command based on D23703790

Reviewed By: markbt

Differential Revision: D23730312

fbshipit-source-id: a49dabba7ec4acf35f6ff99ed23cff5d6f46e2e4
2020-09-17 11:07:06 -07:00
Jun Wu
813647f917 template: remove rev:node legacy template
Summary:
`experimental.template-new-builtin = true` has been rolled out to 100%
and seems to work fine. Therefore, remove code that supports
`template-new-builtin = false`.

Reviewed By: singhsrb

Differential Revision: D23745353

fbshipit-source-id: 178af269381c9d3e20522ba4484d63051589342b
2020-09-17 10:58:39 -07:00
Durham Goode
7c8f9167e1 tests: avoid turning $TESTTMP into a repo
Summary:
Some tests run `hg init` right inside the test directory, turning the
entire $TESTTMP into a repo. In future diffs we'll start to rely more on hgcache
being present during tests, which creates a directory in $TESTTMP. Let's make
sure all repos are created as sub-directories of $TESTTMP.

Reviewed By: kulshrax

Differential Revision: D23662077

fbshipit-source-id: 2b2b974ebfd1bd19ad6acd1ebe3e68dd03a09869
2020-09-17 10:16:03 -07:00
Durham Goode
cbe4499da8 treemanifest: add option for instantiating a Rust treemanifest store
Summary:
Adds the initial condition and creation logic for creating a Rust
treemanifest store. Fetching and some other code paths don't work just yet, but
subsequent diffs enable more and more functionality.

Reviewed By: quark-zju

Differential Revision: D23662052

fbshipit-source-id: a0e7090c9a3bf27a7738bf093f2d4eb6098b1ed6
2020-09-17 10:16:03 -07:00
Durham Goode
556ae539fa repack: prevent Rust repack from repacking an entry twice
Summary: The old logic would just double pack some bits. Let's prevent that.

Reviewed By: xavierd

Differential Revision: D23661933

fbshipit-source-id: 155291fa08ec2c060619329bd1cb6040769feb63
2020-09-17 10:16:03 -07:00
Durham Goode
6ae1cf9619 revisionstore: add refresh function
Summary:
The rust pack stores currently have logic to refresh their list of
packs if there's a key miss and if it's been a while since we last loaded the
list of packs. In some cases we want to manually trigger this refresh, like if
we're in the middle of a histedit and it invokes an external command that
produces pack files that the histedit should later consume (like an external
amend, that histedit then needs to work on top of).

Python pack stores solve this by allowing callers to mark the store for a
refresh. Let's add the same logic for rust stores. Once pack files are gone we
can delete this.

This will be useful for the upcoming migration of treemanifest to Rust
contentstore. Filelog usage of the Rust contentstore avoided this issue by
recreating the entire contentstore object in certain situations, but refresh
seems useful and less expensive.

Reviewed By: quark-zju

Differential Revision: D23657036

fbshipit-source-id: 7c6438024c3d642bd22256a8e58961a6ee4bc867
2020-09-17 10:16:03 -07:00
Durham Goode
055fc0d20b packstore: avoid substracting from an Instant
Summary:
Instants do not represent actual time and can only be compared against
each other. When we subtracted arbitrary Durations from them, we run the risk of
overflowing the underlying storage, since the Instant may be represented by a
low number (such as the age of the process).

This caused crashes in test_refresh (in the next diff) on Windows.

Let's instead represent the "must rescan" state as a None last_scanned time, and avoid any arbitrary subtraction. It's generally much cleaner too.

Reviewed By: quark-zju

Differential Revision: D23752511

fbshipit-source-id: db89b14a701f238e1c549e497a5d751447115fb2
2020-09-17 10:16:03 -07:00
Durham Goode
d832ea7afa treemanifest: change local tree sending to depend on phases
Summary:
When sending trees and files we try to avoid sending trees that are
available from the main server. To do so, we currently check to see if the
tree/file is from the local store (i.e. .hg/store instead of $HGCACHE).

In a future diff we'll be moving trees to use the Rust store, which doesn't
expose the difference between shared and local stores. So we need to stop
depending on logic to test the local store.

Instead we can test if the commit is public or not, and only send the tree/file
is the commit is not public. This is technically a revert of the 2018 D7992502 (5e95b0e32e)
diff, which stopped depending on phases because we'd receive public commits from
svn there were not public on the server yet. Since svn is gone, I think it's
safe go back to that way.

This code was usually to help when the client was further ahead than another
client and in some commit cloud edge cases, but 1) we don't do much/any p2p
exchange anymore, and 2) we did some work this year to ensure clients have more
up-to-date remote bookmarks during exchange (as a way of making phases and
discovery more reliable), so hopefully we can rely on phases more now.

Reviewed By: quark-zju

Differential Revision: D23639017

fbshipit-source-id: 34c13aa2b5ef728ea53ffe692081ef443e7e57b8
2020-09-16 21:39:25 -07:00
Durham Goode
dd387dd0d1 mutablepacks: only create mutable history packs when needed
Summary:
Previously the MetadataStore would always construct a mutable pack, even
if the operation was readonly. This meant all read commands required write
access. It also means that random .tmp files get scattered all over the place
when the rust structures are not properly destructed (like if python doesn't
bother doing the final gc to call destructors for the Rust types).

Let's just only create mutable packs when we actually need them.

Reviewed By: quark-zju

Differential Revision: D23219961

fbshipit-source-id: a47f3d94f70adac1f2ee763f3170ed582ef01a14
2020-09-16 21:39:25 -07:00
Durham Goode
1f5835e70a mutablepacks: only create mutable data packs when needed
Summary:
Previously the ContentStore would always construct a mutable pack, even
if the operation was readonly. This meant all read commands required write
access. It also means that random .tmp files get scattered all over the place
when the rust structures are not properly destructed (like if python doesn't
bother doing the final gc to call destructors for the Rust types).

Let's just only create mutable packs when we actually need them.

Reviewed By: quark-zju

Differential Revision: D23219962

fbshipit-source-id: 573844f81966d36ad324df03eecec3711c14eafe
2020-09-16 21:39:25 -07:00
Durham Goode
57b422b49a py3: fix hg when there's no stdin
Summary:
Some tools, like ShipIt, close stdin before they launch the subprocess.
This causes sys.stdin to be None, which breaks our pycompat buffer read. Let's
handle that.

Reviewed By: quark-zju

Differential Revision: D23734233

fbshipit-source-id: 0adc23cd5a8040716321f6ede0157bc8362d56e0
2020-09-16 19:41:04 -07:00
Durham Goode
ce26d74022 py3: fix crecord help screen
Summary:
Turns out crecord had a help screen. It was broken in Python 3. This
fixes it.

Reviewed By: singhsrb

Differential Revision: D23720798

fbshipit-source-id: 4aade9abb88355c19ee4445de116fdb40d5366bd
2020-09-16 09:34:25 -07:00
Durham Goode
72f8d0cfd8 py3: fix reset
Summary: The test now passes

Reviewed By: quark-zju

Differential Revision: D23720599

fbshipit-source-id: fb8b76dcbacbd8b2e2f2a1f0d5f16abc59f78ff8
2020-09-16 09:30:39 -07:00
Carolyn Busch
97edbc3bc9 setup: convert buildinfosrc to bytes
Reviewed By: singhsrb

Differential Revision: D23719096

fbshipit-source-id: e60522245476dac301e8449743bfd1756cfe3fbc
2020-09-16 09:30:39 -07:00
Durham Goode
420dcf9c63 py3: fix copy tracing
Summary: filter returns a generator in Python 3, but we need a list.

Reviewed By: singhsrb

Differential Revision: D23720661

fbshipit-source-id: 8de3f5844bfe8b85b37c44423733fd2a09967397
2020-09-16 09:27:36 -07:00
Durham Goode
5c5f355ffd py3: fix chistedit
Summary: This was horribly broken, and we have no tests.

Reviewed By: singhsrb

Differential Revision: D23720984

fbshipit-source-id: 4ad47c767b0d18f700c855a7bb43f38f5c5ef317
2020-09-16 09:22:18 -07:00
Durham Goode
9f80bd1d6f py3: fix unicode characters in patches
Summary:
When I added the surrogateescape patch for the email parser decoder
used during patches, I incorrectly added a corresponding encoder on the other
end when we get the data out of the parser. It turns out the parser is
smart/dumb. When using get_payload() it attempts a few different decodings of
the data and ends up replacing all the non-ascii characters with replacement
bits (question marks). Instead we should use get_payload(decode=True), which
bizarrely actually encodes the data into bytes, correctly detecting the presence
of surrogates and using the correct ascii+surrogateescape encoding.

Reviewed By: singhsrb

Differential Revision: D23720111

fbshipit-source-id: ed40a15056c39730c91067b830f194fbe41e5788
2020-09-16 09:21:20 -07:00
Jun Wu
2abf0ada42 version: print EdenSCM instead of Mercurial
Summary: Per team discussion.

Reviewed By: singhsrb

Differential Revision: D23719401

fbshipit-source-id: a1e9a1e9a10369c307413354054a65e6520d13e5
2020-09-15 21:03:59 -07:00
Xavier Deguillard
3ac8b21b2e tests: make test-fb-hgext-remotefilelog-ruststores-lfs.t reliable
Summary:
This test is flaky due to `hg up` not always reading data from the stores, and
thus not always failing to reading the LFS blob. A better way to force read
from the store is to simply use `hg log -p` to read from the stores.

Reviewed By: DurhamG, singhsrb

Differential Revision: D23718823

fbshipit-source-id: 98bc37a76e93a67d031ba7bfa124b1db816983a1
2020-09-15 18:57:58 -07:00
Jun Wu
22d38872fb setup: skip Py3 only thrift files for Py2 build
Summary: The files use Python 3 only syntax and is not really used. Skip them so Python 2 build won't hit invalid syntax issues.

Reviewed By: chadaustin

Differential Revision: D23717662

fbshipit-source-id: f911a83937be9ccc40194f321e3b41625a68e703
2020-09-15 17:37:50 -07:00
Jun Wu
3095de7357 Back out "use python 3 for the eden_scm getdeps build"
Summary:
Running `setup.py` with Python 3 for Python 2 build will cause issues as
`setup.py` writes `.pyc` files in Python 3 format.

Reviewed By: chadaustin

Differential Revision: D23717661

fbshipit-source-id: 38cfabdfdf20424a21f8a5bdaf826e74da2304ac
2020-09-15 17:37:50 -07:00
Johan Schuijt-Li
deb57a25ed mononoke: deprecate preamble in favor of metadata
Summary:
In preparation of moving away from SSH as an intermediate entry point for
Mononoke, let Mononoke work with newly introduced Metadata. This removes any
assumptions we now make about how certain data is presented to us, making the
current "ssh preamble" no longer central.

Metadata is primarily based around identities and provides some
backwards-compatible entry points to make sure we can satisfy downstream
consumers of commits like hooks and logs.

Simarly we now do our own reverse DNS resolving instead of relying on what's
been provided by the client. This is done in an async matter and we don't rely
on the result, so Mononoke can keep functioning in case DNS is offline.

Reviewed By: farnz

Differential Revision: D23596262

fbshipit-source-id: 3a4e97a429b13bae76ae1cdf428de0246e684a27
2020-09-15 10:28:38 -07:00
Thomas Orozco
d7081f6aba lfs: add client support for received compressed responses
Summary:
As it says in the title, this adds support for receiving compressed responses
in the revisionstore LFS client. This is controlled by a flag, which I'll
roll out through dynamicconfig.

The hope is that this should greatly improve our throughput to corp, where
our bandwidth is fairly scarce.

Reviewed By: StanislavGlebik

Differential Revision: D23652306

fbshipit-source-id: 53bf86d194657564bc3bd532e1a62208d39666df
2020-09-15 07:59:53 -07:00
Thomas Orozco
21290702e1 third-party/rust: import async-compression + update zstd
Summary:
This imports the async-compression crate. We have an equivalent-ish in
common/rust, but it targets Tokio 0.1, whereas this community-supported crate
targets Tokio 0.2 (it offers a richer API, notably in the sense that we
can use it for Streams, whereas the async-compression crate we have is only for
AsyncWrite).

In the immediate term, I'd like to use this for transfer compression in
Mononoke's LFS Server. In the future, we might also use it in Mononoke where we
currently use our own async compression crate when all that stuff moves to
Tokio 0.2.

Finally, this also updates zstd: the version we link to from tp2 is actually
zstd 1.4.5, so it's a good idea to just get the same version of the zstd crate.

The zstd crate doesn't keep a great changelog, so it's hard to tell what has changed.
At a glance, it looks like the answer is not much, but I'm going to look to Sandcastle
to root out potential issues here.

Reviewed By: StanislavGlebik

Differential Revision: D23652335

fbshipit-source-id: e250cef7a52d640bbbcccd72448fd2d4f548a48a
2020-09-15 07:59:53 -07:00
Stanislau Hlebik
d0c212f0b1 clienttelemetry: allow logging arbitrary config values
Summary: That might be used to pass more data to the server

Reviewed By: markbt

Differential Revision: D23704722

fbshipit-source-id: a6e41d615f6548f2f8fd036814c59573a45f93bc
2020-09-15 06:48:28 -07:00
Chad Austin
dca9f7bbfb use python 3 for the eden_scm getdeps build
Summary:
EdenFS is adding a Python 3 Thrift client intended for use by other
projects, and the Mercurial Python 2 build doesn't understand Python 3
syntax files, so switch the default getdeps build to Python 3.

Reviewed By: quark-zju

Differential Revision: D23587932

fbshipit-source-id: 6f47f1605987f9b37f888d29b49a848370d2eb0e
2020-09-14 21:39:51 -07:00
generatedunixname89002005307016
827498fc82 suppress errors in eden - batch 1
Differential Revision: D23685952

fbshipit-source-id: e545fd2625c36a8f811179091b3043c95281ff7a
2020-09-14 15:56:35 -07:00
Durham Goode
a674b25157 hgcache: add config driven cache nuking
Summary:
We've often had cases where we need to nuke peoples caches for various
reasons. It's a hug pain since we haven't a way to communicate with all hg
clients. Now that we have configerator dynamicconfigs, we can use that to reach
all clients.

This diff adds support for configs like:
```
[hgcache-purge]
foo=2020-08-20
```
The key, 'foo' in this case, is an identifier used to only run this purge once.
The value is a date after which this purge will no longer run. This is useful
for bounding the damager from forgetting about a purge and having it delete caches
over and over in the future for new repos or repos where the run once marker
file is deleted for some reason.

Reviewed By: quark-zju

Differential Revision: D23044205

fbshipit-source-id: 8394fcf9ba6df09f391b5317bad134f369e9b416
2020-09-14 11:01:02 -07:00
Liubov Dmitrieva
a37a294fda improve fbclone experience
Summary:
`hg cloud rejoin` is used in fbclone

By providing a bit more information about the workspaces available we can improve user
experience and try to eliminate the confusion multiple workspaces cause.

Reviewed By: mitrandir77

Differential Revision: D23623063

fbshipit-source-id: 7598c1b58597032c9cfcef0b44b0ec1b00510ffa
2020-09-11 03:45:55 -07:00
Durham Goode
474b043a34 grep: fix biggrep integration when corpus rev is not present
Summary:
The corpus rev that biggrep has indexed may not be available in the
local client. Later on in the function it will pull that revision, but earlier
in the function the new logic I added a few weeks ago is just crashing.

That logic was trying to diff against the earlier revision, but that's pretty
arbitrary. Let's just diff against one of the revs at random
(deterministically) and get rid of the need for the hash to exist in the repo
early in the command.

Reviewed By: sfilipco

Differential Revision: D23635801

fbshipit-source-id: 1c284d710b8df9539a696e900183bc10d5d71869
2020-09-10 18:01:38 -07:00
Durham Goode
f5a2347fbb py3: fix Mononoke Python 3 test failures
Summary:
Fixes a few issues with Mononoke tests in Python 3.

1. We need to use different APIs to account for the unicode vs bytes difference
for path hash encoding.
2. We need to set the language environment for tests that create utf8 file
paths.
3. We need the redaction message and marker to be bytes.  Oddly this test still
fails with jq CLI errors, but it makes it past the original error.

Reviewed By: quark-zju

Differential Revision: D23582976

fbshipit-source-id: 44959903aedc5dc9c492ec09a17b9c8e3bdf9457
2020-09-09 18:31:04 -07:00
Xavier Deguillard
ed4021b8e3 revisionstore: disallow reading LFS pointers from packfiles
Summary:
For repositories that have the old-style LFS extension enabled, the pointers
are stored in packfiles/indexedlog alongside with a flag that signify to the
upper layers that the blob is externally stored. With the new way of doing LFS,
pointers are stored separately.

When both are enabled, we are observing some interesting behavior where
different get and get_meta calls may return different blobs/metadata for the
same filenode. This may happen if a filenode is stored in both a packfile as an
LFS pointers, and in the LFS store. Guaranteeing that the revisionstore code is
deterministic in this situation is unfortunately way too costly (a get_meta
call would for instance have to fully validate the sha256 of the blob, and this
wouldn't guarantee that it wouldn't become corrupted on disk before calling
get).

The solution take here is to simply ignore all the lfs pointers from
packfiles/indexedlog when remotefilelog.lfs is enabled. This way, there is no
risk of reading the metadata from the packfiles, and the blob from the
LFSStore. This brings however another complication for the user created blobs:
these are stored in packfiles and would thus become unreadable, the solution is
to simply perform a one-time full repack of the local store to make sure that
all the pointers are moved from the packfiles to to LFSStore.

In the code, the Python bindings are using ExtStoredPolicy::Ignore directly as
these are only used in the treemanifest code where no LFS pointers should be
present, the repack code uses ExtStoredPolicy::Use to be able to read the
pointers, it wouldn't be able to otherwise.

Reviewed By: DurhamG

Differential Revision: D22951598

fbshipit-source-id: 0e929708ba5a3bb2a02c0891fd62dae1ccf18204
2020-09-09 18:27:42 -07:00
Stefan Filip
1c172c9008 lfs: use hg-http built client for network requests
Summary: This client provides automatic metrics collection.

Reviewed By: kulshrax

Differential Revision: D23577871

fbshipit-source-id: 137299222a20bc8e4d52c3321febbb91d861b236
2020-09-09 17:35:49 -07:00
Stefan Filip
046db98222 edenapi: use hg-http built client for network requests
Summary:
hg-http's built client should provide integration with Mercurial's stats
collection mechanisms.

Reviewed By: kulshrax

Differential Revision: D23577867

fbshipit-source-id: 93c777021bc347511322269d678d6879710eed3e
2020-09-09 17:35:48 -07:00
Stefan Filip
c1ab6a4e92 http-client: add stats reporting hook
Summary:
Add `with_stats_reporting` to HttpClient. It takes a closure that will be
called with all `Stats` objects generated. We then use this function in
the hg-http crate to integrate with the metrics backend used in Mercurial.

Reviewed By: kulshrax

Differential Revision: D23577869

fbshipit-source-id: 5ac23f00183f3c3d956627a869393cd4b27610d4
2020-09-09 17:35:48 -07:00
Stefan Filip
008d0c82df metrics: use the hgmetrics bindings for incrementing counters
Summary: Rust based metrics so that even Rust libraries can write metrics.

Reviewed By: quark-zju

Differential Revision: D23577870

fbshipit-source-id: b19904968d9372c8ce19775fb37c7af53a370ea5
2020-09-09 17:35:48 -07:00
Stefan Filip
de9b34e83a bindings: add pyhgmetrics to bind the hg-metrics crate
Summary: Exposing the hg-metrics crate to the Python application.

Reviewed By: quark-zju

Differential Revision: D23577875

fbshipit-source-id: 1d919160f8514ae8bfcb0171a0c9d1d9d0de80e6
2020-09-09 17:35:48 -07:00
Stefan Filip
7f72a04c0e metrics: crate for collecting metrics
Summary:
We start off simple here. Python only really has counters so we only implement
counters. There are a lot of options on how to improve this and things get
slightly complicated when we look at the how ecosystem and fb303. Anyway,
simple start.

Reviewed By: quark-zju

Differential Revision: D23577874

fbshipit-source-id: d50f5b2ba302d900b254200308bff7446121ae1d
2020-09-09 17:35:48 -07:00
Stefan Filip
ead17552cf metrics: treat slash '/' as metric delimiter
Summary:
Slash is probably the standard metric delimiter nowadays. Since we don't have
that many metrics I think that it makes sense to look at slash as the
standard metric delimiter going forward.
This diff updates parsing of metric names to treat both '_' and '/' as
delimiters.

Reviewed By: quark-zju

Differential Revision: D23577876

fbshipit-source-id: 03997b1285df9c52d6e2837b5af5372deb69b133
2020-09-09 17:35:48 -07:00
Stefan Filip
4ad9091598 thrift: update thrift types
Summary: autogenerated by `make local`

Reviewed By: quark-zju

Differential Revision: D23577872

fbshipit-source-id: 6ca98fd865c3b3bc3a00d8126ce20b59110f8118
2020-09-09 17:35:48 -07:00
Liubov Dmitrieva
321f4dfb31 add hg cloud switch command to simplify switching between
Summary:
The command is easier to use than `hg cloud join --switch`.

Also highlight the workspace name in the output of `hg cloud status`

Reviewed By: mitrandir77

Differential Revision: D23601507

fbshipit-source-id: 74eb17c9366a9dbe96881c8e3e0705619fadb3d6
2020-09-09 14:04:57 -07:00
Pavel Aslanov
897ec3d6d8 verify that received files have the correct size
Summary:
Streaming clone implementation did not check that received files have the corrects. This change addresses it.

Before this change if connection was interrupted for whatever reason client would treat fetch of changeset as successful and proceed with cloning operations, but later checks would report corruption of internal state of hg data. This is based on user [report](https://fb.workplace.com/groups/scm/permalink/3177150312334567/)

Reviewed By: quark-zju, krallin

Differential Revision: D23572058

fbshipit-source-id: d740b45ca217cd6db0a65e01aabc2ba9a4835221
2020-09-09 11:32:38 -07:00
Saurabh Singh
384c4f61fa fix the Windows build
Reviewed By: sfilipco

Differential Revision: D23601358

fbshipit-source-id: c5a33286b7468882bbedb3e8fe85f66a8f9db0e2
2020-09-09 10:39:35 -07:00
Arun Kulshreshtha
de7f7ab4fe http-client: rename crate
Summary: The Mercurial codebase uses hyphens in crate names rather than underscores. This is similar to the convention favored by the larger Rust community, though it is different from Mononoke, which uses underscores. While we'll probably need to eventually settle on a consistent convention for all of projects in the Eden SCM repo, for now, `http_client` should be made consistent with the adjacent crates.

Reviewed By: sfilipco

Differential Revision: D23585721

fbshipit-source-id: d2e690d86815be02d7b8d645198bcd28e8cbd6e0
2020-09-09 10:12:50 -07:00
David Tolnay
e83e05ff25 Update formatter to rustfmt 2.0
Reviewed By: zertosh

Differential Revision: D23591028

fbshipit-source-id: f458503fc2b9c25023fa1643eca5e166882a4811
2020-09-09 07:52:34 -07:00
Lukasz Piatkowski
379065faab eden/scm: remove leftover of tokio-core after tokio 0.2 migration (#52)
Summary: Pull Request resolved: https://github.com/facebookexperimental/eden/pull/52

Reviewed By: krallin

Differential Revision: D23594074

Pulled By: lukaspiatkowski

fbshipit-source-id: 776c02418f4951321887f566bac8b76c9da8bcc1
2020-09-09 02:32:49 -07:00
Zeyi (Rice) Fan
5e02a93e91 eden-client: move to use tokio 0.2 socket transport
Summary: No more tokio-core! More `async/await`.

Reviewed By: kulshrax

Differential Revision: D23586509

fbshipit-source-id: b2e766ddb7575bc96963432f0c8582b4370b19aa
2020-09-08 20:24:26 -07:00
Zeyi (Rice) Fan
a6a73ec6b6 switch to tokio 0.2 transport
Summary:
This diff adds a `SocketTransport` implementation that no longer uses legacy `tokio-core` based futures but `tokio-tower` and `tower-service` for processing Thrift requests.

The old implementation is renamed to `SocketTransportLegacy` for better transitioning.

Reviewed By: dtolnay

Differential Revision: D20019196

fbshipit-source-id: 3bee684e9254bf1a81669ef0d2c2262a55e75daa
2020-09-08 17:53:57 -07:00
Saurabh Singh
858dbc6861 tests: fix 'test-remotefilelog-undesired-file-logging.t'
Reviewed By: DurhamG

Differential Revision: D23589645

fbshipit-source-id: 350bab980baa811824d7c4fd36d689a5a3395dd8
2020-09-08 17:36:35 -07:00
Durham Goode
2919268555 revisionstore: auto-delete when we have too much pack data
Summary:
In order to keep the hgcache size bounded we need to keep track of pack
file size even during normal operations and delete excess packs.

This has the negative side effect of deleting necessary data if the operation is
legitimately huge, but we'd rather have extra downloading time than fill up the
entire disk.

Reviewed By: quark-zju

Differential Revision: D23486922

fbshipit-source-id: d21be095a8671d2bfc794c85918f796358dc4834
2020-09-08 11:33:50 -07:00
Durham Goode
717d10958f revisionstore: refactor pack iteration code
Summary:
In a future diff we'll add logic to delete old pack files. We'll want
to use this pack iteration code, so let's move it to a function.

Reviewed By: quark-zju

Differential Revision: D23486920

fbshipit-source-id: 5f872e946ffe816289c925dd2e03c292e29da5af
2020-09-08 11:33:50 -07:00
Durham Goode
651a0690be revisionstore: auto-commit datapacks when they get large
Summary:
As the repository grows the opportunity for large downloads increases.
Today all writes to data packs get sent straight to disk, but we have no way to
prevent this from eating all the disk.

Let's automatically flush datapacks when they reach a certain size (default
4GB). In a future diff this will let us automatically garbage collect data packs
to bound the maximum size of packs.

Rotatelog already have this behavior.

Reviewed By: quark-zju

Differential Revision: D23478780

fbshipit-source-id: 14f9f707e8bffc59260c2d04c18b1e4f6bdb2f90
2020-09-08 11:33:50 -07:00
Thomas Orozco
2948993c38 remotefilelog: add killswitch for client certs
Summary:
See D23538897 for context. This adds a killswitch so we can rollout client
certs gradually through dynamicconfig.

Reviewed By: StanislavGlebik

Differential Revision: D23563905

fbshipit-source-id: 52141365d89c3892ad749800db36af08b79c3d0c
2020-09-08 10:39:07 -07:00
Thomas Orozco
d1c4772da3 remotefilelog: use client certs when connecting to LFS
Summary:
Like it says in the title, this updates remotefilelog to present client
certificates when connecting to LFS (this was historically the case in the
previous LFs extension). This has a few upsides:

- It lets us understand who is connecting, which makes debugging easier;
- It lets us enforce ACLs.
- It lets us apply different rate limits to different use cases.

Config-wise, those certs were historically set up for Ovrsource, and the auth
mechanism will ignore them if not found, so this should be safe. That said, I'd
like to a killswitch for this nonetheless. I'll reach out to Durham to see if I
can use dynamic config for that

Also, while I was in there, I cleaned up few functions that were taking
ownership of things but didn't need it.

Reviewed By: DurhamG

Differential Revision: D23538897

fbshipit-source-id: 5658e7ae9f74d385fb134b88d40add0531b6fd10
2020-09-08 10:39:07 -07:00
David Tolnay
e62b176170 Prepare for rustfmt 2.0
Summary:
Generated by formatting with rustfmt 2.0.0-rc.2 and then a second time with fbsource's current rustfmt (1.4.14).

This results in formatting for which rustfmt 1.4 is idempotent but is closer to the style of rustfmt 2.0, reducing the amount of code that will need to change atomically in that upgrade.

 ---

*Why now?* **:** The 1.x branch is no longer being developed and fixes like https://github.com/rust-lang/rustfmt/issues/4159 (which we need in fbcode) only land to the 2.0 branch.

 ---

Reviewed By: zertosh

Differential Revision: D23568779

fbshipit-source-id: 477200f35b280a4f6471d8e574e37e5f57917baf
2020-09-07 20:47:59 -07:00
Mateusz Kwapich
6e5a6c3d71 metaedit: JSON input mode
Summary:
This makes it easy for `metaedit` to be used by automation. Provided
with a simple JSON file with hash->{user, message} mapping metaedit will
do all of its work without any prompts.

Reviewed By: quark-zju

Differential Revision: D23545527

fbshipit-source-id: 18763ecacff9143b9ad492faf654b176b0f86d1f
2020-09-07 13:33:58 -07:00
Jun Wu
89eb6520d2 scmutil: remove meaningfulparents
Summary:
The "meaningfulparents" concept is coupled with rev numbers.
Remove it. This changes default templates to not show parents, and `{parents}`
template to show parents.

Reviewed By: DurhamG

Differential Revision: D23408970

fbshipit-source-id: f1a8060122ee6655d9f64147b35a321af839266e
2020-09-05 15:06:44 -07:00
Durham Goode
8b91cccc8b remotefilelog: log undesired filename fetches
Summary:
Now that the Rust revisionstore records undesired filename fetches,
let's log those results to Scuba in Python.

Reviewed By: StanislavGlebik

Differential Revision: D23462572

fbshipit-source-id: b55f2290e30e3a5c3b67d9f612b24bc3aad403a8
2020-09-04 14:55:15 -07:00
Durham Goode
9772ab1718 revisionstore: record remote fetches that match a pattern
Summary:
We want to be able to record when fetches to certain paths happen.
Let's add recording infrastructure to the new ReportingRemoteDataStore.

A future diff will make the seen accessible from Python for scuba logging.

Reviewed By: xavierd

Differential Revision: D23462574

fbshipit-source-id: 5d749f2429e26e8e7fe4fb5adc29140b4309eac9
2020-09-04 14:55:15 -07:00
Durham Goode
84cbc26b1e revisionstore: add reporting wrapper for remote data store
Summary:
We want to monitor what paths are fetched from our remote servers.
Since all of our remote stores are hidden behind the RemoteDataStore interface,
let's create a wrapper around that. A future diff will insert the actual
monitoring and reporting.

Reviewed By: quark-zju

Differential Revision: D23462571

fbshipit-source-id: e6031f19db23f7d1b09767efb9613d7528fb457d
2020-09-04 14:55:14 -07:00
Jun Wu
dabb68c1e5 checkmessagehook: make error message more obvious
Summary: This hopefully makes it more obvious so it looks less like an hg crash.

Reviewed By: kulshrax

Differential Revision: D23509569

fbshipit-source-id: 7174780bc7e9841e3f89a482280c49427b62fb74
2020-09-04 14:55:14 -07:00
Jun Wu
4131dcf012 context: avoid memorizing revs
Summary:
The revs can change after flush. For example, during pushrebase, some ctx might
initially have a non-master Id assigned, and later got assigned an Id in the
master group:

```
ipdb> p self.__dict__
{'_repo': <edenscm.hgext.fastannotate.protocol.localreposetup.<locals>.fastannotaterepo object at 0x7f2415b3f8e0>, '_rev': 72057594038527478, '_node': b'\xb6\x12\xcd\x81b#\xa3\x01\xe2pP\x84\x05{\xd2He\xbe\xcc\xf0'}
ipdb> p self._node
b'\xb6\x12\xcd\x81b#\xa3\x01\xe2pP\x84\x05{\xd2He\xbe\xcc\xf0'
ipdb> p self._repo.changelog.rev(self._node)
7198913
ipdb> p self._rev
72057594038527478
```

Note that `self._rev` becomes inconsistent with `changelog.rev(self._node)`.

The error looks like:

  $ hg push -r . --to master --debug --trace --traceback --verbose
  ...
  pushing rev 556400239977 to destination ...
  ...
  1 commits found
  list of changesets:
  556400239977b9ed523eae5ad28773784c975f7f
  sending unbundle command
  ...
  added 79 commits with 0 changes to 0 files
  moving remote bookmark 'remote/master' to 84829e9242e4
  ...
  using eden update code path
  Traceback (most recent call last):
    ...
    File "/opt/fb/mercurial/edenscm/mercurial/merge.py", line 2220, in update
      return eden_update.update(
    File "/opt/fb/mercurial/edenscm/mercurial/eden_update.py", line 126, in update
      stats, actions = _handle_update_conflicts(
    ...
    File "/opt/fb/mercurial/edenscm/mercurial/context.py", line 503, in _changeset
      return self._repo.changelog.changelogrevision(self.rev())
      # self = <changectx 84829e9242e4>
    File "/opt/fb/mercurial/edenscm/mercurial/changelog2.py", line 312, in changelogrevision
      return changelogrevision(self.revision(nodeorrev))
      # nodeorrev = 72057594038527521
    File "/opt/fb/mercurial/edenscm/mercurial/changelog2.py", line 365, in revision
      node = self.node(nodeorrev)
      # nodeorrev = 72057594038527521
    File "/opt/fb/mercurial/edenscm/mercurial/changelog2.py", line 280, in node
      raise IndexError("revlog index out of range")
  Traceback (most recent call last):
    File "/opt/fb/mercurial/edenscm/mercurial/changelog2.py", line 278, in node
      return self.idmap.id2node(rev)
  error.CommitLookupError: 'N599585 cannot be found'

Change `context` object to not memorizing revs.

Reviewed By: DurhamG

Differential Revision: D23468702

fbshipit-source-id: b623bcec99b09d61169371e08c69fc6d6f38935c
2020-09-04 13:22:18 -07:00
Jun Wu
e74133f0fa dag: limit max segment level to 4
Summary:
This is based on fbsource data, building level 5 proves to be not useful.

This would save 300ms in the write path.

Reviewed By: sfilipco

Differential Revision: D23494505

fbshipit-source-id: ca795b4900af40dbfdaa463d36f3169413bf6a62
2020-09-04 12:20:54 -07:00
Jun Wu
b4adf0602f dag: remove non-master "Name -> Id" index on request
Summary:
Previously the IdMap's "Name -> Id" index simply ignores the "reassign
non-master" request. It turns out stale entries in that index can cause
issues as demonstrated by the previous diff.

Update IdMap to actually remove both indexes of non-master group on
remove_non_master so it cannot have stale entries.

To optimize the index, the format of IdMap is changed from:

  [ 8 bytes Id (Big Endian) ] [ Name ]

to:

  [ 8 bytes Id (Big Endian) ] [ 1 byte Group ] [ Name ]

So the index can use reference to the slice, instead of embedding the bytes, to
reduce index size.

The filesystem directory name for IdMap used by NameDag is bumped to `idmap2`
so it won't read the incompatible old `idmap` data.

Reviewed By: sfilipco

Differential Revision: D23494508

fbshipit-source-id: 3cb7782577750ba5bd13515b370f787519ed3894
2020-09-04 12:20:53 -07:00
Jun Wu
c5d6c9d0f2 dag: add a test showing non-master rebuild issues
Summary: Some vertexes can disappear from the graph!

Reviewed By: sfilipco

Differential Revision: D23494506

fbshipit-source-id: ecbf2a4169e5fc82596e89a4bfe4c442a82e9cd2
2020-09-04 12:20:53 -07:00
Jun Wu
4aea3657e1 dag: move some test utilities to a TestDag struct
Summary: The TestDag struct will be used to do some more complicated tests.

Reviewed By: sfilipco

Differential Revision: D23494507

fbshipit-source-id: 11350f9e448725ae49f50a7b6f19efc57ad84448
2020-09-04 12:20:53 -07:00
Thomas Orozco
3ba2c2b429 mononoke/hg_sync: make it work on Mercurial Python 3
Summary:
A few things here:

- The heads must be bytes.
- The arguments to wireproto must be strings (we used to encode / decode them,
  but we shouldn't).
- The bookmark must be a string (otherwise it gets serialized as `"b\"foo\""`
  and then it deserializes to that instead of `foo`).

Reviewed By: StanislavGlebik

Differential Revision: D23499846

fbshipit-source-id: c8a657f24c161080c2d829eb214d17bc1c3d13ef
2020-09-04 11:56:44 -07:00
Jun Wu
c9e6995675 py2: fix crecord compatibility
Summary:
D23460476 (c84653c7a9) breaks Python 2:

Python 2: bytes + bytearray -> bytearray
Python 3: bytes + bytearray -> bytes

Fix it.

Python 2: b"%s" % bytearray -> bytes
Python 2: b"%s" % bytearray -> bytes

Reviewed By: singhsrb

Differential Revision: D23514590

fbshipit-source-id: 7fd5f2372444732f13909c42251f000f05955228
2020-09-03 18:51:10 -07:00
Stefan Filip
c09f80882c edenapi: use async-runtime to schedule futures
Summary:
Replacing places where the tokio runtime is instantiated inside the edenapi
client crate.

Reviewed By: quark-zju

Differential Revision: D23468596

fbshipit-source-id: ef68718c7d5b89b6477a2946daaa51618b53d06a
2020-09-03 15:45:34 -07:00
Jun Wu
cea2bf8728 dag: limit segment level at open time
Summary:
At open time, it's pointless to attempt to create new levels. So let's just
read the existing max_level and do not try to build max_level + 1.

This turns out to save 300ms in profiling result.

Reviewed By: sfilipco

Differential Revision: D23494509

fbshipit-source-id: 4ea326a3cc21792790ea0b87e5bf608a94ae382b
2020-09-03 13:48:43 -07:00
Jun Wu
f238529a97 multilog: use per-log meta to pick up updated indexes
Summary:
With MultiLog, per-log meta was previously entirely ignored. However, they can
be useful for updated indexes. For example, application defines a new index,
and opens a Log via MultiLog. The application would expect the new index is
built only once. Without MultiLog, per-log meta is updated at open time in
place. With MultiLog, the updated index meta is not written back to the
multimeta so the new index would be rebuilt multiple times undesirably.

Update MultiLog to reuse the per-log meta if it's compatible so it can pick up
new indexes.

Reviewed By: sfilipco

Differential Revision: D23488212

fbshipit-source-id: c8b3e6b5589dbda2e76a143d15085862a93dae22
2020-09-03 13:48:43 -07:00
Jun Wu
f79e7657af multilog: stop writing poisoned per-log meta
Summary:
The poisoned meta makes investigation harder. ex. `debugdumpindexlog` won't
work on those logs.

Reviewed By: sfilipco

Differential Revision: D23488213

fbshipit-source-id: b33894d8c605694b6adf5afdaed45707fbd7357e
2020-09-03 13:48:43 -07:00
Jun Wu
99511f8743 dag: benchmark dag_ops on different IdDagStores
Summary:
Change dag_ops benchmarks to use different IdDagStores. An example run shows:

  benchmarking dag::iddagstore::indexedlog_store::IndexedLogStore
  building segments (old)                           856.803 ms
  building segments (new)                           127.831 ms
  ancestors                                          54.288 ms
  children (spans)                                  619.966 ms
  children (1 id)                                    12.596 ms
  common_ancestors (spans)                            3.050 s
  descendants (small subset)                         35.652 ms
  gca_one (2 ids)                                   164.296 ms
  gca_one (spans)                                     3.132 s
  gca_all (2 ids)                                   270.542 ms
  gca_all (spans)                                     2.817 s
  heads                                             247.504 ms
  heads_ancestors                                    40.106 ms
  is_ancestor                                       108.719 ms
  parents                                           243.317 ms
  parent_ids                                         10.752 ms
  range (2 ids)                                       7.370 ms
  range (spans)                                      23.933 ms
  roots                                             620.150 ms

  benchmarking dag::iddagstore::in_process_store::InProcessStore
  building segments (old)                           790.429 ms
  building segments (new)                            55.007 ms
  ancestors                                           8.618 ms
  children (spans)                                  196.562 ms
  children (1 id)                                     2.488 ms
  common_ancestors (spans)                          545.344 ms
  descendants (small subset)                          8.093 ms
  gca_one (2 ids)                                    24.569 ms
  gca_one (spans)                                   529.080 ms
  gca_all (2 ids)                                    38.462 ms
  gca_all (spans)                                   540.486 ms
  heads                                             103.930 ms
  heads_ancestors                                     6.763 ms
  is_ancestor                                        16.208 ms
  parents                                           103.889 ms
  parent_ids                                          0.822 ms
  range (2 ids)                                       1.748 ms
  range (spans)                                       6.157 ms
  roots                                             197.924 ms

  benchmarking dag::iddagstore::bytes_store::BytesStore
  building segments (old)                           724.467 ms
  building segments (new)                            90.207 ms
  ancestors                                          23.812 ms
  children (spans)                                  348.237 ms
  children (1 id)                                     4.609 ms
  common_ancestors (spans)                            1.315 s
  descendants (small subset)                         20.819 ms
  gca_one (2 ids)                                    72.423 ms
  gca_one (spans)                                     1.346 s
  gca_all (2 ids)                                   116.025 ms
  gca_all (spans)                                     1.470 s
  heads                                             155.667 ms
  heads_ancestors                                    19.486 ms
  is_ancestor                                        51.529 ms
  parents                                           157.285 ms
  parent_ids                                          5.427 ms
  range (2 ids)                                       4.448 ms
  range (spans)                                      13.874 ms
  roots                                             365.568 ms

Overall, InProcessStore > BytesStore > IndexedLogStore. The InProcessStore
uses `Vec<BTreeMap<Id, StoreId>>` for the level-head index, which is more
efficient on the "Level" lookup (Vec), and more cache efficient (BTree).
BytesStore outperforms IndexedLogStore because it does not need to verify
checksum on every read access - the checksum was verified at store creation
(IdDag::from_bytes).

Note: The `BytesStore` is something optimized for serialization, and hasn't been sent.

Reviewed By: sfilipco

Differential Revision: D23438174

fbshipit-source-id: 6e5f15188e3b935659ccde25fac573e9b963b78f
2020-09-02 18:54:12 -07:00
Jun Wu
84ad7a5351 dag: implement GetLock for all IdDagStores
Summary: This allows them to use the SyncableIdDag APIs.

Reviewed By: sfilipco

Differential Revision: D23438170

fbshipit-source-id: 7ec7288cfb8186b88f85f0212a913cb0dffe7345
2020-09-02 18:54:12 -07:00
Jun Wu
cfff0e9144 dag: make IdDag::prepare_filesystem_sync generic
Summary: Other IdDagStores can also use the API. This will be used in benchmarks.

Reviewed By: sfilipco

Differential Revision: D23438180

fbshipit-source-id: 565552b66372dcfbb268c397883f627491d6e154
2020-09-02 18:54:12 -07:00
Jun Wu
8874e07f9b dag: IdDagStore::reload -> GetLock::reload
Summary:
Similar to `IdDagStore::sync` -> `GetLock::persist`, `reload` is more related
to filesystem/internal state exchange, and should be protected by a lock.  So
let's move the API there, and requires a lock.

Reviewed By: sfilipco

Differential Revision: D23438169

fbshipit-source-id: 4228106b7739a1a758677adfddd213ad54aa4b6a
2020-09-02 18:54:12 -07:00
Jun Wu
d633576880 dag: remove NameDag::reload
Summary:
`NameDag::reload` is used in `flush` to get a "fresh" NameDag.
In a future diff the `IdDag::reload` API gets changed, so let's
remove NameDag's use of it.

Instead, let's just re-`open` the path again to get a fresh NameDag.
It's a bit more expensive but probably okay, and easier to understand.
`get_new_segment_size()` was added as an internal API to preserve tests.

This also solves an issue where `NameDag` cannot recover properly if its
`flush` fails, because the old `NameDag` state is not lost.

After removing `NameDag::reload`, `idMap::reload` is no longer used publicly
and was made private.

Reviewed By: sfilipco

Differential Revision: D23438179

fbshipit-source-id: 0a32556a2cd786919c233d7efcae1cb9cbc5fb09
2020-09-02 18:54:11 -07:00
Jun Wu
8e16e4260f dag: IdDagStore::sync -> GetLock::persist
Summary:
The word "sync" is bi-directional: flush + reload. It was indexedlog::Log's
behavior. However, in the IdDag context "sync" is confusing - it is actually
only used to write data out, with protection from lock. Rename to `persist`
to clarify it's memory -> disk. Besides, requires a reference to a lock object
as a lightweight prove that some lock is held.

Reviewed By: sfilipco

Differential Revision: D23438175

fbshipit-source-id: 3d9ccd7431691d1c4e2ee74f3c80d95f5e7243b5
2020-09-02 18:54:11 -07:00
Jun Wu
3ad58ff945 dag: make SyncableIdMap use &mut IdMap instead of IdMap
Summary:
This removes the need of cloning `IdMap`.

SyncableIdMap is a bit tricky. I added some comments to clarify things.

Reviewed By: sfilipco

Differential Revision: D23438176

fbshipit-source-id: fe66071da07067ed6c53a6437790af1d81b28586
2020-09-02 18:54:11 -07:00
Jun Wu
23f9bec22b dag: move IdDagStore impls to separate files
Summary: This makes `iddagstore.rs` cleaner.

Reviewed By: sfilipco

Differential Revision: D23438177

fbshipit-source-id: 465cec2231a084a36b20da8e413cb9272f64a00a
2020-09-02 18:54:10 -07:00
Jun Wu
4e9200db44 dag: test IndexedLogIdDagStore
Summary:
Make the test cover IndexedLogIdDagStore. The only change is the parent index
returns children in a different order.

Reviewed By: sfilipco

Differential Revision: D23438173

fbshipit-source-id: bcfabcd329e45bbc5e7e773103fa42307c23c35d
2020-09-02 18:54:10 -07:00
Stefan Filip
1ddf5aaa0e tools: add location-to-hash command to read_res
Summary:
There aren't too many thigs that we can do with the responses that we get back
from the server. Thigs are somewhat application specific for this endpoint.
One option that is not available right now and might make sense to add is
limiting the number of entries that are printed for a given location.

Reviewed By: kulshrax

Differential Revision: D23456220

fbshipit-source-id: eb24602c3dea39b568859b82fc27b7f6acc77600
2020-09-02 17:20:43 -07:00
Stefan Filip
932450fb15 handlers: update location-to-hash endpoint with count parameter
Summary:
To reduce the size over the wire on cases where we would be traversing the
changelog on the client, we want to allow the endpoint to return a whole parent
chain with their hashes.

Reviewed By: kulshrax

Differential Revision: D23456216

fbshipit-source-id: d048462fa8415d0466dd8e814144347df7a3452a
2020-09-02 17:20:42 -07:00
Stefan Filip
7122cdded7 types: rename Location to CommitLocation
Summary:
Renaming all the LocationToHash related structures to CommitLocationToHash.
This is done for consistency. I realized the issue when the command for reading
the request from cbor was not what I was expecting it to be. The reason was that
the commit prefix was used inconsistently for LocationToHash.

Reviewed By: kulshrax

Differential Revision: D23456221

fbshipit-source-id: 0181dcaf81368b978902d8ca79c5405838e4b184
2020-09-02 17:20:42 -07:00
Durham Goode
537d5858bd archive: block full archives in large repositories
Summary:
The default archive behavior archives the entire working copy. That is
undesirable and easy to accidentally trigger in a large repository. Let's
prevent it and require users to specify what they want archived.

Reviewed By: quark-zju

Differential Revision: D23464818

fbshipit-source-id: c39a631d618c2007e442e691cda542400cf8f4c3
2020-09-02 11:38:08 -07:00
Stefan Filip
c2079c3464 revisionstore: use async-runtime crate for lfs
Summary:
Replacing uses of the custom Runtime in lfs with the global runtime in the
`async-runtime` crate.

Reviewed By: xavierd

Differential Revision: D23468347

fbshipit-source-id: 61d2858634a37eb2d7d807104702d24889ec047a
2020-09-02 10:01:08 -07:00
Thomas Orozco
de260c7e9d py3: fix debugstacktrace
Summary:
debugstacktrace is broken right now on Python 3: it wants to write to stderr,
which expects `bytes`, but it tries to write a `str`. This fixes it.

Reviewed By: DurhamG

Differential Revision: D23447984

fbshipit-source-id: 5896ae858f6022276fa47e08636c700159a2a678
2020-09-02 00:53:28 -07:00
Jun Wu
a0223bc7e7 dag: make iddagstore test generic
Summary: Make it possible to test other IdDagStores.

Reviewed By: sfilipco

Differential Revision: D23438178

fbshipit-source-id: e5fc1b20833c71dd7569c77c31c76a26a6e357fe
2020-09-01 23:58:04 -07:00
Jun Wu
c84653c7a9 py3: fix a crecord encoding issue
Summary: This only happens if specified context shows up.

Reviewed By: ytsheng

Differential Revision: D23460476

fbshipit-source-id: 788e236bd8e28918afa6b1e0a4e1be297b6f5a66
2020-09-01 21:24:53 -07:00
Jun Wu
211739f00c dag: remove SpanSetAsc
Summary:
Now SpanSet can easily support `push_front`, we can just use SpanSet
efficiently without SpanSetAsc.

Reviewed By: sfilipco

Differential Revision: D23385246

fbshipit-source-id: b2e0086f014977fa990d5142e6eee844293e7ca5
2020-09-01 21:02:08 -07:00
Jun Wu
64bdf70811 dag: add SpanSet::intersection_span_min
Summary: To remove SpanSetAsc, its API needs to be implemented on SpanSet.

Reviewed By: sfilipco

Differential Revision: D23385250

fbshipit-source-id: ebd9d537287b5c1cde6e2c52ffb6da57dbd71852
2020-09-01 21:02:08 -07:00
Jun Wu
16eaceafe9 dag: use VecDeque for SpanSet
Summary: This will make it possible to `push_front` and remove SpanSetAsc special case.

Reviewed By: sfilipco

Differential Revision: D23385249

fbshipit-source-id: 63ac67e9bce7cb281236399b3fb86eba23bbf8a0
2020-09-01 20:53:32 -07:00
Jun Wu
71f101054a dag: implement binary_search_by for VecDeque
Summary:
This makes it easier to replace Vec<Span> with VecDeque<Span> in SpanSet for
efficient push_front and deprecates SpanSetAsc (which uses Id in a bit hacky
way - they are not real Ids).

Reviewed By: sfilipco

Differential Revision: D23385245

fbshipit-source-id: b612cd816223a301e2705084057bd24865beccf0
2020-09-01 20:38:29 -07:00
Jun Wu
d8225764a5 py3: speed up simplemerge
Summary:
One user reports very very slow rebase (tens of minutes and running). The
commit is not very large. Python 2 can complete the rebase in 6 seconds.
I tracked it down to this code path. Making the change makes Python 3
rebase fast too (< 10 seconds). I haven't tracked down exactly why Python
3 is slow yet (maybe N^2 a += b)?

Some numbers about the slow merge:

  ipdb> p len(m3.atext)
  17984924
  ipdb> p len(m3.btext)
  17948110
  ipdb> p len(m3.a)
  613353
  ipdb> p len(m3.b)
  612129
  ipdb> p len(m3.base)
  612135

Reviewed By: singhsrb

Differential Revision: D23441221

fbshipit-source-id: 14b725439f4ecd3352edca512cdde32958b2ce29
2020-09-01 20:32:10 -07:00
Jun Wu
2d02d3b0f7 dag: validate SpanSet order and no mergable adjacent spans
Summary:
Previously the `is_valid()` function only checks about ordering.
Make it also check "no mergeable adjacent spans" and `span.low<=span.high`.
To provide better debug messages, the function does assertions
directly without returning a bool.

Reviewed By: sfilipco

Differential Revision: D23385247

fbshipit-source-id: 84829e9242e47e68dc2a4b2a6775b13331eba959
2020-09-01 20:27:03 -07:00
Jun Wu
4bf5817dad dag: always merge adjacent spans in SpanSet
Summary:
Previously, `SpanSet::from_sorted_spans` allows having adjacent spans like
`[1..=2, 3..=4]`, while `SpanSet::from_spans` would merge them into `[1..=4]`.
Change it so `SpanSet::from_sorted_spans` merges them too.  This simplifies
the `contains` logic and could make some Sets more efficient.

Reviewed By: sfilipco

Differential Revision: D23385248

fbshipit-source-id: 85b5ba9533f15034779e93255085a4fa09c6328a
2020-09-01 20:04:12 -07:00
Jun Wu
afa787bd5c rage: do not report 'serve' commands in sigtrace section
Summary:
There were some rage pastes that have very long "sigtrace" section (ex.  P141069793)
It turns out the sigtrace has lots of "serve" commands that is started in a
non-forking mode, producing very long traces like:

  Tracing Data:
  Process 726702 Thread 2610476:
     Start Dur.ms | Name                                              Source
         0    ... | Run Command                                       hgcommands::run line 296
                  | - pid = 726702                                    :
                  | - uid = 117869                                    :
                  | - nice = 0                                        :
                  | - args = ["/opt/fb/mercurial/hg.real","...        :
                  | - parent_pids = [2610476,1]                       :
                  | - parent_names = ["/opt/fb/mercurial/hg.real",""] :
                  | - exit_code = 0                                   :
                  | - max_rss = 0                                     :
        35    ... | Main Python Command                               (perftrace)
        35    +22  \ Repo Setup                                       edenscm.mercurial.hg line 168
                    | - local = true                                  :
        70   +802  \ Main Python Command                              (perftrace)
        72   +799   | Status                                          edenscm.mercurial.dirstate line 957
                    | - A/M/R Files = 0                               :
        74   +537   | Get EdenFS Status                               (perftrace)
                    | - status = true                                 :
       940   +914  \ Main Python Command                              (perftrace)
       943   +910   | Status                                          edenscm.mercurial.dirstate line 957
                    | - A/M/R Files = 0                               :
       943   +617   | Get EdenFS Status                               (perftrace)
                    | - status = true                                 :
      1875   +866  \ Main Python Command                              (perftrace)
      1877   +863   | Status                                          edenscm.mercurial.dirstate line 957
                    | - A/M/R Files = 0                               :
      1878   +604   | Get EdenFS Status                               (perftrace)
                    | - status = true                                 :
      2759  +2208  \ Main Python Command (719 times)                  (perftrace)
      3155   +860  \ Main Python Command                              (perftrace)
      3158   +856   | Status                                          edenscm.mercurial.dirstate line 957
                    | - A/M/R Files = 0                               :
      3158   +543   | Get EdenFS Status                               (perftrace)
                    | - status = true                                 :
      4068   +883  \ Main Python Command                              (perftrace)
      4071   +879   | Status                                          edenscm.mercurial.dirstate line 957
                    | - A/M/R Files = 0                               :
      4071   +591   | Get EdenFS Status                               (perftrace)
                    | - status = true                                 :
      4967   +913  \ Main Python Command                              (perftrace)
      4969   +910   | Status                                          edenscm.mercurial.dirstate line 957
                    | - A/M/R Files = 0                               :
      4969   +621   | Get EdenFS Status                               (perftrace)
                    | - status = true                                 :
      6630   +922  \ Main Python Command                              (perftrace)
      6633   +918   | Status                                          edenscm.mercurial.dirstate line 957
                    | - A/M/R Files = 0                               :
      6633   +640   | Get EdenFS Status                               (perftrace)
                    | - status = true                                 :
      7615   +856  \ Main Python Command                              (perftrace)
      7622   +849   | Status                                          edenscm.mercurial.dirstate line 957
                    | - A/M/R Files = 0                               :
      7622   +581   | Get EdenFS Status                               (perftrace)
                    | - status = true                                 :
      8487   +951  \ Main Python Command                              (perftrace)
      8490   +947   | Status                                          edenscm.mercurial.dirstate line 957
                    | - A/M/R Files = 0                               :
      8490   +671   | Get EdenFS Status                               (perftrace)
                    | - status = true                                 :
    139275   +794  \ Main Python Command                              (perftrace)
    139278   +790   | Status                                          edenscm.mercurial.dirstate line 957
                    | - A/M/R Files = 0                               :
    139278   +539   | Get EdenFS Status                               (perftrace)
                    | - status = true                                 :
    140132   +837  \ Main Python Command                              (perftrace)
    140135   +832   | Status                                          edenscm.mercurial.dirstate line 957
                    | - A/M/R Files = 0                               :
    140135   +544   | Get EdenFS Status                               (perftrace)
                    | - status = true                                 :
    140992   +814  \ Main Python Command                              (perftrace)
    140994   +811   | Status                                          edenscm.mercurial.dirstate line 957
                    | - A/M/R Files = 0                               :
    140994   +546   | Get EdenFS Status                               (perftrace)
                    | - status = true                                 :
    306862   +864  \ Main Python Command                              (perftrace)
    306865   +860   | Status                                          edenscm.mercurial.dirstate line 957
                    | - A/M/R Files = 0                               :
    306865   +586   | Get EdenFS Status                               (perftrace)
                    | - status = true                                 :
    307801   +858  \ Main Python Command                              (perftrace)
    307804   +854   | Status                                          edenscm.mercurial.dirstate line 957
                    | - A/M/R Files = 0                               :
    307804   +587   | Get EdenFS Status                               (perftrace)
                    | - status = true                                 :
    308690   +874  \ Main Python Command                              (perftrace)
    308693   +869   | Status                                          edenscm.mercurial.dirstate line 957
                    | - A/M/R Files = 0                               :
    308693   +610   | Get EdenFS Status                               (perftrace)
                    | - status = true                                 :
    506391   +924  \ Main Python Command                              (perftrace)
    506396   +917   | Status                                          edenscm.mercurial.dirstate line 957
                    | - A/M/R Files = 0                               :
    506396   +645   | Get EdenFS Status                               (perftrace)
                    | - status = true                                 :
    507401   +898  \ Main Python Command                              (perftrace)
    ....

Our chg usage does not start non-forking servers, those are started by apparently something related to emacs:

  args = ['--config', 'ui.interactive=True', '--config', 'ui.editor=emacsclient', '--config', 'extensions.shelve=', 'serve', '--cmdserver', ...]

Hide them in sigtrace to make rage paste shorter.

Reviewed By: DurhamG

Differential Revision: D23459991

fbshipit-source-id: 7ccc27dbe5ef03e0b97dbfec57213e5478003b1c
2020-09-01 19:57:41 -07:00
Jun Wu
5f0a6f35af py3: fix conflictinfo compatibility
Summary: File content needs to be encoded.

Reviewed By: DurhamG

Differential Revision: D23463706

fbshipit-source-id: e8e512668452618e3b139d7d94ec8776f2b6b25b
2020-09-01 18:31:35 -07:00
Jun Wu
062a83cc16 restack: fix bookmark movement with partial successful auto restack
Summary:
See the test change. Partially successful auto restack should have bookmarks
moved.

Reviewed By: DurhamG

Differential Revision: D23441932

fbshipit-source-id: 07e509a70bcc5cf81f702d40ec1b8dc4a5a781ff
2020-09-01 18:05:44 -07:00
Jun Wu
8191be83c1 tests: add a test for auto rebase bookmark movement issue
Summary: Reported By: asukhachev.

Reviewed By: DurhamG

Differential Revision: D23441931

fbshipit-source-id: b07f47e6796d4d0363250b3b1463f829bb5d0efa
2020-09-01 18:05:44 -07:00
Jun Wu
b3df065db5 debugshell: improve "%trace" UX
Summary: Print hints about how to enable detailed Python tracing.

Reviewed By: kulshrax

Differential Revision: D23437210

fbshipit-source-id: 009425a83945f9b5af2a6280c2572a782c6b349a
2020-09-01 13:49:13 -07:00
Thomas Orozco
0ab9638ef6 py3: fix lfs debuglfsreceive{,all}
Summary:
Those commands are broken right now: they try to write bytes but don't use
`writebytes`.

Reviewed By: DurhamG

Differential Revision: D23450968

fbshipit-source-id: 5d554771459f81718d90e5bad9a4c439cbb05d97
2020-09-01 11:04:16 -07:00
Thomas Orozco
46ab9553bc py3: fix lfs uploads not working anymore
Summary:
When Python 3 wants to upload a file-like object, it does something a bit
awkward: it sets the `Transfer-Encoding` to `chunked`, but doesn't actually
chunk the data. Also, for some reason ,it still sets the `Content-Length`. I'm
not sure where that is coming from.

The thing is, when you set `Transfer-Encoding` to `chunked`, you do need to
chunk, or the other end is going to get very confused.

Unfortunately, this is not what happens here (note that the "send" logs are
from enabling http tracing in Python here, and those logs are basically one
line before `.send()` into a socket, so the chunking doesn't appear to happen
elsewhere):

```
[torozco@devbig051]~/opsfiles_bin % echo "aaaa" | ~/fbcode/buck-out/gen/eden/scm/__hg-py3__/hg-py3.sh debuglfssend https://mononoke-lfs.internal.tfbnw.net/opsfiles_bin
send: b'PUT /opsfiles_bin/upload/11a77c3d96c06974b53d7f40a577e6813739eb5c811b2a86f59038ea90add772/5 HTTP/1.1\r\nAccept-Encoding: identity\r\nContent-length: 5\r\nx-client-correlator: tQT3yBfFEzhVtqI5\r\naccept: application/mercurial-0.1\r\ncontent-type: application/x-www-form-urlencoded\r\nhost: mononoke-lfs.internal.tfbnw.net\r\ntransfer-encoding: chunked\r\nuser-agent: mercurial/4.4.2_dev git/2.15.1\r\n\r\n'
sendIng a read()able
send: b'aaaa\n'
reply: 'HTTP/1.1 400 Bad request\r\n'
header: Content-Type: text/html; charset=utf-8
header: Access-Control-Allow-Origin: *
header: proxy-status: client_read_error; e_upip="AcLKajO63Vab0hC4kzGZQsqck3P_YOu7HsBzshC-NCbuo31tlWWqCiVw5xVLh44LYYe7qioCPqYSb8-1cBpdvFDZb_t5oYRP1Q"; e_proxy="AcJjRKHG02qo6Bv6fEPCUVF7DpCyrq3rmSnXhRLWakKWREEvVpk4jc-tzDyG6l9jvn3vNo8PYPG_5hLtC3L1"
header: Date: Tue, 01 Sep 2020 13:10:35 GMT
header: Connection: close
header: Content-Length: 2959
```

What's a bit confusing to me here is where this Content-length header comes
from. Indeed, normally Python 3 will:

- Not infer a content-length for file-like objects (which is what we have)
  https://fburl.com/ms94eq31
- Set Transfer-Encoding if no Content-Length is present:
  https://fburl.com/f81g8v2j

So, it's a bit unexpected that a) we have a Content-Length (we shouldn't), and
that we b) also have a Transfer-Encoding header. That said, setting the
Content-Length does fix the problem, so that's what this diff does.

Reviewed By: DurhamG

Differential Revision: D23450969

fbshipit-source-id: e1f535ff3d0b49c0c914130593d9aebe89ba18ca
2020-09-01 11:04:16 -07:00
Stanislau Hlebik
2e2e2432a7 sparse: warn if dirstate includes marker files
Summary:
As a follow up to the previous diff, let's also warn if dirstate includes
marker files that should not be included in any sparse profiles.

Reviewed By: DurhamG

Differential Revision: D23414361

fbshipit-source-id: 3d171328bf0ba5754e5bacde85f09abb4fed8603
2020-08-31 23:21:41 -07:00
Jun Wu
56d0255228 extutil: drop runbgcommand
Summary: Callsites were migrated to `util.spawndetached`.

Reviewed By: DurhamG

Differential Revision: D23124753

fbshipit-source-id: f0345461a3f79f9bb6ff3a58e00cdf0ed1893645
2020-08-31 17:34:49 -07:00
Jun Wu
2cdca65aed remotefilelog: runshellcommand -> spawndetached
Summary: There seems to be no need to use a shell.

Reviewed By: DurhamG

Differential Revision: D23124756

fbshipit-source-id: 7de1c23e2325fe88dc4c6a2c90563d06f109ed2f
2020-08-31 17:34:49 -07:00
Jun Wu
ffb93ca839 commandcloud: runbgcommand -> spawndetached
Summary:
The Rust process utility avoids issues with interaction with Python and can do file
redirection on Windows.

Reviewed By: DurhamG

Differential Revision: D23124755

fbshipit-source-id: f72b88bafd19b3b41e53afbf6a4095d0d6bcb93a
2020-08-31 17:34:49 -07:00
Jun Wu
6e2a90ddb5 hooks: add predefined hook to run fsync
Reviewed By: DurhamG

Differential Revision: D22993217

fbshipit-source-id: 2cfb6b26479cd7dad02419fb76fa5d3ca5dd66db
2020-08-31 17:34:49 -07:00
Jun Wu
a01693df0e util: use Rust pyprocess to implement spawndetached
Summary:
The Rust bindings handle the cross-platform differences and avoids issues
with Python / Rust interaction. Use it.

As we're here, extend the API to support cwd and env.

Reviewed By: DurhamG

Differential Revision: D23124171

fbshipit-source-id: fdc13f6eaeb25c05b53d385eb220af33dad984e1
2020-08-31 17:34:48 -07:00
Jun Wu
a90c8ea775 bindings: export rust process handling to Python
Summary:
Spawning processes turns out to be tricky.

Python 2:

- "fork & exec" in plain Python is potentially dangerous. See D22855986 (c35b8088ef).
  Disabling GC might have solved it, but still seems fragile.
- "close_fds=True" works on Windows if there is no redirection.
- Does not work well with `disable_standard_handle_inheritability` from `hgmain`.
  We patched it. See `contrib/python2-winbuild/0002-windows-make-subprocess-work-with-non-inheritable-st.patch`.

Python 3:

- "subprocess" uses native code for "fork & exec". It's safer.
- (>= 3.8) "close_fds=True" works on Windows even with redirection.
- "subprocess" exposes options to tweak low-level details on Windows.

Rust:

- No "close_fds=True" support for both Windows and Unix.
- Does not have the `disable_standard_handle_inheritability` issue on Windows.
- Impossible to cleanly support "close_fds=True" on Windows with existing stdlib.
  https://github.com/rust-lang/rust/pull/75551 attempts to add that to stdlib.
  D23124167 provides a short-term solution that can have corner cases.

Mercurial:

- `win32.spawndetached` uses raw Win32 APIs to spawn processes, bypassing
  the `subprocess` Python stdlib.
- Its use of `CreateProcessA` is undesirable. We probably want `CreateProcessW`
  (unless `CreateProcessA` speaks utf-8 natively).

We are still on Python 2 on Windows, and we'd need to spawn processes correctly
from Rust anyway, and D23124167 kind of fills the missing feature of `close_fds=True`
from Python. So let's expose the Rust APIs.

The binding APIs closely match the Rust API. So when we migrate from Python to
Rust, the translation is more straightforward.

Reviewed By: DurhamG

Differential Revision: D23124168

fbshipit-source-id: 94a404f19326e9b4cca7661da07a4b4c55bcc395
2020-08-31 17:34:48 -07:00
Jun Wu
b7f2ee577a spawn-ext: extend Command::spawn to avoid inheriting fds
Summary:
The Rust upstream took the "set F_CLOEXEC on every opened file" approach and
provided no support for closing fds at spawn time to make spawn lightweight [1].

However, that does not play well in our case:
- On Windows:
  - stdin/stdout/stderr are not created by Rust, and inheritable by
    default (other process like `cargo`, or `dotslash` might leak them too).
  - a few other handles like "Null", "Afd" are inheritable. It's
    unclear how they get created, though.
  - Fortunately, files opened by Python or C in edenscm (ex. packfiles) seem to
    be not inheritable and do not require special handling.
- On Linux:
  - Files opened by Python or C are likely lack of F_CLOEXEC and need special
    handling.

Implement logic to close file handlers (or set F_CLOEXEC) explicitly.

[1]: https://github.com/rust-lang/rust/issues/12148

Reviewed By: DurhamG

Differential Revision: D23124167

fbshipit-source-id: 32f3a1b9e3ae3a9475609df282151c9d6c4badd4
2020-08-31 17:34:48 -07:00
Jun Wu
b3fd513ea4 util: make gethgcmd more reliable
Summary:
It uses `sys.argv`, which might be rewritten by `debugshell`. Capture
`sys.argv` to make hgcmd more reliable.

Reviewed By: DurhamG

Differential Revision: D22993215

fbshipit-source-id: 5fa319e8023b656c6cdf96cb3229ea9f2c9b9b99
2020-08-31 17:34:48 -07:00
Jun Wu
333177101f hooks: add a hook point after write commands
Summary: This allows us to run commands after changes were made to the repo.

Reviewed By: DurhamG

Differential Revision: D22993218

fbshipit-source-id: d9943dcda94da42970fb9107f48f4caa14b6a9d4
2020-08-31 17:34:48 -07:00
David Tolnay
75c2118e01 Remove crate_root from Rust dependency info
Reviewed By: danobi

Differential Revision: D23430948

fbshipit-source-id: c4b374021325fc247121ceecd0e82a0291aa75d6
2020-08-31 14:43:24 -07:00
Jun Wu
9aa9d022ae util: stop using time.perf_counter() for timer()
Summary:
Some code paths (ex. metalog.commit) use `util.timer()` as a way to get
seconds since epoch, and get 0 for tests. Other use-cases of `util.timer()`
are ad-hoc time measure for displaying speed / progress. They do not need high
precision or strong guarantee that the clock does not go backwards. Drop the
`time.perf_counter()` to meet the first use-case's expectation.

Reviewed By: singhsrb

Differential Revision: D23431253

fbshipit-source-id: 8bf2d1ed32e284e17285742e1d0fd7178f181fb3
2020-08-31 13:04:54 -07:00
Jun Wu
9f33746b31 histedit: do not show revision numbers
Summary:
With segments backend, the revision numbers will be longer than commit hashes
and are confusing.

Reviewed By: DurhamG

Differential Revision: D23408971

fbshipit-source-id: e2057fa644fc7b6be4291f879eee3235bb4e687b
2020-08-31 11:57:53 -07:00
Jun Wu
96548cade8 remotefilelog: do not assume range(len(cl)) are valid revs in _linkrev
Summary: `range(len(cl))` contains invalid revs with segments backend.

Reviewed By: DurhamG

Differential Revision: D23411209

fbshipit-source-id: 2f83a5402bb46824cf38871926c1954507b64b56
2020-08-31 11:57:53 -07:00
Jun Wu
ff2d572717 changelog2: avoid excessive memory usage during large pulls
Summary:
Pulling from older repos (ex. years ago) could require GBs of commit text data.
Flush commit data if they exceed certain size.

This is for revlog compatibility.
In the future we probably just make commit text lazy to avoid this kind of issues.

Reviewed By: DurhamG

Differential Revision: D23408834

fbshipit-source-id: 273384f5a05be07877bb1c9871c17b53ba436233
2020-08-31 11:57:53 -07:00
Jun Wu
01c551bb30 hgcommits: add flush_commit_data API
Summary: This would be used to avoid excessive memory usage during pull.

Reviewed By: DurhamG

Differential Revision: D23408833

fbshipit-source-id: 8edd95ab8201697074f65cc118d14755a230567d
2020-08-31 11:57:53 -07:00
Jun Wu
fee02d78e0 changelog2: only call addcommits once in addgroup
Summary:
`addcommits` is designed to be more efficiently if called with a batch of
commits. So let's buffer the commits to add then only call it once.

This avoids some N^2 behaviors, for example, the NameDag internally will
prepare "snapshot" of itself which involves coping the pending Rust vecs
about the segments and id <-> hash map.

The change makes `pull` usable from unusably slow:

Original Python Revlog backend:

```
In [1]: %trace repo.pull(bookmarknames=['master'],quiet=False)
 5191   +466      | Apply Changegroup                                   edenscm.mercurial.bundle2 line 516
                  | - Commits = 125                                     :
                  | - Range = a1d1b3ade136:2e3fe78af189                 :
 5191   +466      | changegroup.cg1unpacker.apply                       edenscm.mercurial.changegroup line 313
 5192   +416      | Progress Bar: commits                               (progressbar)
 5192   +415      | changelog.changelog.addgroup                        edenscm.mercurial.changelog line 536
 5192   +409      | revlog.revlog.addgroup                              edenscm.mercurial.revlog line 2116
 5215   +371      | changelog.changelog._addrevision (125 times)        edenscm.mercurial.changelog line 558
```

DoubleWrite (Segments + Revlog) backend, Before:

```
In [2]: %trace repo.pull(bookmarknames=['master'],quiet=False)
  2396 +154059   | Apply Changegroup                            edenscm.mercurial.bundle2 line 516
                 | - Commits = 323                              :
                 | - Range = cb0b100180ba:5fb57c74f72e          :
  2396 +154059   | changegroup.cg1unpacker.apply                edenscm.mercurial.changegroup line 313
  2397 +151433    \ Progress Bar: commits                       (progressbar)
  2397 +151433     | changelog2.changelog.addgroup              edenscm.mercurial.changelog2 line 334
```

DoubleWrite (Segments + Revlog) backend, After:

```
In [2]: %trace repo.pull(bookmarknames=['master'],quiet=False)
 4629   +512      | Apply Changegroup                                       edenscm.mercurial.bundle2 line 516
                  | - Commits = 45                                          :
                  | - Range = cf23c6972934:1ff0c5f0e7ad                     :
 4629   +512      | changegroup.cg1unpacker.apply                           edenscm.mercurial.changegroup line 313
 4630   +494      | changelog2.changelog.addgroup                           edenscm.mercurial.changelog2 line 334
```

Reviewed By: DurhamG

Differential Revision: D23390435

fbshipit-source-id: dd97a5008dedd844d4134b87bfef190fa739a80b
2020-08-31 11:57:52 -07:00
Jun Wu
e5a4533622 revlog: drop addrevisoncb from addgroup
Summary:
The users of addrevisoncb are gone.
This also removes the "alwayscache" parameter of "_addrevision".

Reviewed By: DurhamG

Differential Revision: D23390437

fbshipit-source-id: 7edd9dd0b93d4cb9d4f35d088a1aef719b450ec1
2020-08-31 11:57:52 -07:00
Jun Wu
1199790982 upgrade: remove the upgrade module
Summary: It is about legacy revlog formats that are no longer relevant.

Reviewed By: DurhamG

Differential Revision: D23390436

fbshipit-source-id: 58c2c432804181bcc6517d6c988777b843fc9ba4
2020-08-31 11:57:52 -07:00
Stanislau Hlebik
2d5000293e sparse: disallow changing profiles if it includes bad file
Summary:
We have a few safeguards against creating full checkouts. However we have
sparse profiles that are not full, but that include very large directories
which normally should not be included.

This diff adds a logic that checks if a new sparse profile has any of the "marker"
files i.e. some files from a folder that should not be included. Operation
aborts if that the case, however there's always a way to workaround that.

Reviewed By: DurhamG

Differential Revision: D23414200

fbshipit-source-id: 626f392319eb1be8b35f39cadafb61f3c1dfefe3
2020-08-31 11:38:16 -07:00
Stanislau Hlebik
7bbf044a49 sparse: fix --sparse to work on eden
Summary:
"hg diff" has --sparse option which diffs only files inside a sparse checkout.
The problem is that it doesn't work on eden checkouts because eden repo doesn't
have sparsematch() function.

This diff makes it so that if sparsematch() function doesn't exist then
--sparse option is just ignored.

The motivation for this change is
https://fb.workplace.com/groups/corehg/?post_id=687768245151742. There are some
diff calls that are triggered by arc lint that race with "hg update" and might download
loads of data on people's laptops. This diff doesn't fix the race, but it:
1) Makes sure we don't download too much data that are not in sparse profiles.
2) arc lint doesn't care about files outside of sparse profiles anyway, so
running --sparse make sense.

Reviewed By: DurhamG

Differential Revision: D23396918

fbshipit-source-id: 2a386fdbeab85187e2c2acab69cb86b74124d46f
2020-08-28 23:47:40 -07:00
Jun Wu
fbc9b865b6 changegroup: do not calculate how many files received commits include
Summary:
This is practically just 0 in our production setup during `pull`s. In the
future when the commit data become lazy, it's no longer possible to read the
files locally. So let's just don't scan the commits.

Reviewed By: DurhamG

Differential Revision: D23390438

fbshipit-source-id: 4c54c4aac5fd840205296ab86955ec1b8ab76607
2020-08-28 13:40:18 -07:00
root@sandcastle5869.frc3.facebook.com
5f749ee470 suppress errors in eden - batch 1
Differential Revision: D23401295

fbshipit-source-id: 01fe0ff888d074c503a445c6d97f17bf0ec2b79c
2020-08-28 12:46:36 -07:00
Durham Goode
08c938e859 dirstate: block addition of paths containing "." and ".."
Summary:
Mergedrivers can call dirstate.add directly and are adding paths with
"." and "..". Let's block those paths.

Reviewed By: quark-zju

Differential Revision: D23375469

fbshipit-source-id: 64e9f20169cfd50325ecd8ebcc1dd3be7a5cb202
2020-08-28 09:42:25 -07:00
Durham Goode
2f5130c882 py3: fix extdiff
Summary:
extdiff uses shutil.rmtree which calls os.rmdir with new python 3
options. Since we pathc os.rmdir, we need to support those options.

Reviewed By: quark-zju

Differential Revision: D23350968

fbshipit-source-id: 081d179dcd67b51ffdeb6b85899adf4e574a8d0f
2020-08-27 19:15:22 -07:00
Jun Wu
f271d882e6 hgcommands: make commands! macro define modules
Summary: Similar to D18528858 so module names do not need to be spelled twice.

Reviewed By: markbt

Differential Revision: D23091380

fbshipit-source-id: a2a261abc9c78c8805cea62b38498ba65398796d
2020-08-27 19:02:27 -07:00
Arun Kulshreshtha
cb3f95d06e configparser: make code compile without "fb" feature
Summary: This crate would fail to build without the "fb" feature because `serde_json` was listed as an optional dependency (but is used in a way that isn't conditional on the `fb` feature). This diff makes the dependency non-optional, and also silences several dead code warnings that are emitted when building without the "fb" feature.

Reviewed By: quark-zju

Differential Revision: D23386786

fbshipit-source-id: b00a8b0b8b0b978c1cfab2838629fcb388a076e9
2020-08-27 18:28:46 -07:00
Jun Wu
d586a40ada hgcommands: add debugfsync
Summary:
The `debugfsync` command calls fsync on newly modified files in svfs.
Right now it only includes locations that we know have constant number
of files.

The fsync logic is put in a separate crate to avoid slow compiles.

Reviewed By: DurhamG

Differential Revision: D23124169

fbshipit-source-id: 438296002eed14db599d6ec225183bf824096940
2020-08-27 18:26:03 -07:00
Xavier Deguillard
eb57ebb4d8 eden: decrease verbosity of "fetching tree" message
Summary:
A warning means that every tree fetched will be printed in the edenfs log,
which is way too much. Let's decrease this to a debug message.

Reviewed By: genevievehelsel

Differential Revision: D23385778

fbshipit-source-id: d77f1cac3efb945d4b95750822f2f12f48c75ffe
2020-08-27 18:16:51 -07:00
Jun Wu
c2d36d03c4 changegroup: avoid using rev numbers
Summary: `len(repo)` can no longer predicate the next rev number. Use nodes instead.

Reviewed By: DurhamG

Differential Revision: D23307791

fbshipit-source-id: cc20e53f039eee2a714748352e8e98aab253095a
2020-08-27 18:14:29 -07:00
Jun Wu
d8e775f423 tracing-collector: limit maximum count of spans
Summary:
Some functions might be called very frequently. For example,
`phases.phasecache.loadphaserevs` might be called 100k+ times.
That makes the tracing data harder to process.

Limit the count of spans to 1k by default so the data is cheaper to process,
and some highly repetitive cases can now be reasoned about. Note the limit
is only put on static Span Ids. If a span uses dynamic metadata or ask for
different Span Ids each time, they will not be limited.

In debugshell,

  td = %trace repo.revs('smartlog()')
  len(td.serialize())

dropped from 6MB to 0.87MB.

It's also possible to reason about:

  td = %trace len(repo.revs('ancestors(.)'))

in debugshell (taking 30s, 98KB serialized, vs 21s without tracing), while
previously the result would be too large to show (`%trace` just hangs).

Reviewed By: DurhamG

Differential Revision: D23307793

fbshipit-source-id: 3c1e9885ce7a275c2abd8935a4e4539a4f14ce83
2020-08-27 18:14:29 -07:00
Jun Wu
9f4dac104f dag: truncate output in <SpanSet as Debug>::fmt
Summary: Set a default limit so the output won't be too long.

Reviewed By: DurhamG

Differential Revision: D23307792

fbshipit-source-id: 7e2ed99e96bbde06436a034e78f899fc2e3e03f8
2020-08-27 18:14:29 -07:00
Jun Wu
54cd73b41b profiling: do not profile debugshell command
Summary:
The debugshell command can be long running and contains uninteresting stuff.
Do not profile it.

Practically this hides showing the background statprof thread when using `%trace`.

Reviewed By: DurhamG

Differential Revision: D23278597

fbshipit-source-id: bad97de22e1be2be8b866bee705ea3a6755aa54b
2020-08-27 18:14:29 -07:00
Jun Wu
d92c80ebcc dispatch: enter ipdb for "NameError 'ipdb' is not defined"
Summary:
This allows entering ipdb for code like: `ipdb` or `ipdb()`. It can be handy to
debug something.

Reviewed By: DurhamG

Differential Revision: D23278599

fbshipit-source-id: 4355dd1944617aeb795450935789f01f66f094eb
2020-08-27 18:14:28 -07:00
Jun Wu
28fa0e1cfe debugshell: add %trace and %hg magics
Summary: This makes it possible to get tracing results, or run hg commands directly.

Reviewed By: DurhamG

Differential Revision: D23278601

fbshipit-source-id: e7dc92080d2881cb4155a481df5ca93f324828fc
2020-08-27 18:14:28 -07:00
Jun Wu
ed78542610 dispatch: add --trace flag
Summary:
The `--trace` flag enables tracing Python modules.
For compatibility reasons, it also enables `--traceback`.

It can be used with debugshell to make `%trace` more useful.

Reviewed By: sfilipco

Differential Revision: D23278600

fbshipit-source-id: d6d0b34bd5c48111f8cd33d7df115f349b0e95b6
2020-08-27 18:14:28 -07:00
Jun Wu
3bbdfd3743 revset: successors(x) should only show visible commits
Summary:
I found this when I aborted an rebase Dxxx and trying rebasing again and it
complained about "nothing to rebase". It was caused by Dxxx resolving into
a hidden commit.

Reviewed By: sfilipco

Differential Revision: D23307794

fbshipit-source-id: f7a956b5300240089b6a4648f28cf4a152ee2433
2020-08-27 18:14:28 -07:00
Arun Kulshreshtha
0b9ca4e83b hgcommands: remove unused imports in dynamicconfig module
Summary: Remove unused imports.

Reviewed By: quark-zju

Differential Revision: D23356940

fbshipit-source-id: 31b81eac11946aa8b24ec23c98ddb14716fbea3a
2020-08-27 14:06:52 -07:00
Genevieve Helsel
3eb96cfb62 fix dictionary changed size during iteration in patch
Summary:
We shouldn't delete from a dictionary while iterating over it, instead we should iterate over a copy and then delete from the original.

`.items()` returns a view of the dict, while wrapping it in `list` makes a deep copy.

Reviewed By: DurhamG

Differential Revision: D23283668

fbshipit-source-id: a168eef1ed2a1ce02fe71b3f6e3aed090965d2a4
2020-08-27 13:14:36 -07:00
Durham Goode
fe56f44ca0 treemanifest: prevent fetching nullid
Summary:
Mononoke throws an error if we request the nullid. In the long term we
want to get rid of the concept of the nullid entirely, so let's just add some
Python level blocks to prevent us from attempting to fetch it. This way we can
start to limit how much Rust has to know about these concepts.

Reviewed By: sfilipco

Differential Revision: D23332359

fbshipit-source-id: 8a67703ba1197ead00d4984411f7ae0325612605
2020-08-27 09:59:40 -07:00
Durham Goode
4d4e425624 configs: add fbitwhoami tiers to dynamicconfig inputs
Summary:
Corp has a different concept of tier than prod. Let's load the corp
tier into our tier set as well.

Reviewed By: quark-zju

Differential Revision: D23354056

fbshipit-source-id: c9543b8253f042c7b1224578e0687b4bdf21738e
2020-08-27 09:24:28 -07:00
Durham Goode
c190d283ec py3: don't use universal newlines for patch import
Summary:
The Python 3 email library internally stores the message as text, even
though our input and requested output is bytes. Let's make our own wrapper
around the parser to use ascii surrogateescape encoding so we can get the
actual bytes out later and not get universal newlines.

Based off the upstream 7b12a2d2eedc995405187cdf9a35736a14d60706,
which is basically a copy of the BytesParser implementation (https://github.com/python/cpython/blob/3.8/Lib/email/parser.py) with
newline=chr(10) added.

Reviewed By: quark-zju

Differential Revision: D23363965

fbshipit-source-id: 880f0642cce96edfdd22da5908c0b573887bed12
2020-08-27 09:21:04 -07:00
Liubov Dmitrieva
06c1d37383 move try up in the rejoin command
Summary:
`hg cloud rejoin` command is used in fbclone and it is supposed to print a
message on RegistrationError but this has been broken recently.

Reviewed By: markbt

Differential Revision: D23342773

fbshipit-source-id: 4f3318848953656dea65a2b5d4d832694f6b353c
2020-08-27 06:53:28 -07:00
Liubov Dmitrieva
bd63a78f96 add more information to hg cloud leave command
Summary:
There are users who prefer run `hg cloud leave` if they notice they are
connected to commit cloud sync.

Proving more information and add a prompt might help them to change their mind.

For some users who left new fbclone will connect them back. So on next leave they can learn more information about Commit Cloud Workspaces.

Reviewed By: markbt

Differential Revision: D23346091

fbshipit-source-id: 72f170f7133cd64b772ec75ae29a85dc8809e351
2020-08-26 22:43:20 -07:00
Durham Goode
8f9c0899cc update: fix performance of updating to null commit
Summary:
When updating to the null commit, the logic that computes the update
distance was broken. The null commit is pre-resolved to -1, which when passed to
a revset raw gets resolved as the tip commit. In large repositories this can
take a long time and use a lot of memory, since it's computing the difference
between tip and null.

Let's fix it to not pass the raw rev number, and also to handle the case of a 0
distance update.

Reviewed By: quark-zju

Differential Revision: D23358402

fbshipit-source-id: 3b0a1fe1bbcb07effba4d0ab2c092e66bdc02e67
2020-08-26 22:14:59 -07:00
Jun Wu
12d23ba64d revisionstore: fix GitHub build (#46)
Summary:
Pull Request resolved: https://github.com/facebookexperimental/eden/pull/46

See https://github.com/facebookexperimental/eden/runs/1034006668:

   error: unused import: `env::set_var`
      --> src/lfs.rs:1539:15
       |
  1539 |     use std::{env::set_var, str::FromStr};
       |               ^^^^^^^^^^^^
       |
  note: the lint level is defined here
      --> src/lib.rs:125:9
       |
  125  | #![deny(warnings)]
       |         ^^^^^^^^
       = note: `#[deny(unused_imports)]` implied by `#[deny(warnings)]`

  error: unnecessary braces around method argument
      --> src/lfs.rs:2439:36
       |
  2439 |         remote.batch_upload(&objs, { move |sha256| local_lfs.blobs.get(&sha256) })?;
       |                                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ help: remove these braces
       |
  note: the lint level is defined here
      --> src/lib.rs:125:9
       |
  125  | #![deny(warnings)]
       |         ^^^^^^^^
       = note: `#[deny(unused_braces)]` implied by `#[deny(warnings)]`

  error: aborting due to 2 previous errors

  error: could not compile `revisionstore`.

I dropped `#![deny(warnings)]` as I don't think warnings like the above ones
should break the build. (denying specific warnings that we care about explicitly
might be a better approach)

Reviewed By: singhsrb

Differential Revision: D23362178

fbshipit-source-id: 02258f57727edfac9818cd29dda5e451c7ca80a7
2020-08-26 20:40:25 -07:00
Arun Kulshreshtha
30e2cf4413 cargo_from_buck: reenable autocargo for edenapi
Summary: Now that it is possible to control which features are enabled on manually-managed dependencies, we can reenable autocargo for `edenapi`. See D23216925, D23327844, and D23329351 (840e6dd6f6) for context.

Reviewed By: dtolnay

Differential Revision: D23335122

fbshipit-source-id: 8ce250c3a106d2a02f457f7ed531623dd866232f
2020-08-26 19:16:48 -07:00
Jun Wu
d60e80796a py3: fix absorb -i
Summary: The command does not crash but `-` lines are ignored.

Reviewed By: DurhamG

Differential Revision: D23357655

fbshipit-source-id: f48568bc193f947503bc19f3e192b33346c317e1
2020-08-26 17:21:01 -07:00
Jun Wu
039419d281 configparser: fix non-fb dependencies (#45)
Summary:
Pull Request resolved: https://github.com/facebookexperimental/eden/pull/45

Fix referring to 'version' without proper codegen by making 'version' compile
without codegen. This fixes configparser test when version/src/lib.rs was not
generated.

Make unneeded deps without 'fb' feature optional.

This would hopefully fix the "EdenSCM Rust Libraries" GitHub workflow.

Reviewed By: DurhamG

Differential Revision: D23269864

fbshipit-source-id: f9e691fe0a75159c4530177b8a96dad47d2494a9
2020-08-26 16:31:00 -07:00
Jun Wu
0705bd3b8d pydag: use dag::delegate to simplify code
Summary: This makes the code simpler.

Reviewed By: sfilipco

Differential Revision: D23269858

fbshipit-source-id: bb9ac0bd1696f7429ca1856e6c63e04fabc2757a
2020-08-26 15:32:26 -07:00