Commit Graph

60098 Commits

Author SHA1 Message Date
Jun Wu
a01693df0e util: use Rust pyprocess to implement spawndetached
Summary:
The Rust bindings handle the cross-platform differences and avoids issues
with Python / Rust interaction. Use it.

As we're here, extend the API to support cwd and env.

Reviewed By: DurhamG

Differential Revision: D23124171

fbshipit-source-id: fdc13f6eaeb25c05b53d385eb220af33dad984e1
2020-08-31 17:34:48 -07:00
Jun Wu
a90c8ea775 bindings: export rust process handling to Python
Summary:
Spawning processes turns out to be tricky.

Python 2:

- "fork & exec" in plain Python is potentially dangerous. See D22855986 (c35b8088ef).
  Disabling GC might have solved it, but still seems fragile.
- "close_fds=True" works on Windows if there is no redirection.
- Does not work well with `disable_standard_handle_inheritability` from `hgmain`.
  We patched it. See `contrib/python2-winbuild/0002-windows-make-subprocess-work-with-non-inheritable-st.patch`.

Python 3:

- "subprocess" uses native code for "fork & exec". It's safer.
- (>= 3.8) "close_fds=True" works on Windows even with redirection.
- "subprocess" exposes options to tweak low-level details on Windows.

Rust:

- No "close_fds=True" support for both Windows and Unix.
- Does not have the `disable_standard_handle_inheritability` issue on Windows.
- Impossible to cleanly support "close_fds=True" on Windows with existing stdlib.
  https://github.com/rust-lang/rust/pull/75551 attempts to add that to stdlib.
  D23124167 provides a short-term solution that can have corner cases.

Mercurial:

- `win32.spawndetached` uses raw Win32 APIs to spawn processes, bypassing
  the `subprocess` Python stdlib.
- Its use of `CreateProcessA` is undesirable. We probably want `CreateProcessW`
  (unless `CreateProcessA` speaks utf-8 natively).

We are still on Python 2 on Windows, and we'd need to spawn processes correctly
from Rust anyway, and D23124167 kind of fills the missing feature of `close_fds=True`
from Python. So let's expose the Rust APIs.

The binding APIs closely match the Rust API. So when we migrate from Python to
Rust, the translation is more straightforward.

Reviewed By: DurhamG

Differential Revision: D23124168

fbshipit-source-id: 94a404f19326e9b4cca7661da07a4b4c55bcc395
2020-08-31 17:34:48 -07:00
Jun Wu
b7f2ee577a spawn-ext: extend Command::spawn to avoid inheriting fds
Summary:
The Rust upstream took the "set F_CLOEXEC on every opened file" approach and
provided no support for closing fds at spawn time to make spawn lightweight [1].

However, that does not play well in our case:
- On Windows:
  - stdin/stdout/stderr are not created by Rust, and inheritable by
    default (other process like `cargo`, or `dotslash` might leak them too).
  - a few other handles like "Null", "Afd" are inheritable. It's
    unclear how they get created, though.
  - Fortunately, files opened by Python or C in edenscm (ex. packfiles) seem to
    be not inheritable and do not require special handling.
- On Linux:
  - Files opened by Python or C are likely lack of F_CLOEXEC and need special
    handling.

Implement logic to close file handlers (or set F_CLOEXEC) explicitly.

[1]: https://github.com/rust-lang/rust/issues/12148

Reviewed By: DurhamG

Differential Revision: D23124167

fbshipit-source-id: 32f3a1b9e3ae3a9475609df282151c9d6c4badd4
2020-08-31 17:34:48 -07:00
Jun Wu
b3fd513ea4 util: make gethgcmd more reliable
Summary:
It uses `sys.argv`, which might be rewritten by `debugshell`. Capture
`sys.argv` to make hgcmd more reliable.

Reviewed By: DurhamG

Differential Revision: D22993215

fbshipit-source-id: 5fa319e8023b656c6cdf96cb3229ea9f2c9b9b99
2020-08-31 17:34:48 -07:00
Jun Wu
333177101f hooks: add a hook point after write commands
Summary: This allows us to run commands after changes were made to the repo.

Reviewed By: DurhamG

Differential Revision: D22993218

fbshipit-source-id: d9943dcda94da42970fb9107f48f4caa14b6a9d4
2020-08-31 17:34:48 -07:00
svcscm
05a99d0549 Updating submodules
Summary:
GitHub commits:

d34550f88c
09f60921e3
cbc7632511
bc8a9f69af
acbdb7449b
1dc6251581
e72282f72e
268329cded
dd0145feae
e5a417434d
4fc01683c3
a9397bf721
608ef20f8a

Reviewed By: 2d2d2d2d2d

fbshipit-source-id: 96bd1fc40f66db272ba9eead14846202e623e1ed
2020-08-31 17:26:08 -07:00
Zeyi (Rice) Fan
c693932e8e return returncode correctly
Reviewed By: xavierd

Differential Revision: D23434438

fbshipit-source-id: 813f987cf62e72c0b6704b31e6e9168006735b6f
2020-08-31 16:26:46 -07:00
svcscm
9599bb5f18 Updating submodules
Summary:
GitHub commits:

cddd02a930
962263bed9

Reviewed By: 2d2d2d2d2d

fbshipit-source-id: 393000dbefb4d10164d8e2b76eb87def72e17751
2020-08-31 15:22:04 -07:00
David Tolnay
75c2118e01 Remove crate_root from Rust dependency info
Reviewed By: danobi

Differential Revision: D23430948

fbshipit-source-id: c4b374021325fc247121ceecd0e82a0291aa75d6
2020-08-31 14:43:24 -07:00
svcscm
1bc22d542f Updating submodules
Summary:
GitHub commits:

5f5f414e9a
b6d2974b18
193e618d57
792d2f906e
b0787925bd
4d116de925
9ed1186cd9
54ec33e474
8d1c8c0e38

Reviewed By: 2d2d2d2d2d

fbshipit-source-id: a9c16f83d1d6c68e5089810d4d28edc1eac66e97
2020-08-31 14:03:34 -07:00
Jun Wu
9aa9d022ae util: stop using time.perf_counter() for timer()
Summary:
Some code paths (ex. metalog.commit) use `util.timer()` as a way to get
seconds since epoch, and get 0 for tests. Other use-cases of `util.timer()`
are ad-hoc time measure for displaying speed / progress. They do not need high
precision or strong guarantee that the clock does not go backwards. Drop the
`time.perf_counter()` to meet the first use-case's expectation.

Reviewed By: singhsrb

Differential Revision: D23431253

fbshipit-source-id: 8bf2d1ed32e284e17285742e1d0fd7178f181fb3
2020-08-31 13:04:54 -07:00
Jun Wu
9f33746b31 histedit: do not show revision numbers
Summary:
With segments backend, the revision numbers will be longer than commit hashes
and are confusing.

Reviewed By: DurhamG

Differential Revision: D23408971

fbshipit-source-id: e2057fa644fc7b6be4291f879eee3235bb4e687b
2020-08-31 11:57:53 -07:00
Jun Wu
96548cade8 remotefilelog: do not assume range(len(cl)) are valid revs in _linkrev
Summary: `range(len(cl))` contains invalid revs with segments backend.

Reviewed By: DurhamG

Differential Revision: D23411209

fbshipit-source-id: 2f83a5402bb46824cf38871926c1954507b64b56
2020-08-31 11:57:53 -07:00
Jun Wu
ff2d572717 changelog2: avoid excessive memory usage during large pulls
Summary:
Pulling from older repos (ex. years ago) could require GBs of commit text data.
Flush commit data if they exceed certain size.

This is for revlog compatibility.
In the future we probably just make commit text lazy to avoid this kind of issues.

Reviewed By: DurhamG

Differential Revision: D23408834

fbshipit-source-id: 273384f5a05be07877bb1c9871c17b53ba436233
2020-08-31 11:57:53 -07:00
Jun Wu
01c551bb30 hgcommits: add flush_commit_data API
Summary: This would be used to avoid excessive memory usage during pull.

Reviewed By: DurhamG

Differential Revision: D23408833

fbshipit-source-id: 8edd95ab8201697074f65cc118d14755a230567d
2020-08-31 11:57:53 -07:00
Jun Wu
fee02d78e0 changelog2: only call addcommits once in addgroup
Summary:
`addcommits` is designed to be more efficiently if called with a batch of
commits. So let's buffer the commits to add then only call it once.

This avoids some N^2 behaviors, for example, the NameDag internally will
prepare "snapshot" of itself which involves coping the pending Rust vecs
about the segments and id <-> hash map.

The change makes `pull` usable from unusably slow:

Original Python Revlog backend:

```
In [1]: %trace repo.pull(bookmarknames=['master'],quiet=False)
 5191   +466      | Apply Changegroup                                   edenscm.mercurial.bundle2 line 516
                  | - Commits = 125                                     :
                  | - Range = a1d1b3ade136:2e3fe78af189                 :
 5191   +466      | changegroup.cg1unpacker.apply                       edenscm.mercurial.changegroup line 313
 5192   +416      | Progress Bar: commits                               (progressbar)
 5192   +415      | changelog.changelog.addgroup                        edenscm.mercurial.changelog line 536
 5192   +409      | revlog.revlog.addgroup                              edenscm.mercurial.revlog line 2116
 5215   +371      | changelog.changelog._addrevision (125 times)        edenscm.mercurial.changelog line 558
```

DoubleWrite (Segments + Revlog) backend, Before:

```
In [2]: %trace repo.pull(bookmarknames=['master'],quiet=False)
  2396 +154059   | Apply Changegroup                            edenscm.mercurial.bundle2 line 516
                 | - Commits = 323                              :
                 | - Range = cb0b100180ba:5fb57c74f72e          :
  2396 +154059   | changegroup.cg1unpacker.apply                edenscm.mercurial.changegroup line 313
  2397 +151433    \ Progress Bar: commits                       (progressbar)
  2397 +151433     | changelog2.changelog.addgroup              edenscm.mercurial.changelog2 line 334
```

DoubleWrite (Segments + Revlog) backend, After:

```
In [2]: %trace repo.pull(bookmarknames=['master'],quiet=False)
 4629   +512      | Apply Changegroup                                       edenscm.mercurial.bundle2 line 516
                  | - Commits = 45                                          :
                  | - Range = cf23c6972934:1ff0c5f0e7ad                     :
 4629   +512      | changegroup.cg1unpacker.apply                           edenscm.mercurial.changegroup line 313
 4630   +494      | changelog2.changelog.addgroup                           edenscm.mercurial.changelog2 line 334
```

Reviewed By: DurhamG

Differential Revision: D23390435

fbshipit-source-id: dd97a5008dedd844d4134b87bfef190fa739a80b
2020-08-31 11:57:52 -07:00
Jun Wu
e5a4533622 revlog: drop addrevisoncb from addgroup
Summary:
The users of addrevisoncb are gone.
This also removes the "alwayscache" parameter of "_addrevision".

Reviewed By: DurhamG

Differential Revision: D23390437

fbshipit-source-id: 7edd9dd0b93d4cb9d4f35d088a1aef719b450ec1
2020-08-31 11:57:52 -07:00
Jun Wu
1199790982 upgrade: remove the upgrade module
Summary: It is about legacy revlog formats that are no longer relevant.

Reviewed By: DurhamG

Differential Revision: D23390436

fbshipit-source-id: 58c2c432804181bcc6517d6c988777b843fc9ba4
2020-08-31 11:57:52 -07:00
Stanislau Hlebik
2d5000293e sparse: disallow changing profiles if it includes bad file
Summary:
We have a few safeguards against creating full checkouts. However we have
sparse profiles that are not full, but that include very large directories
which normally should not be included.

This diff adds a logic that checks if a new sparse profile has any of the "marker"
files i.e. some files from a folder that should not be included. Operation
aborts if that the case, however there's always a way to workaround that.

Reviewed By: DurhamG

Differential Revision: D23414200

fbshipit-source-id: 626f392319eb1be8b35f39cadafb61f3c1dfefe3
2020-08-31 11:38:16 -07:00
svcscm
9e1c4eac92 Updating submodules
Summary:
GitHub commits:

041d632a47

Reviewed By: 2d2d2d2d2d

fbshipit-source-id: c4f7bf28b14cf48e0eaa9b001dd0368328813c89
2020-08-31 10:32:44 -07:00
Stanislau Hlebik
71e1d6493e pass context to getOrLoadChild
Summary:
Scuba logging that tracks undesired file fetches has some blind spots i.e. a
lot of fetches have null pid and null cmd line. This diff tries to fix another
part of the problem.

TreeInode::getOrLoadChild() has TODO `pass a fetch context down through
getOrLoadChild to track this load`. This diff fixes this TODO, and also starts
to pass context from EdenDispatcher:lookup method.

Note that it adds quite a lot of new `ObjectFetchContext::getNullContext()`
calls, and potentially those might be responsible for blind spots in logging.
I'll try to address this problem in the next diffs.

Reviewed By: kmancini

Differential Revision: D23418218

fbshipit-source-id: 319d7436494d8dce3580289aae9963aa13bfc191
2020-08-31 10:05:02 -07:00
Stanislau Hlebik
ea4e64864c fix one case of logging of null ClientPid
Summary:
Scuba logging that tracks undesired file fetches has some blind spots i.e. a
lot of fetches have null pid and null cmd line. This diff fixes at least part
of the problem.

TreePrefetchContext which is used from TreePrefetchLease didn't logged client
pid at all (in fact, it logged almost nothing). This diff fixes at least one blind spot, however it doesn't look like this is the only one.

Reviewed By: kmancini

Differential Revision: D23417451

fbshipit-source-id: 107884e94c6b40de999328ec2ef78fe22174c1ca
2020-08-31 10:05:02 -07:00
Genevieve Helsel
5157cf6f34 fix eden thrift legacy dependencies
Summary: `legacy.py` depends on other files in this directory, so lets add them all to the link tree

Reviewed By: fanzeyi

Differential Revision: D23356917

fbshipit-source-id: e4bfd82ebbd9d143a5454a43bb47e8dd55b4485f
2020-08-31 07:55:27 -07:00
Genevieve Helsel
89791f6b89 mock kerberos checks in doctor tests
Summary: These tests fail locally since adding these checks in eden doctor, so we need to mock them.

Reviewed By: chadaustin

Differential Revision: D23326597

fbshipit-source-id: 87a0e6ab0472e3ae56f89503e928c5a00a16ab04
2020-08-31 07:51:17 -07:00
svcscm
84f5d59ca6 Updating submodules
Summary:
GitHub commits:

459508756b

Reviewed By: bigfootjon

fbshipit-source-id: 00b90faed838e20114b6b86c6f1220f7fddd0df7
2020-08-30 19:22:40 -07:00
svcscm
7e4a373f87 Updating submodules
Summary:
GitHub commits:

382d87a414
a22d69aad7
e473d6c8d0
473b9f42b0
7f8bdbff06

Reviewed By: bigfootjon

fbshipit-source-id: edac2a71159cf0864497a09efadf24bd6f8fc22f
2020-08-30 03:23:15 -07:00
svcscm
fbb2ab60e2 Updating submodules
Summary:
GitHub commits:

4c4e6843ea
5ffc7bc0fe

Reviewed By: bigfootjon

fbshipit-source-id: af0dfb3fc44517e56719832fa1961eba9f714892
2020-08-29 02:55:06 -07:00
Stanislau Hlebik
7bbf044a49 sparse: fix --sparse to work on eden
Summary:
"hg diff" has --sparse option which diffs only files inside a sparse checkout.
The problem is that it doesn't work on eden checkouts because eden repo doesn't
have sparsematch() function.

This diff makes it so that if sparsematch() function doesn't exist then
--sparse option is just ignored.

The motivation for this change is
https://fb.workplace.com/groups/corehg/?post_id=687768245151742. There are some
diff calls that are triggered by arc lint that race with "hg update" and might download
loads of data on people's laptops. This diff doesn't fix the race, but it:
1) Makes sure we don't download too much data that are not in sparse profiles.
2) arc lint doesn't care about files outside of sparse profiles anyway, so
running --sparse make sense.

Reviewed By: DurhamG

Differential Revision: D23396918

fbshipit-source-id: 2a386fdbeab85187e2c2acab69cb86b74124d46f
2020-08-28 23:47:40 -07:00
svcscm
f447d4d04d Updating submodules
Summary:
GitHub commits:

7ba0dc60d7

Reviewed By: bigfootjon

fbshipit-source-id: 1977d472336b939dc952d4ee45233c6c83ea42f6
2020-08-28 23:47:40 -07:00
svcscm
eea1a70175 Updating submodules
Summary:
GitHub commits:

76f0201355
24eab67fc9

Reviewed By: bigfootjon

fbshipit-source-id: 2709e5f3edf3314b8aa74b0f1a75d427fb63725a
2020-08-28 19:57:23 -07:00
Xavier Deguillard
4f9e1750c2 cli: enable doctor on Windows
Summary:
Most of the fixes are pretty trivial as the code was using functions not
present on Windows, either work around them, or switch to ones that are
multi-platform.

Of note, it looks like `hg doctor` doesn't properly detect when Mercurial and
EdenFS are out of sync, disabling the  tests until we figure out why.

Reviewed By: genevievehelsel, fanzeyi

Differential Revision: D23409708

fbshipit-source-id: 3314c197d43364dda13891a6874caab4c29e76ca
2020-08-28 19:49:37 -07:00
svcscm
6b498dfa02 Updating submodules
Summary:
GitHub commits:

86a15594f4
a4191b64fa
22ddfcb288
963314ffd6
20d994faf4

Reviewed By: bigfootjon

fbshipit-source-id: d872e2a2dbf3e57a909ae7464b49fa4c97c4746a
2020-08-28 19:49:37 -07:00
svcscm
08230e8225 Updating submodules
Summary:
GitHub commits:

cc04af2926
8dc2d39f8e
50ef7df1a4

Reviewed By: bigfootjon

fbshipit-source-id: 2001f26ded598a966f37cc965d5bcbcd54aaf673
2020-08-28 16:54:23 -07:00
Stefan Filip
a86409361c thrift: sync repos.thrift to fbcode
Summary:
`fbcode` sync after adding the `update_algorithm` option to
`RawSegmentedChangelogConfig`.

Reviewed By: quark-zju

Differential Revision: D23376694

fbshipit-source-id: df8a378303ec00d5d7d7ac3fff7a4a8cffc99328
2020-08-28 16:37:57 -07:00
svcscm
a94fd1f718 Updating submodules
Summary:
GitHub commits:

4eec8b81bf
db7a54ac3c
1e59800718
a3f588cc0a
127ebf2ef2
982e74de54

Reviewed By: bigfootjon

fbshipit-source-id: aa5a21c34f309a1053d1b9807006e5ed694e6cb2
2020-08-28 16:26:07 -07:00
Jun Wu
fbc9b865b6 changegroup: do not calculate how many files received commits include
Summary:
This is practically just 0 in our production setup during `pull`s. In the
future when the commit data become lazy, it's no longer possible to read the
files locally. So let's just don't scan the commits.

Reviewed By: DurhamG

Differential Revision: D23390438

fbshipit-source-id: 4c54c4aac5fd840205296ab86955ec1b8ab76607
2020-08-28 13:40:18 -07:00
root@sandcastle5869.frc3.facebook.com
5f749ee470 suppress errors in eden - batch 1
Differential Revision: D23401295

fbshipit-source-id: 01fe0ff888d074c503a445c6d97f17bf0ec2b79c
2020-08-28 12:46:36 -07:00
svcscm
3327171011 Updating submodules
Summary:
GitHub commits:

f00d5462f3

Reviewed By: bigfootjon

fbshipit-source-id: a3a16b3449b12e9654f8a91d7e0b2347a39d83f4
2020-08-28 12:46:36 -07:00
svcscm
2b469854e2 Updating submodules
Summary:
GitHub commits:

dc667acde0
defe385bcc
6244fa0ce9

Reviewed By: bigfootjon

fbshipit-source-id: 3d2288c094b624ad20da97e074a1570681431c0c
2020-08-28 10:35:03 -07:00
Durham Goode
08c938e859 dirstate: block addition of paths containing "." and ".."
Summary:
Mergedrivers can call dirstate.add directly and are adding paths with
"." and "..". Let's block those paths.

Reviewed By: quark-zju

Differential Revision: D23375469

fbshipit-source-id: 64e9f20169cfd50325ecd8ebcc1dd3be7a5cb202
2020-08-28 09:42:25 -07:00
svcscm
3adb4358ae Updating submodules
Summary:
GitHub commits:

7f7fb182e4

Reviewed By: bigfootjon

fbshipit-source-id: bf332ef50b1642d41c6212fbc5785a0fc59271f2
2020-08-28 09:42:25 -07:00
svcscm
b84ee302fb Updating submodules
Summary:
GitHub commits:

4c6cb73618
6f42044542

Reviewed By: bigfootjon

fbshipit-source-id: daeff573e04be9bed7630b950f04bed0535306f2
2020-08-27 19:26:19 -07:00
Durham Goode
2f5130c882 py3: fix extdiff
Summary:
extdiff uses shutil.rmtree which calls os.rmdir with new python 3
options. Since we pathc os.rmdir, we need to support those options.

Reviewed By: quark-zju

Differential Revision: D23350968

fbshipit-source-id: 081d179dcd67b51ffdeb6b85899adf4e574a8d0f
2020-08-27 19:15:22 -07:00
Jun Wu
f271d882e6 hgcommands: make commands! macro define modules
Summary: Similar to D18528858 so module names do not need to be spelled twice.

Reviewed By: markbt

Differential Revision: D23091380

fbshipit-source-id: a2a261abc9c78c8805cea62b38498ba65398796d
2020-08-27 19:02:27 -07:00
svcscm
9e32b45b04 Updating submodules
Summary:
GitHub commits:

8e34195422
c2485f2d81
bf087a7fb2
6dea19cad4
b0dbab25ee

Reviewed By: bigfootjon

fbshipit-source-id: e19acce6b5742cb085ac9ab0bd5c9cde7b96469e
2020-08-27 18:50:50 -07:00
Arun Kulshreshtha
cb3f95d06e configparser: make code compile without "fb" feature
Summary: This crate would fail to build without the "fb" feature because `serde_json` was listed as an optional dependency (but is used in a way that isn't conditional on the `fb` feature). This diff makes the dependency non-optional, and also silences several dead code warnings that are emitted when building without the "fb" feature.

Reviewed By: quark-zju

Differential Revision: D23386786

fbshipit-source-id: b00a8b0b8b0b978c1cfab2838629fcb388a076e9
2020-08-27 18:28:46 -07:00
Jun Wu
d586a40ada hgcommands: add debugfsync
Summary:
The `debugfsync` command calls fsync on newly modified files in svfs.
Right now it only includes locations that we know have constant number
of files.

The fsync logic is put in a separate crate to avoid slow compiles.

Reviewed By: DurhamG

Differential Revision: D23124169

fbshipit-source-id: 438296002eed14db599d6ec225183bf824096940
2020-08-27 18:26:03 -07:00
Xavier Deguillard
eb57ebb4d8 eden: decrease verbosity of "fetching tree" message
Summary:
A warning means that every tree fetched will be printed in the edenfs log,
which is way too much. Let's decrease this to a debug message.

Reviewed By: genevievehelsel

Differential Revision: D23385778

fbshipit-source-id: d77f1cac3efb945d4b95750822f2f12f48c75ffe
2020-08-27 18:16:51 -07:00
Jun Wu
c2d36d03c4 changegroup: avoid using rev numbers
Summary: `len(repo)` can no longer predicate the next rev number. Use nodes instead.

Reviewed By: DurhamG

Differential Revision: D23307791

fbshipit-source-id: cc20e53f039eee2a714748352e8e98aab253095a
2020-08-27 18:14:29 -07:00
Jun Wu
d8e775f423 tracing-collector: limit maximum count of spans
Summary:
Some functions might be called very frequently. For example,
`phases.phasecache.loadphaserevs` might be called 100k+ times.
That makes the tracing data harder to process.

Limit the count of spans to 1k by default so the data is cheaper to process,
and some highly repetitive cases can now be reasoned about. Note the limit
is only put on static Span Ids. If a span uses dynamic metadata or ask for
different Span Ids each time, they will not be limited.

In debugshell,

  td = %trace repo.revs('smartlog()')
  len(td.serialize())

dropped from 6MB to 0.87MB.

It's also possible to reason about:

  td = %trace len(repo.revs('ancestors(.)'))

in debugshell (taking 30s, 98KB serialized, vs 21s without tracing), while
previously the result would be too large to show (`%trace` just hangs).

Reviewed By: DurhamG

Differential Revision: D23307793

fbshipit-source-id: 3c1e9885ce7a275c2abd8935a4e4539a4f14ce83
2020-08-27 18:14:29 -07:00