Summary:
`addcommits` is designed to be more efficiently if called with a batch of
commits. So let's buffer the commits to add then only call it once.
This avoids some N^2 behaviors, for example, the NameDag internally will
prepare "snapshot" of itself which involves coping the pending Rust vecs
about the segments and id <-> hash map.
The change makes `pull` usable from unusably slow:
Original Python Revlog backend:
```
In [1]: %trace repo.pull(bookmarknames=['master'],quiet=False)
5191 +466 | Apply Changegroup edenscm.mercurial.bundle2 line 516
| - Commits = 125 :
| - Range = a1d1b3ade136:2e3fe78af189 :
5191 +466 | changegroup.cg1unpacker.apply edenscm.mercurial.changegroup line 313
5192 +416 | Progress Bar: commits (progressbar)
5192 +415 | changelog.changelog.addgroup edenscm.mercurial.changelog line 536
5192 +409 | revlog.revlog.addgroup edenscm.mercurial.revlog line 2116
5215 +371 | changelog.changelog._addrevision (125 times) edenscm.mercurial.changelog line 558
```
DoubleWrite (Segments + Revlog) backend, Before:
```
In [2]: %trace repo.pull(bookmarknames=['master'],quiet=False)
2396 +154059 | Apply Changegroup edenscm.mercurial.bundle2 line 516
| - Commits = 323 :
| - Range = cb0b100180ba:5fb57c74f72e :
2396 +154059 | changegroup.cg1unpacker.apply edenscm.mercurial.changegroup line 313
2397 +151433 \ Progress Bar: commits (progressbar)
2397 +151433 | changelog2.changelog.addgroup edenscm.mercurial.changelog2 line 334
```
DoubleWrite (Segments + Revlog) backend, After:
```
In [2]: %trace repo.pull(bookmarknames=['master'],quiet=False)
4629 +512 | Apply Changegroup edenscm.mercurial.bundle2 line 516
| - Commits = 45 :
| - Range = cf23c6972934:1ff0c5f0e7ad :
4629 +512 | changegroup.cg1unpacker.apply edenscm.mercurial.changegroup line 313
4630 +494 | changelog2.changelog.addgroup edenscm.mercurial.changelog2 line 334
```
Reviewed By: DurhamG
Differential Revision: D23390435
fbshipit-source-id: dd97a5008dedd844d4134b87bfef190fa739a80b
Summary:
The users of addrevisoncb are gone.
This also removes the "alwayscache" parameter of "_addrevision".
Reviewed By: DurhamG
Differential Revision: D23390437
fbshipit-source-id: 7edd9dd0b93d4cb9d4f35d088a1aef719b450ec1
Summary: It is about legacy revlog formats that are no longer relevant.
Reviewed By: DurhamG
Differential Revision: D23390436
fbshipit-source-id: 58c2c432804181bcc6517d6c988777b843fc9ba4
Summary:
We have a few safeguards against creating full checkouts. However we have
sparse profiles that are not full, but that include very large directories
which normally should not be included.
This diff adds a logic that checks if a new sparse profile has any of the "marker"
files i.e. some files from a folder that should not be included. Operation
aborts if that the case, however there's always a way to workaround that.
Reviewed By: DurhamG
Differential Revision: D23414200
fbshipit-source-id: 626f392319eb1be8b35f39cadafb61f3c1dfefe3
Summary:
Scuba logging that tracks undesired file fetches has some blind spots i.e. a
lot of fetches have null pid and null cmd line. This diff tries to fix another
part of the problem.
TreeInode::getOrLoadChild() has TODO `pass a fetch context down through
getOrLoadChild to track this load`. This diff fixes this TODO, and also starts
to pass context from EdenDispatcher:lookup method.
Note that it adds quite a lot of new `ObjectFetchContext::getNullContext()`
calls, and potentially those might be responsible for blind spots in logging.
I'll try to address this problem in the next diffs.
Reviewed By: kmancini
Differential Revision: D23418218
fbshipit-source-id: 319d7436494d8dce3580289aae9963aa13bfc191
Summary:
Scuba logging that tracks undesired file fetches has some blind spots i.e. a
lot of fetches have null pid and null cmd line. This diff fixes at least part
of the problem.
TreePrefetchContext which is used from TreePrefetchLease didn't logged client
pid at all (in fact, it logged almost nothing). This diff fixes at least one blind spot, however it doesn't look like this is the only one.
Reviewed By: kmancini
Differential Revision: D23417451
fbshipit-source-id: 107884e94c6b40de999328ec2ef78fe22174c1ca
Summary: `legacy.py` depends on other files in this directory, so lets add them all to the link tree
Reviewed By: fanzeyi
Differential Revision: D23356917
fbshipit-source-id: e4bfd82ebbd9d143a5454a43bb47e8dd55b4485f
Summary: These tests fail locally since adding these checks in eden doctor, so we need to mock them.
Reviewed By: chadaustin
Differential Revision: D23326597
fbshipit-source-id: 87a0e6ab0472e3ae56f89503e928c5a00a16ab04
Summary:
"hg diff" has --sparse option which diffs only files inside a sparse checkout.
The problem is that it doesn't work on eden checkouts because eden repo doesn't
have sparsematch() function.
This diff makes it so that if sparsematch() function doesn't exist then
--sparse option is just ignored.
The motivation for this change is
https://fb.workplace.com/groups/corehg/?post_id=687768245151742. There are some
diff calls that are triggered by arc lint that race with "hg update" and might download
loads of data on people's laptops. This diff doesn't fix the race, but it:
1) Makes sure we don't download too much data that are not in sparse profiles.
2) arc lint doesn't care about files outside of sparse profiles anyway, so
running --sparse make sense.
Reviewed By: DurhamG
Differential Revision: D23396918
fbshipit-source-id: 2a386fdbeab85187e2c2acab69cb86b74124d46f
Summary:
Most of the fixes are pretty trivial as the code was using functions not
present on Windows, either work around them, or switch to ones that are
multi-platform.
Of note, it looks like `hg doctor` doesn't properly detect when Mercurial and
EdenFS are out of sync, disabling the tests until we figure out why.
Reviewed By: genevievehelsel, fanzeyi
Differential Revision: D23409708
fbshipit-source-id: 3314c197d43364dda13891a6874caab4c29e76ca
Summary:
This is practically just 0 in our production setup during `pull`s. In the
future when the commit data become lazy, it's no longer possible to read the
files locally. So let's just don't scan the commits.
Reviewed By: DurhamG
Differential Revision: D23390438
fbshipit-source-id: 4c54c4aac5fd840205296ab86955ec1b8ab76607
Summary:
Mergedrivers can call dirstate.add directly and are adding paths with
"." and "..". Let's block those paths.
Reviewed By: quark-zju
Differential Revision: D23375469
fbshipit-source-id: 64e9f20169cfd50325ecd8ebcc1dd3be7a5cb202
Summary:
extdiff uses shutil.rmtree which calls os.rmdir with new python 3
options. Since we pathc os.rmdir, we need to support those options.
Reviewed By: quark-zju
Differential Revision: D23350968
fbshipit-source-id: 081d179dcd67b51ffdeb6b85899adf4e574a8d0f
Summary: Similar to D18528858 so module names do not need to be spelled twice.
Reviewed By: markbt
Differential Revision: D23091380
fbshipit-source-id: a2a261abc9c78c8805cea62b38498ba65398796d
Summary: This crate would fail to build without the "fb" feature because `serde_json` was listed as an optional dependency (but is used in a way that isn't conditional on the `fb` feature). This diff makes the dependency non-optional, and also silences several dead code warnings that are emitted when building without the "fb" feature.
Reviewed By: quark-zju
Differential Revision: D23386786
fbshipit-source-id: b00a8b0b8b0b978c1cfab2838629fcb388a076e9
Summary:
The `debugfsync` command calls fsync on newly modified files in svfs.
Right now it only includes locations that we know have constant number
of files.
The fsync logic is put in a separate crate to avoid slow compiles.
Reviewed By: DurhamG
Differential Revision: D23124169
fbshipit-source-id: 438296002eed14db599d6ec225183bf824096940
Summary:
A warning means that every tree fetched will be printed in the edenfs log,
which is way too much. Let's decrease this to a debug message.
Reviewed By: genevievehelsel
Differential Revision: D23385778
fbshipit-source-id: d77f1cac3efb945d4b95750822f2f12f48c75ffe
Summary: `len(repo)` can no longer predicate the next rev number. Use nodes instead.
Reviewed By: DurhamG
Differential Revision: D23307791
fbshipit-source-id: cc20e53f039eee2a714748352e8e98aab253095a
Summary:
Some functions might be called very frequently. For example,
`phases.phasecache.loadphaserevs` might be called 100k+ times.
That makes the tracing data harder to process.
Limit the count of spans to 1k by default so the data is cheaper to process,
and some highly repetitive cases can now be reasoned about. Note the limit
is only put on static Span Ids. If a span uses dynamic metadata or ask for
different Span Ids each time, they will not be limited.
In debugshell,
td = %trace repo.revs('smartlog()')
len(td.serialize())
dropped from 6MB to 0.87MB.
It's also possible to reason about:
td = %trace len(repo.revs('ancestors(.)'))
in debugshell (taking 30s, 98KB serialized, vs 21s without tracing), while
previously the result would be too large to show (`%trace` just hangs).
Reviewed By: DurhamG
Differential Revision: D23307793
fbshipit-source-id: 3c1e9885ce7a275c2abd8935a4e4539a4f14ce83
Summary: Set a default limit so the output won't be too long.
Reviewed By: DurhamG
Differential Revision: D23307792
fbshipit-source-id: 7e2ed99e96bbde06436a034e78f899fc2e3e03f8
Summary:
The debugshell command can be long running and contains uninteresting stuff.
Do not profile it.
Practically this hides showing the background statprof thread when using `%trace`.
Reviewed By: DurhamG
Differential Revision: D23278597
fbshipit-source-id: bad97de22e1be2be8b866bee705ea3a6755aa54b
Summary:
This allows entering ipdb for code like: `ipdb` or `ipdb()`. It can be handy to
debug something.
Reviewed By: DurhamG
Differential Revision: D23278599
fbshipit-source-id: 4355dd1944617aeb795450935789f01f66f094eb
Summary: This makes it possible to get tracing results, or run hg commands directly.
Reviewed By: DurhamG
Differential Revision: D23278601
fbshipit-source-id: e7dc92080d2881cb4155a481df5ca93f324828fc
Summary:
The `--trace` flag enables tracing Python modules.
For compatibility reasons, it also enables `--traceback`.
It can be used with debugshell to make `%trace` more useful.
Reviewed By: sfilipco
Differential Revision: D23278600
fbshipit-source-id: d6d0b34bd5c48111f8cd33d7df115f349b0e95b6
Summary:
I found this when I aborted an rebase Dxxx and trying rebasing again and it
complained about "nothing to rebase". It was caused by Dxxx resolving into
a hidden commit.
Reviewed By: sfilipco
Differential Revision: D23307794
fbshipit-source-id: f7a956b5300240089b6a4648f28cf4a152ee2433
Summary:
`PerfCounters` was the only application-specific type exposed as a parameter to the post-request callbacks, and it was only being used in one place. To facilitate making the post-request callback functionality more general, this diff makes the callback in question capture the `CoreContext` in its environment, thereby giving it access to the `PerfCounters` without requiring it to be passed as an argument.
This should not change the behavior since regardless of how the callback obtains a reference, it will still refer to the same underlying `PerfCounters` from the request's `CoreContext`.
Reviewed By: DurhamG
Differential Revision: D23298417
fbshipit-source-id: 898f14e5b35b827e98eaf1731db436261baa43bb
Summary:
We shouldn't delete from a dictionary while iterating over it, instead we should iterate over a copy and then delete from the original.
`.items()` returns a view of the dict, while wrapping it in `list` makes a deep copy.
Reviewed By: DurhamG
Differential Revision: D23283668
fbshipit-source-id: a168eef1ed2a1ce02fe71b3f6e3aed090965d2a4
Summary:
Mononoke throws an error if we request the nullid. In the long term we
want to get rid of the concept of the nullid entirely, so let's just add some
Python level blocks to prevent us from attempting to fetch it. This way we can
start to limit how much Rust has to know about these concepts.
Reviewed By: sfilipco
Differential Revision: D23332359
fbshipit-source-id: 8a67703ba1197ead00d4984411f7ae0325612605
Summary: I refactored this method to be a member fuction of `EdenFSProcess` and I thought this instance of the method was deleted, but I came across it while working in this area again.
Reviewed By: fanzeyi
Differential Revision: D23113075
fbshipit-source-id: 2c257cca2da3a4bfefb974753eb00c7580c5a104
Summary:
Enabling hg dynamicconfigs in D23309090 (d643f48c8c) changed the output of `hg
manifest --debug` and broke HgImportTest. Set TESTTMP to avoid
production configs.
Reviewed By: DurhamG
Differential Revision: D23335847
fbshipit-source-id: 7ffd0394aa7a8466b266000b18f8742ed4a6b53f
Summary:
Corp has a different concept of tier than prod. Let's load the corp
tier into our tier set as well.
Reviewed By: quark-zju
Differential Revision: D23354056
fbshipit-source-id: c9543b8253f042c7b1224578e0687b4bdf21738e