Summary:
We're seeing users report lfs fetching hanging for 24+ hours. Stack
traces seem to show it hanging on the lfs fetch. Let's read bytes off the wire
in smaller chunks and add a timeout to each read (default timeout is 10s).
Reviewed By: xavierd
Differential Revision: D22853074
fbshipit-source-id: 3cd9152c472acb1f643ba8c65473268e67d59505
Summary:
We encountered an issue where gc kicked in after forking the Python
process. This cause it to trigger some Rust drop logic which hung because some
cross thread locks were not in a good state. Let's just disable gc during the
fork and only reenable it in the parent process.
Reviewed By: quark-zju
Differential Revision: D22855986
fbshipit-source-id: c3e99fb000bcd4cc141848e6362bb7773d0aad3d
Summary:
Pull Request resolved: https://github.com/facebookexperimental/eden/pull/40
Those tools are being used in some integration tests, make them public so that the tests might pass
Reviewed By: ikostia
Differential Revision: D22844813
fbshipit-source-id: 7b7f379c31a5b630c6ed48215e2791319e1c48d9
Summary:
Pull Request resolved: https://github.com/facebookexperimental/eden/pull/41
As of D22098359 (7f1588131b) the default locale used by integration tests is en_US.UTF-8, but as the comment in code mentiones:
```
The en_US.UTF-8 locale doesn't behave the same on all systems and trying to run
commands like "sed" or "tr" on non-utf8 data will result in "Illegal byte
sequence" error.
That is why we are forcing the "C" locale.
```
Additionally I've changed the test-walker-throttle.t test to use "/bin/date" directly. Previously it was using "/usr/bin/date", but the "/bin/date" is a more standard path as it works on MacOS.
Reviewed By: krallin
Differential Revision: D22865007
fbshipit-source-id: afd1346e1753df84bcfc4cf88651813c06933f79
Summary: It fails now, unknown reason, will work on it later
Reviewed By: mitrandir77, ikostia
Differential Revision: D22865324
fbshipit-source-id: c0513bfa2ce9f6baffebff472053e8a5d889c9ba
Summary:
Follow up from D22819791.
We want to use bookmark update delay only in scs, so let's configure it this
way
Reviewed By: krallin
Differential Revision: D22847143
fbshipit-source-id: b863d7fa4bf861ffe5d53a6a2d5ec44e7f60eb1a
Summary:
This is the (almost) final diff to introduce WarmBookmarksCache in repo_client.
A lot of this code is to pass through the config value, but a few things I'd
like to point out:
1) Warm bookmark cache is enabled from config, but it can be killswitched using
a tunable.
2) WarmBookmarksCache in scs derives all derived data, but for repo_client I
decided to derive just hg changeset. The main motivation is to not change the
current behaviour, and to make mononoke server more resilient to failures in
other derived data types.
3) Note that WarmBookmarksCache doesn't obsolete SessionBookmarksCache that was
introduced earlier, but rather it complements it. If WarmBookmarksCache is
enabled, then SessionBookmarksCache reads the bookmarks from it and not from
db.
4) There's one exception in point #3 - if we just did a push then we read
bookmarks from db rather than from bookmarks cache (see
update_publishing_bookmarks_after_push() method). This is done intentionally -
after push is finished we want to return the latest updated bookmarks to the
client (because the client has just moved a bookmark after all!).
I'd argue that the current code is a bit sketchy already - it doesn't read from
master but from replica, which means we could still see outdated bookmarks.
Reviewed By: krallin
Differential Revision: D22820879
fbshipit-source-id: 64a0aa0311edf17ad4cb548993d1d841aa320958
Summary:
Add a cmdlib argument to control cachelib zstd compression. The default behaviour is unchanged, in that the CachelibBlobstore will attempted compression when putting to the cache if the object is larger than the cachelib max size.
To make the cache behaviour more testable, this change also adds an option to do an eager put to cache without the spawn. The default remains to do a lazy fire and forget put into the cache with tokio::spawn.
The motivation for the change is that when running the walker the compression putting to cachelib can dominate CPU usage for part of the walk, so it's best to turn it off and let those items be uncached as the walker is unlikely to visit them again (it only revisits items that were not fully derived).
Reviewed By: StanislavGlebik
Differential Revision: D22797872
fbshipit-source-id: d05f63811e78597bf3874d7fd0e139b9268cf35d
Summary: populate_healer would panic on launch because there were 2 aguments assigned to -d: debug and destination-blobstore-id
Reviewed By: StanislavGlebik
Differential Revision: D22843091
fbshipit-source-id: e300af85b4e9d4f757b4311f2b7d776f59c7527d
Summary:
Although new changelog revlogs do not use deltas since years ago, early
revisions in our production changelog still use mpatch delta format
because they are stream-cloned.
Teach revlogindex to support them.
Reviewed By: sfilipco
Differential Revision: D22657204
fbshipit-source-id: 7aa3b76a9a6b184294432962d36e6a862c4fe371
Summary:
Now the rust-commits features are moved to changelog2, and changelog is no
longer used for rust-commits features. Let's just remove all rust-commits
features from changelog, and collapse related configs into just rust-commits.
Reviewed By: DurhamG
Differential Revision: D22657194
fbshipit-source-id: d74ae40a24fb365981679feab7c2403f84df2b3e
Summary:
Restore the behavior to before D22368827 (da42f2c17e). This also significantly speeds up
graph log like `smartlog` because the fast native path of `reachableroots`
can be used.
Reviewed By: DurhamG
Differential Revision: D22657197
fbshipit-source-id: e3236938d8acfd0935ec45e761763bf0477f2152
Summary: So reachableroots can be called from Python.
Reviewed By: sfilipco
Differential Revision: D22657186
fbshipit-source-id: 36b1b5ed1e32c88bb07e6c7c7e0a7ca89e0751a3
Summary:
The default reachable_roots implementation is good enough for segmented
changelog, but not efficient for revlogindex use-case.
Reviewed By: sfilipco
Differential Revision: D22657193
fbshipit-source-id: a81bc255d42d46c50e61fe954f027f1160dacb6c
Summary:
I thought it was just `roots & (::heads)`. It is actually more complex than
that.
Reviewed By: sfilipco
Differential Revision: D22657201
fbshipit-source-id: bd0b49fc4cdd2c516384cf70c1c5f79af4da1342
Summary:
The `changelog2.changelog` type does not inherit from `revlog`.
It is basically taking implementation from `changelog` with `userust` branches
returning true.
Reviewed By: DurhamG
Differential Revision: D22657195
fbshipit-source-id: dc718d180c7ef3d64f822c3a8c968ef6027047d5
Summary: This will help us verify that the C index is no longer necessary.
Reviewed By: DurhamG
Differential Revision: D22657196
fbshipit-source-id: 16ed74acc5400661572880adf3d8d3267c8b53e2
Summary:
This makes the Rust code path take care of commit writing.
The feature cannot be enabled yet because the `nodemap` backed by the C index
is no longer aware of new in-memory commits. The next diff migrates nodemap to
be backed by Rust and can turn on this feature altogether.
Reviewed By: DurhamG
Differential Revision: D22657191
fbshipit-source-id: 5f1a60f0b391b06fcd61d10676e2e095f8b7c9d6
Summary:
The debug print abuses the `linkmapper`. The Rust commit add logic does not
use `linkmapper`. So let's remove the debug message to be consistent with
the Rust logic.
Reviewed By: DurhamG
Differential Revision: D22657189
fbshipit-source-id: 2e92087dbb5bfce2f00711dcd62881aba64b0279
Summary:
Those tests are going to break with the latest changelog. We're moving away
from revlog so let's just remove the tests.
Reviewed By: DurhamG
Differential Revision: D22657198
fbshipit-source-id: 6d1540050d70c58636577fa3325daca511273a2b
Summary:
`tr.writepending()` removes callbacks saying "temp files are already written".
However, `tr.writepending()` might be called multiple times and the content
being written can be changed.
For example, `test-hook.t` has a test case that uses both `prechangegroup` and
`pretxnchangegroup` external process hooks. The `prechangegroup` hook runs
before the changelog gets changed, and the `pretxnchangegroup` runs after the
changelog gets changed.
Without this diff, the latter will not see the changelog change after migrating
to Rust (which buffers pending commits in memory).
The revlog changelog "addpending" is kept the original behavior - only call once
for avoiding potential performance regression.
Reviewed By: DurhamG
Differential Revision: D22657199
fbshipit-source-id: 8f96a0beaeebd45e73de3973e3ee8dd1426295fb
Summary:
In the future it's harder to provide changed "revs". Let's use commit hash
instead.
Reviewed By: DurhamG
Differential Revision: D22657203
fbshipit-source-id: b46055fe31d174a6eae47570ebec4a73c7d603f6
Summary:
Without this a few tests will fail with upcoming changes.
For example, test-clone-uncompressed.t will say "requesting all changes"
instead of "no changes found" for the "hg clone --stream" command.
Reviewed By: DurhamG
Differential Revision: D22657190
fbshipit-source-id: 349caf58e5bfdb5310b6b5585e4727e208197573
Summary:
Commands like `debugindex` relies on this function to return a revlog object
with low-level APIs. Do not return changelog as-is.
Reviewed By: DurhamG
Differential Revision: D22657202
fbshipit-source-id: b6ae84a157d3411cef6f67ee842f44134fe9b35e
Summary:
This replaces RustError that might happen during `addcommits`, and allow us to
handle it without having a stacktrace.
Reviewed By: DurhamG
Differential Revision: D22539564
fbshipit-source-id: 356814b9baf0b31528dfc92d62b0dcf352bc1e24
Summary:
The zstore-commit-data code paths are in Python. We want to move them to behind
the Rust HgCommits abstractions. So stop making Python interact with the
low-level details.
Reviewed By: DurhamG
Differential Revision: D22638457
fbshipit-source-id: 435db8425a29ce4eae24a6202ad928f85a5f5ee2
Summary: It's the same as `__add__`. It's consistent with the revset language.
Reviewed By: sfilipco
Differential Revision: D22638456
fbshipit-source-id: 928177d553220461192650f4792ac39cadd57dc2
Summary:
Follow up of D22638454.
This makes revlogindex marks its compatible DAG so "all()" fast paths can be used properly.
Reviewed By: sfilipco
Differential Revision: D22638459
fbshipit-source-id: 074e95b9fccbc486b69a947fec5172662e7dd3b7
Summary:
No need to exhaust the entire IdLazySet if there are hints.
This is important to make `small & lazy` fast.
Reviewed By: sfilipco
Differential Revision: D22638462
fbshipit-source-id: 63a71986e6e254769c42eb6250c042ea6aa5808b
Summary:
When multiple DAGs (ex. a local DAG and a commit-cloud DAG) are involved,
certain fast paths become unsound. Namely, the fast paths of the FULL hint
should check DAG compatibility. For example:
localrepodag.all() & remotedag.all()
should not simply return `localrepodag.all()` or `remotedag.all()`.
Fix it by checking DAG pointers.
A StaticSet might be created without using a DAG, add an optimization
to change `all & static` to `static & all`. So StaticSet without DAG
wouldn't require full DAG scans when intersecting with other sets.
Reviewed By: sfilipco
Differential Revision: D22638454
fbshipit-source-id: 72396417e9c1238d5411829da8f16f2c6d4c2f3a
Summary:
Improve `fmt::Debug` so it fits better in the Rust and Python eco-system:
- Support Rust formatter flags. For example `{:#5.3?}`. `5` defines limit of a
large set to show, `3` defines hex commit hash length. `#` specifies the
alternate form.
- Show commit hashes together with integer Ids for IdStaticSet.
- Use HG rev range syntax (`a:b`) to represent ranges for IdStaticSet.
- Limit spans to show for IdStaticSet, similar to StaticSet.
- Show only 8 chars of a long hex commit hash by default.
- Minor renames like `dag` -> `spans`, `difference` -> `diff`.
Python bindings uses `fmt::Debug` as `__repr__` and will be affected.
Reviewed By: sfilipco
Differential Revision: D22638455
fbshipit-source-id: 957784fec9c99c8fc5600b040d964ce5918e1bb4
Summary:
Hard link adds complexity for revlog writes. It's not that useful in production
setup. The Rust revlog `flush` API does not break hardlinked files. So let's
just avoid using hard links during local repo clone.
Reviewed By: DurhamG
Differential Revision: D22638460
fbshipit-source-id: 038f4d5c48e9972b14c9e59a9d7ef72b6bc5308d
Summary:
This makes intersection set stop early. It's useful to stop iteration on some
lazy sets. For example, the below `ancestors(tip) & span` or
`descendants(1) & span` sets can take seconds to calculate without this
optimization.
```
In [1]: cl.dag.ancestors([cl.tip()]) & cl.tonodes(bindings.dag.spans.unsaferange(len(cl)-10,len(cl)))
Out[1]: <and <lazy-id> <dag [...]>>
In [3]: %time len(cl.dag.ancestors([cl.tip()]) & cl.tonodes(bindings.dag.spans.unsaferange(len(cl)-10,len(cl))))
CPU times: user 364 µs, sys: 0 ns, total: 364 µs
Wall time: 362 µs
In [7]: %time len(cl.dag.descendants([repo[1].node()]) & cl.tonodes(bindings.dag.spans.unsaferange(0,100)))
CPU times: user 0 ns, sys: 574 µs, total: 574 µs
Wall time: 583 µs
```
Reviewed By: sfilipco
Differential Revision: D22638458
fbshipit-source-id: b9064ce2ff1aecc2d7d00025928dfcb3c0d78e0c
Summary:
Similar to the segmented changelog version using `ANCESTORS`. This makes
`heads(all())` calculates `heads_ancestors(all())` automatically and gets
the speed-up.
Reviewed By: sfilipco
Differential Revision: D22638464
fbshipit-source-id: 014412f1c226925e50387f18c1282b3cb96d434b
Summary:
Optimize it to not covert revs to `Vec<u32>`, and have a fast path to
initialize `states` with `Unspecified`. This makes it about 2x faster and match
the C revlog `headrevs` performance when calculating `headsancestors(all())`:
```
In [2]: %timeit cl.index.clearcaches(); len(cl.index.headrevs())
10 loops, best of 3: 66.9 ms per loop
In [3]: %timeit len(cl.dageval(lambda: headsancestors(all())))
10 loops, best of 3: 64.9 ms per loop
```
Reviewed By: sfilipco
Differential Revision: D22638461
fbshipit-source-id: 965eb16e3a78ae02a65a8a44559f3a64c16f6884
Summary:
Change `parents` from using the default implementation that returns `StaticSet`
of commit hashes, to a customized implementation that returns `IdStaticSet`.
This avoids unnecessary commit hash lookups, and makes `heads(all())` 30x
faster, matching `headsancestors(all())` (but is still 2x slower than the C
revlog index `headsrevs` implementation).
Reviewed By: sfilipco
Differential Revision: D22638453
fbshipit-source-id: 4fef78080b990046b91fee110c48e36301d83b4f
Summary:
The hint indicates a set `X` is equivalent to `ancestors(X)`.
This allows us to make `heads` use `heads_ancestors` (which is faster in
segmented changelog) automatically without affecting correctness. It also
makes special queries like `ancestors(all())` super cheap because it'll just
return `all()` as-is.
Reviewed By: sfilipco
Differential Revision: D22638463
fbshipit-source-id: 44d9bbcbb0d7e2975a0c8322181c88daa1ba4e37
Summary:
Re-implement the `findcommonheads` logic using `changelog` APIs that are going
to have native support from Rust.
This decouples from revlog-based Python DAG logic, namely `dagutil.revlogdag`,
and `ancestor.incrementalmissingancestors`, unblocking Rust DAG progress, and
cleans up the algorithm to not use revision numbers.
The core algorithm is unchanged. The sampling logic is simplified and tweaked
a bit (ex. no 'initial' / 'quick initial' special cases). The debug and
progress messages are more verbose, and variable names are chosen to match
the docstrings.
I improved the doc a bit, and added some TODO notes about where I think can be
improved.
Reviewed By: sfilipco
Differential Revision: D22519582
fbshipit-source-id: ac8cc8bebad91b4045d69f402e69b7ca28146414
Summary:
It has been long replaced by setdiscovery. This removes another dependency on
`dagutil.revlogdag`.
Reviewed By: DurhamG
Differential Revision: D22519585
fbshipit-source-id: ee261173ba584ffcb3371ec640b233609aafcf77
Summary:
`changegroup` uses `dagutil.revlogdag` just to "linearize" commits to optimize
file revision deltas. This is less relevant in production setup because:
- The file delta calculation with remotefilelog is quite different.
- We don't have lots of branches that make the optimization useful.
- In the future segmented changelog makes commits more linearized.
The Python `dagutil.revlogdag` is ideally removed. This is a step towards that.
Reviewed By: DurhamG
Differential Revision: D22519589
fbshipit-source-id: ac44873893df8658da0617e06cae1805d72417aa
Summary:
The changegroup logic uses those APIs, which uses low-level revlog details like
the C index. Bypass them if the Rust DAG is used.
Reviewed By: DurhamG
Differential Revision: D22519583
fbshipit-source-id: 228c7ba0a8ea77c0cf85db39d1194274d6331416
Summary:
Those methods are less fancy. Use the Rust path to avoid depending on revlog
internals.
Reviewed By: DurhamG
Differential Revision: D22519588
fbshipit-source-id: 0fede55ee04373c069ae7a6dd727f4d7208ee321