Summary:
Without this a few tests will fail with upcoming changes.
For example, test-clone-uncompressed.t will say "requesting all changes"
instead of "no changes found" for the "hg clone --stream" command.
Reviewed By: DurhamG
Differential Revision: D22657190
fbshipit-source-id: 349caf58e5bfdb5310b6b5585e4727e208197573
Summary:
Commands like `debugindex` relies on this function to return a revlog object
with low-level APIs. Do not return changelog as-is.
Reviewed By: DurhamG
Differential Revision: D22657202
fbshipit-source-id: b6ae84a157d3411cef6f67ee842f44134fe9b35e
Summary:
This replaces RustError that might happen during `addcommits`, and allow us to
handle it without having a stacktrace.
Reviewed By: DurhamG
Differential Revision: D22539564
fbshipit-source-id: 356814b9baf0b31528dfc92d62b0dcf352bc1e24
Summary:
The zstore-commit-data code paths are in Python. We want to move them to behind
the Rust HgCommits abstractions. So stop making Python interact with the
low-level details.
Reviewed By: DurhamG
Differential Revision: D22638457
fbshipit-source-id: 435db8425a29ce4eae24a6202ad928f85a5f5ee2
Summary: It's the same as `__add__`. It's consistent with the revset language.
Reviewed By: sfilipco
Differential Revision: D22638456
fbshipit-source-id: 928177d553220461192650f4792ac39cadd57dc2
Summary:
Follow up of D22638454.
This makes revlogindex marks its compatible DAG so "all()" fast paths can be used properly.
Reviewed By: sfilipco
Differential Revision: D22638459
fbshipit-source-id: 074e95b9fccbc486b69a947fec5172662e7dd3b7
Summary:
No need to exhaust the entire IdLazySet if there are hints.
This is important to make `small & lazy` fast.
Reviewed By: sfilipco
Differential Revision: D22638462
fbshipit-source-id: 63a71986e6e254769c42eb6250c042ea6aa5808b
Summary:
When multiple DAGs (ex. a local DAG and a commit-cloud DAG) are involved,
certain fast paths become unsound. Namely, the fast paths of the FULL hint
should check DAG compatibility. For example:
localrepodag.all() & remotedag.all()
should not simply return `localrepodag.all()` or `remotedag.all()`.
Fix it by checking DAG pointers.
A StaticSet might be created without using a DAG, add an optimization
to change `all & static` to `static & all`. So StaticSet without DAG
wouldn't require full DAG scans when intersecting with other sets.
Reviewed By: sfilipco
Differential Revision: D22638454
fbshipit-source-id: 72396417e9c1238d5411829da8f16f2c6d4c2f3a
Summary:
Improve `fmt::Debug` so it fits better in the Rust and Python eco-system:
- Support Rust formatter flags. For example `{:#5.3?}`. `5` defines limit of a
large set to show, `3` defines hex commit hash length. `#` specifies the
alternate form.
- Show commit hashes together with integer Ids for IdStaticSet.
- Use HG rev range syntax (`a:b`) to represent ranges for IdStaticSet.
- Limit spans to show for IdStaticSet, similar to StaticSet.
- Show only 8 chars of a long hex commit hash by default.
- Minor renames like `dag` -> `spans`, `difference` -> `diff`.
Python bindings uses `fmt::Debug` as `__repr__` and will be affected.
Reviewed By: sfilipco
Differential Revision: D22638455
fbshipit-source-id: 957784fec9c99c8fc5600b040d964ce5918e1bb4
Summary:
Hard link adds complexity for revlog writes. It's not that useful in production
setup. The Rust revlog `flush` API does not break hardlinked files. So let's
just avoid using hard links during local repo clone.
Reviewed By: DurhamG
Differential Revision: D22638460
fbshipit-source-id: 038f4d5c48e9972b14c9e59a9d7ef72b6bc5308d
Summary:
This makes intersection set stop early. It's useful to stop iteration on some
lazy sets. For example, the below `ancestors(tip) & span` or
`descendants(1) & span` sets can take seconds to calculate without this
optimization.
```
In [1]: cl.dag.ancestors([cl.tip()]) & cl.tonodes(bindings.dag.spans.unsaferange(len(cl)-10,len(cl)))
Out[1]: <and <lazy-id> <dag [...]>>
In [3]: %time len(cl.dag.ancestors([cl.tip()]) & cl.tonodes(bindings.dag.spans.unsaferange(len(cl)-10,len(cl))))
CPU times: user 364 µs, sys: 0 ns, total: 364 µs
Wall time: 362 µs
In [7]: %time len(cl.dag.descendants([repo[1].node()]) & cl.tonodes(bindings.dag.spans.unsaferange(0,100)))
CPU times: user 0 ns, sys: 574 µs, total: 574 µs
Wall time: 583 µs
```
Reviewed By: sfilipco
Differential Revision: D22638458
fbshipit-source-id: b9064ce2ff1aecc2d7d00025928dfcb3c0d78e0c
Summary:
Similar to the segmented changelog version using `ANCESTORS`. This makes
`heads(all())` calculates `heads_ancestors(all())` automatically and gets
the speed-up.
Reviewed By: sfilipco
Differential Revision: D22638464
fbshipit-source-id: 014412f1c226925e50387f18c1282b3cb96d434b
Summary:
Optimize it to not covert revs to `Vec<u32>`, and have a fast path to
initialize `states` with `Unspecified`. This makes it about 2x faster and match
the C revlog `headrevs` performance when calculating `headsancestors(all())`:
```
In [2]: %timeit cl.index.clearcaches(); len(cl.index.headrevs())
10 loops, best of 3: 66.9 ms per loop
In [3]: %timeit len(cl.dageval(lambda: headsancestors(all())))
10 loops, best of 3: 64.9 ms per loop
```
Reviewed By: sfilipco
Differential Revision: D22638461
fbshipit-source-id: 965eb16e3a78ae02a65a8a44559f3a64c16f6884
Summary:
Change `parents` from using the default implementation that returns `StaticSet`
of commit hashes, to a customized implementation that returns `IdStaticSet`.
This avoids unnecessary commit hash lookups, and makes `heads(all())` 30x
faster, matching `headsancestors(all())` (but is still 2x slower than the C
revlog index `headsrevs` implementation).
Reviewed By: sfilipco
Differential Revision: D22638453
fbshipit-source-id: 4fef78080b990046b91fee110c48e36301d83b4f
Summary:
The hint indicates a set `X` is equivalent to `ancestors(X)`.
This allows us to make `heads` use `heads_ancestors` (which is faster in
segmented changelog) automatically without affecting correctness. It also
makes special queries like `ancestors(all())` super cheap because it'll just
return `all()` as-is.
Reviewed By: sfilipco
Differential Revision: D22638463
fbshipit-source-id: 44d9bbcbb0d7e2975a0c8322181c88daa1ba4e37
Summary:
Re-implement the `findcommonheads` logic using `changelog` APIs that are going
to have native support from Rust.
This decouples from revlog-based Python DAG logic, namely `dagutil.revlogdag`,
and `ancestor.incrementalmissingancestors`, unblocking Rust DAG progress, and
cleans up the algorithm to not use revision numbers.
The core algorithm is unchanged. The sampling logic is simplified and tweaked
a bit (ex. no 'initial' / 'quick initial' special cases). The debug and
progress messages are more verbose, and variable names are chosen to match
the docstrings.
I improved the doc a bit, and added some TODO notes about where I think can be
improved.
Reviewed By: sfilipco
Differential Revision: D22519582
fbshipit-source-id: ac8cc8bebad91b4045d69f402e69b7ca28146414
Summary:
It has been long replaced by setdiscovery. This removes another dependency on
`dagutil.revlogdag`.
Reviewed By: DurhamG
Differential Revision: D22519585
fbshipit-source-id: ee261173ba584ffcb3371ec640b233609aafcf77
Summary:
`changegroup` uses `dagutil.revlogdag` just to "linearize" commits to optimize
file revision deltas. This is less relevant in production setup because:
- The file delta calculation with remotefilelog is quite different.
- We don't have lots of branches that make the optimization useful.
- In the future segmented changelog makes commits more linearized.
The Python `dagutil.revlogdag` is ideally removed. This is a step towards that.
Reviewed By: DurhamG
Differential Revision: D22519589
fbshipit-source-id: ac44873893df8658da0617e06cae1805d72417aa
Summary:
The changegroup logic uses those APIs, which uses low-level revlog details like
the C index. Bypass them if the Rust DAG is used.
Reviewed By: DurhamG
Differential Revision: D22519583
fbshipit-source-id: 228c7ba0a8ea77c0cf85db39d1194274d6331416
Summary:
Those methods are less fancy. Use the Rust path to avoid depending on revlog
internals.
Reviewed By: DurhamG
Differential Revision: D22519588
fbshipit-source-id: 0fede55ee04373c069ae7a6dd727f4d7208ee321
Summary: This avoids depending on the C index if the Rust DAG is available.
Reviewed By: DurhamG
Differential Revision: D22519587
fbshipit-source-id: a89d91184feaeef6641d2b04353601297bf5d4d5
Summary:
The check does not practically work because the client sends `common=[null]`
if the common set is empty.
D22519582 changes the client-side logic to send `common=[]` instead of
`common=[null]` in such cases. Therefore remove the constraint to keep
tests passing. 13 tests depend on this change.
Reviewed By: StanislavGlebik
Differential Revision: D22612285
fbshipit-source-id: 48fbc94c6ab8112f0d7bae1e276f40c2edd47364
Summary:
Replace the Python spanset with the Rust-backed idset.
The idset can represent multiple ranges and works better with Rust code.
The `idset` fast paths do not preserve order for the `or` operation, as
demonstrated in the test changes.
Reviewed By: DurhamG, kulshrax
Differential Revision: D22519584
fbshipit-source-id: 5d976a937e372a87e7f087d862e4b56d673f81d6
Summary:
Now that packfiles are marked with FILE_DELETE_ON_CLOSE, they can no longer be
opened on Windows, and thus trying to stat them will fail with a permission
denied, failing repack.
This really should only be happening when using EdenFS on Windows, which only a
handful (though growing) number of people are using.
Reviewed By: quark-zju
Differential Revision: D22801408
fbshipit-source-id: f4229e90ce076a65994fb9d193d00c309377323a
Summary:
The tilde got dropped as part of the changes in D22672240 (be3683b1d4)
(an easy mistake to make!) and that renders this function less
useful.
Thankfully the caps display isn't a critical function; just for
some diagnostic printing.
Reviewed By: chadaustin
Differential Revision: D22847590
fbshipit-source-id: 716d7c7bd674260687fbc09e3dc94538359f98b3
Summary: Move client hostname reverse DNS lookup from inside of the LFS server's `RequestContext` to an async method on `ClientIdentity`, allowing it to be used elsewhere. The behavior of `RequestContext::dispatch_post_request` should remain unchanged.
Reviewed By: krallin
Differential Revision: D22835610
fbshipit-source-id: 15c1183f64324f216bd639630396c9c6f19bcaaa
Summary: When a TLS connection fails due to a missing client certificate, the `curl` command may fail with either code 35 or 56 depending on the TLS version used. With TLS v1.3, the error is explicitly reported as a missing client certificate, whereas in TLS v1.2, it is reported as a generic handshake failure. This is because TLS v1.3 defines an explicit [`certificate_required`](https://tools.ietf.org/html/rfc8446#section-4.4.2.4) alert, which is [not present](https://github.com/openssl/openssl/issues/6804) in earlier TLS versions.
Reviewed By: krallin
Differential Revision: D22834527
fbshipit-source-id: a15d6a169d35ece6ed5a54b37b8ca9bbc506b3da
Summary:
`log()` passes fsck bars to standard output, but it will also print the same message to the log with level DBG2. (example below)
```V0713 07:05:45.971511 3510654 StartupLogger.cpp:96] [====================>] 100%: fsck on /home/ailinzhang/eden-state/clients/dev-fbsource6/local
```
Since we don't want the log file to be messed up with fsck bars, we use `logVerbose()` with level DBG7.
Reviewed By: kmancini
Differential Revision: D22727965
fbshipit-source-id: 0700503af511030df2abbca4ad2fa1540995e919
Summary:
We have some users issuing 10k+ diff queries to phabricator, which is
causing problems with their db. Since we usually only care about the latest
draft commits, let's bound the size of the requests we send.
Reviewed By: quark-zju
Differential Revision: D22834195
fbshipit-source-id: d41b449a89d6dfb2d6d33e0be6ed0ff31893ab5e
Summary:
The overall goal of this stack is to add WarmBookmarksCache support to
repo_client to make Mononoke more resilient to lands of very large
commits.
We'd like to use WarmBookmarkCache in repo client, and to do that we need to be
able to tell Publishing and PullDefault bookmarks apart. Let's teach
WarmBookmarksCache about it.
Reviewed By: krallin
Differential Revision: D22812478
fbshipit-source-id: 2642be5c06155f0d896eeb47867534e600bbc535
Summary:
This method will be used in the next diff to add a test, but it might be more
useful later as well.
Note that `update()` method in BookmarkTransaction already handles publishing bookmarks correctly
Reviewed By: farnz
Differential Revision: D22817143
fbshipit-source-id: 11cd7ba993c83b3c8bca778560af4a360f892b03
Summary:
The overall goal of this stack is to add WarmBookmarksCache support to
repo_client to make Mononoke more resilient to lands of very large
commits.
The code for managing cached_publishing_bookmarks_maybe_stale was already a bit
tricky, and with WarmBookmarksCache introduction it would've gotten even worse.
Let's move this logic to a separate SessionBookmarkCache struct.
Reviewed By: krallin
Differential Revision: D22816708
fbshipit-source-id: 02a7e127ebc68504b8f1a7401beb063a031bc0f4
Summary:
Pull Request resolved: https://github.com/facebookexperimental/eden/pull/38
The tool is used in some integration tests, make it public so that the tests might pass
Reviewed By: ikostia
Differential Revision: D22815283
fbshipit-source-id: 76da92afb8f26f61ea4f3fb949044620a57cf5ed
Summary:
Better Engineering: remove dead code about secret tool.
Secret tool is a FB specific tool (keychain like) and has been used to transfer OAuth token between
different devservers without user's involvement. We have migrated to certs on devservers, so it is not needed anymore.
Also, it is FB specific and doesn't make sense for open source either.
Reviewed By: mitrandir77
Differential Revision: D22827264
fbshipit-source-id: cd89168ad75ca041d2a0f18d63474dd1eaad483d
Summary: It looks nicer if we highlight the current workspace in the list.
Reviewed By: mitrandir77
Differential Revision: D22826619
fbshipit-source-id: 416b77fb57d8dfe19057e248e12d411dfc5f9412
Summary:
Some users on macs and dev servers are connected to their default workspace, other
users after D22802064 will be connected to their machine workspace. Assume a
user decides to reclone the repo. Currently, as a result of rejoin, the second batch of the users will
automatically see all their commits from the default workspace and will be a bit
surprised. It makes sense to adapt rejoin logic of choosing the default
workspace if workspace name is not given.
Reviewed By: markbt
Differential Revision: D22817941
fbshipit-source-id: 764034c9f2d774051c5523cb2db093af525f27d7
Summary:
The overall goal of this stack is to add WarmBookmarksCache support to
repo_client to make Mononoke more resilient to lands of very large
commits.
The problem with large changesets is deriving hg changesets for them. It might take
a significant amount of time, and that means that all the clients are stuck waiting on
listkeys() or heads() call waiting for derivation. WarmBookmarksCache can help here by returning bookmarks
for which hg changesets were already derived.
This is the second refactoring to introduce WarmBookmarksCache.
Now let's cache not only pull default, but also publishing bookmarks. There are two reasons to do it:
1) (Less important) It simplifies the code slightly
2) (More important) Without this change 'heads()' fetches all bookmarks directly from BlobRepo thus
bypassing any caches that we might have. So in order to make WarmBookmarksCache useful we need to avoid
doing that.
Reviewed By: farnz
Differential Revision: D22816707
fbshipit-source-id: 9593426796b5263344bd29fe5a92451770dabdc6
Summary:
The overall goal of this stack is to add WarmBookmarksCache support to
repo_client to make Mononoke more resilient to lands of very large commits.
This diff just does a small refactoring that makes introducing
WarmBookmarksCache easier. In particular, later in cached_pull_default_bookmarks_maybe_stale cache I'd like to store
not only PullDefault bookmarks, but also Publishing bookmarks so that both
listkeys() and heads() method could be served from this cache. In order to do
that we need to store not only bookmark name, but also bookmark kind (i.e. is
it Publishing or PullDefault).
To do that let's store the actual Bookmarks and hg changeset objects instead of
raw bytes.
Reviewed By: farnz
Differential Revision: D22816710
fbshipit-source-id: 6ec3af8fe365d767689e8f6552f9af24cbcd0cb9
Summary:
Most out our APIs throw error when the path doesn't exist. I would like to
argue that's not the right choice for list_file_history.
Errors should be only retuned in abnormal situations and with
`history_across_deletions` param there's no other easy way to check if the file
ever existed other than calling this API - so it's not abnormal to call
it with path that doesn't exist in the repo.
Reviewed By: StanislavGlebik
Differential Revision: D22820263
fbshipit-source-id: 002bda2ef5ee9d6632259a333b7f3652cfb7aa6b
Summary:
Added a new query function to get the largest log id from bookmarks_update_log.
In repo_import tool once we move a bookmark to reveal commits to users, we want to check if hg_sync has received the commits. To do this, we extract the largest log id from bookmarks_update_log to compare it with the mutable_counter value related to hg_sync. If the counter value is larger or equal to the log id, we can move the bookmark to the next batch of commits.
Since this query wasn't implemented before, this diff add this functionality.
Next step: add query for mutable_counter
Reviewed By: krallin
Differential Revision: D22816538
fbshipit-source-id: daaa4e5159d561e698c6e1874dd8822546c699c7
Summary:
Pull Request resolved: https://github.com/facebookexperimental/eden/pull/37
mononoke_hg_sync_job is used in integration tests, make it public
Reviewed By: krallin
Differential Revision: D22795881
fbshipit-source-id: 7a32c8e8adf723a49922dbb9e7723ab01c011e60
Summary:
Pull Request resolved: https://github.com/facebookexperimental/eden/pull/36
This command is used in some integration tests, make it public.
Reviewed By: krallin
Differential Revision: D22792846
fbshipit-source-id: 39ac89b1a674ea63dc924cafa07107dbf8e5a098
Summary:
Unfortunately, using sqlite causes `edenfsctl prefetch` to take several orders
of magnitude more time than with rocksdb, tuning sqlite to be faster (at the
expense of reliability) still doesn't come closer to rocksdb. From profiling,
this is due to sqlite only allowing a single thread to read/write onto it
serializing all the work that EdenFS is doing.
A rework on the storage will be necessary to be able to get both good
performance, and reliability but that's a long term project, for now, make
Windows use rocksdb by default.
Reviewed By: wez
Differential Revision: D22819084
fbshipit-source-id: 62f397858ed547da30ef8a6346b767461dc53493
Summary:
As opposed to FUSE, ProjectedFS sends notifications for file/directory creation
after the fact, and for directory that means these will be visible on disk before
EdenFS may be aware of it. While EdenFS usually process it quickly, a heavily
multi-threaded application that tries to concurrently create a directory
hierarchy may end up sending notifications to EdenFS in a somewhat out of order
fashion.
Since this should be a very rare occurence, we make this a very slow path by
being optimistic and calling `getInode` first, and then only if that fails, we
aggressively create all the parent directories. During a buck build of ~1k
jobs, this happened only 3 times.
If we fully think this through, this change doesn't fully fix the race, as a
similar race can now happen when a create and remove/rename operations are
concurrent. However, a client performing these operations concurrently is
either aware that this is racy and should handle these properly, or is most
likely buggy. Both of these should significantly reduce the likelyhod of this
happening, thus, I'm leaving this unfixed for now.
To better understand how frequently this happens, I've added a stat counter.
For now, these aren't published to ODS, but this will be tackled later.
Reviewed By: wez
Differential Revision: D22783484
fbshipit-source-id: ea3aafc2f77b65d3967f697f68114921d5909137
Summary: Make the output more informative
Reviewed By: markbt
Differential Revision: D22803543
fbshipit-source-id: 35dd4ff0a1f1003690b250d5284e48e6abb4f4b1