Summary:
In several places in `library.sh` we had `--mononoke-config-path
mononoke-config`. This ensured that we could not run such commands from
non-`$TESTTMP` directorires. Let's fix that.
Reviewed By: StanislavGlebik
Differential Revision: D22901668
fbshipit-source-id: 657bce27ce6aee8a88efb550adc2ee5169d103fa
Summary: The more contexts the better. Makes debugging errors much more pleasant.
Reviewed By: StanislavGlebik
Differential Revision: D22890940
fbshipit-source-id: 48f89031b4b5f9b15f69734d784969e2986b926d
Summary:
I've seen the error a couple of times when messing up with my clones, not
having the path makes it a bit difficult to fully understand what's going on,
make sure we log it.
Reviewed By: fanzeyi
Differential Revision: D22899098
fbshipit-source-id: c9a60b71ea20514158e62fe8fa9c409d6f0f37ff
Summary:
An extremely thin wrapper around existing APIs: just a way to create merge commits from the command line.
This is needed to make the merge strategy work:
```
C
|
M3
| \
. \
| \
M2 \
| \ \
. \ \
| \ \
M1 \ \
| \ \ \
. TM3 \ \
. / | |
. D3 (e7a8605e0d) TM2 |
. | / /
. D2 (33140b117c) TM1
. | /
. D1 (733961456f)
| |
| \
| DAG to merge
|
main DAG
```
When we're creating `M2` as a result of merge of `TM2` into the main DAG, some files are deleted in the `TM3` branch, but not deleted in the `TM2` branch. Executing merge by running `hg merge` causes these files to be absent in `M2`. To make Mercurial work, we would need to execute `hg revert` for each such file prior to `hg merge`. Bonsai merge semantics however just creates correct behavior for us. Let's therefore just expose a way to create bonsai merges via the `megarepotool`.
Reviewed By: StanislavGlebik
Differential Revision: D22890787
fbshipit-source-id: 1508b3ede36f9b7414dc4d9fe9730c37456e2ef9
Summary:
This adds a CLI for the functionality, added in the previous diff. In addition, this adds an integration test, which tests this deletion functionality.
The output of this tool is meant to be stored in the file. It simulates a simple DAG, and it should be fairly easy to automatically parse the "to-merge" commits out of this output. In theory, it could have been enough to just print the "to-merge" commits alone, but it felt like sometimes it may be convenient to quickly examine the delete commits.
Reviewed By: StanislavGlebik
Differential Revision: D22866930
fbshipit-source-id: 572b754225218d2889a3859bcb07900089b34e1c
Summary:
This implements a new strategy of creating pre-merge delete commits.
As a reminder, the higher-level goal is to gradually merge two independent DAGs together. One of them is the main repo DAG, the other is an "import". It is assumed that the import DAG is already "moved", meaning that all files are at the right paths to be merged.
The strategy is as follows: create a stack of delete commits with gradually decreasing working copy size. Merge them into `master` in reverse order.
Reviewed By: StanislavGlebik
Differential Revision: D22864996
fbshipit-source-id: bfc60836553c656b52ca04fe5f88cdb1f15b2c18
Summary:
On Windows, paths are separated by \, but the test was comparing them against
/. We can simply ask Mercurial to return / with the slashpath template filter.
Reviewed By: chadaustin
Differential Revision: D22871407
fbshipit-source-id: 421bd14f752f29265b12eb25609d4f65e593dda8
Summary:
Cache invalidation is hard, and on Windows we avoided doing a lot of them. It
turns out, this was the wrong decision as it's fairly easy to find cases where
the filesystem view is different from the manifest state.
Since the Linux code is most likely correct in where the invalidation is done,
let's also do the same on Windows, removing a whole lot of #ifdef. It is very
likely that as a result of this diff we end up invalidating more than needed,
thus slowing down EdenFS, but at this point I'd prefer to err on the side of
correctness, performance will come later.
While invalidating files should use PrjDeleteFile, for directories, we simply
need to mark them as placeholder, as directories created by a user won't have a
placeholder, thus ProjectedFS would bypass EdenFS when listing in.
Reviewed By: chadaustin
Differential Revision: D22833202
fbshipit-source-id: d807557f5e44279c49ab701b7a797253ef1f0717
Summary: While testing something for another change, I came across this overlooked typo.
Reviewed By: wez
Differential Revision: D22894060
fbshipit-source-id: 8aa48ef5da714650c974adcf8a34a542fdd4ed9e
Summary:
Avoid some overhead and complexity by storing BufVec as a
unique_ptr<IOBuf>. The complexity can be reintroduced if we ever find
FUSE splice support to be a performance win for us.
Reviewed By: kmancini
Differential Revision: D22710795
fbshipit-source-id: e58eedc0fb5cea9e9743ccd20d3e4e2b7cc5d198
Summary:
Previously we log a process to Scuba when it does 2000 (fetchThreshold_) fetchs, but then in Scuba all processes have fetch_count = 2000. In order to see how many fetches a process really did approximately, we log the same process to Scuba every time it does 2000 more fetches.
Note: this change could make the total count of fetch-heavy events in Scuba inaccurate, as we log the same process more than once. So when users want to see how many fetch-heavy events happened, instead of setting "type = fetch_heavy", they should set exactly "fetch_count = 2000".
Reviewed By: chadaustin
Differential Revision: D22867679
fbshipit-source-id: ae3c768a8d3b03628db6a77263e715303a814e3d
Summary:
With upcoming write quorum work, it'll be interesting to know all the failures that prevent a put from succeeding, not just the most recent, as the most recent may be from a blobstore whose reliability is not yet established.
Store and return all errors, so that we can see exactly why a put failed
Reviewed By: ahornby
Differential Revision: D22896745
fbshipit-source-id: a3627a04a46052357066d64135f9bf806b27b974
Summary:
"Chunking hint" is a string (expected to be in a file) of the following format:
```
prefix1, prefix2, prefix3
prefix4,
prefix5, prefix6
```
Each line represents a single chunk: if a paths starts with any of the prefixes in the line, it should belong to the corresponding chunk. Prefixes are comma-separated. Any path that does not start with any prefix in the hint goes to an extra chunk.
This hint will be used in a new pre-merge-delete approach, to be introduced further in the stack.
Reviewed By: StanislavGlebik
Differential Revision: D22864999
fbshipit-source-id: bbc87dc14618c603205510dd40ee5c80fa81f4c3
Summary:
We need to use a different type of pre-merge deletes, it seems, as the one proposed requires a huge number of commits. Namely, if we have `T` files in total in the working copy and we're happy to delete at most `D` files per commit, while merging at most `S` files per deletion stack:
```
#stacks = T/S
#delete_commits_in_stack = (T-X)/D
#delete_commits_total = T/S * (T-X)/D = (T^2 - TX)/SD ~ T^2/SD
T ~= 3*10^6
If D~=10^4 and X~=10^4:
#delete_commits_total ~= 9*10^12 / 10^8 = 9*10^4
If D~=10^5 and X~=10^5:
#delete_commits_total ~= 9*10^12 / 10^10 = 9*10^2
```
So either 90K or 900 delete commits. 90K is clearly too big. 900 may be tolerable, but it's still hard to manage and make sense of. What's more, there seems to be a way to produce fewer of these, see further in the stack.
Reviewed By: StanislavGlebik
Differential Revision: D22864998
fbshipit-source-id: e615613a34e0dc0d598f3178dde751e9d8cde4da
Summary: Since local store compaction is not a hard requirement for graceful restart, make this issue non blocking. We've seen some users fail restarts because they had compaction issues due to lack of space on their device. If we fail during the compaction stage, we should continue the restart anyway. This is also because there is a chance that the local store will clear columns that are no longer in use.
Reviewed By: chadaustin
Differential Revision: D22828433
fbshipit-source-id: 9a2aaec64e77c2d00089834fda8f8cffda472735
Summary:
We're going to add an SQL blobstore to our existing multiplex, which won't have all the blobs initially.
In order to populate it safely, we want to have normal operations filling it with the latest data, and then backfill from Manifold; once we're confident all the data is in here, we can switch to normal mode, and never have an excessive number of reads of blobs that we know aren't in the new blobstore.
Reviewed By: krallin
Differential Revision: D22820501
fbshipit-source-id: 5f1c78ad94136b97ae3ac273a83792ab9ac591a9
Summary:
Related diff: D22816538 (3abc4312af)
In repo_import tool once we move a bookmark to reveal commits to users, we want to check if hg_sync has received the commits. To do this, we extract the largest log id from bookmarks_update_log to compare it with the mutable_counter value related to hg_sync. If the counter value is larger or equal to the log id, we can move the bookmark to the next batch of commits. Otherwise, we sleep, retry fetching the mutable_counter value and compare the two again.
mutable_counters is an sql table that can track bookmarks log update instances with a counter.
This diff adds the functionality to extract the mutable_counters value for hg_sync.
======================
SQL query fix:
In the previous diff (D22816538 (3abc4312af)) we didn't cover the case where we might not get an ID which should return None. This diff fixes this error.
Reviewed By: StanislavGlebik
Differential Revision: D22864223
fbshipit-source-id: f3690263b4eebfe151e50b01a13b0193009e3bfa
Summary: The walker had a couple of unused stats fields in state.rs. Remove them.
Reviewed By: farnz
Differential Revision: D22863812
fbshipit-source-id: effc37abe29fafb51cb1421ff4962c5414b69be1
Summary:
Prefetch had some legacy logic that tried to look at the server to
determine what it needed to fetch. That's expensive, so let's just replace it
with looking at draft() commits. It also had some naive logic that looped over
every file in the manifest and tried to match a pattern. Let's instead use
mf.matches which efficiently avoids traversing unnecessary directories.
This makes prefetch much faster.
Reviewed By: kulshrax
Differential Revision: D22853075
fbshipit-source-id: cf98aa147203c2d0e811b98998b8dc89173943a6
Summary:
An earlier diff, D21772132 (713fbeec24), add an option to default hgcache data store
writes to indexedlog but it only did it for data, not history. Let's also do it
for history.
Reviewed By: quark-zju
Differential Revision: D22870952
fbshipit-source-id: 649361b2d946359b9fbdd038867e1058077bd101
Summary: It is used in lower case in all other places
Reviewed By: farnz
Differential Revision: D22867435
fbshipit-source-id: 50c78027eeacd341144d190f36cc5570d64f92c3
Summary: This makes it a little bit easier to use.
Reviewed By: sfilipco
Differential Revision: D22853717
fbshipit-source-id: aa3c1ed2a9a2d1020a48a4493a644093d8b07e67
Summary:
TL:DR:
A codemod did something a bit unclean, so they added a lint. This will keep bugging us if we make changes here, so let's satisfy the linter.
More info:
`x.y_ref() = ...` and `*x.y_ref() = ...` are pretty much the same except `*x.y_ref() = ...` can throw for optional fields.
A codemod added a bunch of `*x.y_ref() = ...`, but after they didn't want people to copy paste this for optional fields so they added a lint that pops up on non optional fields too :(
https://fb.workplace.com/groups/thriftusers/permalink/509303206445763/
Reviewed By: chadaustin
Differential Revision: D22823686
fbshipit-source-id: b3b1b8a3b6b1f1245176be19c961476e4554a8e5
Summary:
Previously, fetch heavy event's cmdline was delimited by '\x00' when logged to Scuba. (for example: `grep--color=auto-rtest.`)
Now we replace \x00 with a space, so command name and args will be separated by space. ( `grep --color=auto -r test .` )
Reviewed By: kmancini
Differential Revision: D22772868
fbshipit-source-id: 4ab42e78c7bc786767eee3413b9586739a12e8ac
Summary:
This helps in understanding what's going on when some files disappear and/or
aren't flushed properly.
Reviewed By: fanzeyi
Differential Revision: D22833201
fbshipit-source-id: 09beb5796cb40c0a93107ee6a3a3497abb2578f0
Summary:
This is expected to fix flakyness in test-walker-corpus.t
The problem was that if a FileContent node was reached via an Fsnode it did not have a path associated. This is a race condition that I've not managed to reproduce locally, but I think is highly likely to be the reason for flaky failure on CI
Reviewed By: ikostia
Differential Revision: D22866956
fbshipit-source-id: ef10d92a8a93f57c3bf94b3ba16a954bf255e907
Summary:
There have been lots of issues with user experience related to authentication
and its help messages.
Just one of it:
certs are configured to be used for authentication and they are invalid but the `hg cloud auth`
command will provide help message about the certs but then ask to copy and
paste a token from the code about interactive token obtaining.
Another thing, is certs are configired to use, it was not hard to
set up a token for Scm Daemon that can be still on tokens even if cloud
sync uses certs.
Now it is possible with `hg auth -t <token>` command
Now it should be more cleaner and all the messages should be cleaner as well.
Also certs related help message has been improved.
Also all tests were cleaned up from the authentication except for the main
test. This is to simplify the tests.
Reviewed By: mitrandir77
Differential Revision: D22866731
fbshipit-source-id: 61dd4bffa6fcba39107be743fb155be0970c4266
Summary:
We shouldn't add any tls related configs to the default configuration.
Tls is not used by default. Tokens are currently the default, and tls is another
option. It is cleaner to cover the defaults in the code itself, rather than add
complexity to the configuration here.
Reviewed By: mitrandir77
Differential Revision: D22864541
fbshipit-source-id: 0c0723c77c2a961a0915617d636b83bc65ac8541
Summary:
We're seeing users report lfs fetching hanging for 24+ hours. Stack
traces seem to show it hanging on the lfs fetch. Let's read bytes off the wire
in smaller chunks and add a timeout to each read (default timeout is 10s).
Reviewed By: xavierd
Differential Revision: D22853074
fbshipit-source-id: 3cd9152c472acb1f643ba8c65473268e67d59505
Summary:
We encountered an issue where gc kicked in after forking the Python
process. This cause it to trigger some Rust drop logic which hung because some
cross thread locks were not in a good state. Let's just disable gc during the
fork and only reenable it in the parent process.
Reviewed By: quark-zju
Differential Revision: D22855986
fbshipit-source-id: c3e99fb000bcd4cc141848e6362bb7773d0aad3d
Summary:
Pull Request resolved: https://github.com/facebookexperimental/eden/pull/40
Those tools are being used in some integration tests, make them public so that the tests might pass
Reviewed By: ikostia
Differential Revision: D22844813
fbshipit-source-id: 7b7f379c31a5b630c6ed48215e2791319e1c48d9
Summary:
Pull Request resolved: https://github.com/facebookexperimental/eden/pull/41
As of D22098359 (7f1588131b) the default locale used by integration tests is en_US.UTF-8, but as the comment in code mentiones:
```
The en_US.UTF-8 locale doesn't behave the same on all systems and trying to run
commands like "sed" or "tr" on non-utf8 data will result in "Illegal byte
sequence" error.
That is why we are forcing the "C" locale.
```
Additionally I've changed the test-walker-throttle.t test to use "/bin/date" directly. Previously it was using "/usr/bin/date", but the "/bin/date" is a more standard path as it works on MacOS.
Reviewed By: krallin
Differential Revision: D22865007
fbshipit-source-id: afd1346e1753df84bcfc4cf88651813c06933f79
Summary: It fails now, unknown reason, will work on it later
Reviewed By: mitrandir77, ikostia
Differential Revision: D22865324
fbshipit-source-id: c0513bfa2ce9f6baffebff472053e8a5d889c9ba
Summary:
Follow up from D22819791.
We want to use bookmark update delay only in scs, so let's configure it this
way
Reviewed By: krallin
Differential Revision: D22847143
fbshipit-source-id: b863d7fa4bf861ffe5d53a6a2d5ec44e7f60eb1a
Summary:
This is the (almost) final diff to introduce WarmBookmarksCache in repo_client.
A lot of this code is to pass through the config value, but a few things I'd
like to point out:
1) Warm bookmark cache is enabled from config, but it can be killswitched using
a tunable.
2) WarmBookmarksCache in scs derives all derived data, but for repo_client I
decided to derive just hg changeset. The main motivation is to not change the
current behaviour, and to make mononoke server more resilient to failures in
other derived data types.
3) Note that WarmBookmarksCache doesn't obsolete SessionBookmarksCache that was
introduced earlier, but rather it complements it. If WarmBookmarksCache is
enabled, then SessionBookmarksCache reads the bookmarks from it and not from
db.
4) There's one exception in point #3 - if we just did a push then we read
bookmarks from db rather than from bookmarks cache (see
update_publishing_bookmarks_after_push() method). This is done intentionally -
after push is finished we want to return the latest updated bookmarks to the
client (because the client has just moved a bookmark after all!).
I'd argue that the current code is a bit sketchy already - it doesn't read from
master but from replica, which means we could still see outdated bookmarks.
Reviewed By: krallin
Differential Revision: D22820879
fbshipit-source-id: 64a0aa0311edf17ad4cb548993d1d841aa320958
Summary:
Add a cmdlib argument to control cachelib zstd compression. The default behaviour is unchanged, in that the CachelibBlobstore will attempted compression when putting to the cache if the object is larger than the cachelib max size.
To make the cache behaviour more testable, this change also adds an option to do an eager put to cache without the spawn. The default remains to do a lazy fire and forget put into the cache with tokio::spawn.
The motivation for the change is that when running the walker the compression putting to cachelib can dominate CPU usage for part of the walk, so it's best to turn it off and let those items be uncached as the walker is unlikely to visit them again (it only revisits items that were not fully derived).
Reviewed By: StanislavGlebik
Differential Revision: D22797872
fbshipit-source-id: d05f63811e78597bf3874d7fd0e139b9268cf35d
Summary: populate_healer would panic on launch because there were 2 aguments assigned to -d: debug and destination-blobstore-id
Reviewed By: StanislavGlebik
Differential Revision: D22843091
fbshipit-source-id: e300af85b4e9d4f757b4311f2b7d776f59c7527d
Summary:
Although new changelog revlogs do not use deltas since years ago, early
revisions in our production changelog still use mpatch delta format
because they are stream-cloned.
Teach revlogindex to support them.
Reviewed By: sfilipco
Differential Revision: D22657204
fbshipit-source-id: 7aa3b76a9a6b184294432962d36e6a862c4fe371
Summary:
Now the rust-commits features are moved to changelog2, and changelog is no
longer used for rust-commits features. Let's just remove all rust-commits
features from changelog, and collapse related configs into just rust-commits.
Reviewed By: DurhamG
Differential Revision: D22657194
fbshipit-source-id: d74ae40a24fb365981679feab7c2403f84df2b3e
Summary:
Restore the behavior to before D22368827 (da42f2c17e). This also significantly speeds up
graph log like `smartlog` because the fast native path of `reachableroots`
can be used.
Reviewed By: DurhamG
Differential Revision: D22657197
fbshipit-source-id: e3236938d8acfd0935ec45e761763bf0477f2152
Summary: So reachableroots can be called from Python.
Reviewed By: sfilipco
Differential Revision: D22657186
fbshipit-source-id: 36b1b5ed1e32c88bb07e6c7c7e0a7ca89e0751a3
Summary:
The default reachable_roots implementation is good enough for segmented
changelog, but not efficient for revlogindex use-case.
Reviewed By: sfilipco
Differential Revision: D22657193
fbshipit-source-id: a81bc255d42d46c50e61fe954f027f1160dacb6c
Summary:
I thought it was just `roots & (::heads)`. It is actually more complex than
that.
Reviewed By: sfilipco
Differential Revision: D22657201
fbshipit-source-id: bd0b49fc4cdd2c516384cf70c1c5f79af4da1342
Summary:
The `changelog2.changelog` type does not inherit from `revlog`.
It is basically taking implementation from `changelog` with `userust` branches
returning true.
Reviewed By: DurhamG
Differential Revision: D22657195
fbshipit-source-id: dc718d180c7ef3d64f822c3a8c968ef6027047d5
Summary: This will help us verify that the C index is no longer necessary.
Reviewed By: DurhamG
Differential Revision: D22657196
fbshipit-source-id: 16ed74acc5400661572880adf3d8d3267c8b53e2
Summary:
This makes the Rust code path take care of commit writing.
The feature cannot be enabled yet because the `nodemap` backed by the C index
is no longer aware of new in-memory commits. The next diff migrates nodemap to
be backed by Rust and can turn on this feature altogether.
Reviewed By: DurhamG
Differential Revision: D22657191
fbshipit-source-id: 5f1a60f0b391b06fcd61d10676e2e095f8b7c9d6
Summary:
The debug print abuses the `linkmapper`. The Rust commit add logic does not
use `linkmapper`. So let's remove the debug message to be consistent with
the Rust logic.
Reviewed By: DurhamG
Differential Revision: D22657189
fbshipit-source-id: 2e92087dbb5bfce2f00711dcd62881aba64b0279
Summary:
Those tests are going to break with the latest changelog. We're moving away
from revlog so let's just remove the tests.
Reviewed By: DurhamG
Differential Revision: D22657198
fbshipit-source-id: 6d1540050d70c58636577fa3325daca511273a2b
Summary:
`tr.writepending()` removes callbacks saying "temp files are already written".
However, `tr.writepending()` might be called multiple times and the content
being written can be changed.
For example, `test-hook.t` has a test case that uses both `prechangegroup` and
`pretxnchangegroup` external process hooks. The `prechangegroup` hook runs
before the changelog gets changed, and the `pretxnchangegroup` runs after the
changelog gets changed.
Without this diff, the latter will not see the changelog change after migrating
to Rust (which buffers pending commits in memory).
The revlog changelog "addpending" is kept the original behavior - only call once
for avoiding potential performance regression.
Reviewed By: DurhamG
Differential Revision: D22657199
fbshipit-source-id: 8f96a0beaeebd45e73de3973e3ee8dd1426295fb
Summary:
In the future it's harder to provide changed "revs". Let's use commit hash
instead.
Reviewed By: DurhamG
Differential Revision: D22657203
fbshipit-source-id: b46055fe31d174a6eae47570ebec4a73c7d603f6
Summary:
Without this a few tests will fail with upcoming changes.
For example, test-clone-uncompressed.t will say "requesting all changes"
instead of "no changes found" for the "hg clone --stream" command.
Reviewed By: DurhamG
Differential Revision: D22657190
fbshipit-source-id: 349caf58e5bfdb5310b6b5585e4727e208197573
Summary:
Commands like `debugindex` relies on this function to return a revlog object
with low-level APIs. Do not return changelog as-is.
Reviewed By: DurhamG
Differential Revision: D22657202
fbshipit-source-id: b6ae84a157d3411cef6f67ee842f44134fe9b35e
Summary:
This replaces RustError that might happen during `addcommits`, and allow us to
handle it without having a stacktrace.
Reviewed By: DurhamG
Differential Revision: D22539564
fbshipit-source-id: 356814b9baf0b31528dfc92d62b0dcf352bc1e24
Summary:
The zstore-commit-data code paths are in Python. We want to move them to behind
the Rust HgCommits abstractions. So stop making Python interact with the
low-level details.
Reviewed By: DurhamG
Differential Revision: D22638457
fbshipit-source-id: 435db8425a29ce4eae24a6202ad928f85a5f5ee2
Summary: It's the same as `__add__`. It's consistent with the revset language.
Reviewed By: sfilipco
Differential Revision: D22638456
fbshipit-source-id: 928177d553220461192650f4792ac39cadd57dc2
Summary:
Follow up of D22638454.
This makes revlogindex marks its compatible DAG so "all()" fast paths can be used properly.
Reviewed By: sfilipco
Differential Revision: D22638459
fbshipit-source-id: 074e95b9fccbc486b69a947fec5172662e7dd3b7
Summary:
No need to exhaust the entire IdLazySet if there are hints.
This is important to make `small & lazy` fast.
Reviewed By: sfilipco
Differential Revision: D22638462
fbshipit-source-id: 63a71986e6e254769c42eb6250c042ea6aa5808b
Summary:
When multiple DAGs (ex. a local DAG and a commit-cloud DAG) are involved,
certain fast paths become unsound. Namely, the fast paths of the FULL hint
should check DAG compatibility. For example:
localrepodag.all() & remotedag.all()
should not simply return `localrepodag.all()` or `remotedag.all()`.
Fix it by checking DAG pointers.
A StaticSet might be created without using a DAG, add an optimization
to change `all & static` to `static & all`. So StaticSet without DAG
wouldn't require full DAG scans when intersecting with other sets.
Reviewed By: sfilipco
Differential Revision: D22638454
fbshipit-source-id: 72396417e9c1238d5411829da8f16f2c6d4c2f3a
Summary:
Improve `fmt::Debug` so it fits better in the Rust and Python eco-system:
- Support Rust formatter flags. For example `{:#5.3?}`. `5` defines limit of a
large set to show, `3` defines hex commit hash length. `#` specifies the
alternate form.
- Show commit hashes together with integer Ids for IdStaticSet.
- Use HG rev range syntax (`a:b`) to represent ranges for IdStaticSet.
- Limit spans to show for IdStaticSet, similar to StaticSet.
- Show only 8 chars of a long hex commit hash by default.
- Minor renames like `dag` -> `spans`, `difference` -> `diff`.
Python bindings uses `fmt::Debug` as `__repr__` and will be affected.
Reviewed By: sfilipco
Differential Revision: D22638455
fbshipit-source-id: 957784fec9c99c8fc5600b040d964ce5918e1bb4
Summary:
Hard link adds complexity for revlog writes. It's not that useful in production
setup. The Rust revlog `flush` API does not break hardlinked files. So let's
just avoid using hard links during local repo clone.
Reviewed By: DurhamG
Differential Revision: D22638460
fbshipit-source-id: 038f4d5c48e9972b14c9e59a9d7ef72b6bc5308d
Summary:
This makes intersection set stop early. It's useful to stop iteration on some
lazy sets. For example, the below `ancestors(tip) & span` or
`descendants(1) & span` sets can take seconds to calculate without this
optimization.
```
In [1]: cl.dag.ancestors([cl.tip()]) & cl.tonodes(bindings.dag.spans.unsaferange(len(cl)-10,len(cl)))
Out[1]: <and <lazy-id> <dag [...]>>
In [3]: %time len(cl.dag.ancestors([cl.tip()]) & cl.tonodes(bindings.dag.spans.unsaferange(len(cl)-10,len(cl))))
CPU times: user 364 µs, sys: 0 ns, total: 364 µs
Wall time: 362 µs
In [7]: %time len(cl.dag.descendants([repo[1].node()]) & cl.tonodes(bindings.dag.spans.unsaferange(0,100)))
CPU times: user 0 ns, sys: 574 µs, total: 574 µs
Wall time: 583 µs
```
Reviewed By: sfilipco
Differential Revision: D22638458
fbshipit-source-id: b9064ce2ff1aecc2d7d00025928dfcb3c0d78e0c
Summary:
Similar to the segmented changelog version using `ANCESTORS`. This makes
`heads(all())` calculates `heads_ancestors(all())` automatically and gets
the speed-up.
Reviewed By: sfilipco
Differential Revision: D22638464
fbshipit-source-id: 014412f1c226925e50387f18c1282b3cb96d434b
Summary:
Optimize it to not covert revs to `Vec<u32>`, and have a fast path to
initialize `states` with `Unspecified`. This makes it about 2x faster and match
the C revlog `headrevs` performance when calculating `headsancestors(all())`:
```
In [2]: %timeit cl.index.clearcaches(); len(cl.index.headrevs())
10 loops, best of 3: 66.9 ms per loop
In [3]: %timeit len(cl.dageval(lambda: headsancestors(all())))
10 loops, best of 3: 64.9 ms per loop
```
Reviewed By: sfilipco
Differential Revision: D22638461
fbshipit-source-id: 965eb16e3a78ae02a65a8a44559f3a64c16f6884
Summary:
Change `parents` from using the default implementation that returns `StaticSet`
of commit hashes, to a customized implementation that returns `IdStaticSet`.
This avoids unnecessary commit hash lookups, and makes `heads(all())` 30x
faster, matching `headsancestors(all())` (but is still 2x slower than the C
revlog index `headsrevs` implementation).
Reviewed By: sfilipco
Differential Revision: D22638453
fbshipit-source-id: 4fef78080b990046b91fee110c48e36301d83b4f
Summary:
The hint indicates a set `X` is equivalent to `ancestors(X)`.
This allows us to make `heads` use `heads_ancestors` (which is faster in
segmented changelog) automatically without affecting correctness. It also
makes special queries like `ancestors(all())` super cheap because it'll just
return `all()` as-is.
Reviewed By: sfilipco
Differential Revision: D22638463
fbshipit-source-id: 44d9bbcbb0d7e2975a0c8322181c88daa1ba4e37
Summary:
Re-implement the `findcommonheads` logic using `changelog` APIs that are going
to have native support from Rust.
This decouples from revlog-based Python DAG logic, namely `dagutil.revlogdag`,
and `ancestor.incrementalmissingancestors`, unblocking Rust DAG progress, and
cleans up the algorithm to not use revision numbers.
The core algorithm is unchanged. The sampling logic is simplified and tweaked
a bit (ex. no 'initial' / 'quick initial' special cases). The debug and
progress messages are more verbose, and variable names are chosen to match
the docstrings.
I improved the doc a bit, and added some TODO notes about where I think can be
improved.
Reviewed By: sfilipco
Differential Revision: D22519582
fbshipit-source-id: ac8cc8bebad91b4045d69f402e69b7ca28146414
Summary:
It has been long replaced by setdiscovery. This removes another dependency on
`dagutil.revlogdag`.
Reviewed By: DurhamG
Differential Revision: D22519585
fbshipit-source-id: ee261173ba584ffcb3371ec640b233609aafcf77
Summary:
`changegroup` uses `dagutil.revlogdag` just to "linearize" commits to optimize
file revision deltas. This is less relevant in production setup because:
- The file delta calculation with remotefilelog is quite different.
- We don't have lots of branches that make the optimization useful.
- In the future segmented changelog makes commits more linearized.
The Python `dagutil.revlogdag` is ideally removed. This is a step towards that.
Reviewed By: DurhamG
Differential Revision: D22519589
fbshipit-source-id: ac44873893df8658da0617e06cae1805d72417aa
Summary:
The changegroup logic uses those APIs, which uses low-level revlog details like
the C index. Bypass them if the Rust DAG is used.
Reviewed By: DurhamG
Differential Revision: D22519583
fbshipit-source-id: 228c7ba0a8ea77c0cf85db39d1194274d6331416
Summary:
Those methods are less fancy. Use the Rust path to avoid depending on revlog
internals.
Reviewed By: DurhamG
Differential Revision: D22519588
fbshipit-source-id: 0fede55ee04373c069ae7a6dd727f4d7208ee321
Summary: This avoids depending on the C index if the Rust DAG is available.
Reviewed By: DurhamG
Differential Revision: D22519587
fbshipit-source-id: a89d91184feaeef6641d2b04353601297bf5d4d5
Summary:
The check does not practically work because the client sends `common=[null]`
if the common set is empty.
D22519582 changes the client-side logic to send `common=[]` instead of
`common=[null]` in such cases. Therefore remove the constraint to keep
tests passing. 13 tests depend on this change.
Reviewed By: StanislavGlebik
Differential Revision: D22612285
fbshipit-source-id: 48fbc94c6ab8112f0d7bae1e276f40c2edd47364
Summary:
Replace the Python spanset with the Rust-backed idset.
The idset can represent multiple ranges and works better with Rust code.
The `idset` fast paths do not preserve order for the `or` operation, as
demonstrated in the test changes.
Reviewed By: DurhamG, kulshrax
Differential Revision: D22519584
fbshipit-source-id: 5d976a937e372a87e7f087d862e4b56d673f81d6
Summary:
Now that packfiles are marked with FILE_DELETE_ON_CLOSE, they can no longer be
opened on Windows, and thus trying to stat them will fail with a permission
denied, failing repack.
This really should only be happening when using EdenFS on Windows, which only a
handful (though growing) number of people are using.
Reviewed By: quark-zju
Differential Revision: D22801408
fbshipit-source-id: f4229e90ce076a65994fb9d193d00c309377323a
Summary:
The tilde got dropped as part of the changes in D22672240 (be3683b1d4)
(an easy mistake to make!) and that renders this function less
useful.
Thankfully the caps display isn't a critical function; just for
some diagnostic printing.
Reviewed By: chadaustin
Differential Revision: D22847590
fbshipit-source-id: 716d7c7bd674260687fbc09e3dc94538359f98b3
Summary: Move client hostname reverse DNS lookup from inside of the LFS server's `RequestContext` to an async method on `ClientIdentity`, allowing it to be used elsewhere. The behavior of `RequestContext::dispatch_post_request` should remain unchanged.
Reviewed By: krallin
Differential Revision: D22835610
fbshipit-source-id: 15c1183f64324f216bd639630396c9c6f19bcaaa
Summary: When a TLS connection fails due to a missing client certificate, the `curl` command may fail with either code 35 or 56 depending on the TLS version used. With TLS v1.3, the error is explicitly reported as a missing client certificate, whereas in TLS v1.2, it is reported as a generic handshake failure. This is because TLS v1.3 defines an explicit [`certificate_required`](https://tools.ietf.org/html/rfc8446#section-4.4.2.4) alert, which is [not present](https://github.com/openssl/openssl/issues/6804) in earlier TLS versions.
Reviewed By: krallin
Differential Revision: D22834527
fbshipit-source-id: a15d6a169d35ece6ed5a54b37b8ca9bbc506b3da
Summary:
`log()` passes fsck bars to standard output, but it will also print the same message to the log with level DBG2. (example below)
```V0713 07:05:45.971511 3510654 StartupLogger.cpp:96] [====================>] 100%: fsck on /home/ailinzhang/eden-state/clients/dev-fbsource6/local
```
Since we don't want the log file to be messed up with fsck bars, we use `logVerbose()` with level DBG7.
Reviewed By: kmancini
Differential Revision: D22727965
fbshipit-source-id: 0700503af511030df2abbca4ad2fa1540995e919
Summary:
We have some users issuing 10k+ diff queries to phabricator, which is
causing problems with their db. Since we usually only care about the latest
draft commits, let's bound the size of the requests we send.
Reviewed By: quark-zju
Differential Revision: D22834195
fbshipit-source-id: d41b449a89d6dfb2d6d33e0be6ed0ff31893ab5e
Summary:
The overall goal of this stack is to add WarmBookmarksCache support to
repo_client to make Mononoke more resilient to lands of very large
commits.
We'd like to use WarmBookmarkCache in repo client, and to do that we need to be
able to tell Publishing and PullDefault bookmarks apart. Let's teach
WarmBookmarksCache about it.
Reviewed By: krallin
Differential Revision: D22812478
fbshipit-source-id: 2642be5c06155f0d896eeb47867534e600bbc535
Summary:
This method will be used in the next diff to add a test, but it might be more
useful later as well.
Note that `update()` method in BookmarkTransaction already handles publishing bookmarks correctly
Reviewed By: farnz
Differential Revision: D22817143
fbshipit-source-id: 11cd7ba993c83b3c8bca778560af4a360f892b03
Summary:
The overall goal of this stack is to add WarmBookmarksCache support to
repo_client to make Mononoke more resilient to lands of very large
commits.
The code for managing cached_publishing_bookmarks_maybe_stale was already a bit
tricky, and with WarmBookmarksCache introduction it would've gotten even worse.
Let's move this logic to a separate SessionBookmarkCache struct.
Reviewed By: krallin
Differential Revision: D22816708
fbshipit-source-id: 02a7e127ebc68504b8f1a7401beb063a031bc0f4
Summary:
Pull Request resolved: https://github.com/facebookexperimental/eden/pull/38
The tool is used in some integration tests, make it public so that the tests might pass
Reviewed By: ikostia
Differential Revision: D22815283
fbshipit-source-id: 76da92afb8f26f61ea4f3fb949044620a57cf5ed
Summary:
Better Engineering: remove dead code about secret tool.
Secret tool is a FB specific tool (keychain like) and has been used to transfer OAuth token between
different devservers without user's involvement. We have migrated to certs on devservers, so it is not needed anymore.
Also, it is FB specific and doesn't make sense for open source either.
Reviewed By: mitrandir77
Differential Revision: D22827264
fbshipit-source-id: cd89168ad75ca041d2a0f18d63474dd1eaad483d
Summary: It looks nicer if we highlight the current workspace in the list.
Reviewed By: mitrandir77
Differential Revision: D22826619
fbshipit-source-id: 416b77fb57d8dfe19057e248e12d411dfc5f9412
Summary:
Some users on macs and dev servers are connected to their default workspace, other
users after D22802064 will be connected to their machine workspace. Assume a
user decides to reclone the repo. Currently, as a result of rejoin, the second batch of the users will
automatically see all their commits from the default workspace and will be a bit
surprised. It makes sense to adapt rejoin logic of choosing the default
workspace if workspace name is not given.
Reviewed By: markbt
Differential Revision: D22817941
fbshipit-source-id: 764034c9f2d774051c5523cb2db093af525f27d7
Summary:
The overall goal of this stack is to add WarmBookmarksCache support to
repo_client to make Mononoke more resilient to lands of very large
commits.
The problem with large changesets is deriving hg changesets for them. It might take
a significant amount of time, and that means that all the clients are stuck waiting on
listkeys() or heads() call waiting for derivation. WarmBookmarksCache can help here by returning bookmarks
for which hg changesets were already derived.
This is the second refactoring to introduce WarmBookmarksCache.
Now let's cache not only pull default, but also publishing bookmarks. There are two reasons to do it:
1) (Less important) It simplifies the code slightly
2) (More important) Without this change 'heads()' fetches all bookmarks directly from BlobRepo thus
bypassing any caches that we might have. So in order to make WarmBookmarksCache useful we need to avoid
doing that.
Reviewed By: farnz
Differential Revision: D22816707
fbshipit-source-id: 9593426796b5263344bd29fe5a92451770dabdc6
Summary:
The overall goal of this stack is to add WarmBookmarksCache support to
repo_client to make Mononoke more resilient to lands of very large commits.
This diff just does a small refactoring that makes introducing
WarmBookmarksCache easier. In particular, later in cached_pull_default_bookmarks_maybe_stale cache I'd like to store
not only PullDefault bookmarks, but also Publishing bookmarks so that both
listkeys() and heads() method could be served from this cache. In order to do
that we need to store not only bookmark name, but also bookmark kind (i.e. is
it Publishing or PullDefault).
To do that let's store the actual Bookmarks and hg changeset objects instead of
raw bytes.
Reviewed By: farnz
Differential Revision: D22816710
fbshipit-source-id: 6ec3af8fe365d767689e8f6552f9af24cbcd0cb9
Summary:
Most out our APIs throw error when the path doesn't exist. I would like to
argue that's not the right choice for list_file_history.
Errors should be only retuned in abnormal situations and with
`history_across_deletions` param there's no other easy way to check if the file
ever existed other than calling this API - so it's not abnormal to call
it with path that doesn't exist in the repo.
Reviewed By: StanislavGlebik
Differential Revision: D22820263
fbshipit-source-id: 002bda2ef5ee9d6632259a333b7f3652cfb7aa6b
Summary:
Added a new query function to get the largest log id from bookmarks_update_log.
In repo_import tool once we move a bookmark to reveal commits to users, we want to check if hg_sync has received the commits. To do this, we extract the largest log id from bookmarks_update_log to compare it with the mutable_counter value related to hg_sync. If the counter value is larger or equal to the log id, we can move the bookmark to the next batch of commits.
Since this query wasn't implemented before, this diff add this functionality.
Next step: add query for mutable_counter
Reviewed By: krallin
Differential Revision: D22816538
fbshipit-source-id: daaa4e5159d561e698c6e1874dd8822546c699c7
Summary:
Pull Request resolved: https://github.com/facebookexperimental/eden/pull/37
mononoke_hg_sync_job is used in integration tests, make it public
Reviewed By: krallin
Differential Revision: D22795881
fbshipit-source-id: 7a32c8e8adf723a49922dbb9e7723ab01c011e60
Summary:
Pull Request resolved: https://github.com/facebookexperimental/eden/pull/36
This command is used in some integration tests, make it public.
Reviewed By: krallin
Differential Revision: D22792846
fbshipit-source-id: 39ac89b1a674ea63dc924cafa07107dbf8e5a098
Summary:
Unfortunately, using sqlite causes `edenfsctl prefetch` to take several orders
of magnitude more time than with rocksdb, tuning sqlite to be faster (at the
expense of reliability) still doesn't come closer to rocksdb. From profiling,
this is due to sqlite only allowing a single thread to read/write onto it
serializing all the work that EdenFS is doing.
A rework on the storage will be necessary to be able to get both good
performance, and reliability but that's a long term project, for now, make
Windows use rocksdb by default.
Reviewed By: wez
Differential Revision: D22819084
fbshipit-source-id: 62f397858ed547da30ef8a6346b767461dc53493
Summary:
As opposed to FUSE, ProjectedFS sends notifications for file/directory creation
after the fact, and for directory that means these will be visible on disk before
EdenFS may be aware of it. While EdenFS usually process it quickly, a heavily
multi-threaded application that tries to concurrently create a directory
hierarchy may end up sending notifications to EdenFS in a somewhat out of order
fashion.
Since this should be a very rare occurence, we make this a very slow path by
being optimistic and calling `getInode` first, and then only if that fails, we
aggressively create all the parent directories. During a buck build of ~1k
jobs, this happened only 3 times.
If we fully think this through, this change doesn't fully fix the race, as a
similar race can now happen when a create and remove/rename operations are
concurrent. However, a client performing these operations concurrently is
either aware that this is racy and should handle these properly, or is most
likely buggy. Both of these should significantly reduce the likelyhod of this
happening, thus, I'm leaving this unfixed for now.
To better understand how frequently this happens, I've added a stat counter.
For now, these aren't published to ODS, but this will be tackled later.
Reviewed By: wez
Differential Revision: D22783484
fbshipit-source-id: ea3aafc2f77b65d3967f697f68114921d5909137
Summary: Make the output more informative
Reviewed By: markbt
Differential Revision: D22803543
fbshipit-source-id: 35dd4ff0a1f1003690b250d5284e48e6abb4f4b1
Summary: The test is passing, enable it.
Reviewed By: genevievehelsel
Differential Revision: D22798421
fbshipit-source-id: aec5302aad38d3413385bf5f0242800d685fb5ef
Summary: The test is passing, enable it.
Reviewed By: genevievehelsel
Differential Revision: D22798424
fbshipit-source-id: 76b99457aacf5a81c2b9b3ebaedd0e6e1cf2a1e8
Summary: The test is passing, enable it.
Reviewed By: genevievehelsel
Differential Revision: D22798422
fbshipit-source-id: 0f5a106be51e319a0d317900cc88de1131b95e4d
Summary: Out of the 5 tests, 3 are passing, let's enable them.
Reviewed By: genevievehelsel
Differential Revision: D22798418
fbshipit-source-id: 5bd3fd90945e5556de7838a5dc61ad00865a6d55
Summary: Half of them are passing, let's make sure they run.
Reviewed By: genevievehelsel
Differential Revision: D22798423
fbshipit-source-id: b762213ffad431de6f54acb8ceffa54f28f5909d
Summary: The test is passing, enable it.
Reviewed By: genevievehelsel
Differential Revision: D22798417
fbshipit-source-id: c45531a75a3db61e33f95ef80eb007f12dd02f2a
Summary: The test is passing, enable it.
Reviewed By: genevievehelsel
Differential Revision: D22794679
fbshipit-source-id: 8d5f5a9b9fd7750034baeda814bfeff29882d409
Summary: These are all passing, enable them.
Reviewed By: genevievehelsel
Differential Revision: D22794158
fbshipit-source-id: c381a906e096c3642248a2521e8e7772d74f992e
Summary: All of the tests are passing, enable them.
Reviewed By: genevievehelsel
Differential Revision: D22794159
fbshipit-source-id: 8b7e33f3abbde88e06488f7fe9ea7217d204e70e
Summary: The test is passing, enable it.
Reviewed By: chadaustin
Differential Revision: D22794161
fbshipit-source-id: 94d167b88782d386ca60b9215f1c2aef56a88a4d
Summary: The only ones not passing are the FUSE ones, which is expected. Enable all the others.
Reviewed By: genevievehelsel
Differential Revision: D22794162
fbshipit-source-id: 31f51fd5fff8e3ced75924533fa536208dabf11d
Summary: The test is passing, enable it.
Reviewed By: chadaustin
Differential Revision: D22794155
fbshipit-source-id: 286c9f1129d279487019206d58521951c768cbb1
Summary: 2 out of 3 tests are passing, let's run these.
Reviewed By: chadaustin
Differential Revision: D22794160
fbshipit-source-id: e62e1956980689b083107ebb3b8340880e0d72a6
Summary:
Besides the 3 listed, all the other 13 are passing, let's make sure we run them
to not regress.
Reviewed By: chadaustin
Differential Revision: D22794164
fbshipit-source-id: 5660cc36365de96a2b9e62e3462c01b39904d2f3
Summary:
To avoid mistakes when refactoring, static_assert that the template
variant of sendReply and make_iovec are only used with POD types.
Reviewed By: genevievehelsel
Differential Revision: D22710742
fbshipit-source-id: 4557761f3946fe8969ce31a42502f64cc3298e1d
Summary:
Previously, we kept a list of weak pointers to RequestData instances
to track live FUSE calls, but the only thing we need is the FUSE
request header, so directly track that instead.
Reviewed By: kmancini
Differential Revision: D22710716
fbshipit-source-id: 3820acb314ac2db85b86de128fd082bc4871d9c6
Summary:
Now that we are explicit about whether the kernel caches must be
invalidated, we can remove a use of folly::RequestContext.
Reviewed By: kmancini
Differential Revision: D22710518
fbshipit-source-id: 4bd5267bf5dd3135adf33e4f4fa1ea2649816564
Summary:
Avoid the cost of dynamically querying whether we are in a FUSE
request handler or not by passing a flag.
Reviewed By: kmancini
Differential Revision: D22710480
fbshipit-source-id: 010bb8efee8074441aa20aab0eb12277452c5252
Summary:
Avoid the cost of dynamically querying whether we are in a FUSE
request handler or not by passing a flag.
Reviewed By: kmancini
Differential Revision: D22710452
fbshipit-source-id: 818035b72b793fa895147d9df3bb668d5b9c55f3
Summary:
Avoid the cost of dynamically querying whether we are in a FUSE
request handler or not by passing a flag.
Reviewed By: kmancini
Differential Revision: D22710422
fbshipit-source-id: 65b0737ad5f8ca74d12f2c657691d3751df4aa54
Summary:
Avoid the cost of dynamically querying whether we are in a FUSE
request handler or not by passing a flag.
Reviewed By: genevievehelsel
Differential Revision: D22710397
fbshipit-source-id: 7c62f45dfc227416c91070842a349b9d0c626cba
Summary:
We are unifying C++ APIs for accessing optional and unqualified fields:
https://fb.workplace.com/groups/1730279463893632/permalink/2541675446087359/.
This diff migrates code from accessing data members generated from unqualified
Thrift fields directly to the `field_ref` API, i.e. replacing
```
thrift_obj.field
```
with
```
*thrift_obj.field_ref()
```
The `_ref` suffixes will be removed in the future once data members are private
and names can be reclaimed.
The output of this codemod has been reviewed in D20039637.
The new API is documented in
https://our.intern.facebook.com/intern/wiki/Thrift/FieldAccess/.
drop-conflicts
Reviewed By: iahs
Differential Revision: D22764126
fbshipit-source-id: 67b1bc6d4a9135f594d78325cee8a194255bdcb8
Summary:
This list blocks ovrsource from being used with eden + mononoke (including scs)
This keeps us from being able to do metadata prefetching for ovrsource.
Reviewed By: wez
Differential Revision: D22539163
fbshipit-source-id: 24f491a82ea554469369f9aed846a1ea75fafc5f
Summary: This is unused, and won't ever be used.
Reviewed By: genevievehelsel
Differential Revision: D22744352
fbshipit-source-id: 9d20db608f972288eaf33e3ea0a79ffe5e13e03e
Summary:
If the LFS server is down, we are going to retry fetching filenode from the
Mercurial server directly, who is expected to not return a pointer.
Previously, this was achieved by adding a hack into `get_missing`, but since
the function is no longer called during prefetch calls, we cannot rely on it
anymore. Instead, we can wrap the regular remote store and translate all the
StoreKey::Content onto their corresponding hgid keys.
Reviewed By: DurhamG
Differential Revision: D22565604
fbshipit-source-id: 2532a1fc3dfd9ba5600957ed5cf905255cb5b3fd
Summary:
The ContentStore code has a strong assumption that all the data fetched ends up
in the hgcache. Unfortunately, this assumption breaks somewhat if an LFS
pointer is in the local store but the blob isn't alonside it. This can happen
for instance when ubundling a bundle that contains a pointer, the pointer will
be written to the local store, but the blob would be fetched in the shared
store.
We can break this assumption a bit in the LFS store code by writing the fetched
blob alongside the pointer, this allows the `get` operation to find both the
pointer and the blob in the same store.
Reviewed By: DurhamG
Differential Revision: D22714708
fbshipit-source-id: 01aedf04d692c787b7cddb0f7a76828ea37dcf29
Summary:
The pointer for a blob might very well be in the local store, so let's search
in it.
Reviewed By: DurhamG
Differential Revision: D22565608
fbshipit-source-id: 925dd5718fc19e11a1ccaa0887bf5c477e85b2e5
Summary: Similarly to the changes made for `get`, the same can be applied to prefetch.
Reviewed By: DurhamG
Differential Revision: D22565609
fbshipit-source-id: 0fbc1a0086fa44593a6aaffb746ed36b3261040c
Summary:
To gradually merge one repo into the other, we need to produce multiple slices of the working copy. The sum of these slices has to be equal to
the whole of the original repo's working copy. To create each of these slices all files but the ones in the slice need to be deleted from the working copy.
Before this diff, megarepolib would do this in a single delete commit. This however may be impractical, as it will produce huge commits, which we'll be unable
to process adequately. So this diff essentially introduces gradual deletion for each slice, and calls each slice "deletion stack". This is how it looks (a copy from the docstring):
```
M1
. \
. D11 (ac5fca16ae)
. |
. D12 (4c57c974e3)
. |
M2 \
. \ |
. D21 (1135339320) |
. | |
. D22 (60419d261b) |
. | |
o \|
| |
o PM
^ ^
| \
main DAG merged repo's DAG
```
Where:
- `M1`, `M2` - merge commits, each of which merges only a chunk
of the merged repo's DAG
- `PM` is a pre-merge master of the merged repo's DAG
- `D11 (ac5fca16ae)`, `D12 (4c57c974e3)`, `D21 (1135339320)` and `D22 (60419d261b)` are commits, which delete
a chunk of working copy each. Delete commmits are organized
into delete stacks, so that `D11 (ac5fca16ae)` and `D12 (4c57c974e3)` progressively delete
more and more files.
Reviewed By: StanislavGlebik
Differential Revision: D22778907
fbshipit-source-id: ad0bc31f5901727b6df32f7950053ecdde6f599c
Summary:
Once we start moving the bookmark across the imported commits (D22598159 (c5e880c239)), we need to check dependent systems to avoid overloading them when parsing the commits. In this diff we added the functionality to check Phabricator. We use an external service (jf graphql - find discussion here: https://fburl.com/nr1f19gs) to fetch commits from Phabricator. Each commit id starts with "r", followed by a call sign (e.g FBS for fbsource) and the commit hash (https://fburl.com/qa/9pf0vtkk). If we try to fetch an invalid commit id (e.g not having a call sign), we should receive an error. Otherwise, we should receive a JSON.
An imported commit should have the following query result: https://fburl.com/graphiql/20txxvsn - nodes has one result with the imported field true.
If the commit hasn't been recognised by Phabricator yet, the nodes array will be empty.
If the commit has been recognised, but not yet parsed, the imported field will be false.
If we haven't parsed the batch, we will try to check Phabricator again after sleeping for a couple of seconds.
If it has parsed the batch of commits, we move the bookmark to the next batch.
Reviewed By: krallin
Differential Revision: D22762800
fbshipit-source-id: 5c02262923524793f364743e3e1b3f46c921db8d
Summary: On MacOS if you kill a process without waiting on it to be killed you will receive a warning on the terminal saying that the process was killed. To suppress that output, which is messing with the integratino tests, use a combination of kill and wait (the custom "killandwait" bash function). It will wait for the process to stop which is probably what most integration tests would prefer to do
Reviewed By: krallin
Differential Revision: D22790485
fbshipit-source-id: d2a08a5e617e692967f8bd566e48f5f9b50cb94d
Summary: Using "/usr/bin/date" rather than just "date" is very limiting, not all systems have common command line tools installed in the same place, just use "date".
Reviewed By: krallin
Differential Revision: D22762186
fbshipit-source-id: 747da5a388932fb5b9f4c068014c01ee90a91f9b
Summary: On MacOS the default localisation configuration (UTF-8) won't allow operations on arbitrary bytes of data via some commands, because not all sequences of bytes are valid utf-8 characters. That is why when handling arbitrary bytes it is better to use the "C" locale, which can be achieved by setting the LC_ALL env variable to "C".
Reviewed By: krallin
Differential Revision: D22762189
fbshipit-source-id: aa917886c79fba5ea61ff7168767fc4b052a35a1
Summary: Use brew on MacOS GitHub CI runs to update bash from 3.* to 5.*.
Reviewed By: krallin
Differential Revision: D22762195
fbshipit-source-id: b3a4c9df7f8ed667e88b28aacf7d87c6881eb775
Summary: MacOS uses FreeBSD version of command line tools. This diff uses brew to install the GNU tooling on GitHub CI and uses it to run the integration tests.
Reviewed By: krallin
Differential Revision: D22762198
fbshipit-source-id: 1f67674392bf6eceea9d2de02e929bb3f9f7cadd
Summary:
The simplest fix so far is to erase accessed bookmarks state before switching
New cloud join will
Reviewed By: markbt
Differential Revision: D22791409
fbshipit-source-id: 9675ec03c275e42e640d3a95dd5eda2ae084b92b
Summary:
On unexpected errors like missing blobstore keys the walker will now log the preceding node (source) and an interesting step to this node (not necessarily the immediately preceding, e.g. the affected changeset).
Validate mode produces route information with interesting tracking enabled, scrub currently does not to save time+memory. Blobstore errors in scrub mode can be reproduced in validate mode when the extra context from the graph route is needed.
Reviewed By: farnz
Differential Revision: D22600962
fbshipit-source-id: 27d46303a2f2c07219950c20cc7f1f78773163e5
Summary:
Now that we can configure ACL checking on a per-repo basis, use the
`enforce_acl_check` config option as a killswitch to quickly disable ACL
enforcement, if required.
Further, remove the `acl_check` config flag that was always set to True.
As part of this change I've refactored the integration test a little and
replaced the phrase "ACL check" with "ACL enforcement", as we always check the
ACL inside of the LFS server.
Reviewed By: krallin
Differential Revision: D22764510
fbshipit-source-id: 8e09c743a9cd78d54b1423fd2a5cfc9bf7383d7a
Summary: It was once used by EdenFS, but is now dead code, no need to keep it around.
Reviewed By: singhsrb
Differential Revision: D22784582
fbshipit-source-id: d01cf5a99a010530166cabb0de55a5ea3c51c9c7
Summary: These were only used in tests, no need to keep it.
Reviewed By: chadaustin
Differential Revision: D22744353
fbshipit-source-id: 57596d641ab85f15e8c945327d7849a64aa73ef8
Summary:
Both unices and Windows needs to invalidate the cache in the same place, let's
avoid ifdef and consolidate the function name to clean things up a bit.
Reviewed By: chadaustin
Differential Revision: D22741709
fbshipit-source-id: 04060c0080eff9840abd22747ea48404fa50fd86
Summary:
We're experimenting with enabling NO_OPEN support in our internal
build of the osxfuse kext. This commit includes the relevant capability bits
for the kernel interface (which are compatible with the linux FUSE
implementation) as well as adjusts our FUSE client code to detect and use
those bits on macOS.
Reviewed By: xavierd
Differential Revision: D22744378
fbshipit-source-id: 21376439a85b0b0f5a71916dd1af618d9627695e
Summary:
When Mercurial is outside a repo and you run a repo-required command,
it will try to construct the localrepo and fail. Unfortunately the dynamicconfig
loading logic treated it like a repo-creation event, and tried to load the
dynamicconfig in memory, expecting the repo to be created.
Since we do an http request during this, if we were offline it would hang
for 30 seconds every single time. Let's avoid this code path unless creating a
repo.
A separate diff will lower the 30 second timeout.
Reviewed By: kulshrax
Differential Revision: D22710370
fbshipit-source-id: e341e9230d2fdba80059ca086f0f12494a10c5d6
Summary: Make `store` the first argument for all of the EdenAPI Python methods. I've found this arrangement to be more ergonomic when working with the client later in the stack.
Reviewed By: quark-zju
Differential Revision: D22703915
fbshipit-source-id: b0ca900d969ec86ee91e8c62d281c2102860e9ef
Summary: Add a small Python wrapper class around the Rust EdenAPI client. The intention of this class is to allow for proper SIGINT handling during FFI calls as well as better handling of Python exceptions coming from EdenAPI.
Reviewed By: quark-zju
Differential Revision: D22703916
fbshipit-source-id: 33d80f616c55a607075d23dda448064115970b55
Summary: Some versions of sqlite don't allow using LIKE operation on BLOB data, so first cast it to TEXT. This test was failing on Linux runs on GitHub.
Reviewed By: krallin
Differential Revision: D22761041
fbshipit-source-id: 567d68050297c3a2ac781b252d3e9b21ea5b2201
Summary: Have a comprehensive list of OSS tests that do not pass yet.
Reviewed By: krallin
Differential Revision: D22762196
fbshipit-source-id: 19ab920c4c143179db65a6d8ee32974db16c5e3d
Summary:
we've seen flaky tests here on stress test runs, lets subtract time.time() by an epsilon of one to ensure that the comparison is only made within a second precision (time.time() worst case precision)
an example error: `AssertionError: 1595414015.5022793 not greater than or equal to 1595414015.5025299`
`AssertionError: 1595759609.439305 not greater than or equal to 1595759609.441682`
Reviewed By: chadaustin
Differential Revision: D22687496
fbshipit-source-id: 8e87148d620577e3198d2845d785a87a909cd1d3
Summary:
Update the LFS server to use the `enforce_lfs_acl_check` to enforce
ACL checks for specific repos and also reject clients with missing idents.
In the next diff, I will use the existing LFS server config's
`enforce_acl_check` flag as a killswitch.
Reviewed By: krallin
Differential Revision: D22762451
fbshipit-source-id: 61d26944127711f3503e04154e8c079ae75dc815
Summary:
Even though we never asked for a pre rename notification, ProjectedFS would
anyway send one to us, and since we're now failing on unrecognized
notification, that meant renames would always fail. Handle them and do nothing in them.
I'm not sure yet if we want to move the actual rename logic into the pre rename
callback, or keep it as is. The benefit of the pre rename code is that it can
fail the user request, while the current code will fail, but not prevent the
rename from happening.
Reviewed By: wez
Differential Revision: D22738388
fbshipit-source-id: 487e1f90b503bc59cff7315dd38d2a3039552eaf
Summary: This will make it easier to support PrjFS async programming model.
Reviewed By: chadaustin
Differential Revision: D22738408
fbshipit-source-id: 2e2e4f6f2f718b0226cf9fab66589c50b6db49db
Summary:
The pattern when using it is to catch the exception, then call the function,
which then would re-throw it and catch the subtyped exception. We can avoid a
re-throw entirely by having the function take the exception that was thrown in
the first place and dynamic cast it.
Reviewed By: chadaustin
Differential Revision: D22736772
fbshipit-source-id: 0efa3134bccf3ba8bdcd51d67e03c7ee4483a99f
Summary:
We're still reading the entire file at a time, but this paves the way to not do
that.
While the change looks big, a lot of it is just moving code around. The main
gist of it is removing EdenMount::readFile and writing the proper future
combinator in EdenDispatcher::getFileData.
Reviewed By: wez
Differential Revision: D22361748
fbshipit-source-id: 6391a29d25a4c9e61b91952c40c21ad52e728c8b
Summary:
This would enable the enumeration callback to use ProjectedFS asynchronous
completion.
Reviewed By: chadaustin
Differential Revision: D22361747
fbshipit-source-id: b2d31533ee5128e9dd3da7f91d5225331cf8e926
Summary:
We've had a long latent bug where for some reason a file that should be in the
repo isn't visible on the filesystem. Turning on logging and running `ls` shows
that EdenFS tells ProjectedFS about the file, but it's somehow not shown. The
reason being that ProjectedFS keeps track of all the removed files by adding a
tombstone for these, and will not list them until EdenFS removes this
tombstone.
Looking at the various places where we invalidate FUSE's cache, but not
ProjectedFS, only one stands out, and it has to do with files not being present
in the working copy, but added in the update destination. Adding the right
flush there appear to solve the simple repro listed below.
I'll remove all the ifdef around flushing in a later diff to avoid this issue
from re-appearing again.
Reviewed By: fanzeyi
Differential Revision: D22739916
fbshipit-source-id: 3a4fbc825cd21b36cbd2616882fd50e3d9741f63
Summary:
Let's by default not take a lease so that derived_data_tailer can make progress even if all other services are failing to derive.
One note - we don't remove the lease completely, but rather we use another lease that's separate from the lease used by other mononoke services. The motivation here is to make sure we don't derive unodes 4 times - blame, deleted_file_manifest and fastlog all want to derive unodes, and with no lease at all they would just all derive the same data a few times. Deriving unodes a few times seems undesirable, so I suggest to use a InProcessLease instead of no lease.
Reviewed By: krallin
Differential Revision: D22761222
fbshipit-source-id: 9595705d955f3bb2fe7efd649814fc74f9f45d54
Summary:
Add log sequence numbers to the scuba sample builder. This provides an ordering
over the logs made by an individual instance of Mononoke, allowing them to be
sorted.
Reviewed By: krallin
Differential Revision: D22728880
fbshipit-source-id: 854bde51c7bfc469677ad08bb738e5097cb05ad5
Summary:
We have two deficiencies to correct in here; modernise the code without changing behaviour first to make it easier to later fix them.
Deficiency 1 is that we always call the `on_put` handler; we need a mode that doesn't do that unless a blobstore returns an error, for jobs not waiting on a human.
Deficiency 2 is that we accept a write after one blobstore accepts it; there's a risk of that being the only copy if a certain set of race conditions are met
Reviewed By: StanislavGlebik
Differential Revision: D22701961
fbshipit-source-id: 0990d3229153cec403717fcd4383abcdf7a52e58
Summary:
Switching workspaces is good to have before depracating old style backups.
Otherwise it will be too confusing for users who would like to keep their work
on different hosts separate.
Reviewed By: markbt
Differential Revision: D22692393
fbshipit-source-id: abae7667ce24465e69613f3cdd4cd01471fc7704
Summary:
as in title.
Since we haven't tested it much yet I've added a note that this feature is
experimental
Reviewed By: krallin
Differential Revision: D22760648
fbshipit-source-id: 33f858b0021939dabbe1894b08bd495464ad0f63
Summary:
Move changeset_fetcher building to a separate function, because
build_skiplist_index is already rather large and I'm going to make it larger in
the next diff
Reviewed By: krallin
Differential Revision: D22760556
fbshipit-source-id: 800baba052f46ed817f011f71dd28d40e98245fe
Summary:
Currently our skiplists store a skip edge for almost all public commits. This
is problematic for a few reasons:
1) It uses more memory
2) It increases the startup time
3) It makes startup flakier. We've noticed a few times that our backend storage
return errors more often when try to download large blobs.
Let's change the way we build skiplist. Let's not index every public changeset
we have, but rather index it smarter. See comments for more details.
Reviewed By: farnz
Differential Revision: D22500300
fbshipit-source-id: 7e9c887595ba11da80233767dad4ec177d933f72
Summary:
This adds logging for data fetches that come from the thrift globfiles call to
help debug the cause of unexpected data fetches. (See D22448048 for more
motivation)
Reviewed By: genevievehelsel
Differential Revision: D22489512
fbshipit-source-id: 040cf1277205d08ea864a1f30f8d3b25ee7c4508
Summary:
This adds logging for files fetched in prefetch like was aleady added for
blob and tree fetches.
This is needed to log the fetches caused by the glob files thrift call. The
purpose of this to help debug the cause of unexpected data fetches (See
D22448048 for more motivation).
Reviewed By: genevievehelsel
Differential Revision: D22561619
fbshipit-source-id: 5ae78b99fb0c7d863d8223b93492b0d0210ddf9e
Summary:
This adds logging for data fetches that come from the thrift getSHA1 call to
help debug the cause of unexpected data fetches. (See D22448048 for more
motivation)
Reviewed By: genevievehelsel
Differential Revision: D22489514
fbshipit-source-id: eb2d82c206af857cc79439d2854d682641292db8
Summary:
This adds logging for data fetches that come from the thrift getFileInformation call to
help debug the cause of unexpected data fetches. (See D22448048 for more
motivation)
Reviewed By: genevievehelsel
Differential Revision: D22489513
fbshipit-source-id: ab6283476d05b06b9f9e37c6b4fd81c1282046ff
Summary:
This adds logging for data fetches that come from the thrift checkout call to
help debug the cause of unexpected data fetches. (See D22448048 for more
motivation)
Reviewed By: chadaustin
Differential Revision: D22489504
fbshipit-source-id: 3b732a1e5627c2130f561ec0138a1df270e1925d
Summary:
We have seen that some of the unexpected data fetches do not originate from
FUSE. This adds parity to the logging for data fetches that come from the thrift
interface. Adding this logging improves the overall observability of eden, and
will help us debug the cause of unexpected data fetching.
This introduces plumbing to allow logging data fetches that originate from
thrift requests.
Reviewed By: chadaustin
Differential Revision: D22448048
fbshipit-source-id: a39dde72467c4922c07c569c14fb499341d40258
Summary:
What does this do: allocate a ThriftLogHelper and ensure it lives for the
duration of its coresponding thrift request.
Why: By ensuring that the log helper's lifetime matches that of the thrift
request, we simplify adding to the logger. We wont have to think about the
lifetime of each of the members of the logger, we simply can stick them in the
log helper and let shared_ptr do its magic.
More specific why: This refactor helps make adding Thrift data fetch logging
cleaner (see following changes). We need to ensure the ObjectFetchContext
stays around for the duration of the thrift request since we pass it around
by reference. Sticking this in ThriftLogHelper avoids adding another disjoint
piece of plumbing and makes the code easier to maintain going forward.
Reviewed By: genevievehelsel
Differential Revision: D22632546
fbshipit-source-id: 1baa79419386947e52a386d89a65f032f1988622
Summary:
This adds `megarepolib` support for pre-merge "delete" commits creation.
Please see `create_sharded_delete_commits` docstring for explanation of what
these "delete" commits are.
This functionality is not used anywhere yet (I intend to use it from
`megarepotool`), so I've added what I think is a reasonble test coverage.
Reviewed By: StanislavGlebik
Differential Revision: D22724946
fbshipit-source-id: a8144c47b92cb209bb1d0799f8df93450c3ef29f
Summary: This is unused, no need to keep it around.
Reviewed By: wez
Differential Revision: D22361749
fbshipit-source-id: 3c353776437b59c6c7735652f7eb1ce052215e11
Summary:
Instead of overloading the FileMetadata struct that is used for directory
listing, let's use one tailored for our use. This will enable more flexibility
when deciding what to provide ProjectedFS. For instance, we could send the
InodeNumber so we don't need to do expensive path resolution for every notification.
Reviewed By: wez
Differential Revision: D22361751
fbshipit-source-id: 4801be45d8afc3af51e0a9564d9acb0a8e32255a
Summary:
Futures are intended to be chained together and not synchronously waited one
after the other. While this may be one of the goal, it also paves the way to
enable ProjectedFS asynchronous notification handling.
While doing this, a bunch of code was moved from EdenMount.cpp to the
dispatcher itself, the rationale behind this is to follow what the unix
EdenDispacher with the long term plan to merge the 2 as much as possible.
Reviewed By: wez
Differential Revision: D22361750
fbshipit-source-id: fa679a8b94ff6f8b5a33782fdb6b129ab066c4d8
Summary:
Whenever EdenFS reads a file from the overlay, there is a risk of this
triggering a prjfs notification, especially when there is a discrepancy between
the actual state of the overlay, and what EdenFS thinks the overlay should look
like.
A simple example of this includes removing a materialized file in the overlay
when EdenFS isn't running, and running `hg status` when it's back up. The
status call will try to compute the sha1 of the file, and thus will call into
the `CreateFile` API, which will then trigger an EdenFS callback to be called.
This callback will then try to recursively acquire the inode lock, which is
already held by the sha1 computation code. And that's a textbook example of a
deadlock.
To avoid this in the first place, let's simply refuse recursive callbacks. The
sha1 code will either have to be modified to handle errors, or we need to scan
the overlay when mounting the repo.
Reviewed By: fanzeyi
Differential Revision: D22288431
fbshipit-source-id: 2b4b31eddf7debcde55c22dd704d198ec44e59b4
Summary:
At mount time, EdenFS will try to create the .eden/config file which will
indicate to the client that this is an EdenFS repo, this config also contains
the location of the socket to talk to EdenFS.
On unices, and while the .eden directory is slightly different, the content of
this directory is written onto the overlay, but on Windows, the overlay is the
repository itself. What this means is that creating this config file will end
up triggering a ProjFS callback, which can potentially lead to deadlocks if
we're not careful.
A future change will simply prevent these recursive callbacks from happening
and thus a solution needs to be found for the .eden/config file itself. Since
the file itself is tiny (about 400B), and should only be accessed once[0], the
simple approach to keep it in memory and special case it works perfectly.
[0]: Once a file was read fully by ProjFS, it's present in the overlay and
EdenFS will no longer be requested to provide it.
Reviewed By: chadaustin
Differential Revision: D22310734
fbshipit-source-id: 6b2dba2164496ebd251104d7875b51569be2471f
Summary:
Files need to be decoded before they're passed to Mercurial
prefetching.
Reviewed By: singhsrb
Differential Revision: D22736198
fbshipit-source-id: f27a85442755d9fcf6c3a572c02a23d59f314d20
Summary:
When using LFS, it's possible that a pointer may be present in the local
LfsStore, but the blob would only be in the shared one. Such scenario can
happen after an upload, when the blob is moved to the shared store for
instance. In this case, during a `get` call, the local LFS store won't be able
to find the blob and thus would return Ok(None), the shared LFS store woud not
be able to find the pointer itself and would thus return Ok(None) too. If the
server is not aware of the file node itself, the `ContentStore::get` would also
return Ok(None), even though all the information is present locally.
The main reason why this is happening is due to the `get` call operating
primarily on file node based keys, and for content-based stores (like LFS),
this means that the translation layer needs to be present in the same store,
which in some case may not be the case. By allowing stores to return a
`StoreKey` when progress was made in finding the key we can effectively solve
the problem described above, the local store would translate the file node key
onto a content key, and the shared store would read the blob properly.
Reviewed By: DurhamG
Differential Revision: D22565607
fbshipit-source-id: 94dd74a462526778f7a7e232a97b21211f95239f
Summary:
In some rare situation, it is possible to have an LFS pointer in both the
packfile and the LFS store. In this case, the flags for this filenode may or
may not be empty, making the assert too strong.
Reviewed By: DurhamG
Differential Revision: D22565605
fbshipit-source-id: b82282b3f47af2a9e607f09a7a7d271ecc4e521a
Summary:
Disregard two default systemd services on Fedora 32 that cause our
systemd tests to fail.
I believe genevievehelsel is planning on removing this code soon anyway.
Reviewed By: genevievehelsel
Differential Revision: D22713393
fbshipit-source-id: b703b23a3158cb007dc2e1cb53fae36be7282719
Summary:
Pull Request resolved: https://github.com/facebookexperimental/eden/pull/32
This parsing uses the standard "subject name" field of a x509 certificate to create MononokeIdentity.
Reviewed By: farnz
Differential Revision: D22627150
fbshipit-source-id: 7f4bfc87dc2088bed44f95dd224ea8cdecc61886
Summary: If fsnodes point to non-existent content we should be able to detect that.
Reviewed By: farnz
Differential Revision: D22723866
fbshipit-source-id: 31510aada5e21109b498a26e28e0f6f3b7358ec4
Summary: Previously, `BackingStore` and all its sub-classes' `getBlob` and `getTree` methods accepted both `ObjectFetchContext` and `ImportPriority` as arguments. Now, `ImportPriority` is removed because we can get the priority from `ObjectFetchContext `
Reviewed By: kmancini
Differential Revision: D22650629
fbshipit-source-id: e1b0c57a059f11504b28b2c17d698bb58f51e1ee