Summary:
*shakes fist at C++ copy-constructors*
We weren't guaranteeing that case-insensitive status was being propagated on
copies or moves, which meant that eg: `lookup("workspace")` would be treated as
case-sensitive when mounted case insensitively.
Reviewed By: xavierd
Differential Revision: D23857218
fbshipit-source-id: 67e33a8455a0a85e5885389b5bb38b20ef043894
Summary: This can be included in all the supported platforms, thus don't #ifdef it.
Reviewed By: wez
Differential Revision: D23858664
fbshipit-source-id: e61da33a97d87cbfab5bb96d2bdaa865d2c01801
Summary:
This enables autodeps, and brings us one step closer to building EdenFS with
Buck on Windows.
Reviewed By: fanzeyi
Differential Revision: D23857794
fbshipit-source-id: c8587a6f7b9e4d9575a62f592c1d0737dff2a8f0
Summary:
Now that prjfs/EdenDispatcher is no longer directly tied to ProjectedFS (sort
of, this is still a bit WIP), we can move it out of the prjfs directory onto
the inodes one. This allows breaking the circular dependency cycle mentioned in
the previous diff where prjfs and inodes depend on each other.
For now, this is merely a copy/paste of the code enclosed in big #ifdef _WIN32,
we might be able to do better later, but for now this is properly good enough.
Reviewed By: wez
Differential Revision: D23857539
fbshipit-source-id: 77c620bac1656d01d7daee4dbf8b10694a589751
Summary:
If we are to look at the dependency graph for EdenFS on Windows, we would
notice that the eden_prjfs target depends on eden_inodes, and vice versa,
causing a cycle. While CMake is perfectly happy with that, Buck doesn't like
that. The solution to removing this cycle is to move the code that needs the
dependency to eden_inodes into the eden_inodes target, and that's the
EdenDispatcher. However, since PrjfsChannel needs to hold a dispatcher to call
into it, it needs to know the methods exposed by the dispatcher. To achieve
this, a simple abstract class is added, this is the same as what is done for
FUSE.
Reviewed By: wez
Differential Revision: D23857540
fbshipit-source-id: c495c67d43724f648e5ffa17776e4d5d4513698a
Summary: The Windows code is now free of sformat, and only depends on fmt::format.
Reviewed By: fanzeyi
Differential Revision: D23786940
fbshipit-source-id: 4c9d4a8c30e5a3d5eb7971cdc3139539c5893a8d
Summary:
Now that all the logic is in PrjfsChannel.cpp, this header no longer needs to
be included. This helps in making this file as free as possible from Windows
specific bits which will allow it to be moved into the inodes top level
directory.
Reviewed By: fanzeyi
Differential Revision: D23786941
fbshipit-source-id: a4a3a47ea00808ca7a839709068efab06436167f
Summary:
Most of the non-tests callers have an ObjectFetchContext available, let's use
it instead of a static null context. This will help in understanding why some
files/trees are being fetched.
Reviewed By: chadaustin
Differential Revision: D23745752
fbshipit-source-id: b2e145d9e559bde0542adbc5b20ff56ccfc59ece
Summary:
This allows removal of a NullContext which will provide more details as to why
fetches are triggered.
Reviewed By: chadaustin
Differential Revision: D23745755
fbshipit-source-id: c26f648b8695b86caf8fe15c4bc6c4128d345aa1
Summary:
Similarly to the other callbacks, this makes the main function return to
ProjectedFS as soon as the future is created which will allow for it to be
interrupted in a subsequent diff.
Reviewed By: fanzeyi
Differential Revision: D23745754
fbshipit-source-id: 2d77d0eacfe0d37eb9075bf9f0660e4f4af77e8f
Summary:
This merely moves Windows bits outside of the EdenDispatcher and into the
PrjfsChannel, making the former less dependant on Windows, and paving the way
to handling this callback fully asynchronous.
One caveat currently is that while the callback supports specifying an offset
and length, the underlying backing store only allows reading the entire file at
once, thus these arguments are simply ignored in the dispatcher.
Reviewed By: fanzeyi
Differential Revision: D23745753
fbshipit-source-id: 266e1f448f9db536d746da1462a2a590ffad19a6
Summary:
Move some commit cloud operations under infinitepush read path:
those are:
* `hg cloud check` command
* `hg cloud sync` command when the local repo is clean
* `hg cloud switch` command will normally use the read path for the dest workspace because we clean up the repo before performing the switch
* `hg cloud rejoin` command we use in fbclone will normally go through the read path as it runs in a fresh repo
If something is broken, there is always a way to rerun any of these command with '--dest' flag pointing it to the write path.
```
./hg cloud check -r 0c9596fd1 --remote --dest infinitepush-write
./hg cloud sync --dest infinitepush-write
./hg cloud switch -w other --dest infinitepush-write
```
Those use cases are limited and the lag of forward filler shouldn't be noticeable for them but we will be able to collect more signal how Mononoke performs with Commit Cloud.
Sitevar to control the routing of read traffic:
https://www.internalfb.com/intern/sv/HG_SSH_WRAPPER_MONONOKE_ROLLOUT/#revisions_list
Reviewed By: mitrandir77
Differential Revision: D23840914
fbshipit-source-id: 40fbe2e72756e7a4cf8bc5be6a0b94f6cf4906b4
Summary:
At the moment CommitSyncConfig can be set in two ways:
1) Set in the normal mononoke config. That means that config can be updated
only after a service is restarted. This is an outdated way, and won't be used
in prod.
2) Set in separate config which can be updated on demand. This is what we are
planning to use in production.
create_commit_syncer_from_matches was used to build a CommitSyncer object via
normal mononoke config (i.e. outdated option #1). According to the comments it
was done so that we can read configs from the disk instead of configerator, but
this doesn't make sense because you already can read configerator configs
from disk. So create_commit_syncer_from_matches doesn't look particularly
useful, and besides it also make further refactorings harder. Let's remove it.
Reviewed By: ikostia
Differential Revision: D23811090
fbshipit-source-id: 114d88d9d9207c831d98dfa1cbb9e8ede5adeb1d
Summary:
With segmented changelog backend, the revs can be changed, even if len(repo)
didn't change. Caching revs might not get invalidated properly. Let's cache
head nodes instead.
Reviewed By: DurhamG
Differential Revision: D23856176
fbshipit-source-id: c5154c536298c348b847a12de8c4f582f877f96e
Summary:
On Ubuntu the output is a bit different:
```
$ hg cloud sync --use-bgssh
commitcloud: synchronizing 'server' with 'user/test/default'
- remote: /bin/sh: trashssh: command not found
- abort: no suitable response from remote hg!
+ remote: /bin/sh: 1: trashssh: not found
+ abort: no suitable response from remote hg: '[Errno 32] Broken pipe'!
```
Glob them out to make the test pass.
Reviewed By: DurhamG
Differential Revision: D23824735
fbshipit-source-id: 7f96149ee16daff31fd0a1c68975b5edfa27cc46
Summary:
It seems OSX python2 has SIGINT handler set to SIG_IGN by default when running
inside tests. Detect that and reset SIGINT handler to raise KeyboardInterrupt.
This fixes test-ctrl-c.t on OSX.
As we're here, improve test-ctrl-c.t so it checks a bit more things and run
quicker.
Reviewed By: DurhamG
Differential Revision: D23853455
fbshipit-source-id: 05c47650bc80f9880f724828d307c32786265e2c
Summary:
There's a [flaw](https://fb.workplace.com/groups/scm.mononoke/permalink/1220069065022333) in the current `synced_commit_mapping` data model. In a nutshell, the flaw is in the assumption that the `RewrtittenAs` relationship is `1:1`, while in fact it is `1:n` with `n` on the large repo side.
To address this flaw I propose to:
- relax the DB constraints to represent the semantically correct model
- select all the synced candidates from the DB
- for places in code, which require a single mapping for a candidate, use the provided hint to resolve any ambiguity
More concretely:
- instead of a single `CommitSyncOutcome` struct, I propose to have the "canonical" `PluralCommitSyncOutcome` and the "resolved" `CommitSyncOutcome`
- every variant of `PluralCommitSyncOutcome` that is not `RewrittenAs` just maps to an identical variant of `CommitSyncOutcome`
- have a `CandidateSelectionHint` passed from the clients, which would help resolve `PluralCommitSyncOutcome::RewrittenAs` into a `CommitSyncOutcome::RewrittenAs`
- if the hint does not help to resolve `PluralCommitSyncOutcome::RewrittenAs` into an unambiguous `CommitSyncOutcome::RewrittenAs`, just fail the request and require human intervention to deal with things
- within the hint, have for the following variants for the resolution algorithm:
- `Only` which fails the resolution if there's more than one candidate
- `Exact` behaves like `Only` if there's one candidate, otherwise selects a provided candidate
- `OnlyOr(Ancestor|Descendant)Of(Commit|Bookmark)` behave like `Only` if there's one candidate, otherwise select a candidate in the expected topological relationship
Note some important decisions, that may be surprising at first:
- if there's just one candidate, resolutions with all types of hints succeed, even if this candidate does not fit the hint otherwise (for example, if the hint is `Exact(A)`, and the list of candidates is `[B]`, the resolution succeeds.
- for bookmark-related hints, if the bookmark does not exist at the time of resolution, the hint just "downgrades" itself to be `Only`
Both of these emphasize the fact that if the mapping has only one `RewrittenAs` candidate for a given changeset, the behavior does not change.
Reviewed By: StanislavGlebik
Differential Revision: D23670180
fbshipit-source-id: 1cee1f65fc8020e0ae8a7da789b2532d2e436b77
Summary: SomeFailedOthersNone should not consider write mostly blobstores None if all other stores Error
Reviewed By: farnz
Differential Revision: D23840334
fbshipit-source-id: 9838bead6fec0d5f920e4a788387025d0dacf80b
Summary: Add a test for SomeFailedOthersNone when write mostly blobstore is None
Reviewed By: farnz
Differential Revision: D23840685
fbshipit-source-id: 81834663169b3a522b9c08e0a36f0b91354916c7
Summary:
Now that the win directory only contains the mount directory, we can rename it
to be more faithful to its intent. Since this is about ProjectedFS, let's
rename it "prjfs".
Reviewed By: chadaustin
Differential Revision: D23828561
fbshipit-source-id: cb31fe4652fd4356dc2579028d3ae2c7935371a7
Summary: This will make it easier to build with Buck.
Reviewed By: fanzeyi
Differential Revision: D23827754
fbshipit-source-id: bf3bf4d607a08b9831f9dfea172b2e923a219561
Summary:
Now that sqliteoverlay has a TARGETS file, we can remove the manual, and use
autodeps to have the right dependencies in the TARGETS files. This will help in
getting EdenFS to build on Windows with buck.
Reviewed By: chadaustin
Differential Revision: D23820312
fbshipit-source-id: 34bfd13d2ae6d11a404a9b913562c7d45a4b3de7
Summary:
Phabstatus for smartlog uses `PeekeaheadList` rather than `PeekaheadRevsetIterator` as
all of the commits are known ahead of time, and we don't need to collect together
batches as we iterate across the revset.
However, we should still batch up requests to Phabricator, as users with very high
numbers of commits in their smartlog may hit timeouts.
Add a batching mechanism to `PeekaheadList` that splits the list into chunks to
return with each peekahead.
Reviewed By: liubov-dmitrieva
Differential Revision: D23840071
fbshipit-source-id: 68596c7eb4f7404ce6109e69914f328565e34582
Summary:
This provides a way to fix the local cache of backed up heads if it is in an
invalid state.
The most important, it will allow early dogfooding of write traffic from Mononoke
without the reverse filler in place for developers or for the team.
You could just run `hg cloud backup -f` assuming the repo is backfilled to fix
any inconsistency when switch between the two backends
Reviewed By: markbt
Differential Revision: D23840162
fbshipit-source-id: bbd331162d65ba193c4774e37324f15ed0635f82
Summary:
Let's add a command that validates that the created catchup commit is correct.
For now it validates that unodes are the same between catchup commit and commit
that we are merging in.
Later we can add more invariants that we want to check.
Reviewed By: krallin
Differential Revision: D23782369
fbshipit-source-id: 61d19aa73777d5fbb3e1b127bdcf39f5e6309b52
Summary: Add error context to file content scrub so that we can tell if an Error has propagated via the scrub stream loading.
Reviewed By: StanislavGlebik
Differential Revision: D23838144
fbshipit-source-id: 40a8a090510959cab1020182c19076b8a3317b1b
Summary:
Implemented S3 blobstore
Isilon implements S3 as 1:1 mapping into filesystem, and it limits the maximum number of blobs in the single directory. To overcome it lets shard the keys using base64 encoding and making 2 level dir structure with 2 chars dir names.
Reviewed By: krallin
Differential Revision: D23562541
fbshipit-source-id: c87aca2410381a07babb191cbd8cf28233556e03
Summary:
For Python 3 we must ensure that the displayer messages have all been converted
to unicode before providing them to the Rust graph renderer.
The is because the Python 3 version of `encoding.unifromlocal` is a no-op, so
the result may still be `bytes` that need to be converted to `str`.
Reviewed By: quark-zju
Differential Revision: D23827233
fbshipit-source-id: 8f2b707ceceb210c0a2b5b589b99d4016452c61c
Summary:
D23759711 (be51116cf4) changed the way signal handlers work, which apparently causes
this test to fail. The SIGCHLD signal of the child changing state is received
during os.waitpid, which apparently counts as a signal during a system call,
which throws an OSError.
I'm not sure what the real fix should be. Sleeping gets us past the issue, since
presumably the signal is handled before the system call.
Reviewed By: quark-zju
Differential Revision: D23832606
fbshipit-source-id: 70fca19e419da55bbf546b8530406c9b3a9a6d77
Summary:
When running the repo import tool, it's possible that we need to do additional setup steps before being able to run the tool, which otherwise would only come up when we run it.
Firstly, if the repo we import into doesn't have a callsign (e.g. FBS, WWW...), but we want to check Phabricator, our tool would hang when checking Phabricator, because we need the callsign for checking. Therefore, we need to inform the user to set the callsign for the repo.
Secondly, in case the repo push-redirects to a larger repo, we generate a bookmark for the commits imported into the large. However, we need to inform the Phabricator team to include the large repo's bookmark before we can import the commits, because this bookmark publishes the imported commits on Phabricator.
This diff adds a subcommand to check these additional steps, so we wouldn't find these out during the actual import run.
Reviewed By: StanislavGlebik
Differential Revision: D23783462
fbshipit-source-id: 3cdf4035548213d8cee9717fb985c22741a6749b
Summary:
In the later diffs we are going to change how CommitSyncer is initialized. In
order to make it simpler let's refactor cross_repo_sync_test to move
CommitSyncer creation in a single function.
There are a few tests that have very peculiar initialization - for example they
have movers that fail. For those tests I combined the new function for creation
of CommitSyncer with manual initialization of CommitSyncRepos struct.
Reviewed By: krallin
Differential Revision: D23811507
fbshipit-source-id: 682ab30aa09c9189fcd02850a19f1ddf021c0329
Summary:
This simplifies the code a bit, and avoids creating tokio Runtime multiple
times.
Reviewed By: kulshrax
Differential Revision: D23799642
fbshipit-source-id: 21cee6124ef6f9ab6e165891d9ee87b2feb553ac
Summary:
Exercises the PyStream type from cpython-async.
`hg dbsh`:
In [1]: s,f=api._rustclient.commitdata('fbsource', list(repo.nodes('master^^::master')))
In [2]: s
Out[2]: <stream at 0x7ff2db700690>
In [3]: it=iter(s)
In [4]: next(it)
Out[4]: ('6\xf9\x18\xe4\x1c\x05\xfc\xb0\xd3\xb2\xe9\xec\x18E\xec\x0f\x1a:\xb7\xcd', ...)
In [5]: next(it)
Out[5]: ('}\x1f(\xe1o\xf1a\x9b\x81\xb9\x83}\x1b\xbbt\xd2e\xb1\xedb',...)
In [6]: next(it)
Out[6]: ('\xf1\xf0f\x97<\xf3\xdd\xe41w>\x92\xd1\xc0\x9ah\xdd\x87~^',...)
In [7]: next(it)
StopIteration:
In [8]: f.wait()
Out[8]: <bindings.edenapi.stats at 0x7ff2e006a3d8>
In [9]: str(Out[8])
Out[9]: '2.42 kB downloaded in 165 ms over 1 request (0.01 MB/s; latency: 165 ms)'
In [10]: iter(s)
ValueError: stream was consumed
Reviewed By: kulshrax
Differential Revision: D23799645
fbshipit-source-id: 732a5da4ccdee4646386b6080408c0d8958dd67f
Summary:
Exercises the PyFuture type from cpython-async.
`hg dbsh`:
In [1]: api._rustclient.commitdata('fbsource', list(repo.nodes('master^^::master')))
Out[1]:
([...], <future at 0x7f7b65d05060>)
In [2]: f=Out[1][-1]
In [3]: f.wait()
Out[3]: <bindings.edenapi.stats at 0x7f7b665e8228>
In [4]: f.wait()
ValueError: future was awaited
In [5]: str(Out[3])
Out[5]: '2.42 kB downloaded in 172 ms over 1 request (0.01 MB/s; latency: 171 ms)'
Reviewed By: kulshrax
Differential Revision: D23799643
fbshipit-source-id: d4fcef7dca58bc4902bb0809adc065493bb94bd3
Summary:
Add a `PyFuture<F>` type that can be used as return type in binding function.
It converts Rust Future to a Python object with an `await` method so Python
can access the value stored in the future.
Unlike `TStream`, it's currently only designed to support Rust->Python one
way conversion so it looks simpler.
Reviewed By: kulshrax
Differential Revision: D23799644
fbshipit-source-id: da4a322527ad9bb4c2dbaa1c302147b784d1ee41
Summary:
The exposed type can be used as a Python iterator:
for value in stream:
...
The Python type can be used as input and output parameters in binding functions:
# Rust
type S = TStream<anyhow::Result<X>>;
def f1() -> PyResult<S> { ... }
def f2(x: S) -> PyResult<S> { Ok(x.stream().map_ok(...).into()) }
# Python
stream1 = f1()
stream2 = f2(stream1)
This crate is similar to `cpython-ext`: it does not define actual business
logic exposed by `bindings` module. So it's put in `lib`, not
`bindings/modules`.
Reviewed By: markbt
Differential Revision: D23799641
fbshipit-source-id: c13b0c788a6465679b562976728f0002fd872bee
Summary:
See the previous diff for context. Move the error handling and ipdb logic to
the background thread so it can show proper traceback.
Reviewed By: kulshrax
Differential Revision: D23819022
fbshipit-source-id: 8ddae019ab939d8fb2c89afca2a7769094ebe26a
Summary:
With D23759710 (34d8dca79a), the main command was moved to a background thread, but the
error handling isn't. That can cause less useful traceback like:
Traceback (most recent call last):
File "dispatch.py", line 698, in _callcatch
return scmutil.callcatch(ui, func)
File "scmutil.py", line 147, in callcatch
return func()
File "util.py", line 4358, in wrapped
raise value
Set `e.__traceback__` so `raise e` preserves the traceback information.
This only works on Python 3. On Python 2 it is possible to use
`raise exctype, excvalue, tb`. But that's invalid Python 3 code. I'm
going to fix Python 2 traceback differently.
Reviewed By: kulshrax
Differential Revision: D23819023
fbshipit-source-id: 953ac8bd6108f4c0dae193607bee3f931c2bd13e
Summary:
The parameter `mtimethreshold` should be used instead of a constant of 14 days.
This fixes an issue where sigtrace output takes a lot of space in hg rage
output.
Reviewed By: DurhamG
Differential Revision: D23819021
fbshipit-source-id: e639b01d729463a4822fa93604ce3a038fbd4a9a
Summary:
filter returns a filter object, so the second time we iterate, it is empty
This is only in Python3 I believe, so migration to py3 broke it.
Reviewed By: markbt
Differential Revision: D23815206
fbshipit-source-id: 1a6503b2bbfd44959307c189d17dec9b5d5ff991
Summary:
Now that `hg whereami` properly reads the SNAPSHOT file on Windows, the doctor
tests properly detect that Mercurial and EdenFS disagree about the current
commit, thus we can enable the remaining 2 tests.
Reviewed By: genevievehelsel
Differential Revision: D23819924
fbshipit-source-id: 21be19aff913e5e485d72e8cd730e6851ecaba2e
Summary:
During an hg update we first prefetch all the data, then write all the
data to disk. There are cases where the prefetched data is not available during
the writing phase, in which case we fall back to fetching the files one-by-one.
This has truly atrocious performance.
Let's allow the worker threads to check for missing data then do bulk fetching
of it. In the case where the cache was completely lost for some reason, this
would reduce the number of serial fetches by 100x.
Note, the background workers already spawn their own ssh connection's, so
they're already getting some level of parallelism even when they're doing 1-by-1
fetching. That's why we aren't seeing a 100x improvement in performance.
Reviewed By: xavierd
Differential Revision: D23766424
fbshipit-source-id: d88a1e55b1c21e9cea7e50fc6dbfd8a27bd97bb0
Summary:
Pull Request resolved: https://github.com/facebookexperimental/eden/pull/60
For the tests that output different data to stdout in OSS vs FB create helpers that remove the differences.
Reviewed By: farnz
Differential Revision: D23814134
fbshipit-source-id: c6656528021c9a90b98e3c89a9bbe8c5178c6919
Summary:
Add `scsc land-stack` to facilitate testing of stack landing via the source control service.
Use this to test that landing of stacks works.
Reviewed By: aslpavel
Differential Revision: D23813366
fbshipit-source-id: 1f7b682fa5e33a232cb1da5c702a703223658942
Summary:
Update the conversion of `BookmarkMovementError` to `MononokeError` to reflect
that most movement errors are caused by invalid requests.
Reviewed By: aslpavel
Differential Revision: D23814794
fbshipit-source-id: 48503353aaae7b3cd03e5221a8ad014eef2e9414
Summary:
Implement the `repo_land_stack` method by working out which commits are in the
stack to be landed, and then pushrebasing them onto the target bookmark.
Reviewed By: aslpavel
Differential Revision: D23813370
fbshipit-source-id: babe34f0e9f1db055adb2e5d1debefd8ebcf6f86
Summary:
Sometimes the `AsyncIntoResponse` trait needs additional data (e.g. the set of commit
identity schemes the client is interested in) to convert the item into the response
type.
Currently we use a tuple of `(item, &additional_data)` to work around this, however
this will become less readable as we add new items with more additional data.
Split this use-case out into a new trait: `AsyncIntoResponseWith`. This defines
an associated type which is the type of the additional data needed, and provides a
new method `into_response_with`, where a reference to the additional data can be
provided.
Note that conversions for tuple types that are logical `(name, value)` or `(id,
value)` pairs are still ok. It is specifically the case where we have `(item,
&additional_data)` that we are converting here (i.e. the additional data merely
informs the conversion, it is not part of the resulting response value).
Reviewed By: aslpavel
Differential Revision: D23813371
fbshipit-source-id: c0dcfe826288ad53ad572ae4dd956540605998f5
Summary: Make it clear which error is which, and what the number of expected and actual items are.
Reviewed By: StanislavGlebik
Differential Revision: D23813369
fbshipit-source-id: 5b94c5a67438c475235876669ec2be3fd1866700
Summary: Intern ids to reduce space used in the walk state. This is significant on large repos.
Reviewed By: farnz
Differential Revision: D23691524
fbshipit-source-id: b42f926d88083d06ffc44508db44747f9a14e0a5
Summary:
Passing option is not necessary since live_commit_sync_config is always
available.
Reviewed By: ahornby
Differential Revision: D23811021
fbshipit-source-id: ee11f88d57814d9abac8650e52febd9e431770da
Summary:
Automigration gets messed up with `hg cloud rejoin` command in fbclone code because it triggered by the pull command.
As a result fbclone ends up to join a hostname workspace instead of the default for some cases.
* make sure that the migration never runs if background commit cloud operations are disabled
* also, add skip the migration in the pull command in fbclone
Once of those would be enough to fix the issue but I prefer to make both
changes.
Reviewed By: markbt
Differential Revision: D23813184
fbshipit-source-id: 3b49a3f079e889634e3c4f98b51557ca0679090b
Summary:
I've re-backfilled some of blame values for configerator. But old values might
still be in memcache. To make sure that's not the case let's bump the memcache
key.
Reviewed By: krallin
Differential Revision: D23810971
fbshipit-source-id: c333a51ffb2babf7da808b276f9cfa31baaa105c
Summary:
The children revset iterated over everything in the subset, which in
many cases was the entire repo. This can take hundreds of milliseconds. Let's
use the new _makerangeset to only iterate over descendants of the parentset.
Reviewed By: quark-zju
Differential Revision: D23794344
fbshipit-source-id: 9ac9bc014d56a95b5ac65534769389167b0f4508
Summary:
Now that Mercurial itself can properly handle SIGINT, there isn't a need for a Python wrapper around the Rust EdenAPI client (since the main purpose of the wrapper was to ensure proper SIGINT handling--something that could only be done in Python).
Note that while this does remove some code that prints out certificate warnings, that code was actually broken after the big refactor of the Rust bindings. (The exception types referenced no longer exist, so the code would simple result in a `NameError` if it actually tried to catch an exception from the Rust client.)
Reviewed By: singhsrb
Differential Revision: D23801363
fbshipit-source-id: 3359c181fd05dbec24d77fa1b7d9c8bd821b49a6
Summary: Small change to make it more readable and reduce likelihood of allocation (although the collect might be optimized away anyway)
Reviewed By: farnz
Differential Revision: D23760762
fbshipit-source-id: 5c47352386de128b65052d63b3f3ff1081a462e3
Summary: Make `StreamBody` accept a `Stream` of `Bytes` instead of a `TryStream` of `Bytes`. This means that applications returning streaming responses will be forced to deal with errors prior to returning the response.
Reviewed By: krallin
Differential Revision: D23780216
fbshipit-source-id: dbad61947ef23bbfc4edf3d286ad0218c1859d87
Summary:
Using the `EndOnErr` combinator introduced in the previous diff, log any errors that occur during a streaming response to stderr.
Note that **the intent of this diff is to implement the most basic possible example of doing something with these errors**, with the goal of allowing us to modify `StreamBody` to only accept infallible `Stream`s.
I'd imagine that in all likelihood we'd want to do something smarter with the errors than just print them, but I figure that can be added in later diffs since it seems like doing something else (like logging the error to Scuba or adding to the RequestContext) might require additional changes that are beyond the scope of this diff.
At the very least, this seems like an improvement from before, where these errors would just get passed straight through to Hyper.
Reviewed By: krallin
Differential Revision: D23780217
fbshipit-source-id: 2f885f9fdc6af3dd28d95be1daa1d82c732453fa
Summary: Add a new `EndOnErr` combinator for `TryStream`s (exposed via the `GothamTryStreamExt::end_on_err` method) which fuses the underlying `TryStream` upon hitting an error, and passes the error to the user-provided callback. This is useful for contexts like the LFS server, where mid-stream errors are considered unrecoverable and must result in termination of the response.
Reviewed By: krallin
Differential Revision: D23778490
fbshipit-source-id: 05caa52ca62d085cc63cc8feb4619188fe0fac61
Summary: Use the new `ForwardErr` stream combinator to log errors that occur during a streaming response. Right now, they are just printed to stderr, but in the future we should also do other things such as logging them to Scuba. This diff supersedes the approach from D22720957.
Reviewed By: krallin
Differential Revision: D23780215
fbshipit-source-id: 8d2267f1166e665a62a167a6d95bb0b1797b5767
Summary: Implement `ContentMeta` for streams wrapped with the `ForwardErr` combinator, so that they may be used as input to `StreamBody`. (Presently, this won't actually work since `StreamBody` expects a `TryStream`, but this will change later in the stack.)
Reviewed By: krallin
Differential Revision: D23777842
fbshipit-source-id: 234bcdf104afbf2c9832fbe54d85744bfb470363
Summary:
This diff adds a new `ForwardErr` combinator that allows redirecting the errors from a `TryStream` into a channel, allowing them to be processed asynchronously without disrupting the stream itself. This effectively splits the `TryStream` into two `Stream`s containing the successful items and errors respectively.
To make it easy to use the combinator, I've added a new `GothamTryStreamExt` extension trait (in the vein of the old `futures_ext` crate) that allows the user to simply call `forward_err` on any `TryStream`. The trait name is a bit of a misnomer (suggestions welcome!) in that there isn't anything Gotham-specific about it, but the name `TryStreamExt` is taken and I didn't want to set up a successor to `futures_ext` just for the sake of one combinator. (Though we will likely expand the trait in the future.)
Reviewed By: krallin
Differential Revision: D23777501
fbshipit-source-id: 8628cdc2e104cd9b972afda745858f9cb9e85245
Summary:
This extends the Ctrl+C special handling from edenapi to the entire Python
command so Ctrl+C should be able to exit the program even if it's running
some blocking Rust functions.
`edenapi` no longer needs to spawn threads for fetching.
Reviewed By: singhsrb
Differential Revision: D23759710
fbshipit-source-id: cbaaa8e5f93d8d74a8692117a00d9de20646d232
Summary:
on macOS we cannot safely use `fork`.
This commit replaces the use of `fork` in the startup logger subsystem.
This was a little tricky to untangle; originally (prior to any of
the `fork` removal efforts in this diff stack), the startup flow was
to spawn a set of processes via fork:
```
edenfs (setuid)
\-----edenfs (privhelper, as root)
\------edenfs (daemonized)
```
The forked children take advantage of being able to implicitly pass state to
the child processes from the parent. That data flow needs to become explicit
when removing the fork which makes some things a little awkward.
With fork removed:
* `edenfs` unconditionally spawns `edenfs_privhelper` while it has
root privs and before most of the process has been initialized.
* That same `edenfs` process will then spawn a child `edenfs`
process which starts from scratch, but that which needs to
run as the real server instance
* The original `edenfs` instance needs to linger for a while
to remain connected to the controlling tty to pass back the
startup state to the user, before terminating.
This commit deletes the check that `edenfs` is started originally
as root; previously the logic relied on the forked startup logger
continuing past the `daemonizeIfRequested` call and simply deferring
the check until after folly::init. With these changes we can't
easily perform such a check without adding some extra gymnastics
to pass the state around; the place where that is checked is in
the spawned child of the original edenfs, which is not a privileged
process and doesn't know the original euid. I don't believe this
to be a great loss as we tuck `edenfs` away under the libexec dir.
Reviewed By: chadaustin
Differential Revision: D23696569
fbshipit-source-id: 55b95daf022601a4699274d696af419f0a11f6f2
Summary:
On macOS we cannot safely use `fork` to spawn processes while other threads may initialize objc classes.
This commit replaces the use of `fork` in the privhelper startup with
`SpawnedProcess` instead. We need to take care with this as we are generally
installed setuid root and we'd like to avoid being tricked into running an
arbitrary child process as root.
This commit defines a separate executable called `edenfs_privhelper` that
contains just the privhelper server code.
We need to be careful about locating this executable; to avoid invoking an
arbitrary process while we have root privileges we require that the privhelper
be a sibling to the edenfs executable and carry out some additional ownership
verification so that we can tell that the owner of edenfs also controls
edenfs_privhelper.
To facilitate this, I've added an `executablePath` function to PathFuncs; it
returns the path to the current executable image.
To make the integration test scenario simpler, I've added the edenfs_executable
binary definition alongside that of the edenfs binary in the buck and cmake
build systems. This causes the binaries to be siblings in-situ in the build
tree and avoids the need to move things into place in the test harness.
Reviewed By: chadaustin
Differential Revision: D23653343
fbshipit-source-id: 3c2539a5e0e11cee88960db49c885ce0366d314e
Summary: This diff fixes the eden redirection tests so it runs on Windows.
Reviewed By: xavierd
Differential Revision: D22958766
fbshipit-source-id: 45d26587831ed74d6bd7912b22c7c955b077f571
Summary:
Move bunch of code into a separate file (scm daemon related options). Move them
out of cloud sync.
Also introduce additional check that the `hg cloud sync` command scm daemon
runs is intended for the current connected workspace
In theory when we switch a subscription, the SCM daemon gets notified but races possible and it is better to have this additional check, so SCM daemon triggers cloud sync where it is supposed to.
Reviewed By: markbt
Differential Revision: D23783616
fbshipit-source-id: b91a8b79189b7810538c15f8e61080b41abde386
Summary:
The config is not actually used any more (with rust-commits, it is forced on, without rust-commits,
there is no point to keep it on). Therefore removed.
Reviewed By: singhsrb
Differential Revision: D23771570
fbshipit-source-id: ad3e89619ac5e193ef552c25fc064ca9eddba0c6
Summary:
See the previous diff for context. This allows the code to run from non-main
thread.
Reviewed By: singhsrb
Differential Revision: D23759712
fbshipit-source-id: 044193a9d7193488c700d769da9ad68987356d69
Summary:
The idea is to extend D22703916 (61712e381c)'s way of calling functions from just edenapi to
the entire command for better Ctrl+C handling. Some code paths (ex. pager,
crecord) use `signal.signal` and `signal.signal` does not work from non-main
thread.
To workaround the `signal.signal` limitation, we pre-register all signals we care
about in the main thread to a special handler. The special handler reads a
global variable to decide what to do. Other threads can modify that global
variable to affect what the special signal handler does, therefore indirectly
"register" their handles.
Reviewed By: kulshrax
Differential Revision: D23759711
fbshipit-source-id: 8ba389072433e68a36360db6a1b17638e40faefa
Summary:
Before this change, for a long-running function wrapped by 'threaded',
it might:
background thread> start
main thread> receive SIGINT, raise KeyboardInterrupt
main thread> raise at 'thread.join(1)'
main thread> exiting, but wait for threads to complete (Py_Finalize)
background thread> did not receive KeyboardInterrupt, continue running
main thread> continue waiting for background thread
Teach `thread.join(1)` to forward the `KeyboardInterrupt` (or its subclass
`error.SignalInterrupt`) to the background thread, so the background thread
_might_ stop. Besides, label the background thread as daemon so it won't
be waited upon exit.
Reviewed By: kulshrax
Differential Revision: D23759713
fbshipit-source-id: 91893d034f1ad256007ab09b7a8b974325157ea5
Summary:
Move the wrapper to util.py. It'll be used in dispatch.py to make the entire
command Ctrl+C friendly.
Reviewed By: singhsrb
Differential Revision: D23759715
fbshipit-source-id: fa2098362413dcfd0b68e05455aad543a6980907
Summary: This will be used to test Ctrl+C handling with native code.
Reviewed By: kulshrax
Differential Revision: D23759714
fbshipit-source-id: 50da40d475b80da26b7dbc654e010d77cb0ad2d1
Summary: This makes it easier to test the API via debugshell.
Reviewed By: kulshrax
Differential Revision: D23750677
fbshipit-source-id: e29284395f03c9848cf90dd2df187e437890c56e
Summary: It is handy to test edenapi methods directly.
Reviewed By: kulshrax
Differential Revision: D23750709
fbshipit-source-id: 33c15cecaa0372ba9e4688502e7d8f3fdda7c3b8
Summary:
Add a command to rebuild the changelog without recloning other parts of the
repo. This can be used as a way to recover from corrupted changelog. It
currently uses revlog because revlog is still the only supported format during
streamclone.
In the future this can be used for defragmentation.
Reviewed By: DurhamG
Differential Revision: D23720215
fbshipit-source-id: 6db0453d18dbf553660d55d528f990a4029d9da4
Summary:
This commit moves a compile-time template parameter
to be a runtime boolean parameter.
There's a bit of fan-out that, while I don't think it is
super awesome, isn't super terrible either.
The case sensitivity value is read from the checkout config
added in the prior diff in this stack.
Reviewed By: xavierd
Differential Revision: D23751192
fbshipit-source-id: 46f6fe25bfa6666305096ad9c416b510cd3aac8f
Summary:
This diff teaches the CheckoutConfig how to determine
whether a given checkout should be case-sensitive (the default)
or case-insensitive-case-preserving.
This option is passed through to the fuse channel initialization,
so that the kernel will respect it, however, our DirEntry layer
doesn't yet know that it should respect this.
There's currently no UI to set this option. My game plan
is to suggest the following steps to folks that want to try
this out:
```
$ eden stop
$ vim ~/local/.eden/clients/ovrsource/config.toml
```
and then add this line to the `[repository]` section:
```
case-sensitive = false
```
and finally:
```
$ eden start
```
Reviewed By: xavierd
Differential Revision: D23751184
fbshipit-source-id: 6facb23c460cfff6e37d0091b51b97ab06f62c91
Summary:
Improve help to reflect that the system is also meant for managing backups
add missing commands
reshuffle a bit
Reviewed By: markbt
Differential Revision: D23782794
fbshipit-source-id: d7fd3fa06ca7acd649cef557f3fe020295259e3d
Summary:
Compressed responses from LFS are slower than they should right now. Normally,
we'd expect something along the lines of normal response time + compression
time, but right now it's a lot more than this.
The reason for this is that our compressed streams are eager, i.e. they will
consume and compress as much of the underlying stream as possible before
sending off the data. This is problematic for LFS, because we try very hard to
serve everything out of RAM directly (and very often succeed), so that means
we compress the whole stream before sending it off.
This means we might spend e.g. 500ms compressing (this is how long it takes
zstd to compress the object I was testing on, which is a ~80MiB binary that
compresses down to 33% of that), and _then_ we'll spend some time transferring
the compressed data, when we could have started transferring immediately while
we were compressing.
To achieve this, let's simply tell our compressed stream to stop waiting for
more data once in a while (every 4 MiB, which seems very frequent but actually
really isn't).
Reviewed By: StanislavGlebik
Differential Revision: D23782756
fbshipit-source-id: a0d523d84f92e215eb366f551063383fc835fdd6
Summary:
I saw this throw the LFS server into an infinite loop when I tested it. We're
not using this right now, so I'm not investing time into root-causing the
issue, and instead let's just take this out.
Reviewed By: StanislavGlebik
Differential Revision: D23782757
fbshipit-source-id: f320fc72c3ff279042c2fe9fcb9c4904e9e1bfdf
Summary:
Pull Request resolved: https://github.com/facebookexperimental/eden/pull/51
This diff extends capabilities of CargoBuilder in getdeps so that individual manifests can be build even without workspaces. Thanks to that a build for edenapi/tools can be made and its artifacts can be used in mononoke integration tests.
Reviewed By: StanislavGlebik
Differential Revision: D23574887
fbshipit-source-id: 8a974a6b5235d36a44fe082aad55cd380d84dd09
Summary: The command will be provided as hint if username changes has been detected in configuration.
Reviewed By: markbt
Differential Revision: D23769942
fbshipit-source-id: 3e84ecef6dd68267022b92bf10f5e68dfc07f270
Summary:
This makes deletion commits a bit less confusing, but it also have another
benefit.
Without the sort some directories might have been changed multiple times in
deletion commits e.g. if a directory had 5 files, and these files were deleted
in 5 different deletion commits then the directory would be changed 5 times.
This was not good, because it made some data derivation slower (in particular,
fastlog), because it had to regenerate the same data over and over again.
Reviewed By: ikostia
Differential Revision: D23780066
fbshipit-source-id: d5c52b13f58dcaf2012d9c12bf77398561cf10ef
Summary:
Spotted a TODO in fsnode get_fsnode_id. There was only one user of the function, which didn't really need to call it as it had the blob already.
As well as being a bit tidier this also saves a clone of the fsnode.
Reviewed By: StanislavGlebik
Differential Revision: D23758689
fbshipit-source-id: e0a8c124c929fda3af4c96a76d441a79e5bfbd5b
Summary:
Save memory in walker state tracking by not memoing hash values. For large repos this is significant.
I was expecting a small slowdown from this, but so far looks pretty much the same walk rate. Speculation: this may be due to the num cpus lock sharding fix in dashmap 3.11.10 which means there are many more shards than when the memo was tested with 3.11.9, so saving time inside locks is less significant.
Reviewed By: StanislavGlebik
Differential Revision: D23680550
fbshipit-source-id: 351b5ec39885fc30996207c7dccc22c749e30321
Summary:
The `gotham_ext::response` module was getting a bit large, so this diff moves `ContentMeta`, `ContentStream`, and `CompressedContentStream` into a new submodule, alongside the contents of the old `content_encoding` module. This way, the `response` module remains entirely centered around the `TryIntoResponse` trait (and the various body structs that implement that trait).
Later diffs in this stack will be adding an additional layer between the content streams and the body structs, at which point it probably doesn't make sense to have these right next to each other. Splitting them out now will allow for better code organization going forward.
Reviewed By: krallin
Differential Revision: D23777492
fbshipit-source-id: 86e598dcb37578d3b22217a2a65f1bde84d72215
Summary:
`scratch` provided by `fb-scratch` was replaced by `mkscratch` provided by
the Mercurial package. See linked task for details.
Reviewed By: quark-zju
Differential Revision: D23773840
fbshipit-source-id: de0582069ce1a09c3cd9fc6b02d2d149f70d0d78
Summary:
Computing all successorsets is exponential with the number of splits
that have happened. This can slow things down tremendously.
The obsoletenodes path only needs to know "is there a visible successor" in
order to determine if a draft commit is obsolete. Let's use allsuccessors
instead of successorset.
Reviewed By: quark-zju
Differential Revision: D23771025
fbshipit-source-id: 666875e681c2e3306fc301357c95f1ab5bb40a87
Summary:
`hg cloud join --merge` doesn't really solve rename problem because it doesn't
preserve:
1. old heads
2. history
I added a proper API in Commit Cloud Service for renaming workspaces and now we
can use it to provide a rename command and 'rehost' command which is a version
of renaming to bind the current workspace to the current devserver.
Rehost command is meant to be used after dev server migration. I am plannig to
add this to the dev server migration wiki.
Next diff will cover how we can use the rename command to fix a username in workspaces names after username has been changed.
Reviewed By: markbt
Differential Revision: D23757722
fbshipit-source-id: dc11cb226eb76d347cdab70b3c72566448dcd098
Summary:
The Rust contentstore has no way to flush the shared stores, except
when the object is destructed. In treemanifest, the lifetime of the shared store
seems to be different from with files and we're not seeing them flushes
appropriately during certain commands. Let's make the flush api also flush the
shared stores.
Reviewed By: quark-zju
Differential Revision: D23662976
fbshipit-source-id: a542c3e45d5b489fcb5faf2726854cb49df16f4c
Summary:
Now that treemanifests can use Rust stores, we need to update the
Python repack code to support that.
Reviewed By: quark-zju
Differential Revision: D23662361
fbshipit-source-id: c802852c476425eef74181ead04f70b11ff9a27c
Summary:
This makes Rust contentstore prefetch route through the remotetreestore
prefetch logic to reach the lower level tree fetching, and makes the higher
level Python fetching route through the Rust contentstore to do prefetching. The
consequence of this is that there's a relatively unified code path for both
Python and Rust, and hopefully we can delete the janky Python bits once we're
completely migrated to Rust.
The way this diff works is pretty hacky. The code comment explains it, but the
tl;dr is that Rust prefetch works by providing references to the mutable stores,
while Python prefetch assumes they are stored and accessible on the repository.
Inorder for the old python tree fetching logic to work with both models, we
monkey patch the Rust mutable store references we receive into the function that
will later be called to request the repositories mutable stores. This is awful.
A cleaner fix might be to thread the mutable stores all the way through the
python fetching logic, then move the Python accessing of the repositories
mutable stores to the higher layer, near where Rust would provide it. That's a
lot of code churn though, so I'd like to do that in a later diff once we stop
using the non-rust logic entirely.
Reviewed By: quark-zju
Differential Revision: D23662351
fbshipit-source-id: 76007b6089ddf0e558581cd179a112311f8b58e3
Summary:
As part of moving treemanifest to use the Rust tree store, we need to
move prefetch to be able to be initiated from Rust. Rust requires a certain
signature for the prefetch function which accepts multiple keys.
In preparation for this requirement, let's refactor the current remotetreestore
fetching path to have a separate function. In a later diff we'll route Rust
prefetch requests through this function so the python and rust code shares the
same base tree discovery logic.
Reviewed By: quark-zju
Differential Revision: D23662196
fbshipit-source-id: 127045c279dc22914f7e1f3a619f6620586010ba
Summary:
Python 3 reports exceptions in except clauses by showing the original
exception, then saying another exception happened during the original exception
and hiding the second exception stack trace.
To make update exceptions more debuggable, let's move the handling outside the
except clause.
Reviewed By: quark-zju
Differential Revision: D23761667
fbshipit-source-id: bec758a3c7c0b88a5a569f794730058bf6f1eaad
Summary:
Rather than switch all of Eden at once to the thrift-py3 client,
rename get_thrift_client to get_thrift_client_legacy so uses of the
new client can be introduced piecemeal.
(I did try migrating everything at once but it's been quite painful.)
Reviewed By: fanzeyi
Differential Revision: D22423399
fbshipit-source-id: 9e6d938b90fff9fc3266ba20bc77e880e7f5b1aa
Summary:
This is the initial step to track username when the workspace has been created and provide users an appropriate advice how to fix their workspace names if the username in configuration has been changed
in another diff I will provide the advice itself
I will build rename workspace command based on D23703790
Reviewed By: markbt
Differential Revision: D23730312
fbshipit-source-id: a49dabba7ec4acf35f6ff99ed23cff5d6f46e2e4
Summary:
`experimental.template-new-builtin = true` has been rolled out to 100%
and seems to work fine. Therefore, remove code that supports
`template-new-builtin = false`.
Reviewed By: singhsrb
Differential Revision: D23745353
fbshipit-source-id: 178af269381c9d3e20522ba4484d63051589342b
Summary:
Some tests run `hg init` right inside the test directory, turning the
entire $TESTTMP into a repo. In future diffs we'll start to rely more on hgcache
being present during tests, which creates a directory in $TESTTMP. Let's make
sure all repos are created as sub-directories of $TESTTMP.
Reviewed By: kulshrax
Differential Revision: D23662077
fbshipit-source-id: 2b2b974ebfd1bd19ad6acd1ebe3e68dd03a09869
Summary:
Adds the initial condition and creation logic for creating a Rust
treemanifest store. Fetching and some other code paths don't work just yet, but
subsequent diffs enable more and more functionality.
Reviewed By: quark-zju
Differential Revision: D23662052
fbshipit-source-id: a0e7090c9a3bf27a7738bf093f2d4eb6098b1ed6
Summary: The old logic would just double pack some bits. Let's prevent that.
Reviewed By: xavierd
Differential Revision: D23661933
fbshipit-source-id: 155291fa08ec2c060619329bd1cb6040769feb63
Summary:
The rust pack stores currently have logic to refresh their list of
packs if there's a key miss and if it's been a while since we last loaded the
list of packs. In some cases we want to manually trigger this refresh, like if
we're in the middle of a histedit and it invokes an external command that
produces pack files that the histedit should later consume (like an external
amend, that histedit then needs to work on top of).
Python pack stores solve this by allowing callers to mark the store for a
refresh. Let's add the same logic for rust stores. Once pack files are gone we
can delete this.
This will be useful for the upcoming migration of treemanifest to Rust
contentstore. Filelog usage of the Rust contentstore avoided this issue by
recreating the entire contentstore object in certain situations, but refresh
seems useful and less expensive.
Reviewed By: quark-zju
Differential Revision: D23657036
fbshipit-source-id: 7c6438024c3d642bd22256a8e58961a6ee4bc867
Summary:
Instants do not represent actual time and can only be compared against
each other. When we subtracted arbitrary Durations from them, we run the risk of
overflowing the underlying storage, since the Instant may be represented by a
low number (such as the age of the process).
This caused crashes in test_refresh (in the next diff) on Windows.
Let's instead represent the "must rescan" state as a None last_scanned time, and avoid any arbitrary subtraction. It's generally much cleaner too.
Reviewed By: quark-zju
Differential Revision: D23752511
fbshipit-source-id: db89b14a701f238e1c549e497a5d751447115fb2
Summary: Previously, we used the call sign of the repo we import when checking any of the if the commits are parsed by Phabricator. However, we also use this callsign for other repos when checking Phabricator, which is an incorrect implementation. E.g. if fbsource back-syncs to ovsource, we would have used FBS callsign when checking Phabricator for both fbsource and ovrsource, but we should use OVRSOURCE callsign for repo ovrsource. This diff corrects this implementation by saving the callsigns of the small repos in their SmallRepoBackSyncVars.
Reviewed By: StanislavGlebik
Differential Revision: D23758355
fbshipit-source-id: b322acb2ec589eabed5362bfd6b963e2dd1d6ea9
Summary:
I had originally make the logic around ECHILD very strict,
thinking it impossible to have a situation where it may arise,
but it turns out that our daemonization makes this happen all
the time.
This commit treats an ECHILD return from waitpid as equivalent
to a success waitpid result and success child process termination.
Reviewed By: chadaustin
Differential Revision: D23683107
fbshipit-source-id: 7867d636afd8ee79b9f100454f84e7ef480109d8
Summary:
Pull Request resolved: https://github.com/facebookexperimental/eden/pull/58
This makes the test-bookmarks-filler.t pass. Additionally remove few tests from exclusion lists as they started to pass.
Reviewed By: ikostia
Differential Revision: D23757401
fbshipit-source-id: eddcda5fd1806d77d0046b6ced3695df6b3d775d
Summary:
We are running out of space on integration tests runs on Linux. In order to avoid that this change is adding some cleanups.
1. Adding `docker rmi $(docker image ls -aq)` frees up 4 GB.
2. Cleaning up `eden_scm` build directory frees up 3 GB.
3. Cleaning up `mononoke` build directory frees up 1 GB.
This diff also includes a fix for run_tests_getdeps.py where we run all the "PASSING" tests when --rerun flag is passed instead of only the failed ones.
Pull Request resolved: https://github.com/facebookexperimental/eden/pull/57
Reviewed By: krallin
Differential Revision: D23742159
Pulled By: lukaspiatkowski
fbshipit-source-id: 3b5e89ad29c753d585c1a6f01a9a1d6c1e616fbf
Summary: fixes build and test errors on OSS introduced by D23596262 (deb57a25ed)
Reviewed By: ikostia
Differential Revision: D23757086
fbshipit-source-id: 7973ce36b3589cbe21590bd7e19a9828be72128f
Summary: Since repo_import tool is automated, we need a way to recover the process when the tool break without restarting the whole process. To do this, I defined a new struct (RecoveryFields) that allows us to keep track of the state. The most important fields are the import_stage (ImportStage) we need for keeping track of the process and to indicate the first stage of recovery, and the cs_ids we use throughout the process. For each stage in importing, we save the state after we have finished that part. This way we can also recover from interrupts. To do process recovery, we only need to use `recover-process <file_path>` subcommand, where file_path stores the saved state of importing. For normal run we will use `import [<args>]` subcommand.
Reviewed By: krallin
Differential Revision: D23678367
fbshipit-source-id: c0e0b270ea2ccc499368e54f37550cfa58c03970
Summary:
This change allows us to use warm bookmark cache for all clients except
for external sync job (i.e. the job we use to keep configerator-hg in sync with
configerator-git).
This is useful we'd like to use warm bookmark cache for configerator but it
doesn't work with external sync job. We'd like to use it because warm bookmark
cache doesn't advance a bookmark until this revision showed up in
configerator-hg - this proved to be useful when rolling out configerator for
devservers since there were tools that talked to hg, and they were failing if
hg was behind.
Currently hg external sync job doesn't work with warm bookmark cache because it
tries to incorrect move a master. What I mean by that is that hg external sync
job sends unbundle request which contains a pushkey part which says "move
master from commit A to commit B". If commit A is outdated because of warm
bookmark cache then this update will just fail, because master bookmark
actually points to commit C.
Let's just not ever use warm bookmark cache for external sync job
Reviewed By: aslpavel
Differential Revision: D23754603
fbshipit-source-id: c8eec54bca2224688d4a829ded372c6fc4d7930f
Summary:
Pass the elements to the hasher to avoid needing to alloc a vec from them.
This saves building the vec inside MPathElement, and when used on top of smallvec based MPathElement also saves allocation of a Vec from the SmallVec for each element.
Reviewed By: aslpavel
Differential Revision: D23703342
fbshipit-source-id: dd81c6d69b90f128d697ba847dde34058ad1ea6e
Summary:
Use smallvec for the internal storage of MPathElement.
The previous Bytes had a stack size of 32 bytes plus the text it pointed to.
Using SmallVec we can store up to 24 bytes without allocation keeping the same space as the previous Bytes object.
Given most path elements are directory names and directory names are usually short it is expected that this will save both space and allocations.
Reviewed By: farnz
Differential Revision: D23703344
fbshipit-source-id: 39ffc3bd3bb765bd1dbb757b4b1a7782382db909
Summary:
When sending trees and files we try to avoid sending trees that are
available from the main server. To do so, we currently check to see if the
tree/file is from the local store (i.e. .hg/store instead of $HGCACHE).
In a future diff we'll be moving trees to use the Rust store, which doesn't
expose the difference between shared and local stores. So we need to stop
depending on logic to test the local store.
Instead we can test if the commit is public or not, and only send the tree/file
is the commit is not public. This is technically a revert of the 2018 D7992502 (5e95b0e32e)
diff, which stopped depending on phases because we'd receive public commits from
svn there were not public on the server yet. Since svn is gone, I think it's
safe go back to that way.
This code was usually to help when the client was further ahead than another
client and in some commit cloud edge cases, but 1) we don't do much/any p2p
exchange anymore, and 2) we did some work this year to ensure clients have more
up-to-date remote bookmarks during exchange (as a way of making phases and
discovery more reliable), so hopefully we can rely on phases more now.
Reviewed By: quark-zju
Differential Revision: D23639017
fbshipit-source-id: 34c13aa2b5ef728ea53ffe692081ef443e7e57b8
Summary:
Previously the MetadataStore would always construct a mutable pack, even
if the operation was readonly. This meant all read commands required write
access. It also means that random .tmp files get scattered all over the place
when the rust structures are not properly destructed (like if python doesn't
bother doing the final gc to call destructors for the Rust types).
Let's just only create mutable packs when we actually need them.
Reviewed By: quark-zju
Differential Revision: D23219961
fbshipit-source-id: a47f3d94f70adac1f2ee763f3170ed582ef01a14
Summary:
Previously the ContentStore would always construct a mutable pack, even
if the operation was readonly. This meant all read commands required write
access. It also means that random .tmp files get scattered all over the place
when the rust structures are not properly destructed (like if python doesn't
bother doing the final gc to call destructors for the Rust types).
Let's just only create mutable packs when we actually need them.
Reviewed By: quark-zju
Differential Revision: D23219962
fbshipit-source-id: 573844f81966d36ad324df03eecec3711c14eafe
Summary:
Some tools, like ShipIt, close stdin before they launch the subprocess.
This causes sys.stdin to be None, which breaks our pycompat buffer read. Let's
handle that.
Reviewed By: quark-zju
Differential Revision: D23734233
fbshipit-source-id: 0adc23cd5a8040716321f6ede0157bc8362d56e0
Summary: This moves the Windows specific bits outside of the dispatcher code.
Reviewed By: chadaustin
Differential Revision: D23655613
fbshipit-source-id: 05b5bb9ed124ae37b6ae43c2a646967724337962
Summary: Similarly to the other one, this will make it possible to interrupt.
Reviewed By: fanzeyi
Differential Revision: D23643100
fbshipit-source-id: 0daab1cec94d0e177bb707d97bf928b05d5d24a3
Summary: Similarly to the other callback, this will make it possible to interrupt.
Reviewed By: fanzeyi
Differential Revision: D23643101
fbshipit-source-id: 9f9a48e752a850c63255b8867b980163cb6a92c9
Summary:
The opendir callback tend to be the most expensive of all due to having to
fetch the content of all the files. This leads to some frustrating UX as the
`ls` operation cannot be interrupted. By making this asynchronous, the slow
operation can be interrupted. The future isn't cancelled and thus it will
continue to fetch in the background, this will be tackled in a future diff.
Reviewed By: fanzeyi
Differential Revision: D23630462
fbshipit-source-id: f1c4a9fbd9daa18ca4b8f4837c5241a37ccfbcf9
Summary:
Previously, the notification callback code was pretty ad-hoc in how it dealt
with the request context and handling asynchronous callbacks, in order to share
more code with FUSE, let's add a catchErrors method to the PrjfsRequestContext
similarly to what is done in the FUSE code. Once timeouts and notifications
will be added, the catchErrors code will be moved into the parent class and all
of this code will be common between ProjectedFS and FUSE.
Reviewed By: fanzeyi
Differential Revision: D23626748
fbshipit-source-id: 70fae3d4a276be374f58559cc1fb05c8e56e5c2d
Summary:
Completing callbacks asynchronously is as simple as using PrjCompleteCommand
instead of returning the result from the callback. This allows callbacks to be
interrupted/cancelled, which will lead to a better user experience.
For now, the code in the notifaction callback is very ad-hoc, but most of it
will be refactored to be re-used by the other callbacks.
Reviewed By: fanzeyi
Differential Revision: D23611372
fbshipit-source-id: 17d1b8a4cd05706141abbf1e861d74f471537fba
Summary:
This removes a bunch of Windows specific code from the dispatcher which brings
us closer to moving it to a non-Windows specific directory. It also makes the
code more ready to use async notification handling as the dispatcher no longer
waits on future to complete before returning.
Reviewed By: fanzeyi
Differential Revision: D23611371
fbshipit-source-id: b1a2a6ce0a0be4747423ed75bc8a7aa4b5fa99f4
Summary:
Now that all the pieces are in place, we can plumb the request context in. For
now, this adds it to only one callback as I figure out more about it and tweak
it until I have something satisfactory. There are some rough edges with it that
I'm not entirely happy about, but as I change the notification callback to be
more async, I'm hoping to make more convenient to use and less clanky.
Reviewed By: fanzeyi
Differential Revision: D23505508
fbshipit-source-id: d5f12e22a8f67dfa061b8ad82ea718582c323b45
Summary:
There was a time when getRegexExportedValues was not in open source
fb303, but that time has long passed.
Reviewed By: kmancini
Differential Revision: D23745295
fbshipit-source-id: 4702068f0bb7350467e42439444b3f4d75aeec76
Summary:
Since the Stub.h now only contains NOT_IMPLEMENTED, let's move it to its own
header outside of the win directory.
Reviewed By: genevievehelsel
Differential Revision: D23696244
fbshipit-source-id: 2dfc3204707e043ee6c89595668c484e0fa8c0d0
Summary:
With this gone, we will be able to rename and move Stub.h outside of the win
directory.
Reviewed By: genevievehelsel
Differential Revision: D23696243
fbshipit-source-id: ea05b10951fa38a77ce38cd6a09a293364dbeec9
Summary:
While the code isn't compiled, this makes the thrift definition available to
the rest of the code, eliminating the need for having a stub for
SerializedInodeMap on Windows.
Reviewed By: genevievehelsel
Differential Revision: D23696242
fbshipit-source-id: 8a42dd2ed16887f3b7d161511e07aaa35fd1b968
Summary:
The getuid and getgid are defined as returning uid_t and gid_t. Defining these
types here will prevent downstream consumer from having to redefine these
types for Windows.
(Note: this ignores all push blocking failures!)
Reviewed By: yfeldblum, Orvid
Differential Revision: D23693492
fbshipit-source-id: 1ec9221509bffdd5f6d241c4bc08d7809cdb6162
Summary:
There was a bug. If an entry was skipped then we haven't updated the counter.
That means we might skip the same entry over and over again.
Let's fix it
Reviewed By: ikostia
Differential Revision: D23728790
fbshipit-source-id: f323d14c4deba5736ceb8ada7cb7ee48a69c1272
Summary:
Turns out crecord had a help screen. It was broken in Python 3. This
fixes it.
Reviewed By: singhsrb
Differential Revision: D23720798
fbshipit-source-id: 4aade9abb88355c19ee4445de116fdb40d5366bd
Summary: filter returns a generator in Python 3, but we need a list.
Reviewed By: singhsrb
Differential Revision: D23720661
fbshipit-source-id: 8de3f5844bfe8b85b37c44423733fd2a09967397
Summary: This was horribly broken, and we have no tests.
Reviewed By: singhsrb
Differential Revision: D23720984
fbshipit-source-id: 4ad47c767b0d18f700c855a7bb43f38f5c5ef317
Summary:
When I added the surrogateescape patch for the email parser decoder
used during patches, I incorrectly added a corresponding encoder on the other
end when we get the data out of the parser. It turns out the parser is
smart/dumb. When using get_payload() it attempts a few different decodings of
the data and ends up replacing all the non-ascii characters with replacement
bits (question marks). Instead we should use get_payload(decode=True), which
bizarrely actually encodes the data into bytes, correctly detecting the presence
of surrogates and using the correct ascii+surrogateescape encoding.
Reviewed By: singhsrb
Differential Revision: D23720111
fbshipit-source-id: ed40a15056c39730c91067b830f194fbe41e5788
Summary:
Previously we were able to add a backpressure to the x-repo-sync job that waits
until backsync queue gets empty. However it's also useful to wait until hg sync
queue drains for the large repo. This diff makes it possible to do so.
Reviewed By: aslpavel
Differential Revision: D23728201
fbshipit-source-id: 6b198c8d9c35179169a46f2b804f83838edeff1e
Summary: For fsnodes output the filecontent child nodes first as they can be drained without expanding to more nodes.
Reviewed By: farnz
Differential Revision: D23702268
fbshipit-source-id: 26aeca20d47030dbb9145d406db885fe0fce932c
Summary: Use sorted_vector_map when parsing fsnodes, as inputs are stored sorted, which can result in high cost of BTree insertion when traversing large repos.
Reviewed By: aslpavel
Differential Revision: D23691500
fbshipit-source-id: 1f7a5faf2ef3cb4a72a635d0d8e89037bf4d96b3
Summary:
We are currently logging only the outermost underlying error or context, not any of the lower level causes. This makes mononoke_blobstore_trace less useful!
This changes to use anyhow's alternate format that includes causes
Reviewed By: krallin
Differential Revision: D23708577
fbshipit-source-id: fa2e71734841e2b75d824c456dccf61c1fb13fd2
Summary:
Instead of using force_set and force_delete let's use create() update() and
delete() calls.
Reviewed By: ikostia
Differential Revision: D23704245
fbshipit-source-id: 40bcfd906c4f61a860e5ec8312cddc0d80ea94ae
Summary:
Original commit changeset: f4816b303e19
The newer version of Watchman that understands the new mount type name
won't release in time, so back this out for now.
Reviewed By: fanzeyi
Differential Revision: D23720167
fbshipit-source-id: 588541e1d9093533611d1f32b319d2562318506a
Summary:
This test is flaky due to `hg up` not always reading data from the stores, and
thus not always failing to reading the LFS blob. A better way to force read
from the store is to simply use `hg log -p` to read from the stores.
Reviewed By: DurhamG, singhsrb
Differential Revision: D23718823
fbshipit-source-id: 98bc37a76e93a67d031ba7bfa124b1db816983a1
Summary: The files use Python 3 only syntax and is not really used. Skip them so Python 2 build won't hit invalid syntax issues.
Reviewed By: chadaustin
Differential Revision: D23717662
fbshipit-source-id: f911a83937be9ccc40194f321e3b41625a68e703
Summary:
Running `setup.py` with Python 3 for Python 2 build will cause issues as
`setup.py` writes `.pyc` files in Python 3 format.
Reviewed By: chadaustin
Differential Revision: D23717661
fbshipit-source-id: 38cfabdfdf20424a21f8a5bdaf826e74da2304ac
Summary:
tpx doesn't support heavyweight tags or rate limiting, and integration
tests regularly fail with timeouts on my devbig, so bump the process
start and process stop timeouts.
Reviewed By: genevievehelsel
Differential Revision: D23553924
fbshipit-source-id: fa9b8710395d61b087963d18718137e4525ae03d
Summary:
30 seconds is not enough time on heavily contented systems, including
CI. Bump the shutdown timeout to 120.
Also, correctly send SIGKILL to the daemon process when it's been
started with sudo.
Reviewed By: simpkins
Differential Revision: D22422784
fbshipit-source-id: dc7be0962705f1feb9643990309f570e352b68a0
Summary:
This is the function that was used in repo_import tool to wait until hg sync
has processed all of the entries in the queue. Let's move it to the hg sync
helper lib so that it can be used in other places. E.g. I'd like to use it in
the next diffs in mononoke_x_repo_sync_job.
Reviewed By: krallin
Differential Revision: D23708280
fbshipit-source-id: ea846081d89b55b0d2f5407c971e13869cedfd8b
Summary:
This stack updates eden to be able to check all of the locations that able
users certificate may reside.
THRIFT_TLS_CL_CERT_PATH is usally set with the location for the users x509
certs. So it seems best to check this location. In order to be able to check
this location, we need to be able to resolve the enviroment variable in our
parsing.
Reviewed By: wez, genevievehelsel
Differential Revision: D23359815
fbshipit-source-id: 2008cc52ab64d23dbcfda41292a60a4bf77a80df
Summary:
In preparation of moving away from SSH as an intermediate entry point for
Mononoke, let Mononoke work with newly introduced Metadata. This removes any
assumptions we now make about how certain data is presented to us, making the
current "ssh preamble" no longer central.
Metadata is primarily based around identities and provides some
backwards-compatible entry points to make sure we can satisfy downstream
consumers of commits like hooks and logs.
Simarly we now do our own reverse DNS resolving instead of relying on what's
been provided by the client. This is done in an async matter and we don't rely
on the result, so Mononoke can keep functioning in case DNS is offline.
Reviewed By: farnz
Differential Revision: D23596262
fbshipit-source-id: 3a4e97a429b13bae76ae1cdf428de0246e684a27
Summary:
ghostbooleans
Apparently I didn't test for the positive case in my previous diff that introduces this check :(
Reviewed By: xavierd
Differential Revision: D23698179
fbshipit-source-id: 95a28cc13bff5e325214b6a398e19c821b5ae17f
Summary: We only care about the files we need when recording prefetch profiles (since we don't want to fetch top level directories). So let's skip recording `Tree` object types.
Reviewed By: kmancini
Differential Revision: D23693533
fbshipit-source-id: 9af5437ff6571a34597425ca5f657e7126671ba9
Summary: Support for multiple heads in `BonsaiDerived::find_all_underived_ancestors`. This change will be needed to remove manual step of fetching of all changesets in `backfill_derived_data` utilty.
Reviewed By: StanislavGlebik
Differential Revision: D23705295
fbshipit-source-id: 32aa97a77f0a4461cbe4bf1864477e3e121e1879
Summary:
As it says in the title, this adds support for receiving compressed responses
in the revisionstore LFS client. This is controlled by a flag, which I'll
roll out through dynamicconfig.
The hope is that this should greatly improve our throughput to corp, where
our bandwidth is fairly scarce.
Reviewed By: StanislavGlebik
Differential Revision: D23652306
fbshipit-source-id: 53bf86d194657564bc3bd532e1a62208d39666df
Summary:
This adds support for compressing responses in the LFS Server, based on what
the client sent in `Accept-Encoding`. The compression changes are fairly
simple. Most of the codes changes are around the fact that when we compress,
we don't send a Content-Length (because we don't know how long the content will
be).
Note that this is largely implemented in StreamBody. This means it can be used
for free by the EdenAPI server as well. The reason it's in there is because we
need to avoid setting the Content-Length when compression is going to be used
(`StreamBody` is what takes charge for doing this). This also exposes a
callback to get access to the stream post-compression, which also needs to be
exposed in `StreamBody`, since that's where compression happens.
Reviewed By: aslpavel
Differential Revision: D23652334
fbshipit-source-id: 8f462d69139991c3e1d37f392d448904206ec0d2
Summary:
This imports the async-compression crate. We have an equivalent-ish in
common/rust, but it targets Tokio 0.1, whereas this community-supported crate
targets Tokio 0.2 (it offers a richer API, notably in the sense that we
can use it for Streams, whereas the async-compression crate we have is only for
AsyncWrite).
In the immediate term, I'd like to use this for transfer compression in
Mononoke's LFS Server. In the future, we might also use it in Mononoke where we
currently use our own async compression crate when all that stuff moves to
Tokio 0.2.
Finally, this also updates zstd: the version we link to from tp2 is actually
zstd 1.4.5, so it's a good idea to just get the same version of the zstd crate.
The zstd crate doesn't keep a great changelog, so it's hard to tell what has changed.
At a glance, it looks like the answer is not much, but I'm going to look to Sandcastle
to root out potential issues here.
Reviewed By: StanislavGlebik
Differential Revision: D23652335
fbshipit-source-id: e250cef7a52d640bbbcccd72448fd2d4f548a48a
Summary: That might be used to pass more data to the server
Reviewed By: markbt
Differential Revision: D23704722
fbshipit-source-id: a6e41d615f6548f2f8fd036814c59573a45f93bc
Summary: New type async/await can mutate variables, we no longer need synchronization for this counters
Reviewed By: ikostia
Differential Revision: D23704765
fbshipit-source-id: eb2341cb0c82b8a49c28ad3c8fd811ed3af73436
Summary:
This would let us allow only a certain bookmarks to be remapped from a small
repo to a large repo.
Reviewed By: krallin
Differential Revision: D23701341
fbshipit-source-id: cf17a1a21b7594a94c5fb117065f7d9298c8d1af
Summary:
Previously we used target repo for a commit from a source repo. This diff fixes
it.
Reviewed By: krallin
Differential Revision: D23685171
fbshipit-source-id: 4aa105aec244ebcff92b7b71a6cb22dd8a10d2e5
Summary: Add a test to detect any unexpected changes in MPatheElements size
Reviewed By: farnz
Differential Revision: D23703345
fbshipit-source-id: 74354f0861b048ee4611304fc99f0289bce4a7a5
Summary:
Facebook
We need them since we are going to sync ovrsource commits into fbsource
Reviewed By: krallin
Differential Revision: D23701667
fbshipit-source-id: 61db00c7205d5d4047a4040992e7195f769005d3
Summary: Noticed there had been an upstream 3.11.10 release with a fix for a performance regression in 3.11.9, PR was https://github.com/xacrimon/dashmap/issues/100
Reviewed By: krallin
Differential Revision: D23690797
fbshipit-source-id: aff3951e5b7dbb7c21d6259e357d06654c6a6923
Summary:
EdenFS is adding a Python 3 Thrift client intended for use by other
projects, and the Mercurial Python 2 build doesn't understand Python 3
syntax files, so switch the default getdeps build to Python 3.
Reviewed By: quark-zju
Differential Revision: D23587932
fbshipit-source-id: 6f47f1605987f9b37f888d29b49a848370d2eb0e
Summary: These headers aren't needed, and are slowing compile time at best, remove them.
Reviewed By: chadaustin
Differential Revision: D23693491
fbshipit-source-id: 4aebdfbbe56897623f62017bd498dc5c90ea6532
Summary:
This was only used in EdenMount.h, to declare a method that was not compiled on
Windows, let's ifdef that method instead.
Reviewed By: chadaustin
Differential Revision: D23693494
fbshipit-source-id: 1eda62f2ae3a38a30aa0b517911635ef3d3896c2
Summary:
The ProcessNameCache code is compiled on Windows now, this definiton could
cause issues with different cpp files compiling different version of the
ProcessNameCache. To avoid this, let's remove it from Stub.h, this also removes
a bunch of #ifdef.
Reviewed By: chadaustin
Differential Revision: D23693490
fbshipit-source-id: 8f3f7b1128235b9a60f850e688b9e98910c202fc
Summary: This is not needed, remove it.
Reviewed By: chadaustin
Differential Revision: D23693489
fbshipit-source-id: 0d7674f3001410b2d9ff02ef95049c5391d8528c
Summary: This code is the same as the service/oss/main.cpp, no need to keep this one around.
Reviewed By: chadaustin
Differential Revision: D23689607
fbshipit-source-id: bb72a0623dcdb36beca40c3766e8d6817b99dea2
Summary:
This stack updates eden to be able to check all of the locations that able
users certificate may reside.
There can be multiple places where a cert may reside (we cant always
definitively choose one place to look based on the platform). Thus we
need to be able to configure multiple locations for certs in our eden
config.
Thus we need to be able to parse a list of options for a key in our config
parsing.
**Disclaimer this is really icky**
Our `FieldConverter` interface takes a string to parse. So this means
that after parsing the config file for each value we have to re-serialize it
into a string to pass it in here. Previously we only supported string and
bool values so this re-serialization was not too terrible. Now that we want
to support arrays this re-serialization is extra gross. To minimize the grossness,
I am reusing cpptoml for serializing / deserializing around the `FieldConverter`
interface.
Long term it would be better if FieldConverter took a cpptoml::base or
something more generic instead of a string so we dont have to do this.
But that will be a big refactor, and I don't currently have bandwidth for it :(
Reviewed By: wez
Differential Revision: D23359928
fbshipit-source-id: 7c89de485706dd13a05adf19df28425d2c1756a8
Summary:
This test can't be non-flaky, because it relies on the kernel deciding
when to drop inodes from cache, and we've investigated it multiple
times. Given it tests a rarely-used function that would be better
expressed as a unit test in C++, just remove it for now.
Reviewed By: wez
Differential Revision: D23665455
fbshipit-source-id: 522e47113857eff399be4f2bb60e26e801d61e9a
Summary: For ease of consumption, remove the descriptive line and the extra newline at the bottom of the generated prefetch profile. Also, sort the files for smaller generated diffs upon iteration.
Reviewed By: kmancini
Differential Revision: D23683153
fbshipit-source-id: e2bd510d5fbd7095f199e70b2556b84e0984a914
Summary:
We've often had cases where we need to nuke peoples caches for various
reasons. It's a hug pain since we haven't a way to communicate with all hg
clients. Now that we have configerator dynamicconfigs, we can use that to reach
all clients.
This diff adds support for configs like:
```
[hgcache-purge]
foo=2020-08-20
```
The key, 'foo' in this case, is an identifier used to only run this purge once.
The value is a date after which this purge will no longer run. This is useful
for bounding the damager from forgetting about a purge and having it delete caches
over and over in the future for new repos or repos where the run once marker
file is deleted for some reason.
Reviewed By: quark-zju
Differential Revision: D23044205
fbshipit-source-id: 8394fcf9ba6df09f391b5317bad134f369e9b416
Summary:
getConfigStat had a bug where it, instead of clearing the bits of
*configStat, cleared the bits of the pointer itself. This caused the
stat struct for missing files to be uninitialized memory, causing
configs to reload. Write a test and fix the bug.
Reviewed By: xavierd
Differential Revision: D23645087
fbshipit-source-id: ad42f7ec1b313f668604e3a7f6c8200f6b94b23d
Summary:
While hacking on some code, I ran into a situation where some
zero-initialized stat structs weren't actually being zeroed. This was
either a compiler bug or a situation where the build system was not
correctly rebuilding everything after my changes, and I did not have
enough disassembly available to investigate.
Either way, since this code assumes zero bits in some nonobvious ways,
explicitly assert they are.
Reviewed By: xavierd
Differential Revision: D23644819
fbshipit-source-id: eb6bff9ff997379113db1e1bf9d6a0a538f10f0b
Summary:
This just makes it more convenient to use in circumstances when fns expects
`impl LCAHint` and we have `Arc<dyn LCAHint`.
Reviewed By: farnz
Differential Revision: D23650380
fbshipit-source-id: dacbfcafe6f3806317a81ed4677af190027f010b
Summary:
We noticed spurious config file reloads, so add some logging to help
track that down.
Reviewed By: xavierd
Differential Revision: D23644447
fbshipit-source-id: 9953a17de402660c7f6491fb9abd8d702fa290e8
Summary:
GNU `df` (and any other coreutil that relies on gnulib's ME_REMOTE
flag) detects remote filesystems with some heuristics. One of which is
whether the device type contains a colon. Since edenfs is a remote
filesystem, include a colon, so it's properly detected as such.
Reviewed By: genevievehelsel
Differential Revision: D23520233
fbshipit-source-id: f4816b303e198d4e2a446efdcc5b49a593e09a05
Summary:
We intend to rename the edenfs device type to include a colon (and
possibly the backing repo basename). In preparation, update code that
detects edenfs mounts to include anything that starts with "edenfs:".
Reviewed By: genevievehelsel
Differential Revision: D23520008
fbshipit-source-id: 280f7617d5c96e23d548041b3482bca388076a7b
Summary:
For some unknown reason, we weren't setting this on Windows, which meant that
whenever edenfs would need to call edenfsctl (like at mount time when fixing up
redirections), it would always use the binary found in the PATH. While in most
cases this is OK, this is not the intended behavior for tests that are expected
to use the just compiled binary, not the one in the PATH.
Reviewed By: genevievehelsel
Differential Revision: D23653027
fbshipit-source-id: f1cc977e44b10c379d2b90bc7972bfec1fccad23
Summary: Previously, `SignalStream` would assume that an error would always end the `Stream`, and would therefore stop and report the amount of transferred data upon encountering any error. This isn't always the desired behavior, as it is possible for `TryStream`s to return mid-stream errors without immediately ending. `SignalStream` should allow for this kind of usage.
Reviewed By: farnz
Differential Revision: D23643337
fbshipit-source-id: 2c7ffd9d02c05bc09c6ec0e282c0b2cca166e079
Summary:
Similarly to FuseRequestContext, this will be allocated whenever ProjectedFS
calls onto EdenFS to keep track of timing, stats, etc.
For now, this just holds the callback data passed in, we need to copy it as
ProjectedFS will deallocate it when the EdenFS callback returns, but since we
intend to complete these asynchronously, the callback data needs to outlive the
callback, hence the copy. It's likely that this is copying too much, and only
part of it actually needs to be copied, this will be tackled later.
Reviewed By: wez
Differential Revision: D23505511
fbshipit-source-id: ece00183e3194611d3d63465878470d6e53b790c
Summary: A following diff will make use of the channel.
Reviewed By: wez
Differential Revision: D23505510
fbshipit-source-id: c044fff51c8771b1ead86333317e5c617184075c
Summary:
Currently getbundle implementation works this way: it accepts two list of
changesets, a list of `heads` and a list of `common`. getbundle needs to return
enough changesets to the client so that client has `heads` and all of its
ancestors. `common` is a hint to the server that tells which nodes client
already has and doesn't need sending.
Current getbundle implementation is inefficient when any of the heads have a low
generation number but also we have a lot excludes which have a high generation
number. Current implementation needs to find ancestors of excludes that are
equal or lower than generation of this head with a low generation number.
This diff is hacky attempt to improve it. It does it by splitting the heads
into heads with high generation numbers and heads with low generation numbers,
sending separate getbundle requests for each of them and then combining the
results.
Example
```
O <- head 1
|
O <- exclude 1
|
... O <- head 2 <- low generation number
| |
O O <- exclude 2
```
If we have a request like `(heads: [head1, head2], excludes: [exclude1, exclude2])` then this optimization splits it into two requests: `(heads: [head1], excludes: [exclude1, exclude2] )` and `(heads:[head2], excludes:[exclude2])`, and then combines the results. Note that it might result in overfetching i.e. returning much more commits to the client then it requested. This might make a request much slower, and to combat that I suggest to have a tunable getbundle_low_gen_optimization_max_traversal_limit that prevents overfetching.
Reviewed By: krallin
Differential Revision: D23599866
fbshipit-source-id: fcd773eb6a0fb4e8d2128b819f7a13595aca65fa
Summary:
This completely converts mercurial changeset to be an instance of derived data:
- Custom lease logic is removed
- Custom changeset traversal logic is removed
Naming scheme of keys for leases has been changed to conform with other derived data types. This might cause temporary spike of cpu usage during rollout.
Reviewed By: farnz
Differential Revision: D23575777
fbshipit-source-id: 8eb878b2b0a57312c69f865f4c5395d98df7141c
Summary:
shed/sql library used mainly to communicate with Mysql db and to have a nice abstraction layer around mysql (which is used in production) and sqlite (integration tests). The library provided an interface, that was backed up from Mysql side my raw connections and by MyRouter.
This diff introduces a new backend - new Mysql client for Rust.
New backend is exposed as a third variant for the current model: sqlite, mysql (raw conn and myrouter) and mysql2 (new client). The main reason for that is the fact that the current shed/sql interface for Mysql
(1) heavily depends on mysql_async crate, (2) introduces much more complexity than needed for the new client and (3) it seems like this will be refactored and cleaned up later, old things will be deprecated.
So to not overcomplicate things by trying to implement the given interface for the new Mysql client, I tried to simplify things by adding it as a third backend option.
Reviewed By: farnz
Differential Revision: D22458189
fbshipit-source-id: 4a484b5201a38cc017023c4086e9f57544de68b8
Summary:
Pull Request resolved: https://github.com/facebookexperimental/eden/pull/56
Additionally to adding a flag as in the commit title the "test-scs*" tests are being excluded via pattern rather than listed out. This will help in future when more tests using scs_server will be added as the author won't have to list it in exclusion list.
Reviewed By: farnz
Differential Revision: D23647298
fbshipit-source-id: f5c263b9f68c59f4abf9f672c7fe73b63fa74102
Summary:
lookup has special support for requests that start with _gitlookup. It remaps
hg commit to git commit and vice versa. Let's support that in mononoke as well.
Reviewed By: krallin
Differential Revision: D23646778
fbshipit-source-id: fcde58500b5956201e718b0a609fc3fee1bbdd28
Summary:
`hg cloud rejoin` is used in fbclone
By providing a bit more information about the workspaces available we can improve user
experience and try to eliminate the confusion multiple workspaces cause.
Reviewed By: mitrandir77
Differential Revision: D23623063
fbshipit-source-id: 7598c1b58597032c9cfcef0b44b0ec1b00510ffa
Summary: This diff refactors the tool to use changeset_ids instead of changesets, since the functionalities of the tool always use the ids over the changesets. For process recovery (next steps) it's more ideal to save the changeset_ids, since they are simpler to store. However, if we used changesets in the tool, we would need to load them with the saved ids just to only use the ids throughout the process. Therefore, this loading step becomes redundant and we should simply use the ids.
Reviewed By: krallin
Differential Revision: D23624639
fbshipit-source-id: d9b558ebb46c0670bd09556783060f12d3a279ed
Summary: Add a Scuba handler for the EdenAPI server, which will allow the server to log custom columns to Scuba via `ScubaMiddleware`. For now, the only application specific columns are `repo` and `method`.
Reviewed By: krallin
Differential Revision: D23619437
fbshipit-source-id: f08aaf9c84657b4d92f1a1cfe5f0cb5ccd408e5e
Summary: Add `OdsMiddleware` to the EdenAPI server to log various aggregate request stats to ODS. This middleware is directly based on the `OdsMiddleware` from the LFS server (which unfortunately can't be easily generalized due to the way in which the `stats` crate works).
Reviewed By: krallin
Differential Revision: D23619075
fbshipit-source-id: d361c73d18e0d1cb57347fd24c43bdb68fb7819d
Summary:
Add a new `HandlerInfo` struct that can be inserted into the request's `State` by each handler to provide information about the handler to post-request middleware. (This includes things like which handler ran, which repo was queried, etc.)
A note on design: I did consider adding these fields to `RequestContext` (similar to how it's done in the LFS server), but that proved somewhat problematic in that it would require using interior mutability to avoid forcing each handler to borrow the `RequestContext` mutably (and therefore prevent it from borrowing other state data). The current approach seems preferable to adding a `Mutex` inside of `RequestContext`.
Reviewed By: krallin
Differential Revision: D23619077
fbshipit-source-id: 911806b60126d41e2132d1ca62ba863c929d4dc9
Summary:
The corpus rev that biggrep has indexed may not be available in the
local client. Later on in the function it will pull that revision, but earlier
in the function the new logic I added a few weeks ago is just crashing.
That logic was trying to diff against the earlier revision, but that's pretty
arbitrary. Let's just diff against one of the revs at random
(deterministically) and get rid of the need for the hash to exist in the repo
early in the command.
Reviewed By: sfilipco
Differential Revision: D23635801
fbshipit-source-id: 1c284d710b8df9539a696e900183bc10d5d71869
Summary:
The AccessCount fields were recently renamed to start with fsChannel instead of
fuse to be more platform independant. This broke edenfsctl due to it using the
old names.
Reviewed By: wez
Differential Revision: D23633574
fbshipit-source-id: 2a5fc73c47d2f0a6db407ecfeaf85992b7932c10
Summary:
Readdir tries to be smart and prefetch the metadata for each of the children.
But this uses the old path to read metadata for files which can cause eden to
download the blob. When metadata prefetching is turned on in the backing store
it is better to leave this to metadata prefetching to the backing store.
Reviewed By: wez
Differential Revision: D23476876
fbshipit-source-id: 41cc5e6f423f19adb18581564c069c12621b6c1b
Summary:
This diff fixes a bug on Windows when the redirection target is a non-empty directory. As seen in P141872812
This doesn't make the exception go away but generate a more meaningful error message so the user can act on it.
Reviewed By: xavierd
Differential Revision: D23605233
fbshipit-source-id: 2d2bde0e9cd94323a6537ebcec29a4c15868806d
Summary: Once we start to move the bookmark for the large repo commits, small repo commits should also start to appear for the dependent systems (e.g. Phabricator) through back-syncing. This diff adds this functionality to see if the commits have been recognised by the tools.
Reviewed By: StanislavGlebik
Differential Revision: D23566994
fbshipit-source-id: 2f6f3b9099bb864fec6a488064abe1abe7f06813
Summary: Useful when looking into blobstore corruption - you can compare all the blobstore versions by manual fetchees.
Reviewed By: krallin
Differential Revision: D23604436
fbshipit-source-id: 7b56947b0188536499514bae6615c6e81b9106c3
Summary: Going to add more features, so simplify by asyncifying first
Reviewed By: krallin
Differential Revision: D23604437
fbshipit-source-id: 52b2b372e4d3fbf1d59168c6c11311d9edf4ff0f
Summary: When we're scrubbing blobstores, it's not actually a success state if a scrub fails to write. Report this back to the caller - no-one will usually be scrubbing unless they expect repair writes to succeed, and a failure is a sign that we need to investigate further
Reviewed By: mitrandir77
Differential Revision: D23601541
fbshipit-source-id: d328935af9999c944719a6b863d0c86b28c54f59
Summary:
One test was fixed earlier by switching MacOS to use modern version of bash, the other is fixed here by installing "nmap" and using "ncat" from within it on both linux and mac.
Pull Request resolved: https://github.com/facebookexperimental/eden/pull/50
Reviewed By: krallin
Differential Revision: D23599695
Pulled By: lukaspiatkowski
fbshipit-source-id: e2736cee62e82d1e9da6eaf16ef0f2c65d3d8930
Summary:
Fixes a few issues with Mononoke tests in Python 3.
1. We need to use different APIs to account for the unicode vs bytes difference
for path hash encoding.
2. We need to set the language environment for tests that create utf8 file
paths.
3. We need the redaction message and marker to be bytes. Oddly this test still
fails with jq CLI errors, but it makes it past the original error.
Reviewed By: quark-zju
Differential Revision: D23582976
fbshipit-source-id: 44959903aedc5dc9c492ec09a17b9c8e3bdf9457
Summary:
For repositories that have the old-style LFS extension enabled, the pointers
are stored in packfiles/indexedlog alongside with a flag that signify to the
upper layers that the blob is externally stored. With the new way of doing LFS,
pointers are stored separately.
When both are enabled, we are observing some interesting behavior where
different get and get_meta calls may return different blobs/metadata for the
same filenode. This may happen if a filenode is stored in both a packfile as an
LFS pointers, and in the LFS store. Guaranteeing that the revisionstore code is
deterministic in this situation is unfortunately way too costly (a get_meta
call would for instance have to fully validate the sha256 of the blob, and this
wouldn't guarantee that it wouldn't become corrupted on disk before calling
get).
The solution take here is to simply ignore all the lfs pointers from
packfiles/indexedlog when remotefilelog.lfs is enabled. This way, there is no
risk of reading the metadata from the packfiles, and the blob from the
LFSStore. This brings however another complication for the user created blobs:
these are stored in packfiles and would thus become unreadable, the solution is
to simply perform a one-time full repack of the local store to make sure that
all the pointers are moved from the packfiles to to LFSStore.
In the code, the Python bindings are using ExtStoredPolicy::Ignore directly as
these are only used in the treemanifest code where no LFS pointers should be
present, the repack code uses ExtStoredPolicy::Use to be able to read the
pointers, it wouldn't be able to otherwise.
Reviewed By: DurhamG
Differential Revision: D22951598
fbshipit-source-id: 0e929708ba5a3bb2a02c0891fd62dae1ccf18204
Summary:
hg-http's built client should provide integration with Mercurial's stats
collection mechanisms.
Reviewed By: kulshrax
Differential Revision: D23577867
fbshipit-source-id: 93c777021bc347511322269d678d6879710eed3e
Summary:
Add `with_stats_reporting` to HttpClient. It takes a closure that will be
called with all `Stats` objects generated. We then use this function in
the hg-http crate to integrate with the metrics backend used in Mercurial.
Reviewed By: kulshrax
Differential Revision: D23577869
fbshipit-source-id: 5ac23f00183f3c3d956627a869393cd4b27610d4
Summary: Rust based metrics so that even Rust libraries can write metrics.
Reviewed By: quark-zju
Differential Revision: D23577870
fbshipit-source-id: b19904968d9372c8ce19775fb37c7af53a370ea5
Summary:
We start off simple here. Python only really has counters so we only implement
counters. There are a lot of options on how to improve this and things get
slightly complicated when we look at the how ecosystem and fb303. Anyway,
simple start.
Reviewed By: quark-zju
Differential Revision: D23577874
fbshipit-source-id: d50f5b2ba302d900b254200308bff7446121ae1d
Summary:
Slash is probably the standard metric delimiter nowadays. Since we don't have
that many metrics I think that it makes sense to look at slash as the
standard metric delimiter going forward.
This diff updates parsing of metric names to treat both '_' and '/' as
delimiters.
Reviewed By: quark-zju
Differential Revision: D23577876
fbshipit-source-id: 03997b1285df9c52d6e2837b5af5372deb69b133
Summary:
The command is easier to use than `hg cloud join --switch`.
Also highlight the workspace name in the output of `hg cloud status`
Reviewed By: mitrandir77
Differential Revision: D23601507
fbshipit-source-id: 74eb17c9366a9dbe96881c8e3e0705619fadb3d6
Summary:
typed_hash only implements serialize. Because of this, if we want to serialize a struct that contains e.g changesetid(s), we can't deserialize it later. This diff adds deserialize implementation for typed_hashes.
Implementation is similar to HgNodeHash's: https://fburl.com/diffusion/r3df5iga
Reviewed By: krallin
Differential Revision: D23598925
fbshipit-source-id: 4d48b75eb8a01028e6e2d9bcc1ae20051a97b7fb
Summary:
Streaming clone implementation did not check that received files have the corrects. This change addresses it.
Before this change if connection was interrupted for whatever reason client would treat fetch of changeset as successful and proceed with cloning operations, but later checks would report corruption of internal state of hg data. This is based on user [report](https://fb.workplace.com/groups/scm/permalink/3177150312334567/)
Reviewed By: quark-zju, krallin
Differential Revision: D23572058
fbshipit-source-id: d740b45ca217cd6db0a65e01aabc2ba9a4835221
Summary: The Mercurial codebase uses hyphens in crate names rather than underscores. This is similar to the convention favored by the larger Rust community, though it is different from Mononoke, which uses underscores. While we'll probably need to eventually settle on a consistent convention for all of projects in the Eden SCM repo, for now, `http_client` should be made consistent with the adjacent crates.
Reviewed By: sfilipco
Differential Revision: D23585721
fbshipit-source-id: d2e690d86815be02d7b8d645198bcd28e8cbd6e0
Summary:
While it's unlikely to work properly (it uses /bin/sh), it compiles properly on
Windows, so let's include it in the build.
Reviewed By: wez
Differential Revision: D23520368
fbshipit-source-id: 267ba04f98f5dacc81e1772f86f5ad43c846815d
Summary:
We are using older version of tokio which spawns as many threads as we have
physical cores instead of the number of logical cores. It was fixed in
https://github.com/tokio-rs/tokio/issues/2269 but we can't use it yet because
we are waiting for another fix to be released -
https://github.com/rust-lang/futures-rs/pull/2154.
For now let's hardcode it in mononoke
Reviewed By: krallin
Differential Revision: D23599140
fbshipit-source-id: 80685651a7a29ba8938d9aa59770f191f7c42b8b
Summary:
This change move logic associated with mercurial changeset derivation to `mercurial_derived_data` crate.
NOTE: it is not converted to derived data infrastructure at this point, it is a preparation step to actually do this
Reviewed By: farnz
Differential Revision: D23573610
fbshipit-source-id: 6e8cbf7d53ab5dbd39d5bf5e06c3f0fc5a8305c8
Summary:
The Mac integration test workflow already installs a modern curl that fixes https://github.com/curl/curl/issues/4801, but it does so after "hg" is built, so "hg" uses the system curl libraries, which fails when used with a certificate not present in keychain.
Pull Request resolved: https://github.com/facebookexperimental/eden/pull/53
Reviewed By: krallin
Differential Revision: D23597285
Pulled By: lukaspiatkowski
fbshipit-source-id: a7b8b6ae55ce338bfb9946a852cbb6b929e73203
Summary:
There are blobs that fail to scrub and terminate the process early for a variety of reasons; when this is running as a background task, it'd be nice to get the remaining keys scrubbed, so that you don't have a large number of keys to fix up later.
Instead of simply outputting to stdout, write keys to one of three files in the format accepted on stdin:
1. Success; you can use `sort` and `comm -3` to remove these keys from the input dat, thus ensuring that you can continue scrubbing.
2. Missing; you can look at these keys to determine which blobs are genuinely lost from all blobstores, and fix up.
3. Error; these will need running through scrub again to determine what's broken.
Reviewed By: krallin
Differential Revision: D23574855
fbshipit-source-id: a613e93a38dc7c3465550963c3b1c757b7371a3b
Summary:
With three blobstores in play, we have issues working out exactly what's wrong during a manual scrub. Make the error handling better:
1. Manual scrub adds the key as context for the failure.
2. Scrub error groups blobstores by content, so that you can see which blobstore is most likely to be wrong.
Reviewed By: ahornby, krallin
Differential Revision: D23565906
fbshipit-source-id: a199e9f08c41b8e967d418bc4bc09cb586bbb94b
Summary:
Sorting bookmark names can be expensive for the MySQL server. As we
don't rely on the ordering of bookmark names when requesting all bookmarks,
remove the sorting.
I've not modified the `Select.*After` queries as they are used for pagination,
which does rely on the order of bookmark names. Further, any queries for
bookmarks that have a limit other than `std::u64::MAX` will remain sorted.
Reviewed By: ahornby
Differential Revision: D23574741
fbshipit-source-id: 79e07b64bb8bb34229c429bdf885c5144963f140
Summary:
The test-blobimport.t creates few files that are conflicting in a case insensitive file system, so make them differ by changing number of underscores in one of the files.
test-pushrebase-block-casefolding.t is directly testing a feature of case sensitive file system, so it cannot be really tested on MacOS
Pull Request resolved: https://github.com/facebookexperimental/eden/pull/49
Reviewed By: farnz
Differential Revision: D23573165
Pulled By: lukaspiatkowski
fbshipit-source-id: fc16092d307005b6f0c8764c1ce80c81912c603b
Summary:
Previously we were not logging a redacted access if previous access was logged
less < MIN_REPORT_TIME_DIFFERENCE_NS ago. That doesn't work well with our
tests.
Let's instead add a sampling tunable.
Reviewed By: krallin
Differential Revision: D23595067
fbshipit-source-id: 47f6152945d9fdc2796fd1e74804e8bcf7f34940
Summary: Moving some of the functionality (which is required for mercurial changeset derivation) into a separate crate. This is required to convert mercurial changeset to derived data to avoid circular dependency it would create otherwise.
Reviewed By: StanislavGlebik
Differential Revision: D23566293
fbshipit-source-id: 9d30b4b3b7d8a922f72551aa5118c43104ef382c