Summary: The json conversion is no longer needed. Remove it.
Reviewed By: yancouto
Differential Revision: D31465816
fbshipit-source-id: 2f8cfde03c04f254df5e352f7b6803c15062104f
Summary:
The `log_request` feature is the last `ToJson` user. Change it to just use
pprint instead so we can drop `ToJson`. The pprint format should also work
with Python / debugshell / debugapi.
Reviewed By: yancouto
Differential Revision: D31465802
fbshipit-source-id: 4a13da136ca88872163e1680a965c1d3b93abd0a
Summary:
Added back serialize impls on some request types removed by D31057140 (0cd8e64c51).
They will be used to replace ToJson impl.
Reviewed By: yancouto
Differential Revision: D31465803
fbshipit-source-id: d4084f65af76607db9073980097ac4b129e17fff
Summary:
The cli is not used. It is mainly to format requests from JSON and send them.
But JSON has its own issues - no binary data support and `hg dbsh`, `hg
debugapi` provides debugging too. So let's remove the unused CLI tool.
Reviewed By: yancouto
Differential Revision: D31465822
fbshipit-source-id: 71574b93a8503643cc503323d6a01f2a87bc41f3
Summary:
`hg debugapi` now provides easier ways to test edenapi endpoints.
Remove the `edenapi/tools` to reduce code bloat. This also unblocks removing
`To/FromJson` complexity so adding new endpoints become easier.
Reviewed By: yancouto
Differential Revision: D31465796
fbshipit-source-id: fdc0a47b4302c876e78455101068f27949d1b645
Summary:
We're going to remove read_res and make_req to remove code bloat.
Tests should consider using `debugapi` instead. See previous diffs for
examples.
Reviewed By: yancouto
Differential Revision: D31465825
fbshipit-source-id: 8a12567e5fb3040cd2dc911dfe8476d1fdf16f70
Summary:
See previous diffs for context. This makes the test shorter and will eventually
allow us to remove some bloated code in edenapi.
Reviewed By: yancouto
Differential Revision: D31465806
fbshipit-source-id: 5ea30d7069b68bf6f905398d47e4c0babc5c61c4
Summary:
See previous diffs for context. This makes the test shorter and will eventually
allow us to remove some bloated code in edenapi.
Reviewed By: yancouto
Differential Revision: D31465833
fbshipit-source-id: 9c7ce84a67dbc2b7253f9e0bb63026a6aa1725d7
Summary:
Accept owned bytes (`Vec<u8>`). This will make `Serde<Bytes>` work in Python
bindings.
Reviewed By: yancouto
Differential Revision: D31615717
fbshipit-source-id: ad200d359b9282fa84d1698afdde8241fc288905
Summary:
This removes duplicated implementation. The main goal is to make the hash types
use bytes instead of tuple serialization, and support hex deserialization.
Reviewed By: yancouto
Differential Revision: D31615718
fbshipit-source-id: 8e0039d4fad8a2720daeb2f276e6f5a795ea6482
Summary: Improves a bit user experience, some help text was missing.
Reviewed By: mzr
Differential Revision: D31609163
fbshipit-source-id: 059bbe1176b7abdd42804f20aa44f5cb974ae575
Summary:
See the previous diff for context.
Aside from error type changes, this changes the serialization format of Sha256
from a tuple of 32 u8 to a byte slice. It seems okay, because the only user
of the `Sha256` serialization on client-side is revisionstore's `FileAuxData`,
which does not serialize to disk (This is checked by dropping the Serialize
trait and see what breaks).
Reviewed By: yancouto
Differential Revision: D31615716
fbshipit-source-id: 3d31b5d356c7e5f6b229fa9eae71ba4cad1c0e1a
Summary:
See the previous diff for context.
The error types in from_slice and from_hex are changed. Related callsites are
updated.
Reviewed By: yancouto
Differential Revision: D31615720
fbshipit-source-id: ed127621d689f527b2e2eb24b0bef03870340e05
Summary:
There are a lot of hash types: HgId, Sha256, and edenapi-types has some.
The HgId seems to have most features but it is hard to reuse it for similar
but different types. This is an attempt to do so and unify the main implementation
of those types.
Most methods are copied from HgId. `from_byte_array` is made const fn so it
can be used to construct constants.
The error type is intentionally chosen to be not `anyhow::Error`.
Using static typed errors is considered good practice for low-level crates.
The benefit is that higher level users get their choice - precise static error
type with compile-time checks, or convenient, dynamic error types by anyhow
with runtime downcasts.
Reviewed By: yancouto
Differential Revision: D31615719
fbshipit-source-id: 337356721354c43fe23b9f2d0e90d104c8864c44
Summary:
It's not actually used and is hard to integrate with hg now.
`bookmarkstore` was initially designed before metalog (provides atomicity for
metadata changes) being a thing.
With metalog we track bookmarks using plain text strings, complex structures
like indexedlog cannot be used within metalog.
Reviewed By: yancouto
Differential Revision: D31615721
fbshipit-source-id: 75d6b9c9ba4475e86530e2368d30f53e58ac37d8
Summary:
See previous diffs for context. This makes the test shorter and will eventually
allow us to remove some bloated code in edenapi.
Reviewed By: yancouto
Differential Revision: D31465811
fbshipit-source-id: 2c0c0bab4adc4fb89fe9b46bc3f204c79efe17ee
Summary:
See previous diffs for context. This makes the test shorter and will eventually
allow us to remove some bloated code in edenapi.
Note some hashes are serialized (suboptimally) as an array of bytes, not as a
byte slice. Ideally we need `serde_bytes` annotation somewhere but I haven't
checked if that breaks compatibility.
Reviewed By: yancouto
Differential Revision: D31465828
fbshipit-source-id: 7892ececc475bad530708499cf2255852d610bc2
Summary: They will be used by `debugapi`.
Reviewed By: yancouto
Differential Revision: D31615722
fbshipit-source-id: b3e53a8ac0ab8202905602b478c1ca9e4d712f64
Summary:
These are used by Mononoke tests. Expose them so `debugapi` can be used in
tests.
Reviewed By: yancouto
Differential Revision: D31465832
fbshipit-source-id: 4291023d5d4623c43065e1b1c90848f3bc15047f
Summary:
See previous diffs for context. This makes the test shorter and will eventually
allow us to remove some bloated code in edenapi.
Note the client-side edenapi only provides prefix lookup. So it does not work
for an arbitrary range.
Reviewed By: yancouto
Differential Revision: D31465812
fbshipit-source-id: 9b81148258c45e7e534faee97e1b51c4dd75102d
Summary:
See previous diffs for context. This makes the test shorter and will eventually
allow us to remove some bloated code in edenapi.
Reviewed By: yancouto
Differential Revision: D31465827
fbshipit-source-id: 3e7a943efce6c670d817f57e10a1e1e191fe633f
Summary:
See previous diff for context. This provides pure Rust HgId more easily (without
needing `py`), and makes the acceptable format a bit more flexible, useful for
`debuapi`.
Unfortunately `Serde<Key>` or `Serde<Vec<(RepoPathBuf, HgId)>>` is not really
a choice because `RepoPathBuf` did not declare as `#[serde(transparent)]`,
which means paths have to be written as `(path,)`. Adding `#[serde(transparent)]`
to `RepoPathBuf` might be a breaking change that I didn't try.
Reviewed By: yancouto
Differential Revision: D31465819
fbshipit-source-id: cda210c2f5f6532256204abd428e1ad2b1de9fd9
Summary:
These APIs wrote directly to stores. Change them to return content like other
APIs so `debugapi` can print the result.
As we're here, drop legacy progress support.
Reviewed By: yancouto
Differential Revision: D31465820
fbshipit-source-id: 0c7a2e07f8fe56a89cc82a51ca6566d3ab6cc754
Summary:
D31057140 (0cd8e64c51) removed some serde derives on structs, but we need to them for easier
Python bindings. Let's add them back.
Reviewed By: yancouto
Differential Revision: D31465804
fbshipit-source-id: 94e8f0dbbbde4e52cb8b3153ea219fa70c91fca4
Summary:
See previous diffs for context. This makes the test shorter and will eventually
allow us to remove some bloated code in edenapi.
Note the client-side edenapi only provides prefix lookup. So it does not work
for an arbitrary range.
Reviewed By: yancouto
Differential Revision: D31465834
fbshipit-source-id: eb45922e109b7301beb9799e3ccb7905541de605
Summary:
See previous diffs for context. This makes the test shorter and will eventually
allow us to remove some bloated code in edenapi.
Reviewed By: yancouto
Differential Revision: D31465823
fbshipit-source-id: c2a89cbac62ae6d4aa80bde41c177bc4acd986fa
Summary:
See previous diffs for context. This makes the test shorter and will eventually
allow us to remove some bloated code in edenapi.
Reviewed By: yancouto
Differential Revision: D31465813
fbshipit-source-id: b921476f2d398271688ee96cc994474b616358f0
Summary:
See previous diffs for context. This makes the test shorter and will eventually
allow us to remove some bloated code in edenapi.
Reviewed By: yancouto
Differential Revision: D31465818
fbshipit-source-id: 7885d7204d01a5ae7a0e835eeb9e51cedcc6281d
Summary:
See previous diffs for context. This makes the test shorter and will eventually
allow us to remove some bloated code in edenapi.
Reviewed By: yancouto
Differential Revision: D31465815
fbshipit-source-id: 052a8a4487793a6a977e8ff27aa2bc5443175061
Summary:
This is not used in production and the removal unblocks the removal
of the `complete_trees` endpoint.
Reviewed By: yancouto
Differential Revision: D31465807
fbshipit-source-id: d2a8ff79fe4e6181adefa499bbd7125fa0c5c26b
Summary:
See previous diffs for context. This makes the test shorter and will eventually
allow us to remove some bloated code in edenapi.
Reviewed By: yancouto
Differential Revision: D31465798
fbshipit-source-id: 8e184a881479e42fcda06d7258af55fab99ab1da
Summary:
`debugapi` was introduced to test edenapi endpoints as seen by `hg`. Use it to
test the real client-side edenapi logic. This will eventually allow us to drop
`read_res`, `make_req`, `FromJson`, and `ToJson` to reduce bloated code in
edenapi.
Reviewed By: yancouto
Differential Revision: D31465799
fbshipit-source-id: 5a1ae0565b9c7a23773644e775c5f1930ebc12ad
Summary: It was a temporary step that is never used in production. Delete the endpoint.
Reviewed By: yancouto
Differential Revision: D31465808
fbshipit-source-id: 90e77eab96bb75796a3b31a5907f137d743a7dfa
Summary: It was a temporary step that is never used in production. Delete the endpoint.
Reviewed By: yancouto
Differential Revision: D31465829
fbshipit-source-id: 635ee205589b0f4d15388ae2a2bb9ada51d77edd
Summary: This makes it a bit more efficient and more friendly in debugapi output.
Reviewed By: DurhamG
Differential Revision: D31465800
fbshipit-source-id: b741792eb43a7a57c90f75362dbf189739dbb844
Summary:
`debugapi` was introduced to test edenapi endpoints as seen by `hg`. Use it to
test the real client-side edenapi logic. This will eventually allow us to drop
`read_res`, `make_req`, `FromJson`, and `ToJson` to reduce bloated code in
edenapi.
Reviewed By: yancouto
Differential Revision: D31465830
fbshipit-source-id: 9da739a76ef6e5d49804b0cea2089fc1741d0b7c
Summary:
`debugapi` was introduced to test edenapi endpoints as seen by `hg`. Use it to
test the real client-side edenapi logic. This will eventually allow us to drop
`read_res`, `make_req`, `FromJson`, and `ToJson` to reduce bloated code in
edenapi.
Reviewed By: yancouto
Differential Revision: D31465814
fbshipit-source-id: 5d090e0a92c6374b15a36f80dcc578ad280e50e2
Summary: To ease review, just doing the simple / hacky version of the TreeStore diff, like the FileStore one below this. This doesn't address the tree batch failure issue, but that's not blocking release so it's fine for now.
Reviewed By: andll
Differential Revision: D31593076
fbshipit-source-id: 0d3c420e50af0d8882ba171590597aac1b6c4c77
Summary:
Currently, the scmstore backingstore backend for EdenFs performs very poorly compared to ContentStore. This is mainly because a local-only FileStore is created on every fetch, and always flushes when dropped (even when not used).
With this change, we'll only flush when the parent FileStore is dropped.
This is a slightly hacky fix, vs. creating a new non-flush-on-drop type like `TreeStoreFetch` as we did for trees in the previous diff, but this should work fine for now and is radically smaller. Later I'd like to clean up both to eliminate the verbosity in the trees approach while still using separate types rather than a bool.
Reviewed By: andll
Differential Revision: D31591276
fbshipit-source-id: 11266c19ac68d87015719f5bcbd8f857e596bfdb
Summary: With edenapi and memcache reads both being written to disk as they're fetched, we should no longer need to break large prefetch batches into chunks.
Reviewed By: DurhamG
Differential Revision: D31552341
fbshipit-source-id: 2d9e10db669754cc8124228252cf832c8102f220
Summary: Like with the previous EdenApi change, writing files fetched from memcache to disk as we read them will prevent us from accumulating large amounts of file content in memory and potentially causing an OOM.
Reviewed By: DurhamG
Differential Revision: D31552190
fbshipit-source-id: 1eceb7570918575382f067ff9ad1e08e0623f335
Summary: Introduce a new method, `evict_to_cache`, which writes a memory-backed LazyFile to disk and returns a mmap-backed LazyFile instead.
Reviewed By: DurhamG
Differential Revision: D31551177
fbshipit-source-id: 86901628c56151ac805a57885379241c42f794c0
Summary:
Use the new async `files_attrs` method to concurrently fetch files from EdenApi, write them to disk, and read them back as a mmap-ed `LazyFile`.
Also remove scmstore prefetch chunking with the previous approach required.
Previously, we fetched all the requested files from EdenApi into memory as a batch, then wrote them to disk afterward. This caused OOMs for extremely large fetches, and required chunking scmstore prefetches, which had a negative performance impact.
Reviewed By: DurhamG
Differential Revision: D31445678
fbshipit-source-id: 9a2e1476fb8ddfcd546a5e0b501cc91cc2a97303
Summary: Add a basic async `files_attrs` method to `EdenApiFileStore` for use by scmstore for concurrent fetching / writing to disk.
Reviewed By: DurhamG
Differential Revision: D31483490
fbshipit-source-id: 20233767541d60ccdfcd97bdc98b9bbb98d8700e
Summary:
Currently, scmstore does not write LFS pointers discovered outside the LfsStore before fetching blobs from the LFS remote and attempting to read them back from disk. This causes scmstore to rely on the contentstore fallback to write LFS pointers, after which future reads will succeed in scmstore.
With this change, we track which LFS pointers were discovered from a source other than an LfsStore, and write those to the LfsStore before fetching from the LFS remote. This eliminates the debugimporterhelper fallback in the backingstore implementation, and should improve the performance of the shim in general for LFS files where the LFS pointer is not available locally.
In the future, I'll probably perform these writes inline with the fetches that discovered the pointers for added concurrency, and reorganize the LFS fetching code in general to avoid some of the weird bookkeeping we currently do around local vs. shared for LFS.
Reviewed By: andll
Differential Revision: D31280512
fbshipit-source-id: 2e2a32db3fc9fab8ac179b7852da36f77c66ef31
Summary: we run the eden endpoints from mononoke server, not via this binary any more. remove it
Reviewed By: StanislavGlebik
Differential Revision: D31501592
fbshipit-source-id: 0626fe43f0f1ce4a6c7165a734eb487225fa65b6
Summary: Derived data manager doesn't need full BlobRepo struct. So lets create facet container which includes only needed parts for derivation using manager. I'll use it later with 2DS service.
Reviewed By: StanislavGlebik
Differential Revision: D31540177
fbshipit-source-id: 42a3e96352689f39730b8f62142cccd7a1bb7256
Summary: I'm going to use that API from 2DS service. so the discovery of underived will not be triggered.
Reviewed By: yancouto
Differential Revision: D31540178
fbshipit-source-id: cdf91974b90bc07e09cd3f465f9a1f8b8271a8e4
Summary:
Background: I've been looking into derived data performance and found that
while overall performance is good, it depends quite a lot on the blobstore
latency i.e. the higher the latency the slower the derivation. What's worse is
that increasing blobstore latency even by 100ms might increase time of
derivation of 100 commits from 12 to 65 secs! [1]
However we have ways to mitigate it:
* **Option 1** If we use "backfill" mode then it makes derived data derivation less
sensitive to the put() latency
* **Option 2** If we use "parallel" mode then it makes derived data derivation less
sensitive to the get() latency.
We can use "backfill" mode for almost all derived data types (only exception is
filenodes), however "parallel" only enabled for a few derived data types (e.g.
fsnodes, skeleton manifests, filenodes).
In particular, we didn't have a way to do batch derived data derivation for
unodes, and so unodes derivation might get quite sensitive to the blobstore
get() latency. So this diff tries to address that.
I considered three options:
* **Option 1** The simplest option of implementing "parallel" mode for unodes is to just
do a unode warmup before we start a sequential derivation for a stack of commits. After the
warmup all necessary entries should be in cache, so derivation should be less latency sensitive.
This could work, but it has a few disadvantages, namely:
* We do additional traversal - not the end of the world, but it might get
expensive for large commits
* We might fetch large directories that don't fit in cache more often than we
need to.
That said, because of it's simplicity it might be a reasonable option to keep
in mind, and I might get back to it later.
* **Option 2** Do a derivation for a stack of commits. We have a function to derive a
manifest for a single commit, but we could write a similar function to derive the whole stack at once.
That means for each changed file or directory we generate not a single change
but a stack of changes.
I was able to implement it, but the code was too complicated. There were quite
a few corner cases (particularly when a file was replaced with a directory, or
when deriving a merge commit), and dealing with all of them was a pain.
Moreover, we need to make sure it works correctly in all scenarios, and that
wouldn't be an easy thing to do.
* **Option 3** Do a derivation for a "simple" stack of commits. That's basically the
simplified version of option #2. Let's allow doing batch derivation only for
stacks that have no
a) merges
b) path changes that are ancestors of each other (which cause file/dir
conflicts).
This implementation is significantly simpler than option #2, and it should
cover most of the cases and hopefully bring perf benefits (though this is
something I'm yet about to measure). So this is what this diff implements
Reviewed By: yancouto
Differential Revision: D30989888
fbshipit-source-id: 2c50dfa98300a94a566deac35de477f18706aca7
Summary: I'd like to reuse it in the next diff, so let's move it to a separate function
Reviewed By: yancouto
Differential Revision: D31603911
fbshipit-source-id: 69ec0553022aabe662f75a50321423c54aafd196
Summary:
We don't really need it, and InnerRepo is more expensive to create, so let's
not do that.
Reviewed By: yancouto
Differential Revision: D31602266
fbshipit-source-id: 269bf50f8b4ee99e22888d91af2ac392078d32fa
Summary:
We were `hg adding` all files that were modified, however, we only really need to add files that didn't exist before the snapshot changes.
This gets rid of the annoying "already tracked!" messages.
Reviewed By: markbt
Differential Revision: D31438120
fbshipit-source-id: ca3545bc5881dcf01283abfa5ec9eca6309ff607
Summary: updated library.sh timeout handling to be clearer by checking the time change directly, and to use the health_check endpoint
Reviewed By: yancouto
Differential Revision: D31536407
fbshipit-source-id: 43cdf4260c10dfd4d3097dcd92d071c3d18b18f8
Summary:
We don't actually use bundle2 to deliver manifests anymore. Everything
is done via ondemand fetching, which won't have the problem covered in this
test. So let's delete it.
Reviewed By: quark-zju
Differential Revision: D31309313
fbshipit-source-id: 312508fa1b5e903314b92c048d23525c2194ab91
Summary:
This is part of removing filepeer. I also enabled treemanifest and
modernclient (i.e. lazy changelog) on a few tests.
Reviewed By: quark-zju
Differential Revision: D31032055
fbshipit-source-id: 6822274ad07303ed86b2ee5dd4e09979f1e215d5
Summary:
This test covers 'hg clone -r REV' which isn't really a supported clone
case since we now depend on cloning particular remote bookmarks. The test is
also heavily dependent on revision numbers. So let's just delete it.
This is needed as part of removing the tests dependency on hg server logic like
filepeer.
Reviewed By: quark-zju
Differential Revision: D31309312
fbshipit-source-id: e4620186e3eda3114686de36d06710747439ae18
Summary:
Ports test-bundle.t to use modernclient configs, meaning it no longer
uses the filepeer and it uses treemanifest and lazychangelog. I deleted the
parts of the test that depend on mounting a bundle file on top of a repo, since
that logic is unused in production.
Reviewed By: quark-zju
Differential Revision: D31309310
fbshipit-source-id: a535ed9a21253fd258f70088e7436480957afb2a
Summary:
We need to pass the file content blob with rename header to eagerepo,
not the one without the header. This resulted in a SHA1 mismatch assertion when
trying to make test-bundle.t use eagerepo.
Reviewed By: quark-zju
Differential Revision: D31309314
fbshipit-source-id: afaf3af3423b3f3006c1a95ddbf0da20056d9581
Summary: This is a step towards moving test-bundle.t to modernclient configs.
Reviewed By: quark-zju
Differential Revision: D31309318
fbshipit-source-id: 9680e85031797088d624a6c85e2d7316b108818e
Summary:
This test depends on a precomputed, legacy-formated, checked-in bundle
file. Bundle files are on their way to being replaced with commit cloud, and we
don't use a lot of the features in this test anyway (branches, merges) so let's
delete this test.
Reviewed By: quark-zju
Differential Revision: D31309316
fbshipit-source-id: b5729bed33a8c84fa75792528630d99dbc1996be
Summary:
Add a way to test EdenAPI endpoints.
The goal is to deduplicate with `make_req` and `read_res` tools to make it easier to
write new edenapi endpoints (no need to update `make_req` or `read_res`).
Reviewed By: DurhamG, yancouto
Differential Revision: D31465801
fbshipit-source-id: 5127941d0820ce737a4958a1d124f420acbaf771
Summary: Add a binding so Python code can use the Rust pprint implementation.
Reviewed By: DurhamG
Differential Revision: D31465826
fbshipit-source-id: b8f49f0d0587f82fae5906577acda95a09953e69
Summary:
I found that I need to "print" a value that might contain bytes and SHA1
hashes. JSON cannot do this job well because it does not support bytes.
Rust debug print can be too verbose.
Originally I tried:
- Python json: Not easily extendable to support binary data.
- Python repr: Not pretty.
- Python pprint: Lots of complexity on line wrapping, not easy to extend.
I ended up with an adhoc version of pprint (in D31465801):
# Similar to pprint, but much simpler (no textwrap) and rewrites
# binary nodes to hex(hexnode).
def format(o, indent, sort=sort):
if isinstance(o, bytes) and len(o) == 20:
yield 'bin("%s")' % hex(o)
elif isinstance(o, (bytes, str, bool, int)) or o is None:
yield repr(o)
elif isinstance(o, list):
yield "["
if sort:
o = sorted(o, key=lambda o: repr(o))
yield from formatitems(o, indent + 1)
yield "]"
elif isinstance(o, tuple):
yield "("
yield from formatitems(o, indent + 1)
yield ")"
elif isinstance(o, dict):
yield "{"
def fmt(kv, indent=indent + 1):
k, v = kv
kfmt = "".join(format(k, indent + 1)) + ": "
yield kfmt
yield from format(v, indent + len(kfmt))
items = sorted(o.items())
yield from formatitems(items, indent + 1, fmt)
yield "}"
else:
yield "?"
def formatitems(items, indent, fmt=None):
if fmt is None:
fmt = lambda o: format(o, indent)
total = len(items)
for i, o in enumerate(items):
if i > 0:
yield "\n"
yield " " * indent
yield from fmt(o)
if i + 1 < total:
yield ","
Later I found I need this feature in Rust too (D31465805 and D31465802).
So I translated the above Python code to Rust.
The Python syntax means the printed content can be copy-pasted to
Python files to form some quick scripts. Python code can also use
`eval` or `ast.literal_eval` (for safety, but no `bin` support) to
deserialize pprint output.
Reviewed By: DurhamG
Differential Revision: D31465797
fbshipit-source-id: ef4d17df84590075f74a0298ac89f4a963d8ed3c
Summary:
Serde deserializer is usually more permissive than restrictive. Make it a bit
more permissive in `deserialize_bytes` when we see a string.
Practically this means the Python bindings are a bit more permissive. However
the Rust interfaces are still strong typed so I think it is okay.
Reviewed By: DurhamG
Differential Revision: D31465810
fbshipit-source-id: a339b662f00a16953ce849ed8c2d65a1c3365081
Summary:
Use serde to decode HgId from Python. This allows us to get pure Rust types
earlier and get flexible hash support (binary or hex) from `HgId` for free.
Reviewed By: ahornby
Differential Revision: D31465821
fbshipit-source-id: 18c459d5fab6d508d0ec37f136bd45a7baf1e473
Summary:
See previous diff for context. This makes the endpoint a bit easier to use.
Also add retry, since it's now easier with the Vec type.
Reviewed By: yancouto
Differential Revision: D31416217
fbshipit-source-id: 40de6d14cf5cd088cd69156758699706bc7b8b8b
Summary:
See previous diff for context. This makes the endpoint a bit easier to use.
Also add retry, since it's now easier with the Vec type.
Reviewed By: yancouto
Differential Revision: D31416215
fbshipit-source-id: 53c169d17b2e16c285b5ea6d5c90ce0ee5d0b280
Summary:
See D31407278 (b68ef5e195) for a similar change. Basically, for lightweight metdata,
`Vec<T>` is more convenient than `Respoonse`. The prefix lookup endpoint
is lightweight. Let's switch to `Vec` fot the `EdenApi` trait.
Reviewed By: yancouto
Differential Revision: D31416216
fbshipit-source-id: 260235b57ddedcd7be8accb263a0090330445f7f
Summary:
See D31407278 (b68ef5e195) for a similar change. Basically, for lightweight metdata,
`Vec<T>` is more convenient than `Respoonse`. The commit graph endpoint
outputs commit hashes, which are lightweight. So let's switch to `Vec`
fot the `EdenApi` trait.
Reviewed By: yancouto
Differential Revision: D31410099
fbshipit-source-id: 0966b9afb47264c34ebe88355dc6df669dfb803b
Summary:
Previously the `clone.force-edenapi-clonedata` config decides whether to use
segments clone. That can be error prone - it will be a crash if the server
doesn't have the segments index.
Change it so we ask about the segments index and automatically use the lazy
changelog clone.
`clone.force-edenapi-clonedata` will be no longer necessary.
Differential Revision: D31358367
fbshipit-source-id: 9fa639d0349d00938c89cc091ea793f20dd714c8
Summary:
Expose the Rust capabilities endpoint to Python.
Note: I avoided some complexities here:
- Not implementing a `capabilities_py` function just for rustfmt purpose.
This avoids repeating the `capabilities` signature two more extra times.
- Not supporting legacy progress callback at all.
Reviewed By: yancouto
Differential Revision: D31358368
fbshipit-source-id: 76d0d71e627adbc57ed853922c4f826f3edfccb4
Summary:
Mononoke side implementation was done by D30831346 (8aa676ada0).
Note: I avoided some complexities here:
- Not using `paths::` constant since the constant isn't useful elsewhere. It
saves repeating the "capabilities" name a few times.
Reviewed By: yancouto
Differential Revision: D31358370
fbshipit-source-id: d75d057d1fdc44fffac9d136dbd10241d78b5cfd
Summary:
The capabilities method reports what optional features a repo has.
It was first introduced at Mononoke side. See D30831346 (8aa676ada0).
Reviewed By: yancouto
Differential Revision: D31358369
fbshipit-source-id: 673c30f660621279f84d451898dc3707974c1cae
Summary:
`EdenApi` is implemented in a few places including some test utilities.
Changing function signatures would require changing the `unimplemented!()`
test implementation too. That's a bit annoying.
Add default implementation and drop the `unimplemented!()` implementations
to make it easier to change edenapi interfaces.
Reviewed By: yancouto
Differential Revision: D31408643
fbshipit-source-id: 602ccd3ce545d1ab646bc32eb84976417f0df9f8
Summary:
When reading related code I found "No need to create new indexes" confusing.
Clarify it by stating that the index is wrong but no business logic depend on
it.
Differential Revision: D31146157
fbshipit-source-id: 4c73b4958ac6fb286bfc5b8256c8c03a26cda7b0
Summary:
When eden is not running on windows `eden rm` will spit out some error
messages and exits with error code 1. But it does actually successfully
remove the repo.
On linux removing a repo while eden is not running behaves just like if
eden were running.
Let's make the removal more graceful on windows.
Reviewed By: xavierd
Differential Revision: D31519805
fbshipit-source-id: d393922a9474e64251142207ae38a602766f17bf
Summary: Removes some boiler plate and implements the trait more consistently for Arc<X> as well as Arc<dyn X>
Reviewed By: Croohand
Differential Revision: D31558567
fbshipit-source-id: 503e745ccf4eba6dcf03b8a3f4ace49310c1c319
Summary: This simplifies test setup and means can stop throwing away creation SQL errors
Reviewed By: StanislavGlebik
Differential Revision: D31536408
fbshipit-source-id: 4d3df0f69c5b49719196c8a1b20d65c965d88869
Summary: sys.platform is "win32", not the path to python.
Reviewed By: fanzeyi
Differential Revision: D31554106
fbshipit-source-id: 64b388d2fb8a493f811a0cf22fe2471a25bfbf7e
Summary: Use atomic rename instead of fs::write when writing out the hgrc.dynamic config file. fs::write re-writes the file if it already exists which opens a race condition if two hg processes try to write the config at the same time.
Reviewed By: DurhamG
Differential Revision: D31548054
fbshipit-source-id: c5216bf6f9f2ff78d37a2c0e23c8c22fdf1d0897
Summary: Rebase (and other commands) print the mappings between old and new commits via the tweakdefaults extension. Previously we would sort the node mapping dictionary which yields apparent random ordering. Now instead we maintain dictionary order which by default gives us the dictionary insert order (which in rebase's case, at least, is revision order).
Reviewed By: DurhamG
Differential Revision: D31551290
fbshipit-source-id: 0b5d9c4846ba42154ae387bdf4777a61e5e1e605
Summary:
I've investigated the deadlock we hit recently and attempted to fix it with D31310626 (8ca8816fdd).
Now there's a bunch of facts, an unproven theory and a potential fix. Even though the theory is not proven and hence the fix might be wrong, the theory sounds plausible and the fix should be safe to land even if the theory is wrong.
But first, the facts,
### Facts
a) There's a repro for a deadlock - running `buck-out/gen/eden/mononoke/git/gitimport/gitimport#binary/gitimport --use-mysql-client --repo-name aosp --mononoke-config-path configerator://scm/mononoke/repos/tiers/prod --panic-fate=exit --cachelib-only-blobstore /data/users/stash/gitrepos/boot_images git-range 46cdf6335e1c737f6cf22b89f3438ffabe13b8ae b4106a25a4d8a1168ebe9b7f074740237932b82e` very often deadlocks (maybe not every time, but at least every 3rd run). So even though D31310626 (8ca8816fdd) unblocked the mega_grepo_sync, it doesn't seem like it fixed or even mitigated the issue consistently. Note that I'm running it on devbig which has quite a lot of cpus - this may or may not make deadlock more likely.
b) The code deadlocks on [`find_file_changes()`](https://fburl.com/code/7i6tt7om) function, and inside this function we do two things - first we find what needs to be uploaded using [`bonsai_diff()`](https://fburl.com/code/az3v3sbk) function, and then we use [`do_upload()`](https://fburl.com/code/kgb98kg9) function to upload contents to the mononoke repo. Note that both `bonsai_diff()` and `do_upload()` use git_pool. Bonsai_diff produces a stream, and then we apply do_upload to each element of the stream and finally we apply [`buffer_unordered(100)`](https://fburl.com/code/6tuqp3jd) to upload them all in parallel.
In simplified pseudo-code it looks like
```
bonsai_diff(..., git_pool, ...)
.map(|entry| do_upload(..., git_pool, entry, ....)
.try_buffer_unordered(100)
```
c) I've added a few printlns in P462159232, and run the repro command until it, well, repros. These printlns just shows when there was an attempt to grab a semaphore and when it successfully grabbed, and also who was the caller - `bonsai_diff()` or `do_upload()` function. This is the example output I'm getting P462159671. The output is a mess, but what's interesting here is that there exactly 100 more entries of "grabbing semaphore uploading" than "grabbed semaphore uploading". And 100 is exactly the size of the buffer in buffer_unordered.
### Theory
So above are the facts, and below is the theory that fits these facts (but as I said above, the theory is unproven yet). Let me start with a few assumption
1) Fact `c)` seem to suggest that when we run into a deadlock there was a full buffer of `do_upload()` futures that were all waiting on a semaphore.
1) If we look at [`buffer_unordered` code](https://docs.rs/futures-util/0.3.17/src/futures_util/stream/stream/buffer_unordered.rs.html#71) we can see that if buffer is full then it stops polling underlying stream until any of the buffered futures is done.
1) Once semaphore is released it [wakes up just a handful of futures](https://docs.rs/tokio/1.12.0/src/tokio/sync/batch_semaphore.rs.html#242) that wait for this semaphore.
Given this assumptions I believe the following situation is happening:
1) We get into a state where we have 100 `do_upload()` futures in buffer_unordered stream all waiting for a semaphore. At the same time all the semaphore permits are grabbed by `bonsai_diff()` stream (because of assumption #1)
2) Because of assumption #2 we don't poll underlying `bonsai_diff()` stream. However this stream has already [spawned a few tasks](https://fburl.com/code/9iq3tfad) which successfully grabbed the semaphore permit and are executing
3) Once these `bonsai_diff()` tasks are finished they [drop the semaphore permit](https://fburl.com/code/sw5fwccw), which in turn causes semaphore to be released and one of the tasks that are waiting to be woken up (see assumption #3). But now two things can happen - either a `do_upload()` future will be woken up or `bonsai_diff()` future will be woken up. If the former happens then `buffer_unordered` would poll the `do_upload()` future and continue executing. However if the latter happens (i.e. `bonsai_diff()` future was woken up) then it causes a deadlock - buffer_unordered() aren't going to poll `bonsai_diff()` stream, and so bonsai_diff() stream won't be able to make any progress. At the same time `do_upload()` futures aren't going to be polled at all because they weren't woken up in the first place, and so we deadlock.
There are a few things that seem unlikely in this theory (e.g. why we never wake up any of the `do_upload()` futures), so I do think it's worth validating. We could either add a bunch of println! in buffer_unordered and semaphore code, or try to setup [tokio-console](https://github.com/tokio-rs/console).
### Potential fix
Potential fix is simple - let's just add another spawn around `GitPool::with()` function. If the theory is right, then this fix should work. Once a semaphore is released and a `bonsai_diff()` future is woken up then this future will be spawned and hence will be able to complete successfully regardless of whether `buffer_unordered()` has its buffer full or not. And once `bonsai_diff()` fully completes then finally `do_upload()` futures will be woken up and `buffer_unordered()` will progress until completion.
If the theory is wrong then we just added one useless `tokio::spawn()`. We have a lot of spawns all over the place and `tokio::spawn` are cheap. So landing this fix shouldn't make things worse, but it might actually improve them (and the testing I did suggests that it indeed should improve them - see `Test plan` section below). I don't think it should be seen as a reason to not find the root cause
Reviewed By: mojsarn
Differential Revision: D31541432
fbshipit-source-id: 0260226a21e6e5e0b41736f86fb54a3abccd0a0b
Summary:
Incorporated Diffs:
* Fix blank lines between structured annotated fields D31372368
* Fix format for empty struct D31489644
Manual component version update
Bump Schedule: https://www.internalfb.com/intern/msdk/bump/?schedule_fbid=326075138621961
Package: https://www.internalfb.com/intern/msdk/package/608249566967196/
Oncall Team: nuclide
NOTE: This build is expected to expire at 2022/10/08 03:04PM PDT
---------
New project source changes since last bump based on D30698375 (e5f5b589b458e473ab300550737172d43167f2bf at 2021/09/01 04:03PM IST):
no related diffs found between since last bump
---------
build-break (bot commits are not reviewed by a human)
Reviewed By: zertosh
Differential Revision: D31517842
fbshipit-source-id: 9f04c0ad1f418ba8ec3c96b05366683413d93f97
Summary: This appears to have broken all the tests on Linux.
Reviewed By: zhengchaol
Differential Revision: D31505082
fbshipit-source-id: 610eb941d0f0eb536a4688ac2637a8466be0b05c
Summary:
xterm-color terminals don't support hiding the cursor the way curses likes to.
this crashes eden top for users.
eden top mostly functions fine otherwise in xterm-color, so let's make
this a non fatal error.
Note: the highlighting behavior is a little wierd (the styling after the
highlighted column does not work, no bolding etc). This looks like a wierd
curses bug on these terminals, so not sure if we can really fix this. I figure
a little wierd styling is still better than a crash.
Reviewed By: fanzeyi
Differential Revision: D31480121
fbshipit-source-id: 581ef7c548fd1f7986c4f93d8b797d7f7588c351
Summary: This will let us identify clients in scuba logs. This will be useful to identify whether a client has some specific feature enabled. We can use this instead of dedicated boolean columns like is_eden, or is_using_feature_X.
Reviewed By: krallin
Differential Revision: D31501373
fbshipit-source-id: 0e63b73659fd145f01098d60ced510e464730982
Summary:
The hg client provides a header which contains `ClientInfo` data. This
includes the sandcastle alias and nonce.
Update Mononoke to parse this header and then log the sandcastle alias and
nonce to scuba.
Reviewed By: krallin
Differential Revision: D31208450
fbshipit-source-id: f0971b668887de47fbab29b7ce9b0173cbdeafe4
Summary:
The `Metadata::new` function has grown quite a few optional arguments
which can make it difficult to read.
Simplify the `new` method and add new methods to record optional information
(such as encoded CATs).
Reviewed By: krallin
Differential Revision: D31500788
fbshipit-source-id: 9675c39f3061fef614e792e6ab6e365e9b423b2a
Summary:
The hg client provides extra info in a header that now includes the
sandcastle alias and nonce.
Update the LFS server to parse the header and log the sandcastle alias and
nonce to scuba.
Reviewed By: krallin
Differential Revision: D31208449
fbshipit-source-id: 773f0ec22060203c2c74a20090bd4893885506f8
Summary:
The hg client uses a version of reqwest that relies on tokio-0.2. In
the next diff I will add `clientinfo` as a dependency of Mononoke LFS server,
which also pulls in `configparser`. However, Mononoke LFS server uses
tokio-1.0.
Update the reqwest dependency to be the latest version.
Reviewed By: quark-zju
Differential Revision: D31389559
fbshipit-source-id: acf4c3b5c9df2a8bc8cfa134a2d314fbb96c3354
Summary: If regular traffic goes through the agent, debug traffic should as well.
Differential Revision: D31500581
fbshipit-source-id: 32abef4e082dbf120c9aa104206b460e12ed506f
Summary: this should enable everything using rust's http client to go over x2pagentd. The biggest user is edenapi. Should work on all platforms (including Windows!).
Reviewed By: ahornby
Differential Revision: D31430275
fbshipit-source-id: f90a633eb3cc4e82447b1b76200499016dcb6b8e
Summary: I just landed D30899990. I'm landing this to sync the config, still without using it.
Differential Revision: D31472892
fbshipit-source-id: 5e18c4c3529118ef81880c886f0d8b9428efcbf4
Summary: `eden-config.h` was not included previously and caused the preprocessor macro to always use the default value false path.
Reviewed By: chadaustin
Differential Revision: D31462811
fbshipit-source-id: ade236ce1f5b2b6511163515ced79890855190f2
Summary:
On Windows, edenfsctl.real is a par file that Windows can't execute directly.
Thus let's have Python run it.
Reviewed By: fanzeyi
Differential Revision: D31477721
fbshipit-source-id: d5a699ceb3d3b1b3d5778ef5720bca7c299bed80
Summary:
These tests have been failing for a long time due to the expected output on
Windows containing \ but the test using the / separator.
Let's simply use os.path.join to build a path with the right separator in these
tests.
Reviewed By: fanzeyi
Differential Revision: D31477722
fbshipit-source-id: a4ac25750647229974e23e305508e83917011fef
Summary:
Current `test_mock` creates logger with default_drain and output isn't printed in unit-tests. There is a method `logger_that_can_work_in_tests()` which overcome it, let's use it for constructing test CoreContext.
While testing I've also spotted some leftover from migration to new BonsaiDerivable. Lets fix it too
Reviewed By: krallin
Differential Revision: D31430162
fbshipit-source-id: a086be521f0ceaeb3267e87f24980fb11587a6e7
Summary:
The callers are still converting from an ImmediateFuture to a folly::Future,
this will be tackled in a subsequent diff.
Reviewed By: fanzeyi
Differential Revision: D31349658
fbshipit-source-id: f3cef52d3fb4b6c1fb0af73399a57e8d163216b8
Summary:
Passing shared_ptr by copy everywhere can be expensive as it forces an atomic
operation to be performed. Since the caller of the glob code can easily
guarantee that the data will outlive the globbing code, let's just pass
references/pointers to it so that only references and pointers are copied.
Reviewed By: genevievehelsel
Differential Revision: D31344889
fbshipit-source-id: cee797202470aa123381d9ee22e11780722f5b33
Summary: The endpoint can fail sometimes. Add retries to make it more reliable.
Reviewed By: yancouto
Differential Revision: D31407279
fbshipit-source-id: 1b05feedd65477aa0b9cd07be30217b4f2e6d1b2
Summary: This will be used by the next change.
Reviewed By: yancouto
Differential Revision: D31407281
fbshipit-source-id: 6e3e26a0fb4864d2076d2aafcb0f0ba919bfe2cc
Summary:
The `Response` type in edenapi has extra complexity about async streaming.
It is more useful for large data like file contents. For metadata like the
hash<->location translation, using a plain `Vec` is much simpler. Using
`Vec` also makes it easier to implement retry logic.
Reviewed By: yancouto
Differential Revision: D31407278
fbshipit-source-id: 9d6df225d183eaffe72f99cfd53ae3cd2987a518
Summary:
We concluded that ideally `sync -> block_on(async)` only happens in Python
wrappers, not pure Rust code. With that, `EdenApiBlocking` makes it easier
to write anti-pattern pure Rust code, and itself looks a bit repetitive.
All users of `EdenApiBlocking` have been migrated. So let's drop
`EdenApiBlocking` now.
Reviewed By: yancouto
Differential Revision: D31407284
fbshipit-source-id: 797506ccd07f413ac041dcd2ea07b3a3519d912f
Summary:
There are only a few places using EdenApiBlocking. The other endpoints are just
using explicit `block_on` constructs.
Let's migrate the endpoints off the EdenApiBlocking trait.
Reviewed By: yancouto
Differential Revision: D31407287
fbshipit-source-id: edeb16a1ed4f50cc01c75546df7362673f941d01
Summary:
EdenApiBlocking is going away. Drop another dependency on it.
This also adds proper Ctrl+C support for the command.
Reviewed By: yancouto
Differential Revision: D31407285
fbshipit-source-id: 6c5f40f2933fdd54db197ae6b93c5f240eda1c25
Summary:
This makes it possible to drop EdenApiBlocking. Since the actual lazy clone
logic is in the Python code, we might want to only keep one of the Rust or
Python lazy clone logic eventually.
Reviewed By: yancouto
Differential Revision: D31407286
fbshipit-source-id: 3685edbf6e4709aaf8190ba65035d6627ef83fe9
Summary:
`EdenApiBlocking` was there to provide some convenience for non-async code to
use edenapi. Now a lot of places actually use async Rust properly, and calling
async Rust from non-async Rust is not a great practice. So let's plan for
`EdenApiBlocking` removal. To unblock it, drop `EdenApiBlocking` usage in
revisionstore.
Reviewed By: yancouto
Differential Revision: D31407282
fbshipit-source-id: 70a6e9468606176cf5cf119418547e694e229ec4
Summary:
Add a config option `edenapi.max-retry-per-request` to control the maximum
retry count. Replace the hard-coded 10 with it.
Reviewed By: yancouto
Differential Revision: D31407280
fbshipit-source-id: 654c79601fe12007e6cbf7ac6a4110da420801dd
Summary: Add a general retry method for futures. Move clone and pull paths to use it.
Reviewed By: yancouto
Differential Revision: D31407283
fbshipit-source-id: 46ec2bd5bacdcd10ae5078ced8d12f2c8b9ed1ec
Summary:
The error is most likely a misconfiguration. Prompt the user to check the
message.
Differential Revision: D31383947
fbshipit-source-id: c58c2b026048266fc0a8c90f31928c97a2381258
Summary:
Using modulo on arbitrary integers to get random numbers [isn't correct](https://www.internalfb.com/diff/D31305392 (da13975a4f)?dst_version_fbid=311037904117090&transaction_fbid=550270779610744), as the distribution between numbers isn't fair (unless the size is a power of two).
This was raised on D31305392 (da13975a4f), but we decided to land that quickly to unblock builds before doing these changes.
I'm applying the changes suggested on D31305392 (da13975a4f). This is what this diff does:
- For all cases where we generate small numbers (up to 5), replace with call to `Gen::choose`, so `u32::arbitrary(g) % 3` becomes `g.choose(&[0, 1, 2]).unwrap()`.
- For generating numbers in range 0..=1, I instead replaced with generating a boolean, which gets rid of the `unreachable!` calls.
- I removed the code to generate numbers in range 0..=0.
- For generating larger numbers, I used `u64::arbitrary` instead, which should make things "less wrong".
Some things I assumed, but am happy to change before landing, just let me know:
- Theoretically we don't *need* to change the code for `% 2` and `% 4`, as the math checks out there. I changed it for consistency there, but am happy to change it back.
- Using boolean also wasn't suggested initially, I'm happy to change back.
Reviewed By: krallin
Differential Revision: D31379381
fbshipit-source-id: a0bac26ebabd32a6c65f717512de998ef5dc37c8
Summary:
`commit_lookup_pushrebase_history` is an endpoint that tries to traverse Pushrebase and Commit Sync mappings to find the commit's pushed version for a landed commit. But currently it traverses Commit Sync mapping blindly because it doesn't know the sync direction. This can (very rare) lead to an inaccurate result.
Let's use the `source_repo` column I've introduced in this diff stack and don't traverse in the wrong direction if we know it is wrong.
Reviewed By: StanislavGlebik
Differential Revision: D30975759
fbshipit-source-id: 9c5ecf059dcdebf0c91f0c5545f0c6e95610c2ec
Summary:
modernise tests by using moder newserver function
the function is shorter and it will also allow us to run the tests with Mononoke in the future
Reviewed By: yancouto
Differential Revision: D30773102
fbshipit-source-id: 994cbcfb2688aef3e96446e1cb021db72bc70c67
Summary:
Another test that was broken by quickcheck update. I notice it breaking a run of `hg_windows` [here](https://www.internalfb.com/intern/sandcastle/log/?instance_id=4503600123928895&step_id=4503604846860319&step_index=9&name=Run%20cargo%20test#479) (though it's not windows specific).
The problem is quickcheck started using NaN on floats, which surfaced some invalid code. In roundtrip tests, we need our struct to implement `Eq`, which f32 does not (since NaN != NaN). Workaround was implementing something that also considers NaN == NaN.
Question: Do I also fix the hg-server version of the test?
Reviewed By: krallin
Differential Revision: D31401277
fbshipit-source-id: b3eef1a3aef395a1194308788ec74f1bb5a33a42
Summary:
The backfiller should use the background session class so that we ensure data
is written to all blobstores in a multiplex.
Reviewed By: yancouto
Differential Revision: D31429168
fbshipit-source-id: 32c767dbef291771565f73cedf3cd01c1a3cce40
Summary:
When rederiving batches of commits, the batch derivation process must mark
these commits as derived so that rederivation will continue with the next
batch.
Reviewed By: yancouto
Differential Revision: D31429169
fbshipit-source-id: e9f6a84a0391ee8d72a0007f39e755410bfac724
Summary: Since we migrated all derived data types to use manager, we can now delete a bunch of code.
Reviewed By: krallin
Differential Revision: D31344999
fbshipit-source-id: db864bdc3ba0f95cb34be6e554d629d254f09608
Summary: The test output was previously incorrect due to wrong rounding from jq tool
Reviewed By: ahornby
Differential Revision: D31429221
fbshipit-source-id: 2979e393c6f6c1b52e41d732f155275166062bff
Summary:
fix the values in the test, the test was broken due to changes in jq
jq used to provide incorrect rounding.
Reviewed By: ahornby
Differential Revision: D31390426
fbshipit-source-id: ab4d7014109d23aa5b4fb95db4f485cea70e5b05
Summary: I'm going to reuse that derived type for unit-test in 2DService. And for that I need to make it library.
Reviewed By: HarveyHunt
Differential Revision: D31340518
fbshipit-source-id: 3960c0d3ae9a72e1fa6dc9afb170c0c708b3cdf8
Summary:
Writing to the LocalStore is purely the responsability of the
LocalStoreCachedBackingStore and not of the individual BackingStore. Thus, they
cannot assume that the root Tree is actually stored in it and should just
directly import it.
Reviewed By: chadaustin
Differential Revision: D31340206
fbshipit-source-id: 0f485ceb9fa71f7a7bdc8aaefaa850540075c88c
Summary:
Looking at Instruments, when issuing tons of glob queries to EdenFS, EdenFS
appears to be spending a very large amount of time adding tasks to the
UnboundedTaskExecutor. Since globs are expected to be fast, we can afford to
execute them inline, reducing this overhead and speeding up glob queries.
Reviewed By: chadaustin
Differential Revision: D31289485
fbshipit-source-id: 428fff9f5fea65073b2a061dc7070d63ae36d95d
Summary:
The globbing algorithm is recursive and returns its own glob results merged
with its children's glob results. The merging is done by simply copying the
children's glob result and returning it. What this means is that a single
GlobResult will be copied K times, with K being the recursion depth at which it
was created. This makes the total number of copies be O(K*N) with N the result
length.
Since we can simply avoid these copies by simply creating the GlobResult in a
shared vector, we can avoid the copies entirely at the expense of taking a
lock.
Reviewed By: chadaustin
Differential Revision: D31288036
fbshipit-source-id: ae8a98a01eab2ba7f23908d347d7a4ec199cdfab
Summary:
In the case where a child is already loaded, this allows the code to not
allocate a SemiFuture, which should benefit code that repeatly calls it on
already loaded inodes. One typical example is the globbing entry point that
needs to find the TreeInode where the glob needs to be applied on.
Reviewed By: chadaustin
Differential Revision: D31283398
fbshipit-source-id: 76f82d74f2a45ee2b3b9bf442d47c0a2262bced9
Summary:
One layer about getChildRecursive is getInode, let's make this one use an
ImmediateFuture too.
Reviewed By: chadaustin
Differential Revision: D31283397
fbshipit-source-id: 8bc524bea857d6ec5bc045d6e3383d38133c3b38
Summary:
This diff uses the helper proc macro added on the previous diff to simplify the code for dozens of api objects.
Over 1000 lines deleted :)
Examples of structs that couldn't be migrated:
- Wire structs that didn't rename the fields to numbers. (e.g. WireHistoryRequest) (would need some extra migration)
- enums (not currently supported by the proc macro)
- Wire structs that didn't map directly to their non-wire counterparts (e.g. WireSnapshotState)
I added some comments with possible future improvements, but didn't pursue them right now as they are significantly less useful than this diff itself, which covers most of the cases.
Reviewed By: ahornby
Differential Revision: D31057140
fbshipit-source-id: 88a867ba2cdfedf6a96a8ca3718508073822b962
Summary:
This diff implements a proc macro that derives a "wire" object from a simple api obj for edenapi. It is enough to cover most of the wire objects and simplify code a lot.
Differences from initial proof of concept:
- Renamed `default_wire` to `auto_wire`. I'm happy to rename again if preferred.
- Fields must have `#[id(X)]` notation.
- Fields with default values are not serialized.
- Arbitrary implementation also is derived
This diff only uses this proc_macro on one Api Object, in order to make reviewing it easier, focusing on the implementation of the macro. The next diff uses this macro in all possible objects, actually doing the bulk of the refactoring.
Reviewed By: ahornby
Differential Revision: D31054734
fbshipit-source-id: d6136faf84492983ca69461fe243e830021b2f66
Summary:
Plus a minor refactoring to use the io::IsTty trait in edenfs_client::status instead of calling into libc directly.
This commit reintroduces 218f06a4e648 after reversion in e9ef7c2142d0. Windows build broke because I accidentally left a stray `#[cfg(unix)]` above unrelated use line.
Differential Revision: D31248594
fbshipit-source-id: ddee62d9dc4d0b99d2e67fe9b757748ab501e030
Summary:
Setting EDENSCM_LOG can achieve the same effect. So let's remove the redundant
logic. This also avoids overhead constructing the strings when tracing is off.
Reviewed By: yancouto
Differential Revision: D31359046
fbshipit-source-id: db53cc16f1efcf6111535090a3eadec19b888329
Summary: Same as D30974102 (91c4748c5b) but for mercurial filenodes.
Reviewed By: markbt
Differential Revision: D31170597
fbshipit-source-id: fda62e251f9eb0e1b6b4aa950d93560b1ff81f67
Summary:
This fixes these tests (which are blocking the deployment):
```
FAILED eden/mononoke/mononoke_types:mononoke_types-unittest - content_metadata::test::content_metadata_blob_roundtrip
FAILED eden/mononoke/mononoke_types:mononoke_types-unittest - content_metadata::test::content_metadata_thrift_roundtrip
FAILED eden/mononoke/mononoke_types:mononoke_types-unittest - file_contents::test::file_contents_blob_roundtrip
FAILED eden/mononoke/mononoke_types:mononoke_types-unittest - file_contents::test::file_contents_thrift_roundtrip
```
They were broken since the update of quickcheck, as it now checks with big values for ints, which broke these tests as the code didn't do a good job of converting between u64/i64. This diff fixes that.
Usually, we just want to use `as` conversion everywhere. Thrift only stores i64. but it's fine to store a negative value that will correctly overflow back to the correct positive value of u64. In practice though, this shouldn't really happen, as these are sizes, and we're never getting anywhere near the u64 limit.
Using `TryInto` doesn't cause these overflows, but for just storage, it makes things work in less cases.
Reviewed By: krallin
Differential Revision: D31379001
fbshipit-source-id: feeb87a62148f97b3bd467e8c2ef2156c8e3329a
Summary: Establish a http connection to x2pagentd using a unix socket.
Reviewed By: mzr
Differential Revision: D31204332
fbshipit-source-id: d1f98dc0e4eae4fb918cd77a9571e1a2da7d7ed2
Summary:
If a file is replaced by a move or copy from another file, we should ignore
this copy info when computing the renames in a diff. If we do not, then we
will fail to include the deletion of the copy source in the case of a move, and
the copy information doesn't add anything anyway as the destination file will
just show as modified.
Reviewed By: mitrandir77
Differential Revision: D31180151
fbshipit-source-id: c89a8ae26a516fd958406bb967a587b3b6c36a48
Summary: Switch derivation of Git trees to use the `DerivedDataManager`.
Reviewed By: yancouto
Differential Revision: D31303798
fbshipit-source-id: 193f5d373a56a0d1099f49db76758227d15c3762
Summary: Add overrides for changesets, filenodes and bonsai_hg_mapping in the derived data manager.
Reviewed By: yancouto
Differential Revision: D31378456
fbshipit-source-id: b1faa543ca65fa041d2d0ddc908ea5fb950d023a
Summary:
Vendoring this patch: https://github.com/curl/curl/pull/7737 to curl-sys rust crate. On windows the hg client is using curl that's bundled with sys-curl. I need this patch to have unix domain sockets working in hg client on windows.
I had to manually vendor https://raw.githubusercontent.com/mzr/curl/57e7ec4dbe4dd2831de51f2644879387d2ea7b44/docs/INSTALL because reindeer didn't do it. IDK why.
oss-eden-{darwin,linux,windows}-getdeps fail with:
```
FAILED: eden/scm/lib/backingstore/CMakeFiles/rust_backingstore.cargo eden/scm/lib/backingstore/debug/libbackingstore.a eden/scm/lib/backingstore/release/libbackingstore.a
cd /data/sandcastle/temp/fbcode_builder_getdeps/shipit/eden/eden/scm/lib/backingstore && /data/sandcastle/temp/fbcode_builder_getdeps/installed/cmake-hQhVzQT-WzFKTeqXjLxo5lLi8IG4_MjX2-YRqptCUVs/bin/cmake -E remove -f /data/sandcastle/temp/fbcode_builder_getdeps/shipit/eden/eden/scm/lib/backingstore/Cargo.lock && /data/sandcastle/temp/fbcode_builder_getdeps/installed/cmake-hQhVzQT-WzFKTeqXjLxo5lLi8IG4_MjX2-YRqptCUVs/bin/cmake -E env CARGO_TARGET_DIR=/data/sandcastle/temp/fbcode_builder_getdeps/build/eden/eden/scm/lib/backingstore CARGO_HOME=/data/sandcastle/temp/fbcode_builder_getdeps/build/eden/_cargo_home cargo build --release -p backingstore --features fb
Blocking waiting for file lock on package cache
Blocking waiting for file lock on package cache
error: failed to calculate checksum of: /data/sandcastle/boxes/eden-trunk-hg-fbcode-fbsource/third-party/rust/vendor/curl-sys-0.4.45+curl-7.78.0/curl/docs/INSTALL
Caused by:
failed to open file `/data/sandcastle/boxes/eden-trunk-hg-fbcode-fbsource/third-party/rust/vendor/curl-sys-0.4.45+curl-7.78.0/curl/docs/INSTALL`
Caused by:
No such file or directory (os error 2)
```
Not idea how to fix this. Seems related to the fact that reindeer didn't vendor docs/INSTALL.
# EDIT:
# It might been caused by some bug in hg. now it's fine
# The failure in oss-eden-linux-getdeps looks unrelated (something with rocksdb)
Reviewed By: krallin
Differential Revision: D31370778
fbshipit-source-id: a1245f8cb6b58f5765e34c95dfd78325a8e6e457
Summary:
There were a few Wire objs that didn't have tests because they didn't implement Arbitrary. I added simple implementations for them.
I did it mostly for the request/response types, as they have the other types inside them.
Reviewed By: markbt
Differential Revision: D31019959
fbshipit-source-id: edf283fd79a40b794c89db79c026f1bcaf834a9f
Summary:
This diff adds snapshot tests for most eden api wire types, while at the same time making the testing code much smaller, including tests for "wire" and "serialize" roundtrips.
## Context:
The previous diff had added an easy way to add snapshot tests. This stack aims to simplify the wire protocol code needed to create/modify an endpoint. A good thing to do before that is to add snapshot tests to all wire types, so that if we change them in a refactor, we're confident they still work exactly the same. This will also be useful when a type is changed in the future.
## How this makes tests easier
- In order to create snapshot tests, we need example objects to test with. Luckily we already use a framework for generating example objects (quickcheck::Arbitrary), so the idea here is to use that to make snapshot tests as automatic as possible.
- At the same time, the "wire" and "serialize" roundtrip tests (which also used Arbitrary), can also be made more automatic.
Now, using a simple helper, `auto_wire_tests!(WireObjectName)`, it is possible to derive all three types of tests automatically. This makes the current code smaller, and safer as we now have the additional safety provided by snapshot tests.
## Observations
- Not all wire types had tests implemented for them (I assume because it was too much work doing so, and might have done that myself in the past), I only moved the ones that already had. I'll do another pass and add remaining objects on a following diff.
- There are a couple actual non-refactor changes. I'll add comments explaining those.
- quickcheck crate is using quite an old version. I tried updating but it snowballed into something much more complicated, so I kept using the old version. We'll need to get to it at some point, though.
Reviewed By: markbt
Differential Revision: D31019233
fbshipit-source-id: 30c4a90848d0a5dcaffb89b9a0cd1cebfe4ace55
Summary:
This diff adds a way to do snapshot tests in SC, by simply calling a macro. It uses that as an example on two wire objects to show it works.
Motivation:
- Using snapshot testing in wire object will make it super clear what changes are happening on the wire format, which is useful to prevent breakages. (see quip)
- I'm going with approach #3 from proposal, but both #2 and #3 would need snapshot testing.
How?
- I'm using [insta](https://docs.rs/insta/1.8.0/insta/), a rust crate for snapshot testing. Unfortunately it depends on some cargo-specific stuff that doesn't work with buck, but I was able to get it working without patching the package by bypassing the Cargo dependent stuff.
- Insta also provides some cargo exensions on top of it to compare snapshots, which we can't have, so I made it clear on the test errors (see test plan) how to update the snapshots (via flag).
Future
- On future diffs I'll add a snapshot test for all the wire objects, this just adds to a first few to show it works and set up the plumbing.
- Writing objects manually for snapshot tests for **every** wire object would be a large amount of code, and troublesome when adding new (specially big) wire objects. We already have wire objects implementing `Arbitrary`, so I think it should be possible to use this to generate simple snapshot tests on automatically generated objects. (It will need some extra work to make sure they are created consistently.)
Reviewed By: markbt
Differential Revision: D30958077
fbshipit-source-id: 3d8663e7897e5f6eb4b97c24f47b37ef2f138e5a
Summary:
This is fairly mechanical diff that finalizes split of Hash into ObjectId and Hash20.
More specifically this diff does two things:
* Replaces `Hash` with `Hash20`
* Removes alias `using Hash = Hash20`
Reviewed By: chadaustin
Differential Revision: D31324202
fbshipit-source-id: 780b6d2a422ddf6d0f3cfc91e3e70ad10ebaa8b4
Summary:
A spawner type is required for new thrift clients, specify the noop one for now.
This also requires regenerating the generated thrift libraries.
Reviewed By: yancouto
Differential Revision: D31338518
fbshipit-source-id: cbecf3ec6f9678918ca459c19f1cc160214fadfd
Summary:
We now use post-pull hook to mark commits as landed:
hooks.post-pull.marklanded=hg debugmarklanded
So there is no need to mark them inside the pull command. In fact, it seems
something is wrong (phases aren't invalidated properly?) so that the
pullcreatemarkers logic might actually hide commits during pull incorrectly:
commit 1e964a4302c03e5ae48e5b85b0fc0bf27f847b09
Author: metalog <metalog@example.com>
Date: Tue Sep 28 17:29:49 2021 +0000
pull
Parent: a4511a83cf862cc7216c15d83a3f4ff9d3b3241b
Transaction: pullcreatemarkers
RootId: 18d81a1531ecea65affe83c25804c790cac57c59
diff --git a/visibleheads b/visibleheads
index eb58137..da5f45a 100644
--- a/visibleheads
+++ b/visibleheads
@@ -1,16 +1,6 @@
v1
-e82de77adcc261cb306dafeb6cbe15f26f7de768
-91cf8c4b47e433acbe3f774e608eee42a3ad089d
8c071e5aa26d920f1b88c4b1cd10f6e946d4312a
536ff436cde4ec53d74c02ae2c5ed6f60609e01a
-e49b834a6ff9b61a95a743d22703dc6634f2918c
-c53b65542d4583ae82835144ac7f72dce7a6f335
1010312ed7ac683c5e97ad765cdbcb4927ddf62c
-a563a5f35d2df4c54ba1fe2401aa1cd929a218bf
-55c0c7483c6cd85114a9be587d11c87aaceaeff4
-74079567b677c5799b6c67855683ec346ebc3cce
33b5fd6055da2284cf5224f70e1bb9791ed87641
-ec221dff2393793f7b1e11ea9f9e9ea87b79da1f
-cc466614d43a6af24520b7a81653435f4a614fbb
2d7466be885e757d2d41630a3148fd31f5199ffa
-226c8136603dcbcc408133a2122e93fc045527fa
As we're here, document some other config options existed in the code.
Reviewed By: DurhamG
Differential Revision: D31295709
fbshipit-source-id: e26c728215a209ab5dfaee7a84daece8197a1cc4
Summary:
With the lookup processor now returning an ImmediateFuture, we can focus on its
caller, starting with getChildRecursive.
Reviewed By: chadaustin
Differential Revision: D31283396
fbshipit-source-id: 97abc57b9efe3540c5770aa952995c257e6eda4b
Summary:
While profiling glob queries sent by Buck, I noticed that EdenFS spends almost
as much time resolving the inode to perform the glob on as EdenFS takes to
actually perform the glob. Futures related overhead shows up as predominent,
thus let's convert these to ImmediateFuture to speed this up.
Reviewed By: chadaustin
Differential Revision: D31283395
fbshipit-source-id: 7355ddf7498f722ed8ec2989f010a28fb15c293f
Summary:
While std::variant is convenient, they are both slow to compile, and the
compiler cannot optimize it as well as a manually written tagged union. Since
ImmediateFuture is performance critical for EdenFS, let's use a tagged union
and speed them up by an additional 40%.
Reviewed By: chadaustin
Differential Revision: D31272296
fbshipit-source-id: e34be4489a596d3577b3bd900a1f20d6c7d8b693
Summary:
The max duration would cause UBSAN failures due to folly's SemiFuture code
multiplying the value which understandably cannot be represented. Splitting the
function is easy and avoids the problem entirely.
Reviewed By: genevievehelsel
Differential Revision: D31272297
fbshipit-source-id: c15ca70ad771c11b4f68bb9974422c0986d4928b
Summary:
When Buck is using the EdenFS globber, the searchRoot argument is set, thus
let's add a new argument to the benchmark to simulate a Buck workflow.
Reviewed By: chadaustin, genevievehelsel
Differential Revision: D31283399
fbshipit-source-id: 5e32b2aceb6090e26e88cf7f0d163448d56107d4
Summary:
The goal of this stack is to remove Proxy Hash type, but to achieve that we need first to address some tech debt in Eden codebase.
For the long time EdenFs had single Hash type that was used for many different use cases.
One of major uses for Hash type is identifies internal EdenFs objects such as blobs, trees, and others.
We seem to reach agreement that we need a different type for those identifiers, so we introduce separate ObjectId type in this diff to denote new identifier type and replace _some_ usage of Hash with ObjectId.
We still retain original Hash type for other use cases.
Roughly speaking, this is how this diff separates between Hash and ObjectId:
**ObjectId**:
* Everything that is stored in local store(blobs, trees, commits)
**Hash20**:
* Explicit hashes(Sha1 of the blob)
* Hg identifiers: manifest id and blob hg ig
For now, in this diff ObjectId has exactly same content as Hash, but this will change in the future diffs. Doing this way allows to keep diff size manageable, while migrating to new ObjectId right away would produce insanely large diff that would be both hard to make and review.
There are few more things that needs to be done before we can get to the meat of removing proxy hashes:
1) Replace include Hash.h with ObjectId.h where needed
2) Remove Hash type, explicitly rename rest of Hash usages to Hash20
3) Modify content of ObjectId to support new use cases
4) Modify serialized metadata and possibly other places that assume ObjectId size is fixed and equal to Hash20 size
Reviewed By: chadaustin
Differential Revision: D31316477
fbshipit-source-id: 0d5e4460a461bcaac6b9fd884517e129aeaf4baf
Summary:
https://pxl.cl/1Qh3j
This is the most called edenapi endpoint by far. If we sample logging of it, we can increase the retention of the scuba table.
if we wish, it's possible to not change retention for some "non-trivial" requests, but I haven't done that.
Reviewed By: liubov-dmitrieva
Differential Revision: D31277391
fbshipit-source-id: ee19e9daa4cd39c5d3eac1063e82aa40fc108bc7
Summary:
This is used to uniquely identify requests in gotham. It's logged to output, on errors, and on Scuba.
Problem: On scuba, this packs very badly, as it is a large string (36 chars), unique for all requests.
Solution: Let's get a prefix of it, it should reduce size used on scuba. Got a prefix of size 5.
This affects both LFS and EdenApi.
Pros:
- Reduces size
- Very easy fix
Cons:
- More chance of conflict. The space of this id is 16^5 = 10^6. There will surely be conflicts, but maybe that's not a huge deal?
Alternative: Using 8 digits, that's about 4bi ids, which will reduce conflicts significantly in exchange for more space.
Why not use an int id (example: u64), or using other characters in id (not only hex): This would reduce the size of data significantly, but has drawbacks:
- For int, would require a big refactoring, as everything assumes the id to be string. Specially since this goes through client-server, might be complicated.
- Not just getting a prefix means more processing on each request, and means we need to recalculate it everytime.
- Size reduction might not be that big, as scuba already packs stuff pretty well.
Reviewed By: krallin
Differential Revision: D31305547
fbshipit-source-id: 23f6b6cb7de5b7a090864db414d4d71cd68c4946
Summary: D31115820 (ae87b82eaf) updated quickcheck, but there's some stuff we need to fix forward. This diff fixes the remaining failures I could find.
Reviewed By: farnz
Differential Revision: D31305392
fbshipit-source-id: a6684d47833bc0fd933751c13cdd71392cb1833b
Summary:
use `hg cloud upload` command if usehttpupload enabled
For few users who opted out of cloud sync, we should use `hg cloud upload` command instead of `hg cloud backup` if usehttpupload enabled
Reviewed By: markbt
Differential Revision: D31205919
fbshipit-source-id: 7619e7b299e19a7626782e7b3c1a69e7cd7dbc1b
Summary:
Somehow before this fix I've seen us runing out of semaphores and deadlocking
because not freeing semaphores immediately after finishing running the
function requiring git repo.
Reviewed By: mojsarn
Differential Revision: D31310626
fbshipit-source-id: ba12b2d4918ecc30ca0aa6ff011176f7634badf9
Summary: Need this for cargo check/rust-analyzer to work.
Reviewed By: guswynn
Differential Revision: D31319911
fbshipit-source-id: ebd3fa72d8fc3667391a2067f95cab9e5f53301f
Summary: I'm parsing some deeply-nested JSON, and it's running into the limit. This feature enables a potential footgun, but even with the feature enabled you have to add code to reach for said footgun.
Reviewed By: jsgf
Differential Revision: D31284743
fbshipit-source-id: 00ea5d7d7db8bdeb878d48fe390831f39e007409
Summary:
When calling `changelog.idmap`, it takes a snapshot of the IdMap.
When flushing the changelog, the IdMap might change. Some Ids might get
re-assigned.
For code (phases?) that keeps the `nameset` for a while the idmap might become
out-dated after a changelog flush. That might be a cause of removing visible
heads incorrectly.
Let's change `_torev` and `_tonode` to properties so they always take the
"latest" idmap snapshot.
Reviewed By: DurhamG
Differential Revision: D31296057
fbshipit-source-id: 1b3fec5d21649eab772ab3a150a3182a18b94edf
Summary:
It was added on the client side in D30686450 (7eb11cb392) to handle octopus merges correctly,
let's add it on mononoke as well, otherwise new_streaming_clone fails to parse
a revlog.
Reviewed By: mitrandir77
Differential Revision: D31305651
fbshipit-source-id: 976d7fdb8775f859e4732fd8a68f9b28f04ce4f9
Summary: This implements unix socket support for mercurial's HTTPConnection so commitcloud can use it.
Reviewed By: ahornby
Differential Revision: D31229256
fbshipit-source-id: a610c3c34be608ac2d9b41f3a7b6b62b44227b94
Summary:
We want to reduce duplicated work. Since requests will be consistently hashed, each instance of service will receive some set of requests from multiple clients. By storing requests together with shared future derivation, any client can get the state of derivation.
In addition upon receiving requests we can clean up the map to remove already completed futures so the map will not grow indefinitely.
Reviewed By: StanislavGlebik
Differential Revision: D30776322
fbshipit-source-id: 961055f8b3328378451edd677506d7e716a9afd2
Summary:
Now the runlog for an hg command invocation includes any progress bar metadata. I repurposed the existing rust progress thread to also upate the runlog progress (only if it has changed).
To avoid race conditions with the main thread writing the final "exit" runlog entry, updating the runlog progress is a no-op if the runlog's exit code has already been set.
Reviewed By: quark-zju
Differential Revision: D31065260
fbshipit-source-id: 181661cb06ab2910d8a0e41f5aa767528eb234f5
Summary:
The runlog's purpose is to store live information for every hg invocations. Users/VSCode will access the runlog data to see details about active hg commands.
In this initial commit I've added basic start/end updates to the runlog. The only current storage option is JSON files written to ".hg/runlog/<random ID>". Cleanup of the files will be added later. In the future I may look at sqlite as an alternative.
Set runlog.enable=True to turn on the runlog.
Reviewed By: quark-zju
Differential Revision: D31065258
fbshipit-source-id: 3ff29e1b8473f7e0b6b0d02537d1f18c2c5026fb
Summary:
The old message was a little misleading as in some case EdenFS was running while it couldn't mount the repository. Mercurial will still tell the user that EdenFS is not running. It is not accurate.
The new message is trying to cover this case to avoid confusing people.
Reviewed By: zhengchaol
Differential Revision: D31278947
fbshipit-source-id: dd3e599654390269b6cf31d8842105970cb29cc0
Summary:
This updates the following crates to their latest versions:
- `rand`: 0.7 => 0.8
- `quickcheck`: 0.9 => 1.0
Both crates introduced some breaking changes, so affected clients had to be fixed accordingly. Most changes are rather mechanical and shouldn't change the existing logic. In addition, a few buggy property tests were uncovered, presumably due to `quicktest` becoming smarter with its choice of inputs in the newer version, and the fixes are included in this diff.
Reviewed By: yancouto
Differential Revision: D31115820
fbshipit-source-id: 60a61dfac3236fd93cd4f03b86506654d81d330f
Summary: This diff fixes some integration test errors after enabling the new edenfsctl.
Reviewed By: xavierd
Differential Revision: D30789741
fbshipit-source-id: 02d74defc41def4fb6ea0cc4694f944b4c0044e2
Summary:
Some detail polishing.
Incomplete commands are commented out. Help messages are now printed correctly. Fixed a small behavior divergence in `eden config` (`to_string_pretty` uses multi-line string instead of escaping characters).
Reviewed By: xavierd
Differential Revision: D30547011
fbshipit-source-id: 98d323744ce7a7fc989cbf79dd07ed8af3cee09d
Summary: This diff adds the Rust edenfsctl to our open source build.
Reviewed By: xavierd
Differential Revision: D30788685
fbshipit-source-id: 603caa933ecfc5af0ede7e22f6c7911462da3a65
Summary:
The lookup of content ids was not working as expected.
Reasons:
- If content id was provided, we never checked it was actually on the blobstore, and failed when building the metadata for it. This was happening since D30016963 (f64520a312)
This diff fixes that by explicitly checking it exists. I also added some comments to clarify.
Reviewed By: liubov-dmitrieva, StanislavGlebik
Differential Revision: D31268102
fbshipit-source-id: 9801a7f4ce1536e68f44ebe114087e53cf094d7a
Summary:
Functions that only takes boolean arguments are fairly confusing and error
prone. Here, since we only ever pass a single true value to it, we could simply
inline setting the right counter in the caller. This makes the code easier to
read, and less error prone.
Reviewed By: genevievehelsel
Differential Revision: D31188413
fbshipit-source-id: 64c019ff52b1ff5644e5bea11a361e586044403f
Summary:
Changes to edenfs-client seem to be breaking the hgbuild windows job https://www.internalfb.com/intern/sandcastle/job/27021598254894733/
Original commit changeset: 218f06a4e648
Reviewed By: DurhamG
Differential Revision: D31244893
fbshipit-source-id: e9ef7c2142d0a6afca342f84574d553b136b5fdb
Summary: I would like to use httpclient.HTTPConnection client because in the following diffs I am adding unix domain socket support to it and jplopezgu will add use that support for commitcloud.
Reviewed By: ahornby
Differential Revision: D31229252
fbshipit-source-id: 8999f27b68f9c7aa9f725d65c291f4d338d3b813
Summary:
One way to mitigate the skiplist inefficiencies is to just use segmented
changelog if we can.
Currently we can do it only for commits on master bookmarks for most repos but
upcoming defrag work from farnz would allow us to include release branches
there as well. That will cover most of the is_ancestor queries.
NOTE: This is not the end of diffs switching us to use segmented changelog. I'm planning to also do it for other places where we do ancestry checks and lower common ancestor operations.
Reviewed By: StanislavGlebik
Differential Revision: D31169338
fbshipit-source-id: 9d4b27d3fb22016b0239c52d71a9b2d9ae9a103b
Summary: This would allow us to benefit from segmented changelog server-side
Reviewed By: StanislavGlebik
Differential Revision: D31169337
fbshipit-source-id: 3c648ed2f144cee57de7c319692a37b04adf5705
Summary: Previously, all EdenAPI methods supported callback-based progress reporting. With the new HTTP progress bars, this old progress API is no longer used anywhere (except for a test program). Let's clean it up to get rid of the extra parameter for every method.
Reviewed By: andll
Differential Revision: D31184693
fbshipit-source-id: 996959e0d81dd7685fcfaca98f162e7267684306
Summary:
this admin command in D29734333 (3f8de3336a) started depending on innerRepo because it
needed access to ephemeral blobstore. It didn't need other parts of inner repo
so there's no need for that dependency.
Reviewed By: krallin
Differential Revision: D31210293
fbshipit-source-id: 004fb95d17e7e1d3095db0258f3c55dadaf5524c
Summary:
This mode rederives commits and compares that what was rederived is the same to
what has already been derived. It's useful to test any changes to derive data
logic and make sure these changes don't have any bugs
Reviewed By: markbt
Differential Revision: D31143741
fbshipit-source-id: 618dbf12ab444b5686d50f83a590314adc6c5dda
Summary: Remove some more path clones by changing within_restrictions to take Option<&MPath>
Reviewed By: StanislavGlebik
Differential Revision: D31175004
fbshipit-source-id: 92f0b4b594c4b3e30258acd019e7f42d9b3bc5fb
Summary: Remove a couple of clones of path by moving up ChangesetPathContentContext::new_with_fsnode_entry
Reviewed By: StanislavGlebik
Differential Revision: D31175005
fbshipit-source-id: fa686f69087e317877c2c9a9c0cffe05a6006775
Summary:
`self.map.contains_vertex_name` only checks the local `map` without triggering
remote protocols. That means with lazy changelog clone, if `master` points to
a lazy commit, the clone will fail. Fix it by switching to `self.contains_vertex_name`,
or even better, `self.vertex_id_batch` to do proper batching.
Reviewed By: StanislavGlebik
Differential Revision: D31228524
fbshipit-source-id: 229d8a92c5517ac5a1dbfa3f440df88a4ab8e3e6
Summary:
In advance of Thrift servers defaulting the queue timeout to 100 ms,
which is quite low for EdenFS's needs, explicitly set our queue
timeout to 5 seconds.
Reviewed By: zhengchaol
Differential Revision: D31218348
fbshipit-source-id: 35a109fb6848f7c81c4b58d70e2beae90557e1c8
Summary: we can just use getBackingStores like how is done for `startRecordingBackingStoreFetch` and only record non-empty fileAccesses. This will enable fetch logging for LocalCacheBackingStores which use an HgQueuedBackingStore under the hood
Reviewed By: zhengchaol
Differential Revision: D31215109
fbshipit-source-id: 443d28a57144fdcf078bd653ecf5726825f55740
Summary: fix the dynamic casting for getting a tracebus for the trace hg entrypoint. dynamic cast still makes sense at this point since `trace hg` should only be called on hg backed mounts
Reviewed By: chadaustin
Differential Revision: D31214737
fbshipit-source-id: 65e018e6658d934d8ecd3434bdfc3d72f6873d2b
Summary: instead of dynamic casting to find the repo name, all backing stores can return an optional reponame, and can check if the optional is set.
Reviewed By: zhengchaol
Differential Revision: D31214723
fbshipit-source-id: 9d10114ff6bde13254d3a3caaf2401f87d07ffd7
Summary: add more information to the runtime error thrown by the dynamic cast failure in `eden trace hg` and predictive fetch
Reviewed By: zhengchaol
Differential Revision: D31212247
fbshipit-source-id: 982901dfd2eb05db9ca6e7366277a07b6b29872f
Summary:
VC++ 2019 is pickier about which standard library includes include
each other. Be explicit.
Reviewed By: zhengchaol
Differential Revision: D31186916
fbshipit-source-id: 95cfa8848d0e2e312e2024923fa166db5f68dde0
Summary:
This unused feature allowed a sub-process to inherit an hg repo wlock from the parent process. It was apparently intended for merge drivers, but nothing was using it.
I want to move some locking logic into rust, and this stuff was complicating things.
Note that this functionality was also removed upstream in https://phab.mercurial-scm.org/D9053.
Reviewed By: quark-zju
Differential Revision: D31184339
fbshipit-source-id: 92908220d48e2bc55e2f4fca90e647650ca5bef7
Summary:
While debugging the unlinked inode unloading for NFS I have re-added these
logs a couple times. These seem valuable to have in eden so that we don't have
to add them any time we are debugging eden and we can debug a bit in a
production eden rather than dev built eden.
Reviewed By: xavierd
Differential Revision: D30971151
fbshipit-source-id: 58172079dfe4f4e4ba31bae30bf982e2cbe0fd29
Summary:
We run periodic inode unloading for unlinked inodes on NFS because we get no
information from the client on when inodes are no longer needed, and we have to
clean them up at some point for memory and disk reasons. See previous commit
summaries for more details on this (D30144901 (ffa558bf84)).
Let's add some counters on this so we have a bit more visibility into the
process. This counter is meant to mimic the PeriodicUnloadCounter counter.
Reviewed By: chadaustin
Differential Revision: D30966688
fbshipit-source-id: cfc8d769b53073d9f4c0c27b6bee20e222c6c8d2
Summary:
I believe this is the reason for -
https://fb.workplace.com/groups/238845853462687/posts/845939069420026. We used
default config that doesn't do any chunking and puts large files as a single
blobs.
Let's not do that
Reviewed By: farnz
Differential Revision: D31209331
fbshipit-source-id: 43c2d2ab7caac110a1474856da09c119a5e72429
Summary:
EdenApiUploads: eliminate extra lookup if no stacks
In EdenApiUploads we filter heads first and then we filter the commits belonging to these stacks.
However, in some usecases users don't use stacks. If there is no any single stack, the second lookup would be redundant and it would be nice to avoid it completely.
We can pass a flag to the upload code saying that extra filtering is not needed.
For example, in configerator repo users usually don't do stacks.
Reviewed By: markbt
Differential Revision: D31203489
fbshipit-source-id: 0921a01198bfc377afc3af3f7319fd0c5fec04d7
Summary: Plus a minor refactoring to use the io::IsTty trait in edenfs_client::status instead of calling into libc directly.
Reviewed By: quark-zju
Differential Revision: D31156633
fbshipit-source-id: 218f06a4e64836be88b4afac98dcfa140373c730
Summary:
There is no need to read from the LocalStore twice, the tree is either present
in it, or not.
Reviewed By: chadaustin
Differential Revision: D31187972
fbshipit-source-id: 15bdeef9176b51e6ba3f62ed16550032b0024b94
Summary:
Some of EdenFS backing store requires EdenFS to cache objects locally to avoid
potentially expensive network fetches, while others already have some form of
local caching. In the past, all backing store fell in the first category, but
thanks to Mercurial's native backing store implementation the LocalStore
caching has become pure overhead for it. Previously, this was worked around by
configuring the LocalStore to not cache blobs locally, but this wasn't done for
trees. This config also conflicts with the need to cache blobs and trees
locally for backing stores in the first category (such as ReCas).
Since we know at construction time what backing store needs local caching, we
can simply wrap these in the newly introduced LocalStoreCachedBackingStore
store.
For now, since the Mercurial backing store always writes a proxy hash to the
LocalStore, bypassing the LocalStore for trees would be a regression due to the
added disk IO. Once proxy hashes are gone for Mercurial, we can remove the
LocalStoreCachedBackingStore wrapper.
Reviewed By: chadaustin
Differential Revision: D31118905
fbshipit-source-id: 4a2958eafeeb8144ee4421ec44dbd30cedceee29
Summary: Same as D30974102 (91c4748c5b) but for mercurial cs.
Reviewed By: StanislavGlebik
Differential Revision: D31145642
fbshipit-source-id: c1be7b49bf0cbe70b844f1a31de706215a51d1ae
Summary: Same as D30974102 (91c4748c5b) but for fastlog.
Reviewed By: ahornby
Differential Revision: D31142066
fbshipit-source-id: 44a79e8a9db180736324db734b018344a77c070a
Summary:
Same as D30974102 (91c4748c5b) but for deleted manifest.
Needed some changes regarding using `DerivationContext` instead of `BlobRepo`.
Reviewed By: StanislavGlebik
Differential Revision: D31121260
fbshipit-source-id: f37daac320173b0896f12c83bdd8a723d22ec876
Summary:
Same as D30974102 (91c4748c5b) but for fsnodes.
Needed some changes regarding using `DerivationContext` instead of `BlobRepo`.
Reviewed By: StanislavGlebik
Differential Revision: D31113044
fbshipit-source-id: 6e996135f59f26e76e52b0b24ea61917216d1e53
Summary:
Same as D30974102 (91c4748c5b) but for skeleton manifest.
Needed some changes regarding using `DerivationContext` instead of `DerivedDataManager`.
Reviewed By: StanislavGlebik
Differential Revision: D31111484
fbshipit-source-id: eacc1d3247dffac4537745ec2a2071ef0abcbd43
Summary:
Same as D30974102 (91c4748c5b) but for changeset info.
This turned out quite simple, as we already have the bonsai changeset, so there's no need to do any async stuff.
Reviewed By: StanislavGlebik
Differential Revision: D31110319
fbshipit-source-id: 952686ae5583b858361b7a2a67fe914bfe5239d6
Summary: Now that EdenAPI fetching is turned on everywhere, let's make it the default.
Reviewed By: quark-zju
Differential Revision: D31184213
fbshipit-source-id: 450c1167d42ee867b505a2a14b0c636bed81107d
Summary:
It can be surprising when a job suddenly is no longer able to run sudo, or no
longer run as root that all the tests are marked as being successful, despite
the fact that they no longer run. Let's recognize when we run on EdenFS to
allow tests to fail if they can no longer run EdenFS.
Reviewed By: zhengchaol
Differential Revision: D30357402
fbshipit-source-id: c3758d7a5a3c575dd68bd97062ae24abe4124874
Summary:
Now that we might have multiple kernel protocols per mount (i.e. both fuse and
nfs on macOS) let's include them in eden rage.
Reviewed By: xavierd
Differential Revision: D31154042
fbshipit-source-id: 38e7630829d70fe9dd6dbeabacc3b538ee798e0d
Summary: dyn Drop produces warning because everything is Drop
Reviewed By: quark-zju
Differential Revision: D31175376
fbshipit-source-id: 78f55a60c9bb6d51cde9433ab2815ec133b15ecc
Summary:
We might have a somewhat weird case - a file was replaced with a directory and
then in the next the same file was deleted again (even though this file doesn't exist
anymore). In that case we need to make sure these two commits are in two
different stacks of commits, however previously we weren't doing that. This
diff fixes it.
Reviewed By: markbt
Differential Revision: D31168174
fbshipit-source-id: 4b9986e615ec98b6452ff81b113124d14f236382
Summary:
This works more reliably and fully restores `test-commitcloud-sync-race.t` to
pre-D28595292 (c72cd2333f) state.
Reviewed By: markbt
Differential Revision: D30974286
fbshipit-source-id: 729e20f23cb5d8aacdbcef1c869fc9a73ac4d4d4
Summary:
Change the state of visibleheads from the in-memory Python `_heads` variable
to the `svfs.metalog["visibleheads"]`. This changes a few things:
- No need to manually invalidate or reload the `_heads` state on transaction
close/reload, since metalog gets reloaded on transaction boundaries.
- No need to use features from the old transaction framework, such as
`addfilegenerator`, keeping `journal.visibleheads` for transaction
rollback. No need to track `dirty`.
This probably solves issues where `hg pull` hides visible heads unexpectedly.
See P458576970 for example reported by chadaustin where pull runs right after
a cloud sync and hides 5a8c51b193 unexpectedly, but metalog parent shows that
pull got the state after the cloud sync. See also
https://fb.workplace.com/groups/scm/posts/4114332378616351/ for a similar
report from dtolnay.
In theory other states (bookmarks, remotenames) might have similar issues.
But this diff only focuses on visibleheads.
Reviewed By: markbt
Differential Revision: D30974289
fbshipit-source-id: 85d81fd2e2d85ed22ac144f2cb663eb0423955fb
Summary:
`test-cross-repo-commit-validator.t` seems to take longer to run with the next
change. Bump the timeout to make it pass.
Reviewed By: markbt
Differential Revision: D31148285
fbshipit-source-id: 2c815d988b323eb08cf06256ee666130eeebf9a6
Summary:
It tests the Python pushrebase server logic which is no longer relevant.
The next change breaks it and it seems easier deleting the test.
Reviewed By: markbt
Differential Revision: D31121918
fbshipit-source-id: ee5619b35ad4aa16f0227e563ed531e879d1c8d7
Summary:
Store unchanged `self.heads` in a local variable. This avoids some overhead
if `self.heads` is going to be a bit more expensive.
Reviewed By: markbt
Differential Revision: D30974287
fbshipit-source-id: baaffb8f41da4b57e4ac94c305e5ad490a3c3135
Summary:
This is subtle. But fbhistedit (providing `exec` support) depends on it
invaliding everything related to repo to trigger state reloading after
executing a command (which could be `hg`).
Reviewed By: markbt
Differential Revision: D30974284
fbshipit-source-id: b033d81565dcf61104e4d30ecd7d48c33d6d79a4
Summary:
In a future change we'll require `svfs.metalog`. Let's move metalog fix to
before other stuff and attach fixed metalog to `svfs`.
Reviewed By: markbt
Differential Revision: D30974285
fbshipit-source-id: 3be89d1f1cda3d29dd5746940959ee47c1dd674d
Summary:
This allows doctor to construct changelog without requiring valid visibleheads
data. doctor cannot fix visibleheads first, because fixing visibleheads requires
changelog.
Reviewed By: markbt
Differential Revision: D30974288
fbshipit-source-id: 5bcf0f1918809fc0c7db3c89c70e0d17f961dc2c
Summary:
To make metalog replace more features supported by the transaction framework,
there is a need to expose pending metalog states to sub-processes. This diff
makes it so.
Reviewed By: markbt
Differential Revision: D30970502
fbshipit-source-id: 84192a14f4cef0765e4e361b61ab630311fd2dff
Summary: .drain() retains the drained container and its heap allocation for reuse, but as we're not reusing the container, moving the contents into_iter() makes the intent clearer
Reviewed By: StanislavGlebik
Differential Revision: D31149817
fbshipit-source-id: 07cc8b7cabc9b1d522daee8b13cfa6eeb96e2d30
Summary: .drain() retains the drained container and its heap allocation for reuse, but as we're not reusing the container, moving the contents into_iter() by for..in makes the intent clearer.
Reviewed By: StanislavGlebik
Differential Revision: D31149816
fbshipit-source-id: 63c7bba8a457e62a37944aecd8ec8c42dac8deaa
Summary:
No need to keep the Bytes live after compact_protocol::deserialize, can move them into it instead.
Makes it clearer bytes aren't reused, and should have some small effect on peak memory usage during deserialization by freeing the Bytes buffer earlier
Reviewed By: StanislavGlebik
Differential Revision: D31149815
fbshipit-source-id: 858914d2d8e3d91b5e863053dfeeb5d5ec37b9eb
Summary:
optimization for edenapi upload
Lookup for filenodes and trees can be done in parallel. Usually we have a small number of trees to check and a small number of filenodes, it is better to send them in a single lookup request, so they all can be checked in parallel. Parallelism limit for the lookup request is few thousands, so almost always if we merge the requests here, they will be parallelised.
Reviewed By: yancouto
Differential Revision: D31127401
fbshipit-source-id: 8014b27a2ba9d082babe2e0cd7bebf43c8b46082
Summary:
add scuba metrics for stages of EdenApi Uploads
add cloud sync reason for manual run
This is an effort to improve our Eden Api Uploads metrics and Commit Cloud metrics, so we can analyse and improve its performance.
Reviewed By: markbt
Differential Revision: D31109948
fbshipit-source-id: ee5a449e2652ea1798997ae2c52c4672f55e3eae
Summary:
I recently added this feature but it had a bug - when DontAggregate mode was
used it compared file changes of a new commit with the previous commit only
instead of all changes in the stack.
Since FileAggregation is broken let's remove it and collect both file changes
for the whole stack and for a given commit
Reviewed By: mitrandir77
Differential Revision: D31145055
fbshipit-source-id: 99dbedb919fb9edbdfaeaa658d49a08d008bd282
Summary:
The `ObjectFetchContext::Origin::FromBackingStore` is widely interpreted as
meaning that a network fetch was performed, but for some backing stores, this
isn't true. The Mercurial backing store for instance can either read data from
its on-disk cache, or from the network. Since both have very different
characteristics we shouldn't bundle them in the same enum value.
Since the backing store knows how data was obtained, let's have the backing
store return how it was obtained to enable the ObjectStore to properly record
this information. The `FromBackingStore` is also renamed to make it clearer
what its purpose is.
Reviewed By: zhengchaol
Differential Revision: D31118906
fbshipit-source-id: ee42a0c9d221f870742de07c0df7c732bc79d880
Summary:
we are passing some bytes into Popen and shlex.quote. shlex.quote expects a
string not bytes. fsencode gives us bytes fsdecode gives us string. Let's used
fsdecode instead.
Reviewed By: zhengchaol
Differential Revision: D31129335
fbshipit-source-id: 7792bdcd4dd833a4946daf8ec75576cfe4fc24af
Summary:
Derived data manager now doesn't allow deriving a batch of commits if all
ancestors weren't derived yet (and that's a good idea to do this check).
But it started to break benchmark if --batch-size, --backfill and --parallel
options are set, because in the very
beginning of the function we mark all commits as not derived, and when we start
deriving the second batch the first batch is assumed to not be derived, and
this triggers derived data manager check.
Let's instead mark only commits that we are about to derive as not derived, and
clear this check once we are done.
Reviewed By: mitrandir77
Differential Revision: D31140464
fbshipit-source-id: fc74d58dc3c4a3ad70e8e2527f7d6dfc8fde8a9c
Summary:
I'd like to reuse them in the next diff, so let's refactor it a bit.
Note - in D30837581 (315a8b311d) markbt suggested a good idea for refactoring
backfill_derived_data. I liked the idea, but when I tried to approach this
refactoring it turned out to be tricky to do so (FWIW, it might be easier to
rewrite everything from scratch). So for now I did the smallest possible
refactoring that's needed to add validation in the next diff, but this small
refactoring can probably be used for a larger refactoring later.
Reviewed By: mitrandir77
Differential Revision: D31115979
fbshipit-source-id: f0b4d70454186a023cd9e12cd645768af1b716e8
Summary:
I'd like to use it in the next diffs to add a way to validate that derived data
is the same after rederivation. A lot of the code in `benchmark.rs` is useful
for doing this validation, so let's rename `benchmark.rs` so that it's ok to
use it from two different subcommands.
Reviewed By: mitrandir77
Differential Revision: D31115981
fbshipit-source-id: 86439534d8e49a4022086cb27918b7bcd7befc5c