Summary:
See previous diffs for context. This makes the test shorter and will eventually
allow us to remove some bloated code in edenapi.
Reviewed By: yancouto
Differential Revision: D31465813
fbshipit-source-id: b921476f2d398271688ee96cc994474b616358f0
Summary:
See previous diffs for context. This makes the test shorter and will eventually
allow us to remove some bloated code in edenapi.
Reviewed By: yancouto
Differential Revision: D31465818
fbshipit-source-id: 7885d7204d01a5ae7a0e835eeb9e51cedcc6281d
Summary:
See previous diffs for context. This makes the test shorter and will eventually
allow us to remove some bloated code in edenapi.
Reviewed By: yancouto
Differential Revision: D31465815
fbshipit-source-id: 052a8a4487793a6a977e8ff27aa2bc5443175061
Summary:
This is not used in production and the removal unblocks the removal
of the `complete_trees` endpoint.
Reviewed By: yancouto
Differential Revision: D31465807
fbshipit-source-id: d2a8ff79fe4e6181adefa499bbd7125fa0c5c26b
Summary:
See previous diffs for context. This makes the test shorter and will eventually
allow us to remove some bloated code in edenapi.
Reviewed By: yancouto
Differential Revision: D31465798
fbshipit-source-id: 8e184a881479e42fcda06d7258af55fab99ab1da
Summary:
`debugapi` was introduced to test edenapi endpoints as seen by `hg`. Use it to
test the real client-side edenapi logic. This will eventually allow us to drop
`read_res`, `make_req`, `FromJson`, and `ToJson` to reduce bloated code in
edenapi.
Reviewed By: yancouto
Differential Revision: D31465799
fbshipit-source-id: 5a1ae0565b9c7a23773644e775c5f1930ebc12ad
Summary: It was a temporary step that is never used in production. Delete the endpoint.
Reviewed By: yancouto
Differential Revision: D31465808
fbshipit-source-id: 90e77eab96bb75796a3b31a5907f137d743a7dfa
Summary: It was a temporary step that is never used in production. Delete the endpoint.
Reviewed By: yancouto
Differential Revision: D31465829
fbshipit-source-id: 635ee205589b0f4d15388ae2a2bb9ada51d77edd
Summary: This makes it a bit more efficient and more friendly in debugapi output.
Reviewed By: DurhamG
Differential Revision: D31465800
fbshipit-source-id: b741792eb43a7a57c90f75362dbf189739dbb844
Summary:
`debugapi` was introduced to test edenapi endpoints as seen by `hg`. Use it to
test the real client-side edenapi logic. This will eventually allow us to drop
`read_res`, `make_req`, `FromJson`, and `ToJson` to reduce bloated code in
edenapi.
Reviewed By: yancouto
Differential Revision: D31465830
fbshipit-source-id: 9da739a76ef6e5d49804b0cea2089fc1741d0b7c
Summary:
`debugapi` was introduced to test edenapi endpoints as seen by `hg`. Use it to
test the real client-side edenapi logic. This will eventually allow us to drop
`read_res`, `make_req`, `FromJson`, and `ToJson` to reduce bloated code in
edenapi.
Reviewed By: yancouto
Differential Revision: D31465814
fbshipit-source-id: 5d090e0a92c6374b15a36f80dcc578ad280e50e2
Summary: To ease review, just doing the simple / hacky version of the TreeStore diff, like the FileStore one below this. This doesn't address the tree batch failure issue, but that's not blocking release so it's fine for now.
Reviewed By: andll
Differential Revision: D31593076
fbshipit-source-id: 0d3c420e50af0d8882ba171590597aac1b6c4c77
Summary:
Currently, the scmstore backingstore backend for EdenFs performs very poorly compared to ContentStore. This is mainly because a local-only FileStore is created on every fetch, and always flushes when dropped (even when not used).
With this change, we'll only flush when the parent FileStore is dropped.
This is a slightly hacky fix, vs. creating a new non-flush-on-drop type like `TreeStoreFetch` as we did for trees in the previous diff, but this should work fine for now and is radically smaller. Later I'd like to clean up both to eliminate the verbosity in the trees approach while still using separate types rather than a bool.
Reviewed By: andll
Differential Revision: D31591276
fbshipit-source-id: 11266c19ac68d87015719f5bcbd8f857e596bfdb
Summary: With edenapi and memcache reads both being written to disk as they're fetched, we should no longer need to break large prefetch batches into chunks.
Reviewed By: DurhamG
Differential Revision: D31552341
fbshipit-source-id: 2d9e10db669754cc8124228252cf832c8102f220
Summary: Like with the previous EdenApi change, writing files fetched from memcache to disk as we read them will prevent us from accumulating large amounts of file content in memory and potentially causing an OOM.
Reviewed By: DurhamG
Differential Revision: D31552190
fbshipit-source-id: 1eceb7570918575382f067ff9ad1e08e0623f335
Summary: Introduce a new method, `evict_to_cache`, which writes a memory-backed LazyFile to disk and returns a mmap-backed LazyFile instead.
Reviewed By: DurhamG
Differential Revision: D31551177
fbshipit-source-id: 86901628c56151ac805a57885379241c42f794c0
Summary:
Use the new async `files_attrs` method to concurrently fetch files from EdenApi, write them to disk, and read them back as a mmap-ed `LazyFile`.
Also remove scmstore prefetch chunking with the previous approach required.
Previously, we fetched all the requested files from EdenApi into memory as a batch, then wrote them to disk afterward. This caused OOMs for extremely large fetches, and required chunking scmstore prefetches, which had a negative performance impact.
Reviewed By: DurhamG
Differential Revision: D31445678
fbshipit-source-id: 9a2e1476fb8ddfcd546a5e0b501cc91cc2a97303
Summary: Add a basic async `files_attrs` method to `EdenApiFileStore` for use by scmstore for concurrent fetching / writing to disk.
Reviewed By: DurhamG
Differential Revision: D31483490
fbshipit-source-id: 20233767541d60ccdfcd97bdc98b9bbb98d8700e
Summary:
Currently, scmstore does not write LFS pointers discovered outside the LfsStore before fetching blobs from the LFS remote and attempting to read them back from disk. This causes scmstore to rely on the contentstore fallback to write LFS pointers, after which future reads will succeed in scmstore.
With this change, we track which LFS pointers were discovered from a source other than an LfsStore, and write those to the LfsStore before fetching from the LFS remote. This eliminates the debugimporterhelper fallback in the backingstore implementation, and should improve the performance of the shim in general for LFS files where the LFS pointer is not available locally.
In the future, I'll probably perform these writes inline with the fetches that discovered the pointers for added concurrency, and reorganize the LFS fetching code in general to avoid some of the weird bookkeeping we currently do around local vs. shared for LFS.
Reviewed By: andll
Differential Revision: D31280512
fbshipit-source-id: 2e2a32db3fc9fab8ac179b7852da36f77c66ef31
Summary: we run the eden endpoints from mononoke server, not via this binary any more. remove it
Reviewed By: StanislavGlebik
Differential Revision: D31501592
fbshipit-source-id: 0626fe43f0f1ce4a6c7165a734eb487225fa65b6
Summary: Derived data manager doesn't need full BlobRepo struct. So lets create facet container which includes only needed parts for derivation using manager. I'll use it later with 2DS service.
Reviewed By: StanislavGlebik
Differential Revision: D31540177
fbshipit-source-id: 42a3e96352689f39730b8f62142cccd7a1bb7256
Summary: I'm going to use that API from 2DS service. so the discovery of underived will not be triggered.
Reviewed By: yancouto
Differential Revision: D31540178
fbshipit-source-id: cdf91974b90bc07e09cd3f465f9a1f8b8271a8e4
Summary:
Background: I've been looking into derived data performance and found that
while overall performance is good, it depends quite a lot on the blobstore
latency i.e. the higher the latency the slower the derivation. What's worse is
that increasing blobstore latency even by 100ms might increase time of
derivation of 100 commits from 12 to 65 secs! [1]
However we have ways to mitigate it:
* **Option 1** If we use "backfill" mode then it makes derived data derivation less
sensitive to the put() latency
* **Option 2** If we use "parallel" mode then it makes derived data derivation less
sensitive to the get() latency.
We can use "backfill" mode for almost all derived data types (only exception is
filenodes), however "parallel" only enabled for a few derived data types (e.g.
fsnodes, skeleton manifests, filenodes).
In particular, we didn't have a way to do batch derived data derivation for
unodes, and so unodes derivation might get quite sensitive to the blobstore
get() latency. So this diff tries to address that.
I considered three options:
* **Option 1** The simplest option of implementing "parallel" mode for unodes is to just
do a unode warmup before we start a sequential derivation for a stack of commits. After the
warmup all necessary entries should be in cache, so derivation should be less latency sensitive.
This could work, but it has a few disadvantages, namely:
* We do additional traversal - not the end of the world, but it might get
expensive for large commits
* We might fetch large directories that don't fit in cache more often than we
need to.
That said, because of it's simplicity it might be a reasonable option to keep
in mind, and I might get back to it later.
* **Option 2** Do a derivation for a stack of commits. We have a function to derive a
manifest for a single commit, but we could write a similar function to derive the whole stack at once.
That means for each changed file or directory we generate not a single change
but a stack of changes.
I was able to implement it, but the code was too complicated. There were quite
a few corner cases (particularly when a file was replaced with a directory, or
when deriving a merge commit), and dealing with all of them was a pain.
Moreover, we need to make sure it works correctly in all scenarios, and that
wouldn't be an easy thing to do.
* **Option 3** Do a derivation for a "simple" stack of commits. That's basically the
simplified version of option #2. Let's allow doing batch derivation only for
stacks that have no
a) merges
b) path changes that are ancestors of each other (which cause file/dir
conflicts).
This implementation is significantly simpler than option #2, and it should
cover most of the cases and hopefully bring perf benefits (though this is
something I'm yet about to measure). So this is what this diff implements
Reviewed By: yancouto
Differential Revision: D30989888
fbshipit-source-id: 2c50dfa98300a94a566deac35de477f18706aca7
Summary: I'd like to reuse it in the next diff, so let's move it to a separate function
Reviewed By: yancouto
Differential Revision: D31603911
fbshipit-source-id: 69ec0553022aabe662f75a50321423c54aafd196
Summary:
We don't really need it, and InnerRepo is more expensive to create, so let's
not do that.
Reviewed By: yancouto
Differential Revision: D31602266
fbshipit-source-id: 269bf50f8b4ee99e22888d91af2ac392078d32fa
Summary:
We were `hg adding` all files that were modified, however, we only really need to add files that didn't exist before the snapshot changes.
This gets rid of the annoying "already tracked!" messages.
Reviewed By: markbt
Differential Revision: D31438120
fbshipit-source-id: ca3545bc5881dcf01283abfa5ec9eca6309ff607
Summary: updated library.sh timeout handling to be clearer by checking the time change directly, and to use the health_check endpoint
Reviewed By: yancouto
Differential Revision: D31536407
fbshipit-source-id: 43cdf4260c10dfd4d3097dcd92d071c3d18b18f8
Summary:
We don't actually use bundle2 to deliver manifests anymore. Everything
is done via ondemand fetching, which won't have the problem covered in this
test. So let's delete it.
Reviewed By: quark-zju
Differential Revision: D31309313
fbshipit-source-id: 312508fa1b5e903314b92c048d23525c2194ab91
Summary:
This is part of removing filepeer. I also enabled treemanifest and
modernclient (i.e. lazy changelog) on a few tests.
Reviewed By: quark-zju
Differential Revision: D31032055
fbshipit-source-id: 6822274ad07303ed86b2ee5dd4e09979f1e215d5
Summary:
This test covers 'hg clone -r REV' which isn't really a supported clone
case since we now depend on cloning particular remote bookmarks. The test is
also heavily dependent on revision numbers. So let's just delete it.
This is needed as part of removing the tests dependency on hg server logic like
filepeer.
Reviewed By: quark-zju
Differential Revision: D31309312
fbshipit-source-id: e4620186e3eda3114686de36d06710747439ae18
Summary:
Ports test-bundle.t to use modernclient configs, meaning it no longer
uses the filepeer and it uses treemanifest and lazychangelog. I deleted the
parts of the test that depend on mounting a bundle file on top of a repo, since
that logic is unused in production.
Reviewed By: quark-zju
Differential Revision: D31309310
fbshipit-source-id: a535ed9a21253fd258f70088e7436480957afb2a
Summary:
We need to pass the file content blob with rename header to eagerepo,
not the one without the header. This resulted in a SHA1 mismatch assertion when
trying to make test-bundle.t use eagerepo.
Reviewed By: quark-zju
Differential Revision: D31309314
fbshipit-source-id: afaf3af3423b3f3006c1a95ddbf0da20056d9581
Summary: This is a step towards moving test-bundle.t to modernclient configs.
Reviewed By: quark-zju
Differential Revision: D31309318
fbshipit-source-id: 9680e85031797088d624a6c85e2d7316b108818e
Summary:
This test depends on a precomputed, legacy-formated, checked-in bundle
file. Bundle files are on their way to being replaced with commit cloud, and we
don't use a lot of the features in this test anyway (branches, merges) so let's
delete this test.
Reviewed By: quark-zju
Differential Revision: D31309316
fbshipit-source-id: b5729bed33a8c84fa75792528630d99dbc1996be
Summary:
Add a way to test EdenAPI endpoints.
The goal is to deduplicate with `make_req` and `read_res` tools to make it easier to
write new edenapi endpoints (no need to update `make_req` or `read_res`).
Reviewed By: DurhamG, yancouto
Differential Revision: D31465801
fbshipit-source-id: 5127941d0820ce737a4958a1d124f420acbaf771
Summary: Add a binding so Python code can use the Rust pprint implementation.
Reviewed By: DurhamG
Differential Revision: D31465826
fbshipit-source-id: b8f49f0d0587f82fae5906577acda95a09953e69
Summary:
I found that I need to "print" a value that might contain bytes and SHA1
hashes. JSON cannot do this job well because it does not support bytes.
Rust debug print can be too verbose.
Originally I tried:
- Python json: Not easily extendable to support binary data.
- Python repr: Not pretty.
- Python pprint: Lots of complexity on line wrapping, not easy to extend.
I ended up with an adhoc version of pprint (in D31465801):
# Similar to pprint, but much simpler (no textwrap) and rewrites
# binary nodes to hex(hexnode).
def format(o, indent, sort=sort):
if isinstance(o, bytes) and len(o) == 20:
yield 'bin("%s")' % hex(o)
elif isinstance(o, (bytes, str, bool, int)) or o is None:
yield repr(o)
elif isinstance(o, list):
yield "["
if sort:
o = sorted(o, key=lambda o: repr(o))
yield from formatitems(o, indent + 1)
yield "]"
elif isinstance(o, tuple):
yield "("
yield from formatitems(o, indent + 1)
yield ")"
elif isinstance(o, dict):
yield "{"
def fmt(kv, indent=indent + 1):
k, v = kv
kfmt = "".join(format(k, indent + 1)) + ": "
yield kfmt
yield from format(v, indent + len(kfmt))
items = sorted(o.items())
yield from formatitems(items, indent + 1, fmt)
yield "}"
else:
yield "?"
def formatitems(items, indent, fmt=None):
if fmt is None:
fmt = lambda o: format(o, indent)
total = len(items)
for i, o in enumerate(items):
if i > 0:
yield "\n"
yield " " * indent
yield from fmt(o)
if i + 1 < total:
yield ","
Later I found I need this feature in Rust too (D31465805 and D31465802).
So I translated the above Python code to Rust.
The Python syntax means the printed content can be copy-pasted to
Python files to form some quick scripts. Python code can also use
`eval` or `ast.literal_eval` (for safety, but no `bin` support) to
deserialize pprint output.
Reviewed By: DurhamG
Differential Revision: D31465797
fbshipit-source-id: ef4d17df84590075f74a0298ac89f4a963d8ed3c
Summary:
Serde deserializer is usually more permissive than restrictive. Make it a bit
more permissive in `deserialize_bytes` when we see a string.
Practically this means the Python bindings are a bit more permissive. However
the Rust interfaces are still strong typed so I think it is okay.
Reviewed By: DurhamG
Differential Revision: D31465810
fbshipit-source-id: a339b662f00a16953ce849ed8c2d65a1c3365081
Summary:
Use serde to decode HgId from Python. This allows us to get pure Rust types
earlier and get flexible hash support (binary or hex) from `HgId` for free.
Reviewed By: ahornby
Differential Revision: D31465821
fbshipit-source-id: 18c459d5fab6d508d0ec37f136bd45a7baf1e473
Summary:
See previous diff for context. This makes the endpoint a bit easier to use.
Also add retry, since it's now easier with the Vec type.
Reviewed By: yancouto
Differential Revision: D31416217
fbshipit-source-id: 40de6d14cf5cd088cd69156758699706bc7b8b8b
Summary:
See previous diff for context. This makes the endpoint a bit easier to use.
Also add retry, since it's now easier with the Vec type.
Reviewed By: yancouto
Differential Revision: D31416215
fbshipit-source-id: 53c169d17b2e16c285b5ea6d5c90ce0ee5d0b280
Summary:
See D31407278 (b68ef5e195) for a similar change. Basically, for lightweight metdata,
`Vec<T>` is more convenient than `Respoonse`. The prefix lookup endpoint
is lightweight. Let's switch to `Vec` fot the `EdenApi` trait.
Reviewed By: yancouto
Differential Revision: D31416216
fbshipit-source-id: 260235b57ddedcd7be8accb263a0090330445f7f
Summary:
See D31407278 (b68ef5e195) for a similar change. Basically, for lightweight metdata,
`Vec<T>` is more convenient than `Respoonse`. The commit graph endpoint
outputs commit hashes, which are lightweight. So let's switch to `Vec`
fot the `EdenApi` trait.
Reviewed By: yancouto
Differential Revision: D31410099
fbshipit-source-id: 0966b9afb47264c34ebe88355dc6df669dfb803b
Summary:
Previously the `clone.force-edenapi-clonedata` config decides whether to use
segments clone. That can be error prone - it will be a crash if the server
doesn't have the segments index.
Change it so we ask about the segments index and automatically use the lazy
changelog clone.
`clone.force-edenapi-clonedata` will be no longer necessary.
Differential Revision: D31358367
fbshipit-source-id: 9fa639d0349d00938c89cc091ea793f20dd714c8
Summary:
Expose the Rust capabilities endpoint to Python.
Note: I avoided some complexities here:
- Not implementing a `capabilities_py` function just for rustfmt purpose.
This avoids repeating the `capabilities` signature two more extra times.
- Not supporting legacy progress callback at all.
Reviewed By: yancouto
Differential Revision: D31358368
fbshipit-source-id: 76d0d71e627adbc57ed853922c4f826f3edfccb4