Commit Graph

68560 Commits

Author SHA1 Message Date
Jun Wu
89f1c0cb3b test-edenapi-server-scuba-logging: switch to debugapi
Summary:
See previous diffs for context. This makes the test shorter and will eventually
allow us to remove some bloated code in edenapi.

Reviewed By: yancouto

Differential Revision: D31465806

fbshipit-source-id: 5ea30d7069b68bf6f905398d47e4c0babc5c61c4
2021-10-13 13:38:08 -07:00
Jun Wu
8ce85acd8e test-edenapi-upload-file: switch to debugapi
Summary:
See previous diffs for context. This makes the test shorter and will eventually
allow us to remove some bloated code in edenapi.

Reviewed By: yancouto

Differential Revision: D31465833

fbshipit-source-id: 9c7ce84a67dbc2b7253f9e0bb63026a6aa1725d7
2021-10-13 13:38:08 -07:00
Jun Wu
9f1499437b minibytes: deserialize from more bytes types
Summary:
Accept owned bytes (`Vec<u8>`). This will make `Serde<Bytes>` work in Python
bindings.

Reviewed By: yancouto

Differential Revision: D31615717

fbshipit-source-id: ad200d359b9282fa84d1698afdde8241fc288905
2021-10-13 13:38:08 -07:00
Jun Wu
f741864c3e edenapi/types: use AbstractHashType for better serialization
Summary:
This removes duplicated implementation. The main goal is to make the hash types
use bytes instead of tuple serialization, and support hex deserialization.

Reviewed By: yancouto

Differential Revision: D31615718

fbshipit-source-id: 8e0039d4fad8a2720daeb2f276e6f5a795ea6482
2021-10-13 13:38:08 -07:00
Yan Soares Couto
2d6f3877d1 Improve CLI help text for snapshots
Summary: Improves a bit user experience, some help text was missing.

Reviewed By: mzr

Differential Revision: D31609163

fbshipit-source-id: 059bbe1176b7abdd42804f20aa44f5cb974ae575
2021-10-13 13:34:00 -07:00
Jun Wu
fa0c51c885 types: migrate Sha256 to AbstractHashType
Summary:
See the previous diff for context.

Aside from error type changes, this changes the serialization format of Sha256
from a tuple of 32 u8 to a byte slice. It seems okay, because the only user
of the `Sha256` serialization on client-side is revisionstore's `FileAuxData`,
which does not serialize to disk (This is checked by dropping the Serialize
trait and see what breaks).

Reviewed By: yancouto

Differential Revision: D31615716

fbshipit-source-id: 3d31b5d356c7e5f6b229fa9eae71ba4cad1c0e1a
2021-10-13 13:27:31 -07:00
Jun Wu
afdbb5cfe8 types: migrate HgId to AbstractHashType
Summary:
See the previous diff for context.

The error types in from_slice and from_hex are changed. Related callsites are
updated.

Reviewed By: yancouto

Differential Revision: D31615720

fbshipit-source-id: ed127621d689f527b2e2eb24b0bef03870340e05
2021-10-13 13:27:31 -07:00
Jun Wu
f00af51977 types: add an abstract hash type
Summary:
There are a lot of hash types: HgId, Sha256, and edenapi-types has some.
The HgId seems to have most features but it is hard to reuse it for similar
but different types. This is an attempt to do so and unify the main implementation
of those types.

Most methods are copied from HgId. `from_byte_array` is made const fn so it
can be used to construct constants.

The error type is intentionally chosen to be not `anyhow::Error`.
Using static typed errors is considered good practice for low-level crates.
The benefit is that higher level users get their choice - precise static error
type with compile-time checks, or convenient, dynamic error types by anyhow
with runtime downcasts.

Reviewed By: yancouto

Differential Revision: D31615719

fbshipit-source-id: 337356721354c43fe23b9f2d0e90d104c8864c44
2021-10-13 13:27:31 -07:00
Jun Wu
9fe9e2384b bookmarkstore: remove crate
Summary:
It's not actually used and is hard to integrate with hg now.

`bookmarkstore` was initially designed before metalog (provides atomicity for
metadata changes) being a thing.

With metalog we track bookmarks using plain text strings, complex structures
like indexedlog cannot be used within metalog.

Reviewed By: yancouto

Differential Revision: D31615721

fbshipit-source-id: 75d6b9c9ba4475e86530e2368d30f53e58ac37d8
2021-10-13 13:27:30 -07:00
Jun Wu
71994edcae test-edenapi-server-ephemeral-prepare: switch to debugapi
Summary:
See previous diffs for context. This makes the test shorter and will eventually
allow us to remove some bloated code in edenapi.

Reviewed By: yancouto

Differential Revision: D31465811

fbshipit-source-id: 2c0c0bab4adc4fb89fe9b46bc3f204c79efe17ee
2021-10-13 13:27:30 -07:00
Jun Wu
0f4c2a9787 test-edenapi-files: switch to debugapi
Summary:
See previous diffs for context. This makes the test shorter and will eventually
allow us to remove some bloated code in edenapi.

Note some hashes are serialized (suboptimally) as an array of bytes, not as a
byte slice.  Ideally we need `serde_bytes` annotation somewhere but I haven't
checked if that breaks compatibility.

Reviewed By: yancouto

Differential Revision: D31465828

fbshipit-source-id: 7892ececc475bad530708499cf2255852d610bc2
2021-10-13 13:27:30 -07:00
Jun Wu
17fef63dbc io: make with_input provide mut Read
Summary: Without `mut`, the `Read` cannot actually read anything.

Reviewed By: yancouto

Differential Revision: D31465817

fbshipit-source-id: b069499ff0e8a371f0e27402baedfb25414e29a1
2021-10-13 13:27:30 -07:00
Jun Wu
0e8d76b982 pyedenapi: expose process_files_upload endpoint
Summary: They will be used by `debugapi`.

Reviewed By: yancouto

Differential Revision: D31615722

fbshipit-source-id: b3e53a8ac0ab8202905602b478c1ca9e4d712f64
2021-10-13 13:27:30 -07:00
Jun Wu
9addabe563 pyedenapi: expose files_attrs and ephemeral_prepare endpoints
Summary:
These are used by Mononoke tests. Expose them so `debugapi` can be used in
tests.

Reviewed By: yancouto

Differential Revision: D31465832

fbshipit-source-id: 4291023d5d4623c43065e1b1c90848f3bc15047f
2021-10-13 13:27:30 -07:00
Jun Wu
e5db11c8ed pyedenapi: use serde for tree attributes
Summary: Replace manual PyDict parsing with serde. This simplifies the code.

Reviewed By: yancouto

Differential Revision: D31465809

fbshipit-source-id: 935f5a12a23e525e12cbc914d61b0352a765761d
2021-10-13 13:27:30 -07:00
Jun Wu
5d9eb37bac test-edenapi-server-trees: switch to debugapi
Summary:
See previous diffs for context. This makes the test shorter and will eventually
allow us to remove some bloated code in edenapi.

Note the client-side edenapi only provides prefix lookup. So it does not work
for an arbitrary range.

Reviewed By: yancouto

Differential Revision: D31465812

fbshipit-source-id: 9b81148258c45e7e534faee97e1b51c4dd75102d
2021-10-13 13:27:29 -07:00
Jun Wu
964b1e5fd8 test-edenapi-server-history: switch to debugapi
Summary:
See previous diffs for context. This makes the test shorter and will eventually
allow us to remove some bloated code in edenapi.

Reviewed By: yancouto

Differential Revision: D31465827

fbshipit-source-id: 3e7a943efce6c670d817f57e10a1e1e191fe633f
2021-10-13 13:27:29 -07:00
Jun Wu
da87e19e97 pyedenapi: use Serde<HgId> for keys
Summary:
See previous diff for context. This provides pure Rust HgId more easily (without
needing `py`), and makes the acceptable format a bit more flexible, useful for
`debuapi`.

Unfortunately `Serde<Key>` or `Serde<Vec<(RepoPathBuf, HgId)>>` is not really
a choice because `RepoPathBuf` did not declare as `#[serde(transparent)]`,
which means paths have to be written as `(path,)`. Adding `#[serde(transparent)]`
to `RepoPathBuf` might be a breaking change that I didn't try.

Reviewed By: yancouto

Differential Revision: D31465819

fbshipit-source-id: cda210c2f5f6532256204abd428e1ad2b1de9fd9
2021-10-13 13:27:29 -07:00
Jun Wu
cc7e210401 pyedenapi: remove store side effects for files and history
Summary:
These APIs wrote directly to stores. Change them to return content like other
APIs so `debugapi` can print the result.

As we're here, drop legacy progress support.

Reviewed By: yancouto

Differential Revision: D31465820

fbshipit-source-id: 0c7a2e07f8fe56a89cc82a51ca6566d3ab6cc754
2021-10-13 13:27:29 -07:00
Jun Wu
a57939efd0 edenapi/types: add back serde impls on some types
Summary:
D31057140 (0cd8e64c51) removed some serde derives on structs, but we need to them for easier
Python bindings. Let's add them back.

Reviewed By: yancouto

Differential Revision: D31465804

fbshipit-source-id: 94e8f0dbbbde4e52cb8b3153ea219fa70c91fca4
2021-10-13 13:27:29 -07:00
Jun Wu
3dbb12bdc2 test-edenapi-server-hash-lookup: switch to debugapi
Summary:
See previous diffs for context. This makes the test shorter and will eventually
allow us to remove some bloated code in edenapi.

Note the client-side edenapi only provides prefix lookup. So it does not work
for an arbitrary range.

Reviewed By: yancouto

Differential Revision: D31465834

fbshipit-source-id: eb45922e109b7301beb9799e3ccb7905541de605
2021-10-13 13:27:29 -07:00
Jun Wu
f4b57e4891 test-edenapi-server-segmented-changelog-setup: switch to debugapi
Summary:
See previous diffs for context. This makes the test shorter and will eventually
allow us to remove some bloated code in edenapi.

Reviewed By: yancouto

Differential Revision: D31465823

fbshipit-source-id: c2a89cbac62ae6d4aa80bde41c177bc4acd986fa
2021-10-13 13:27:29 -07:00
Jun Wu
6a41405729 test-edenapi-server-commit-revlog-data: switch to debugapi
Summary:
See previous diffs for context. This makes the test shorter and will eventually
allow us to remove some bloated code in edenapi.

Reviewed By: yancouto

Differential Revision: D31465813

fbshipit-source-id: b921476f2d398271688ee96cc994474b616358f0
2021-10-13 13:27:28 -07:00
Jun Wu
740c46001c test-edenapi-server-commit-location-to-hash: switch to debugapi
Summary:
See previous diffs for context. This makes the test shorter and will eventually
allow us to remove some bloated code in edenapi.

Reviewed By: yancouto

Differential Revision: D31465818

fbshipit-source-id: 7885d7204d01a5ae7a0e835eeb9e51cedcc6281d
2021-10-13 13:27:28 -07:00
Jun Wu
6446690788 test-edenapi-server-commit-graph: switch to debugapi
Summary:
See previous diffs for context. This makes the test shorter and will eventually
allow us to remove some bloated code in edenapi.

Reviewed By: yancouto

Differential Revision: D31465815

fbshipit-source-id: 052a8a4487793a6a977e8ff27aa2bc5443175061
2021-10-13 13:27:28 -07:00
Jun Wu
09efa80252 edenapi: drop complete_trees support
Summary: It is no longer used.

Reviewed By: yancouto

Differential Revision: D31465831

fbshipit-source-id: 338ee0da080972cfd60caacef00b13d356940d3f
2021-10-13 13:27:28 -07:00
Jun Wu
cb57cff72e edenapi_service: drop complete_trees support
Summary: It is no longer used.

Reviewed By: yancouto

Differential Revision: D31465824

fbshipit-source-id: e12ee7c8e29ddb01b58e750be777936c33fdfc0a
2021-10-13 13:27:28 -07:00
Jun Wu
a834f656cc treemanifestserver: drop complete tree prefetching
Summary:
This is not used in production and the removal unblocks the removal
of the `complete_trees` endpoint.

Reviewed By: yancouto

Differential Revision: D31465807

fbshipit-source-id: d2a8ff79fe4e6181adefa499bbd7125fa0c5c26b
2021-10-13 13:27:28 -07:00
Jun Wu
8927b38f7b test-edenapi-server-bookmarks: switch to debugapi
Summary:
See previous diffs for context. This makes the test shorter and will eventually
allow us to remove some bloated code in edenapi.

Reviewed By: yancouto

Differential Revision: D31465798

fbshipit-source-id: 8e184a881479e42fcda06d7258af55fab99ab1da
2021-10-13 13:27:28 -07:00
Jun Wu
27f146ebfb test-edenapi-server-clone: switch to debugapi
Summary:
`debugapi` was introduced to test edenapi endpoints as seen by `hg`.  Use it to
test the real client-side edenapi logic. This will eventually allow us to drop
`read_res`, `make_req`, `FromJson`, and `ToJson` to reduce bloated code in
edenapi.

Reviewed By: yancouto

Differential Revision: D31465799

fbshipit-source-id: 5a1ae0565b9c7a23773644e775c5f1930ebc12ad
2021-10-13 13:27:27 -07:00
Jun Wu
aeadfc689e edenapi: drop full_idmap clone support
Summary: It was a temporary step that is never used in production. Delete the endpoint.

Reviewed By: yancouto

Differential Revision: D31465808

fbshipit-source-id: 90e77eab96bb75796a3b31a5907f137d743a7dfa
2021-10-13 13:27:27 -07:00
Jun Wu
d8a2b7c11c segmented_changelog: drop full_idmap clone support
Summary: It was a temporary step that is never used in production. Delete the endpoint.

Reviewed By: yancouto

Differential Revision: D31465829

fbshipit-source-id: 635ee205589b0f4d15388ae2a2bb9ada51d77edd
2021-10-13 13:27:27 -07:00
Jun Wu
e5ccd802d5 pyedenapi: use PyCell for clonedata
Summary: This makes it a bit more efficient and more friendly in debugapi output.

Reviewed By: DurhamG

Differential Revision: D31465800

fbshipit-source-id: b741792eb43a7a57c90f75362dbf189739dbb844
2021-10-13 13:27:27 -07:00
Jun Wu
f55d531734 test-edenapi-mismatched-heads: switch to debugapi
Summary:
`debugapi` was introduced to test edenapi endpoints as seen by `hg`.  Use it to
test the real client-side edenapi logic. This will eventually allow us to drop
`read_res`, `make_req`, `FromJson`, and `ToJson` to reduce bloated code in
edenapi.

Reviewed By: yancouto

Differential Revision: D31465830

fbshipit-source-id: 9da739a76ef6e5d49804b0cea2089fc1741d0b7c
2021-10-13 13:27:27 -07:00
Jun Wu
d261801d65 test-edenapi-server-commit-hash-to-location: switch to debugapi
Summary:
`debugapi` was introduced to test edenapi endpoints as seen by `hg`.  Use it to
test the real client-side edenapi logic. This will eventually allow us to drop
`read_res`, `make_req`, `FromJson`, and `ToJson` to reduce bloated code in
edenapi.

Reviewed By: yancouto

Differential Revision: D31465814

fbshipit-source-id: 5d090e0a92c6374b15a36f80dcc578ad280e50e2
2021-10-13 13:27:27 -07:00
Meyer Jacobs
628113cd27 scmstore: don't always flush TreeStore on Drop
Summary: To ease review, just doing the simple / hacky version of the TreeStore diff, like the FileStore one below this. This doesn't address the tree batch failure issue, but that's not blocking release so it's fine for now.

Reviewed By: andll

Differential Revision: D31593076

fbshipit-source-id: 0d3c420e50af0d8882ba171590597aac1b6c4c77
2021-10-13 10:59:02 -07:00
Meyer Jacobs
68770c9f64 scmstore: don't always flush FileStore on Drop
Summary:
Currently, the scmstore backingstore backend for EdenFs performs very poorly compared to ContentStore. This is mainly because a local-only FileStore is created on every fetch, and always flushes when dropped (even when not used).

With this change, we'll only flush when the parent FileStore is dropped.

This is a slightly hacky fix, vs. creating a new non-flush-on-drop type like `TreeStoreFetch` as we did for trees in the previous diff, but this should work fine for now and is radically smaller. Later I'd like to clean up both to eliminate the verbosity in the trees approach while still using separate types rather than a bool.

Reviewed By: andll

Differential Revision: D31591276

fbshipit-source-id: 11266c19ac68d87015719f5bcbd8f857e596bfdb
2021-10-13 10:59:02 -07:00
Meyer Jacobs
e5820fcbdb scmstore: remove prefetch batch chunking
Summary: With edenapi and memcache reads both being written to disk as they're fetched, we should no longer need to break large prefetch batches into chunks.

Reviewed By: DurhamG

Differential Revision: D31552341

fbshipit-source-id: 2d9e10db669754cc8124228252cf832c8102f220
2021-10-13 10:59:02 -07:00
Meyer Jacobs
35e03b7940 scmstore: evict memcache fetches to disk as they're returned
Summary: Like with the previous EdenApi change, writing files fetched from memcache to disk as we read them will prevent us from accumulating large amounts of file content in memory and potentially causing an OOM.

Reviewed By: DurhamG

Differential Revision: D31552190

fbshipit-source-id: 1eceb7570918575382f067ff9ad1e08e0623f335
2021-10-13 10:59:01 -07:00
Meyer Jacobs
bb762031cc scmstore: factor out method for writing memory-backed LazyFile to cache
Summary: Introduce a new method, `evict_to_cache`, which writes a memory-backed LazyFile to disk and returns a mmap-backed LazyFile instead.

Reviewed By: DurhamG

Differential Revision: D31551177

fbshipit-source-id: 86901628c56151ac805a57885379241c42f794c0
2021-10-13 10:59:01 -07:00
Meyer Jacobs
53b9b6ec57 scmstore: evict EdenApi fetches to disk concurrently
Summary:
Use the new async `files_attrs` method to concurrently fetch files from EdenApi, write them to disk, and read them back as a mmap-ed `LazyFile`.

Also remove scmstore prefetch chunking with the previous approach required.

Previously, we fetched all the requested files from EdenApi into memory as a batch, then wrote them to disk afterward. This caused OOMs for extremely large fetches, and required chunking scmstore prefetches, which had a negative performance impact.

Reviewed By: DurhamG

Differential Revision: D31445678

fbshipit-source-id: 9a2e1476fb8ddfcd546a5e0b501cc91cc2a97303
2021-10-13 10:59:01 -07:00
Meyer Jacobs
8b995983b5 edenapi: add async files_attrs
Summary: Add a basic async `files_attrs` method to `EdenApiFileStore` for use by scmstore for concurrent fetching / writing to disk.

Reviewed By: DurhamG

Differential Revision: D31483490

fbshipit-source-id: 20233767541d60ccdfcd97bdc98b9bbb98d8700e
2021-10-13 10:59:01 -07:00
Meyer Jacobs
5524912ecf scmstore: write LFS pointers discovered outside LFS pointer store
Summary:
Currently, scmstore does not write LFS pointers discovered outside the LfsStore before fetching blobs from the LFS remote and attempting to read them back from disk. This causes scmstore to rely on the contentstore fallback to write LFS pointers, after which future reads will succeed in scmstore.

With this change, we track which LFS pointers were discovered from a source other than an LfsStore, and write those to the LfsStore before fetching from the LFS remote. This eliminates the debugimporterhelper fallback in the backingstore implementation, and should improve the performance of the shim in general for LFS files where the LFS pointer is not available locally.

In the future, I'll probably perform these writes inline with the fetches that discovered the pointers for added concurrency, and reorganize the LFS fetching code in general to avoid some of the weird bookkeeping we currently do around local vs. shared for LFS.

Reviewed By: andll

Differential Revision: D31280512

fbshipit-source-id: 2e2a32db3fc9fab8ac179b7852da36f77c66ef31
2021-10-13 10:59:01 -07:00
Alex Hornby
5bd3fe6b36 mononoke: remove edenapi_server
Summary: we run the eden endpoints from mononoke server, not via this binary any more.  remove it

Reviewed By: StanislavGlebik

Differential Revision: D31501592

fbshipit-source-id: 0626fe43f0f1ce4a6c7165a734eb487225fa65b6
2021-10-13 09:45:20 -07:00
Egor Tkachenko
d6616b7f72 Add facet repo container for 2DS service.
Summary: Derived data manager doesn't need full BlobRepo struct. So lets create facet container which includes only needed parts for derivation using manager. I'll use it later with 2DS service.

Reviewed By: StanislavGlebik

Differential Revision: D31540177

fbshipit-source-id: 42a3e96352689f39730b8f62142cccd7a1bb7256
2021-10-13 09:29:27 -07:00
Egor Tkachenko
b522fdd880 Add single commit derivation API for derived data manager.
Summary: I'm going to use that API from 2DS service. so the discovery of underived will not be triggered.

Reviewed By: yancouto

Differential Revision: D31540178

fbshipit-source-id: cdf91974b90bc07e09cd3f465f9a1f8b8271a8e4
2021-10-13 09:29:26 -07:00
Stanislau Hlebik
35a7998c07 mononoke: add a method to derive a simple stack of manifests
Summary:
Background: I've been looking into derived data performance and found that
while overall performance is good, it depends quite a lot on the blobstore
latency i.e. the higher the latency the slower the derivation. What's worse is
that increasing blobstore latency even by 100ms might increase time of
derivation of 100 commits from 12 to 65 secs! [1]

However we have ways to mitigate it:
* **Option 1** If we use "backfill" mode then it makes derived data derivation less
sensitive to the put() latency
* **Option 2** If we use "parallel" mode then it makes derived data derivation less
sensitive to the get() latency.

We can use "backfill" mode for almost all derived data types (only exception is
filenodes), however "parallel" only enabled for a few derived data types (e.g.
fsnodes, skeleton manifests, filenodes).

In particular, we didn't have a way to do batch derived data derivation for
unodes, and so unodes derivation might get quite sensitive to the blobstore
get() latency. So this diff tries to address that.

I considered three options:
* **Option 1** The simplest option of implementing "parallel" mode for unodes is to just
do a unode warmup before we start a sequential derivation for a stack of commits. After the
warmup all necessary entries should be in cache, so derivation should be less latency sensitive.
This could work, but it has a few disadvantages, namely:
* We do additional traversal - not the end of the world, but it might get
 expensive for large commits
* We might fetch large directories that don't fit in cache more often than we
need to.

That said, because of it's simplicity it might be a reasonable option to keep
in mind, and I might get back to it later.

* **Option 2** Do a derivation for a stack of commits. We have a function to derive a
manifest for a single commit, but we could write a similar function to derive the whole stack at once.
That means for each changed file or directory we generate not a single change
but a stack of changes.
I was able to implement it, but the code was too complicated. There were quite
a few corner cases (particularly when a file was replaced with a directory, or
when deriving a merge commit), and dealing with all of them was a pain.
Moreover, we need to make sure it works correctly in all scenarios, and that
wouldn't be an easy thing to do.

* **Option 3** Do a derivation for a "simple" stack of commits. That's basically the
simplified version of option #2. Let's allow doing batch derivation only for
stacks that have no
a) merges
b) path changes that are ancestors of each other (which cause file/dir
conflicts).

This implementation is significantly simpler than option #2, and it should
cover most of the cases and hopefully bring perf benefits (though this is
something I'm yet about to measure). So this is what this diff implements

Reviewed By: yancouto

Differential Revision: D30989888

fbshipit-source-id: 2c50dfa98300a94a566deac35de477f18706aca7
2021-10-13 08:48:27 -07:00
Stanislau Hlebik
fc4a630238 mononoke: move cmd-line parsing of derived data types to a function
Summary: I'd like to reuse it in the next diff, so let's move it to a separate function

Reviewed By: yancouto

Differential Revision: D31603911

fbshipit-source-id: 69ec0553022aabe662f75a50321423c54aafd196
2021-10-13 07:36:26 -07:00
Stanislau Hlebik
f752d303c5 mononoke: do not create InnerRepo in subcommand_single unnecessarily
Summary:
We don't really need it, and InnerRepo is more expensive to create, so let's
not do that.

Reviewed By: yancouto

Differential Revision: D31602266

fbshipit-source-id: 269bf50f8b4ee99e22888d91af2ac392078d32fa
2021-10-13 07:36:26 -07:00
Yan Soares Couto
f5e190dab9 Prevent "already tracked!" messages
Summary:
We were `hg adding` all files that were modified, however, we only really need to add files that didn't exist before the snapshot changes.

This gets rid of the annoying "already tracked!" messages.

Reviewed By: markbt

Differential Revision: D31438120

fbshipit-source-id: ca3545bc5881dcf01283abfa5ec9eca6309ff607
2021-10-13 05:15:10 -07:00