Summary: This is breaking the Windows release, reverting.
Reviewed By: fanzeyi
Differential Revision: D29339787
fbshipit-source-id: 22d8ff5db5619194e4597754dc37343cf0bc3286
Summary:
Introduce `LegacyStore` trait, which contains ContentStore methods not covered by other datastore traits.
Implement this trait for both contentstore and scmstore, and modify rust code which consumes `contentstore` directly to use `PyObject` and `LegacyStore` to abstract over both contentstore and scmstore instead.
Reviewed By: DurhamG
Differential Revision: D29043162
fbshipit-source-id: 26e10b23efc423265d47a8a13b25f223dbaef25c
Summary:
Now that EdenFS is using EdenAPI more, let's let it take advantage of
EdenAPI's better batching. We alread have a batch API for files, let's copy the
pattern for trees as well. This adds the C++ bindings. The next diff consumes
this from EdenFS
This is largely just a copy of how batch blob fetching does this. But I'm a C++
noob, so feel free to tear this apart with nits.
Reviewed By: chadaustin
Differential Revision: D28426789
fbshipit-source-id: 88d359985e849018fb3c2b4ef9e52d07c96bf31a
Summary:
Now that EdenFS is using EdenAPI more, let's let it take advantage of
EdenAPI's better batching. We alread have a batch API for files, let's copy the
pattern for trees as well. This first diff just produces the Rust code. Future
diffs will add the C++ bindings then integrate it into EdenFS.
This is largely just a copy of how batch blob fetching does it.
Reviewed By: chadaustin
Differential Revision: D28426790
fbshipit-source-id: 822ef6e7b3458df5dba7a007657e85351162b9ff
Summary:
Like it says in the title. The API between Bytes 1.x has changed a little bit,
but the concepts are basically the same, so we just need to change the
callsites that were calling `bytes()` and have them ask for `chunk()` instead.
This diff attempts to be as small as it can (and it's already quite big). I
didn't attempt to update *everything*: I only updated whatever was needed to
keep `common/rust/tools/scripts/check_all.sh` passing.
However, there are a few changes that fall out of this. I'll outline them here:
## `BufExt`
One little caveat is the `copy_to_bytes` we had on `BufExt`. This was
introduced into Bytes 1.x (under that name), but we can't use it here directly.
The reason we can't is because the instance we have is a `Cursor<Bytes>`, which
receives an implementation of `copy_from_bytes` via:
```
impl<T: AsRef<[u8]>> Buf for std::io::Cursor<T>
```
This means that implementation isn't capable of using the optimized
`Bytes::copy_from_bytes` which doesn't do a copy at all. So, instead, we need
to use a dedicated method on `Cursor<Bytes>`: `copy_or_reuse_bytes`.
## Calls to `Buf::to_bytes()`
This method is gone in Bytes 1.x, and replaced by the idiom
`x.copy_to_bytes(x.remaining())`, so I updated callsites of `to_bytes()`
accordingly.
## `fbthrift_ext`
This set of crates provides transports for Thrift calls that rely on Tokio 0.2
for I/O. Unfortunately, Tokio 0.2 uses Bytes 0.5, so that doesn't work well.
For now, I included a copy here (there was only one required, when reading from
the socket). This can be removed if we update the whole `fbthrift_ext` stack to
Bytes 1.x. fanzeyi had been wanting to update this to Tokio 1.x, but was blocked on `thrift/lib/rust` using Bytes 0.5, and confirmed that the overhead of a copy here is fine (besides, this code can now be updated to Tokio 1.x to remove the copy).
## Crates using both Bytes 0.5 & Bytes 1.x
This was mostly the case in Mononoke. That's no coincidence: this is why I'm
working on this. There, I had to make changes that consist of removing Bytes
0.5 to Bytes 1.x copies.
## Misuse of `Buf::bytes()`
Some places use `bytes()` when they probably mean to use `copy_to_bytes()`. For
now, I updated those to use `chunk()`, which keeps the behavior the same but
keeps the code buggy. I filed T91156115 to track fixing those (in all
likelihood I will file tasks for the relevant teams).
Reviewed By: dtolnay
Differential Revision: D28537964
fbshipit-source-id: ca42a614036bc3cb08b21a572166c4add72520ad
Summary:
We have a linker issue on Windows when building EdenFS with CMake:
```
backingstore.lib(winhttp.o) : error LNK2019: unresolved external symbol __imp_WinHttpSetStatusCallback referenced in function winhttp_connect
backingstore.lib(winhttp.o) : error LNK2019: unresolved external symbol __imp_WinHttpOpen referenced in function winhttp_connect
backingstore.lib(winhttp.o) : error LNK2019: unresolved external symbol __imp_WinHttpCloseHandle referenced in function winhttp_close_connection
backingstore.lib(winhttp.o) : error LNK2019: unresolved external symbol __imp_WinHttpConnect referenced in function winhttp_connect
backingstore.lib(winhttp.o) : error LNK2019: unresolved external symbol __imp_WinHttpReadData referenced in function winhttp_stream_read
backingstore.lib(winhttp.o) : error LNK2019: unresolved external symbol __imp_WinHttpWriteData referenced in function winhttp_stream_read
backingstore.lib(winhttp.o) : error LNK2019: unresolved external symbol __imp_WinHttpQueryOption referenced in function certificate_check
```
This fixes that.
Reviewed By: xavierd
Differential Revision: D28230163
fbshipit-source-id: f74e42ee30ec8f3b81c1f80b7cf63a21ea97732c
Summary:
From what I can see, this was added when EdenFS had a Mononoke store, which is
now long gone, thus we should be able to remove the Curl dependency altogether.
Reviewed By: fanzeyi
Differential Revision: D28037816
fbshipit-source-id: 834f7db64bab5dda1748ad2f033c27a2854b0ba4
Summary: This diff removes the split between manually managed and autocargo managed Cargo.toml files in `eden/scm/lib`, now all files are autogenerated.
Reviewed By: quark-zju
Differential Revision: D26830884
fbshipit-source-id: 3a5d8409a61347c7650cc7d8192fa426c03733dc
Summary:
Memcache is dogslow to initialize, taking >30s on debug build. As a
consequence, this slows down every single test by that amount time, with the
guarantee that no blobs will be found in memcache, ie: a total waste of time.
On release builds, Memcache is significantly faster to initialize, so let's
only disable initializing Memcache for debug builds only.
Reviewed By: fanzeyi
Differential Revision: D26800265
fbshipit-source-id: 8b393c603414de68268fdadb385de177e214a328
Summary:
Move these conversion related function and trait out of `hg` module so EdenFS can use it too. Changes:
* Moved `get_opt`, `get_or` and `get_or_default` directly into `ConfigSet`.
* Moved `FromConfigValue` and `ByteCount` into `configparser::convert`.
Reviewed By: quark-zju
Differential Revision: D26355403
fbshipit-source-id: 9096b7b737bc4a0cccee1a3883e89a323f864fac
Summary:
The backingstore crate depends on mio which depends on ntdll. It's not entirely
clear to me why we need to do this manually and why cargo/cmake doesn't pick it
up automatically, but let's fix it this way for now.
Reviewed By: genevievehelsel
Differential Revision: D26376606
fbshipit-source-id: 26714b3f03aabefafdf48c7fb0f442cad501a058
Summary: We now get progress bar output when fetching from memcache!
Reviewed By: kulshrax
Differential Revision: D24060663
fbshipit-source-id: ff5efa08bced2dac12f1e16c4a55fbc37fbc0837
Summary: Disable CLANGTIDY checks for several places at the code.
Reviewed By: zertosh, benoitsteiner
Differential Revision: D24018176
fbshipit-source-id: b2d294f9efd64b2e2c72b11b18d8033f9928e826
Summary: Now that the `async_runtime` crate exists, use Mercurial's global `tokio::Runtime` instead of creating one for each EdenAPI store.
Reviewed By: quark-zju
Differential Revision: D23945569
fbshipit-source-id: 7d7ef6efbb554ca80131daeeb2467e57bbda6e72
Summary:
The rust pack stores currently have logic to refresh their list of
packs if there's a key miss and if it's been a while since we last loaded the
list of packs. In some cases we want to manually trigger this refresh, like if
we're in the middle of a histedit and it invokes an external command that
produces pack files that the histedit should later consume (like an external
amend, that histedit then needs to work on top of).
Python pack stores solve this by allowing callers to mark the store for a
refresh. Let's add the same logic for rust stores. Once pack files are gone we
can delete this.
This will be useful for the upcoming migration of treemanifest to Rust
contentstore. Filelog usage of the Rust contentstore avoided this issue by
recreating the entire contentstore object in certain situations, but refresh
seems useful and less expensive.
Reviewed By: quark-zju
Differential Revision: D23657036
fbshipit-source-id: 7c6438024c3d642bd22256a8e58961a6ee4bc867
Summary:
Previously the backing store was loading configs manually. Now that
system, dynamic, user, and repo config loading are unified, let's go through
that approved path.
Reviewed By: kulshrax
Differential Revision: D22736338
fbshipit-source-id: 232023e660107a096691e9d99bf89c04c218dfbd
Summary:
This threads the calls to load_dynamic and load_repo through the Rust
layer up to the Python bindings. This diff does 2 notable things:
1. It adds a reload API for reloading configs in place, versus creating a new
one. This will be used in localrepo.__init__ to construct a new config for the
repo while still maintaining the old pinned values from the copied ui.
2. It threads a repo path and readonly config list from Python down to the Rust
code. This allows load_dynamic and load_repo to operate on the repo path, and
allows the readonly filter to applied to all configs during reloading.
Reviewed By: quark-zju
Differential Revision: D22712623
fbshipit-source-id: a0f372f4971c5feac2f20e89a0fb3fe6d4a65d6f
Summary: Similarly to the changes made for `get`, the same can be applied to prefetch.
Reviewed By: DurhamG
Differential Revision: D22565609
fbshipit-source-id: 0fbc1a0086fa44593a6aaffb746ed36b3261040c
Summary:
When using LFS, it's possible that a pointer may be present in the local
LfsStore, but the blob would only be in the shared one. Such scenario can
happen after an upload, when the blob is moved to the shared store for
instance. In this case, during a `get` call, the local LFS store won't be able
to find the blob and thus would return Ok(None), the shared LFS store woud not
be able to find the pointer itself and would thus return Ok(None) too. If the
server is not aware of the file node itself, the `ContentStore::get` would also
return Ok(None), even though all the information is present locally.
The main reason why this is happening is due to the `get` call operating
primarily on file node based keys, and for content-based stores (like LFS),
this means that the translation layer needs to be present in the same store,
which in some case may not be the case. By allowing stores to return a
`StoreKey` when progress was made in finding the key we can effectively solve
the problem described above, the local store would translate the file node key
onto a content key, and the shared store would read the blob properly.
Reviewed By: DurhamG
Differential Revision: D22565607
fbshipit-source-id: 94dd74a462526778f7a7e232a97b21211f95239f
Summary: Move the `tokio::Runtime` into `EdenApiRemoteStore` so that if initialization fails, we can propagate the error instead of panicking.
Reviewed By: xavierd
Differential Revision: D22564210
fbshipit-source-id: 9db1be99f2f77c6bb0f6e9dc445d624dc5990afe
Summary: Instead of restricting the allowed characters in a repo name, allow any UTF-8 string. The string will be percent-encoded before being used in URLs.
Reviewed By: quark-zju
Differential Revision: D22559830
fbshipit-source-id: f9caa51d263e06d424531e0947766f4fd37b035f
Summary: Some repos do not have `remotefilelog.reponame` set, so this shouldn't be a required config item.
Reviewed By: fanzeyi
Differential Revision: D22553141
fbshipit-source-id: a0fe9c289a1a32650572a4c123cda60af90e79ec
Summary: Reimplement `EdenApiHgIdRemoteStore` as `EdenApiRemoteStore<T>`, where `T` is a marker type indicating whether this store fetches files or trees. This allows working with the stores in a more strongly-typed way, and avoid having to check what kind of store this is at runtime when fetching data.
Reviewed By: quark-zju
Differential Revision: D22492160
fbshipit-source-id: e17556093fa9b81d2301f281da36d75a03e33c5e
Summary: Update the `revisionstore` and `backingstore` crates to use the new EdenAPI crate.
Reviewed By: quark-zju
Differential Revision: D22378330
fbshipit-source-id: 989f34827b744ff4b4ac0aa10d004f03dbe9058f
Summary: Move old EdenAPI crate to `scm/lib/edenapi/old` to make room for the new crate. This old code will eventually been deleted once all references to it are removed from the codebase.
Reviewed By: quark-zju
Differential Revision: D22305173
fbshipit-source-id: 45d211340900192d0488543ba13d9bf84909ce53
Summary:
NOTE: This stack works together, they are separated for easier reviewing. Check D21723465 if you want to read this stack in one place.
This diff encapsulates the newly added batch blob import API in backingstore crate, and expose it to EdenFS. It converts incoming complex data structures down to primitive data types that can go through FFI boundary.
Reviewed By: xavierd
Differential Revision: D21697578
fbshipit-source-id: 8f5505ef4cab342dfa710798a04df4e0e94055d9
Summary:
NOTE: This stack works together, they are separated for easier reviewing. Check D21723465 if you want to read this stack in one place.
This diff implements getBlobBatch in `backingstore` crate that is able to process multiple blob import request at once. This will help EdenFS to process blob import request efficiently. See following diffs in the stack for usage.
To import a list of import requests, the function first check if the file exists locally. When it is already in local store, the function will call the provided `resolve` function to send the blob back. Then it proceeds to send **ONE** request to remote store (in our case, EdenAPI) for all these missing files if this import permits remote importing.
We use the callback style `resolve` parameter because we want to awake waiting threads as soon as the blob is ready, so we don't waste time on unnecessary waiting.
Reviewed By: xavierd
Differential Revision: D21697580
fbshipit-source-id: b550accf6f6163cf6f2e9be6b628e46f44076c37
Summary:
xavierd and I realized that memcache is not actually enabled in EdenFS as `ContentStoreBuilder` only sets it when there is are remote store. We believe turning on memcache store can help us improve some performance for importing blobs (80% hit rate).
This diff adds a noop remote store that does nothing to trick `ContentStoreBuilder` to include the memcache store. We decided to go this way instead of modifying `ContentStoreBuilder` is because EdenFS is going to have a real remotestore very soon, and changes to `ContentStoreBuilder` may have some impact on how Mercurial uses it.
Reviewed By: xavierd
Differential Revision: D21621234
fbshipit-source-id: 5502e001ab498134b6a927d83be823261e70e9e1
Summary: This bug got in while iterating the original Diff. It should only be returning empty when the blob does not exist locally.
Reviewed By: xavierd
Differential Revision: D21417659
fbshipit-source-id: 676e22313ab4a024af5341d8c99797fc062bd293
Summary:
Talked with xavierd last week and we can use LocalStore's `get_missing` to determine if a blob is present locally. In this way we can prevent the backingstore crate from accidentally asking EdenAPI for a blob, so better control at EdenFS level.
With this change, we can use this function at the time where a blob import request is created with confidence that this should be short cheap call.
This diff should not change any behavior or performance.
Reviewed By: xavierd
Differential Revision: D21391959
fbshipit-source-id: fd31687da1e048262cb4eae2974cab6d8915a76d
Summary:
All of the callers are already using an Arc, so instead of forcing the remote
store to be cloneable, and thus wrap an inner self with an Arc, let's just pass
self as an Arc.
Reviewed By: DurhamG
Differential Revision: D20715580
fbshipit-source-id: 1bef23ae7da7b314d99cb3436a94d04134f1c0e4
Summary:
Instead of having the magic number 0x2000 all over the place, let's move the
logic to this method.
Reviewed By: DurhamG
Differential Revision: D20637749
fbshipit-source-id: bf666f8787e37e6d6c58ad8982a5679b7e3e717b
Summary:
This enables fetching blobs from the LFS server. For now, this is limited to
fetching them, but the protocol specify ways to also upload. That second part
will matter for commit cloud and when pushing code to the server.
One caveat to this code is that the LFS server is not mocked in tests, and thus
requests are done directly to the server. I chose very small blobs to limit the
disruption to the server, by setting a test specific user-agent, we should be
able to monitor traffic due to tests and potentially rate limit it.
Reviewed By: DurhamG
Differential Revision: D20445628
fbshipit-source-id: beb3acb3f69dd27b54f8df7ccb95b04192deca30
Summary:
This type can either be a Mercurial type key, or a content hash based key. Both
the prefetch and get_missing now can handle these properly. This is essential
for stores where data can either be fetched in both ways or when the data is
split in 2. For LFS for instance, it is possible to have the LFS pointer (via
getpackv2), but not the actual blob. In which case get_missing will simply
return the content hash version of the StoreKey, to signify what it actually
has missing.
Reviewed By: quark-zju
Differential Revision: D20445631
fbshipit-source-id: 06282f70214966cc96e805e9891f220b438c91a7
Summary: This makes it clear that these traits are dealing with Mercurial Key.
Reviewed By: quark-zju
Differential Revision: D20445626
fbshipit-source-id: d5acbf442e9407b973e95e40af69b5a61bff0a4d
Summary:
Now that the ContentStore can automatically strip the metadata header, no need
for duplicated code in the backingstore.
Reviewed By: fanzeyi
Differential Revision: D20376812
fbshipit-source-id: e863e1cc2fcdc8b9e612a464b305fa25ceb66e13
Summary:
The bytes 0.5 is a depencency of newer tokio, it's also newer, and thus better.
Staying on 0.4 means that copies between Bytes 0.4 and 0.5 need to be done,
this will be especially bad in the LFS code since 10+MB buffer will have to be
copied...
One main API change is for the configparser. The code used to take Into<Bytes>
for the keys, I switched it to AsRef<[u8]>.
For hg_memcache_client, an extra copy is performed to build a Delta, since this
code uses an old tokio, and is being replaced right now, the effort of
switching to a new tokio and new bytes was not deemed worth it, the copy will
do for now.
Reviewed By: dtolnay
Differential Revision: D20043137
fbshipit-source-id: 395bfc3749a3b1bdfea652262019ac6a086e61e0
Summary:
With the Arc embedded into the store themselves, this forces a second
allocation in order to use them as trait objects. Since in most cases, we do
not want the stores themselves to be cloneable, we can move the Arc outside and
thus reduce the number of pointer indirection.
Reviewed By: DurhamG
Differential Revision: D19867568
fbshipit-source-id: 9cd126831fe2b9ee715472ac3299b7a09df95fce
Summary:
Failing means that we fallback to the Python importer. Let's simply warn about
it.
Reviewed By: fanzeyi
Differential Revision: D19897274
fbshipit-source-id: f9c63f5aa76015c28b31f00bba98244f5c86e923
Summary:
EdenFS will now be able to fetch blobs directly from memcache. This won't have
any big benefits as no blobs are in memcache right now, but over time, this
will significantly reduce the cost of fetching blobs.
Reviewed By: fanzeyi
Differential Revision: D19861643
fbshipit-source-id: c2e9d317bd30d4656bf0b3f8897794161697761a
Summary:
As reported by JT, EdenFS may hold the file descriptor of mapped pack files too long even when it is deleted by external processes, thus taking more disk spaces.
This diff fixes this problem by making EdenFS periodically rescan the pack files.
Reviewed By: chadaustin
Differential Revision: D19395439
fbshipit-source-id: 4bfd6a7ac13dceb3099d2704d62b3825433aff4b
Summary:
In order to avoid pathological linkrevfixup after pushing a change to a
pushrebase repo, we need to be able to fetch history data that is already
present locally from the server. Since the Rust stores will always check
whether the data is present locally before fetching it, we would not be
fetching anything, causing the pathological linkrevfixup to kick in.
By allowing the stores to be built without a local component, the prefetch
function will not find the local history, and will thus be able to fetch it
properly.
Reviewed By: DurhamG
Differential Revision: D19412619
fbshipit-source-id: 421c59c63634ead7f98e6ba89da0532067f7b412
Summary: This diff makes EdenFS to be able to use EdenApi to import blobs and trees when requested objects are not present in local hgcache.
Reviewed By: chadaustin
Differential Revision: D18605547
fbshipit-source-id: 4acd2e07cfd9de6b6775ded30ea22a4478b9f1e4
Summary:
Add an option `experimental:use-edenapi` to `EdenConfig`.
See the next diff for usage.
Reviewed By: chadaustin
Differential Revision: D18605549
fbshipit-source-id: 2786c21bb38a76229078662cc5c1ddf906d1be4a