Summary:
I'm not sure what this was for, but it doesn't seem necessary, and removing it simplifies the code a lot, enabling us to make other improvements later.
This is an alternate, less ambitious version of https://www.internalfb.com/diff/D30620443.
Reviewed By: DurhamG
Differential Revision: D30674016
fbshipit-source-id: 17dee50b82c78d31e45492dc23826d8c3fe838e5
Summary:
This test relies on Mononoke, so it fails for make local build/test,
which breaks hgbuild. Let's only enable it for buck tests.
Reviewed By: quark-zju
Differential Revision: D30782799
fbshipit-source-id: 4b543beeb248715702b9072d84cdb8211dcd4a9b
Summary:
Currently there are two things preventing us from running add_sync_target
on existing target:
* already existing bookmark
* already existing config
Both need to be deleted to create new target. This diff removes the second
one to simplify code and make it easier to recreate the target (it's easy to
forget about manually removing the config as they otherwise don't need
human interventions).
Reviewed By: StanislavGlebik
Differential Revision: D30767613
fbshipit-source-id: f951c0e1ef9bde69d805dc911331fcdb220123f2
Summary:
This logic scans all the ancestors of the working copy that are not
ancestor of the graft source and checks their extras. With lazy changelog this
is extremely expensive. Let's just drop this logic.
Reviewed By: quark-zju
Differential Revision: D30734017
fbshipit-source-id: ca5606cea08efe10f29847970379d6bff4eb4aee
Summary:
Update the `Filenodes` trait so that it doesn't require the repository id to be
passed in every method invocation. In practice a filenodes instance can only
be used for a single repo, so it is safer for the implementation to store the
repository id.
At the same time, update the trait to use new futures and async-trait.
Reviewed By: yancouto
Differential Revision: D30729630
fbshipit-source-id: a1f80a299d9b0a99ddb267d1f7093f27cf21f1af
Summary:
Make the derivation process for mercurial changesets and manifests not depend
on `BlobRepo`, but instead use the repo attribute (`RepoBlobstore`) directly.
This will allow us to migrate to using `DerivedDataManager` in preparation
for removing `BlobRepo` from derivation entirely.
Reviewed By: yancouto
Differential Revision: D30729629
fbshipit-source-id: cf478ffb97a919c78c7e6e574580218539eb0fdf
Summary:
Make the derivation process for blame and deleted files manifest not depend
on `BlobRepo`, but instead use the repo attribute (`RepoBlobstore`) directly.
This will allow us to migrate to using `DerivedDataManager` in preparation
for removing `BlobRepo` from derivation entirely.
A `BlobRepo` reference is still needed at the moment for derivation of the
unodes that these depend on. That will be removed when `DerivedDataManager`
takes over co-ordination of derivation.
Reviewed By: yancouto
Differential Revision: D30729628
fbshipit-source-id: 4504abbe63c9bf036d69cb4341c75b13061fae18
Summary:
Make the derivation process for fsnodes and skeleton manifests not depend on
`BlobRepo`, but instead take the `DerivedDataManager` from the `BlobRepo` and
use that instead. This is in preparation for removing `BlobRepo` from
derivation entirely.
Reviewed By: yancouto
Differential Revision: D30301855
fbshipit-source-id: a2ed1639526aad9ddbe8429988043f0499f7629c
Summary:
Make the derivation process for unodes not depend on `BlobRepo`, but instead
take the `DerivedDataManager` from the `BlobRepo` and use that instead. This
is in preparation for removing `BlobRepo` from derivation entirely.
Reviewed By: yancouto
Differential Revision: D30300408
fbshipit-source-id: c35e9e21366de74338f453aaf6be476e7305556d
Summary:
The derived data manager also has a reference to the repo blobstore. This must
also be overridden when we override the blobstore.
Reviewed By: yancouto
Differential Revision: D30738354
fbshipit-source-id: b0e16ef810c5244cd056a3c9e5b9ceaaddb5ecea
Summary:
modernise tests (removing disabling treemanifest)
treemanifest is the default now, so it shouldn't be disabled unless absolutely necessary.
Also, if we would like to switch some of the tests to Mononoke, it shouldn't be disabled.
Only two tests left with it in commitcloud. Those two a bit harder to fix.
```
devvm1006.cln0 {emoji:1f440} ~/fbsource/fbcode/eden/scm/tests
[139] → rg 'disable treemanifest' | grep cloud
test-commitcloud-backup-bundlestore-short-hash.t: $ disable treemanifest
test-commitcloud-backup.t: $ disable treemanifest
```
Reviewed By: kulshrax
Differential Revision: D30278754
fbshipit-source-id: cf450084669c2b6b361cd34952bf986e913de1a8
Summary:
I want to use `ReplicaFirst` read connection type since `ReplicaOnly` is a bit too restrictive.
We've had 2 MySQL SEVs this year when all the replicas went down crashing our services despite the primary instance working normally.
There was also a case when I've deleted too much rows at once and all replicas went down due to replication lag (I know better now)
RFC
- Yay or Nay?
- Should I expand `ReadConnectionType` to mirror all options of `InstanceRequirement`?
- Perhaps it's worth moving it into the `common/rust/shed/sql` crate?
I kept cleaning up all the usages out of this diff to keep the changes minimal for RFC
Differential Revision: D30574326
fbshipit-source-id: 1462b238305d47557372afe7763096c53df55f10
Summary:
Segmented changelog seeder spends a significant chunk of time fetching
changesets. By saving them to file we can make reseeding significantly faster.
Reviewed By: farnz
Differential Revision: D30765374
fbshipit-source-id: 0e6adf12e334ad70486145173ae87c810880988a
Summary:
In backfill_derived_data we had a way to prefetch a lot of commits at once, so
that backfill_derived_data doesn't have to do it on every startup.
I'd like to use the same functionality in segmented changelog seeder, so let's
move it to the separate binary.
Reviewed By: mitrandir77, farnz
Differential Revision: D30765375
fbshipit-source-id: f6930965b270cbaae95c3ac4390b3d367eaab338
Summary: ContentDataStore is meant to be implemented local-only. Fetching remotely seems to cause the issue observed in https://fb.workplace.com/groups/scm/permalink/4192991577417097/ (though I'm not quite sure why yet)
Reviewed By: kmancini
Differential Revision: D30744817
fbshipit-source-id: 68875a4912905f9b8f88cf4be804c5d988c3905d
Summary: If the bind unmount fails in in the privhelper, theres a possibility of infinite recursion in this method. This adds a flag to indicate if we've tried the bind unmount before.
Differential Revision: D30732857
fbshipit-source-id: 6ee887d211977ee94c8e66531287f076a7e61a2c
Summary:
It sounds like macOS has a bug where an APFS subvolume may be falsely created.
Let's retry with the hope that the retry will succeed.
Differential Revision: D30657706
fbshipit-source-id: 60bc74f789a0d34b2be53073103b95474a9a18e6
Summary: This is regenerated rust lib using the latest compiler
Reviewed By: krallin
Differential Revision: D30720130
fbshipit-source-id: 3d3389ec8504568fc356dda1577e1f7801cb7e96
Summary:
~~Also enable the `derive` feature so it isn't necessary to separately
depend on `strum_macros`.~~
This turns out to break a lot.
Reviewed By: dtolnay
Differential Revision: D30709976
fbshipit-source-id: a9181070b8d7a8489eebc9e94fa24f334cd383d5
Summary: Move `edenapi::Client`'s internals to an `Arc<ClientInner>`. This makes the client `Clone`-able while sharing the same underlying state. This is particularly useful for scenarios where `Future`s or `Stream`s returned by the client need to hold a reference to the client itself (e.g., in order to issue subsequent HTTP requests).
Differential Revision: D30729803
fbshipit-source-id: c97e700c9e3702f818eb86ded1a46f920a55cfd1
Summary: The `Fetch<T>` type has basically turned into the canonical type EdenAPI for all EdenAPI responses. Originally, this type was merely an implementation detail (essentially just a named tuple returned by the `fetch()` method, hence the name), but given its prominence in the API, the name is confusing. As we add more functionality and usage to this type, it makes sense to give it a more suitable name.
Differential Revision: D30730573
fbshipit-source-id: 7acd2a86b55bdfc186bd9110f6a99333df9d29d3
Summary:
Some of the method names used internally by `edenapi::Client` are a bit terse.
This was OK back when there were only handful of private methods which were used by a small number of API methods that were doing more or less the same thing (sending concurrent POST requests for a set of keys).
Today, there are way more API methods, most of which set up requests in different ways. As such, it makes sense to give these older private methods more explicit and descriptive names so that their intended usage is clear.
Differential Revision: D30729802
fbshipit-source-id: 5adfd8e7ba153df8c036e4dbb312f95b9b1d7335
Summary: Allow repack to be called on treescmstore via the ContentStore shim like filescmstore is already supported.
Reviewed By: andll
Differential Revision: D30687145
fbshipit-source-id: 7559af08e98cfb22da6dbf45dc1746312b1e6d28
Summary:
Provide a basic implementation of the LegacyStore trait for TreeStore to allow repack calls to be forwarded to the fallback ContentStore for trees.
Repack will be removed entirely before contentstore is deleted, and the `unimplemented` methods are never called, so this should be safe.
Reviewed By: andll
Differential Revision: D30687136
fbshipit-source-id: d238d70fbf6be5c25c2e1c9610430a53d031bf3b
Summary: Looks like it was lost during the last refactoring, let's add it back.
Reviewed By: farnz
Differential Revision: D30728456
fbshipit-source-id: 20c638b3c5a8664f2367f871cd29a793fd897de3
Summary:
Some users have reported errors of the form:
```
error.HttpError: [65] Send failed since rewinding of the data stream failed (seek callback returned error 2)
```
These are caused by the fact that we're passing the HTTP request body directly to libcurl in memory rather than via a file, but we haven't implemented the `seek()` method necessary for libcurl to retransmit the data if needed. This diff implements the method.
Reviewed By: DurhamG
Differential Revision: D30654625
fbshipit-source-id: f21a067ad02ee540b86cf2e6eff2c6f08f45a3e4
Summary:
Like it says in the title, this updates us to use Daemonize 0.5, though from
Github and not Crates.io, because it hasn't been released to the latter yet.
The main motivation here is to pull in
https://github.com/knsd/daemonize/pull/39 to avoid leaking PID files to
children of the daemon.
This required some changes in `hphp/hack/src/facebook/hh_decl` and `xplat/rust/mobium` since the way to
run code after daemonization has changed (and became more flexible).
Reviewed By: ndmitchell
Differential Revision: D30694946
fbshipit-source-id: d99768febe449d7a079feec78ab8826d0e29f1ef
Summary:
At the moment when segmented changelog is updated and/or reseeded mononoke
servers can pick it up only once an hour (this is a current reload schedule)
or when mononoke server is restarted. However during production issues (see
attached task for an example) it would be great to have a way to force all
servers to reload segmented changelog.
This diff makes it possible to do so with a tunable. Once tunable changes its
value then monononoke servers almost immediately (subject to jitter) reload it.
This implementation adds a special loop that polls tunables value and reloads
if it changes. Note that in theory it could avoid polling and watch for configerator
changes instead, but it would be harder to implement and I decided that it's
not worth it.
Reviewed By: farnz
Differential Revision: D30725095
fbshipit-source-id: da90ea06715c4b763d0de61e5899dfda8ffe2067
Summary:
Previously, extremely large prefetch calls could cause an OOM if the requested files were all fetched remotely from EdenApi, in which case the memory would remain in use until the entire batch had been fetched.
With this change, at most 1000 EdenApi files will be held in memory at once (or 10GB of memory). This is a stop-gap solution, a better approach would be to avoid storing all EdenApi files in memory after a certain amount, or allow the batch fetching implementation to understand we're only prefetching, and thus avoid reading anything back from disk or storing EdenApi files in memory unnecessarily.
Reviewed By: andll
Differential Revision: D30686054
fbshipit-source-id: 022e353760c515961a8956f7958b43f429143971
Summary:
Manual component version update
Bump Schedule: https://www.internalfb.com/intern/msdk/bump/?schedule_fbid=342556550408072
Package: https://www.internalfb.com/intern/msdk/package/181247287328949/
Oncall Team: rust_foundation
NOTE: This build is expected to expire at 2022/09/01 09:14AM PDT
---------
New project source changes since last bump based on D30663071 (08e362a355e0a64a503f5073f57f927394696b8c at 2021/08/31 03:47AM -05):
| 2021/08/31 04:41AM -05 | generatedunixname89002005294178 | D30665384 | [MSDK] Update autocargo component on FBS:master |
| 2021/08/31 07:14PM PDT | kavoor | D30681642 | [autocargo] Make cxx-build match version of cxx |
| 2021/09/01 04:05PM BST | krallin | D30698095 | autocargo: include generated comment in OSS manifests |
---------
build-break (bot commits are not reviewed by a human)
Reviewed By: farnz
Differential Revision: D30717040
fbshipit-source-id: 2c1d09f0d51b6ff2e2636496cf22bcf781f22889
Summary: Keep two versions of fbthrift_ext that one with tokio-0.2 and the other with tokio-1.x. This diff is just renaming.
Reviewed By: dtolnay
Differential Revision: D30558441
fbshipit-source-id: bfe7e96b95529f2745f635190df5118a0cb44014
Summary:
Having the same queue for all three makes the dequeue code overly complicated
as it needs to keep track of the kind of request that needs to be dequeued.
Incidently, the previous code had a bug where request in "putback" would be
requeued at the end of the queue, even though there were at the beginning of it
if they all had the same priorities.
This is theory should also improve the dequeue performance when the queue has a
mix of blobs/tree requests, but I haven't measured.
Reviewed By: genevievehelsel
Differential Revision: D30560490
fbshipit-source-id: b27e5429105c07e5f9eab482c12e5699ca3413f7
Summary:
Since the background condition is before the actual prefetching of files,
specifying the background option would just glob files but not prefetch them
which is equivalent to prefetching all the trees.
Reviewed By: genevievehelsel
Differential Revision: D30618753
fbshipit-source-id: 5533b1c78d614342ac3341ce033795be3850750a
Summary:
It looks like a few scmstore changes landed with warnings (probably fixed higher up in the tree unification stack).
This change fixes those warnings.
Differential Revision: D30686092
fbshipit-source-id: d80625dea64f35683f815b58c83a3e5bb7cbdfa8
Summary:
When we remove a file from a sparse profile and commit the profile, it
should delete the file on disk. There's a bug where it doesn't actually delete
the file. This fixes it by passing the correct commit parents to the refresh
function.
Reviewed By: andll
Differential Revision: D30683677
fbshipit-source-id: 7f012faa99975d8270209f2962e7f9236890daed
Summary:
Use `populate_missing_vertexes_for_add_heads` (added by D27630093 (f138b012e9)) to avoid
excessive lookups for non-master ids that remain in non-master. The function
was used in two other `flush` cases, but missed the id reassignment case. It
works basically by using the "discovery" logic to quickly rule out what's
missing and what's present (ex. if a root is missing in the server graph, then
all descendants of the root is missing).
Reviewed By: andll
Differential Revision: D30700451
fbshipit-source-id: 1f1cd88399dbffd4af75083fef1f3e363a5c60fe
Summary: During reassigning non-master ids, it might trigger too many remote lookups.
Reviewed By: StanislavGlebik
Differential Revision: D30700452
fbshipit-source-id: 2483335e466c3de8a362f7b6a15fc4ba9e2693be
Summary:
Support octopus merge defined in the following format:
- The revlog flag has `1 << 12` set.
- Extra `stepparents` is set to `hexnode1,hexnode2,...` format.
This is mainly used to support revlog from stream clone.
Reviewed By: StanislavGlebik
Differential Revision: D30686450
fbshipit-source-id: d5aa2f18a02f5f0d7aa033210fb4f79b729c0d26
Summary:
Extend the struct so we can support more than 2 parents.
The size of the sturct is now 16 bytes, from 8 bytes. This might have some
performance overhead.
Not using `Box<[u8]>` because that will make the struct 24 bytes.
Reviewed By: StanislavGlebik
Differential Revision: D30686451
fbshipit-source-id: c0f8d0472c7e578f34d771dacecffc91585650c3
Summary:
In `vertex_id_with_max_group(name, group)`, if `group` is master and the `name`
exists in the non-master group, then there is no need to lookup remotely because
a same name (vertex) cannot be present in both master and non-master group. In
that case, just return that the `name` does not exist in the master group.
Reviewed By: StanislavGlebik
Differential Revision: D30699215
fbshipit-source-id: 5170abe719aa7cc31533912e18bc0e21f133e1f4
Summary:
Added a test about excessive remote lookups when flush() reassigns vertexes
from non-master to master.
Reviewed By: StanislavGlebik
Differential Revision: D30699214
fbshipit-source-id: 0547707764855ab9a563178740612b54df4a5fc9
Summary: They are used to narrow down issues related to S242328.
Reviewed By: StanislavGlebik
Differential Revision: D30699216
fbshipit-source-id: 28f4f0bfadadb2dea5510878168c2d7b47a8641c
Summary: Split out the request ID, URL, and HTTP method from `RequestContext` into a new `RequestInfo` struct, which can be cheaply cloned and included in the response returned to the caller. This enables the caller to correlate requests and responses, which is useful when working with many concurrent requests.
Reviewed By: DurhamG
Differential Revision: D30650365
fbshipit-source-id: 68efedcf852c91387450443ebe46062809633f10
Summary:
Make it possible to call the CheckIntegrity APIs from Python such as:
In [1]: cl.inner.checkuniversalids()
Out[1]: []
In [2]: cl.inner.checksegments()
Out[2]: []
In [3]: cl.inner.checkisomorphicgraph(cl.inner, cl.dageval(lambda: heads(mastergroup())))
# take a while
Out[3]: []
Reviewed By: andll
Differential Revision: D30682536
fbshipit-source-id: 23f280bf261def3d20d5f7dc15a48c2fc2d79d77
Summary: This makes other crates easier to implement CheckIntegrity.
Reviewed By: andll
Differential Revision: D30682540
fbshipit-source-id: 4333f37fa7bafe55a8bee9f149b2f23a463c51af
Summary:
Makes the revlog index provides dummy graph integrity checks so it can
be used as a generic object in the Python bindings.
Reviewed By: andll
Differential Revision: D30682542
fbshipit-source-id: 25c6e8640de46188d7bf45a927e11e0779a8ad40
Summary: Make it possible to check a graph against a reference graph.
Reviewed By: andll
Differential Revision: D30682539
fbshipit-source-id: 57db952dcda5656ff6000e9961448c9b64afbaf0
Summary: Make it possible to check segment integrity.
Reviewed By: andll
Differential Revision: D30644243
fbshipit-source-id: 24bb0c8c8c9394d688e3e9320e59268bc2a4ed3f
Summary: Make it possible to check universal ids externally.
Reviewed By: andll
Differential Revision: D30644242
fbshipit-source-id: f312ff59dbdf68e57c5249d57c5d44da0b10e398
Summary: This will be used to verify graph integrity later.
Reviewed By: andll
Differential Revision: D30644244
fbshipit-source-id: 0d22b70121da37c411adf17200a6c752fefa80ad
Summary: This breaks all use of `hg sparse`, because `.hg*` cannot be matched.
Reviewed By: mitrandir77
Differential Revision: D30666349
fbshipit-source-id: c06d1b798a57490f2e5560f178a2839ae5425146
Summary: We've got multiple manifold parameters now, two of which are Option<i64>, so lets create a struct to name them
Reviewed By: HarveyHunt
Differential Revision: D30305462
fbshipit-source-id: 44eee00d478e4485d074a14fcccec2f0f9572ecd
Summary:
This allows to quickly identify the program that emitted the error.
Per user feedback: https://fb.workplace.com/groups/clifoundation/posts/433922134631466
Reviewed By: StanislavGlebik
Differential Revision: D30604611
fbshipit-source-id: 712bc9f466c5a7b5c97a1b83a10fbe277341a300
Summary:
The mockall crate's `automock` attribute previously created nondeterministic output, which leads to frequent random "Found possibly newer version of crate" failures in Buck builds that involve cache.
The affected trait in Conveyor is:
https://www.internalfb.com/code/fbsource/[4753807291f7275a061d67cead04ea12e7b38ae2]/fbcode/conveyor/common/just_knobs/src/lib.rs?lines=13-23
which has a method with two lifetime parameters. Mockall's generated code shuffled them in random order due to emitting the lifetimes in HashSet order. The generated code would randomly contain one of these two types:
`Box<dyn for<'b, 'a> FnMut(&str, Option<&'a str>, Option<&'b str>) -> Result<bool> + Send>`
`Box<dyn for<'a, 'b> FnMut(&str, Option<&'a str>, Option<&'b str>) -> Result<bool> + Send>`
Reviewed By: jsgf
Differential Revision: D30656936
fbshipit-source-id: c1a251774333d7a4001a7492c1995efd84ff22e5
Summary: Adds an option to print the path to the eden log file. Similar to `eden pid`, this can be used for shell one-liners.
Reviewed By: chadaustin
Differential Revision: D30558294
fbshipit-source-id: ca70addaef2093e10f0321bae0cff3b1bfc7dc75
Summary: `eden debug log --upload` fits in better with the format of the other cli tools (rather than `eden debug log upload`)
Differential Revision: D30557691
fbshipit-source-id: 32e47e1487703560f2adb5f0f79f1002d29eea93
Summary:
In a previous diff we made sparse matchers become union matchers, since
they are a collection of each individual sparse profiles matcher. In order to
maintain the performance benefits of having sparse computations run on
non-python matchers, we need to update the matcher extractor to support union
matchers.
Reviewed By: quark-zju
Differential Revision: D30588256
fbshipit-source-id: 15014be844e1d713e19ae8f2959d947516b4e3c7
Summary:
We were copy/pasting metadata.get("version", "1") everywhere. Let's
make it a helper function.
Differential Revision: D30586162
fbshipit-source-id: ff6a9706f1970f84ffeb7de0e1362c3ba507fc00
Summary:
Sparse profiles should be roughly scoped around the files needed to
work on a certain product. If an engineer needs to work on multiple products
they should be able to enable multiple profiles.
Previously, multiple v2 profiles would be combined into an ordered list of
include/exclude rules, which meant that profiles enabled later could exclude
files included by the earlier profiles.
To fix this, let's treat each profile separately and create a matcher for each.
We then combine these into a union matcher, which means we're guaranteed to have
all the files that each profile specifies.
Differential Revision: D30586161
fbshipit-source-id: 2e04cfdba670ffce381a7c041706f315775ad7b0
Summary:
In a future diff we'll process sparse profiles differently at the
matcher creation level. To do so we need to expose the profile object to that
layer. Let's do it by storing the profile instead of just the profile name.
Differential Revision: D30586163
fbshipit-source-id: d90343b4101c43fbd838512289362aca7c3f816a
Summary:
With lazy changelog, it is possible that `vertex_name` is unable to translate
an id to name in non-async context. Do not treat it as errors.
Differential Revision: D30615948
fbshipit-source-id: 4e7abd77c6eb116db00e25489685563b7cf78a9c
Summary: The segments is stored in the shared `.hg`, not in the local repo `.hg`.
Differential Revision: D30615949
fbshipit-source-id: 9d2b7c1ce245553a2df070b066429fbcead5d827
Summary:
Similarly to the previous diff, reducing the lock scope will improve
concurrency leading to higher performance in EdenFS.
Reviewed By: andll
Differential Revision: D30595787
fbshipit-source-id: 1d52e4a8d362f7e2e3e18c2a57a3ebb7628f549e
Summary:
Similarly to the previous diff, let's not hold any read/write locks when not
needed. This will improve concurrency of the code.
Reviewed By: andll
Differential Revision: D30595786
fbshipit-source-id: 6ea6c689e4deca713051a9f3611647334c528bc7
Summary:
In the context of EdenFS, during a prefetch operation, EdenFS will call into
the revisionstore (via the backingstore) from multiple threads and use the
ContentStore::prefetch method in each of them with batches of keys.
Unfortunately, the IndexedLogDatastore lock is write held both when serializing
fetched data and when writing to the underlying IndexedLog. The serialization
part can be fairly CPU intensive as it will compress data with LZ4, but doesn't
require the lock to be held. If a concurrent get_missing call to the
IndexedLogDatastore is made, that will thus wait for the compression to
complete, reducing overall concurrency and fetching speed.
Reviewed By: andll
Differential Revision: D30583566
fbshipit-source-id: 06f5f4988c1bc911ae155189317232b54915a5cf
Summary: Add a new `BackingStore` implementation based on `scmstore` instead of `ContentStore` and allow the backend to be selected with the `scmstore.backingstore` config.
Reviewed By: andll
Differential Revision: D29672974
fbshipit-source-id: dc6b8662903bcfc941b586544aad487de6b8c956
Summary: Introduce a new `StoreValue` trait, implemented for `StoreFile`, for use in `CommonFetchState`, which will be shared between `TreeStore` and `FileStore`.
Reviewed By: kulshrax
Differential Revision: D30295987
fbshipit-source-id: 862349980befa13cb3d420ee617e4fe5464d890a
Summary: Introduce a new `StoreAttrs` trait, implemented for `FileAttributes`, for use in `CommonFetchState`, which will be used by both `TreeStore` and `FileStore`.
Reviewed By: kulshrax
Differential Revision: D30295969
fbshipit-source-id: 603242fe08bea72035b1b1d0f37c46514e1fb92b
Summary: Refactoring in preparation for updating `TreeStore` to match `FileStore`
Reviewed By: andll
Differential Revision: D30295957
fbshipit-source-id: 0f1677311eb2578d8e0dae12c07a1599edc3b500
Summary: Making `FetchResults` generic in preparation for using it with trees too.
Reviewed By: andll
Differential Revision: D30286592
fbshipit-source-id: a48cf2dbcdcc2c4b8a102eaa02ac465c367c6793
Summary: I'll be adding metrics to the `TreeStore` soon, so I'm refactoring code that will be shared between files and trees. This change moves the non-file-specific metrics types up to the top-level `scmstore` module, rather than inside `scmstore::file`.
Reviewed By: andll
Differential Revision: D30284933
fbshipit-source-id: 31f48312bca8d75d4893220cd189b9735a37a5a0
Summary: Helps figure out what happens to metalog internally.
Differential Revision: D30563249
fbshipit-source-id: 10323d36d762edda93206dd01c88d1f0d8abdf8d
Summary:
The failpoint feature supports more complex injections, such as sleep, or fail
after a few times. There is no need to keep the adhoc faultinjection feature.
Differential Revision: D30495223
fbshipit-source-id: b5613811e489a5a52e9c0dd1ebf1096c848a402b
Summary: Currently, tree imports are queued regardless of whether they are in the `hgcache`. This adds unnecessary delay, especially if the queue is busy (importer takes a long time and causes queue to backlog). This diff adds the logic to check if the tree is in `hgcache` before enqueuing a tree import request.
Reviewed By: xavierd
Differential Revision: D30514871
fbshipit-source-id: eb23f64b7f059832571f957fb67d18c3821d2844
Summary:
this library is a more general version of the `panic_unpack` lib I
made in fbcode. I made this library, its mit-apache licensed
Reviewed By: dtolnay
Differential Revision: D30607308
fbshipit-source-id: ee4fad3924fdae021753055cd3fd88c99cb99512
Summary:
This allows us to insert FAILPOINTS in Python so we can use sleep, return error
etc.
Differential Revision: D30495224
fbshipit-source-id: aef56d03bc32eefb69573cfa586aa63a301edffc
Summary:
I abandoned D30603353 in favor of this one because
its_cleaner
We don't need repo name in every hook, just 2 of them.
This will allow us to have predefined, named hook configs that are repo-agnostic. E.g. when repositories are similar, they could share one set of hook configs in configerator.
There were two hooks that had repo_name in configerator hook config: verify_integrity and verify_reviewed_by
Reviewed By: StanislavGlebik
Differential Revision: D30605229
fbshipit-source-id: c310b16b564808d0dc0909d21cc3521a57e06fad
Summary: The result is too flakey now that we've introduced threading.
Reviewed By: andll
Differential Revision: D30607733
fbshipit-source-id: f8bfa2a57d427731fb4ac3011f4364190a83b771
Summary:
Some code in the HgDatapackStore is overly complicated due to the fact that
revHash returns a owned Hash and this forces the code to thus copy it onto a
temporary vector. By having a method that can directly return a slice to the
hash, this issue disappears, thus let's add it.
Reviewed By: chadaustin
Differential Revision: D30582458
fbshipit-source-id: dc102117bc82ab72378293c0abfe9acfd862e9e6
Summary: Cleaned up all remaining usages of this deprecated API in CTP codebase
Differential Revision: D30517771
fbshipit-source-id: 6b2c7fb6c569bf5a928a7eec60fdd890baad312f
Summary:
Handling of mutable renames was incorrect for two reasons:
1) We didn't add an entry to history graph, so only a single changeset before
rename was returned. That was easy to "fix" (just add a new entry to history
graph), but...
2) ...all history operations now have to use a different path (the source of
the rename path).
To fix it let's track not just the changeset id, but also the path for the
given changeset id. Since the path can potentially be large I wrapped it into
Arc to avoid expensive clones.
Differential Revision: D30576342
fbshipit-source-id: a99f6269c34b0a0c626104ec47c9392f984328fb
Summary:
This diff calls the `/:repo/snapshot` EdenApi endpoint added on D30514854 (ab17c4d181) from the `hg snapshot restore` command.
For now, it just prints the parent of the snapshot, but in next diffs it will update to it and restore the dirty changes.
Reviewed By: StanislavGlebik
Differential Revision: D30517984
fbshipit-source-id: e1381eaed561a7184ee02ab99d0282f11a1d944f
Summary: This diff adds the `fetch_snapshot` method to the EdenApi trait, implements it for talking with the EdenApi service, and also adds python bindings for ease of use from python.
Reviewed By: StanislavGlebik
Differential Revision: D30517973
fbshipit-source-id: 41c24ba25040b397b7d739c2885a47acfb9100d2
Summary:
I noticed that every time we fetch blob from hg, we calculate sha hash and put it into metadata table.
Both calculating sha1 of content and writing it to rocks is fairly expensive, and it would be nice if we can skip doing so in some cases.
In this diff I use inexpensive cache check to see if we already calculated metadata for given blob and skip recalculation
In terms of performance, it reduces blob access time in hot case from **0.62 ms to 0.22 ms**.
[still need to do some testing with buck, but I think this should not block the diff since it seem farily trivial]
This is short-medium term fix, the longer term solution will be keeping hashes in mercurial and fetching them via eden api, but this will take some time to implement
Reviewed By: chadaustin, xavierd
Differential Revision: D30587132
fbshipit-source-id: 3b24ec88fb02e1ea514568b4e2c8f9fd784a0f10
Summary: Similarly to the enqueue benchmark, let's have a dequeue benchmark.
Differential Revision: D30560489
fbshipit-source-id: ae18f7e283e4bab228aaa0f58bff2e6f2cfa3021
Summary:
In order to enqueue and find an element in a hash table, the key needs to be
hashed. Hashing a HgProxyHash relies on hashing a string which is significantly
more expensive than hashing a Hash directly. Note that they both represent the
same data and thus there shouldn't be more collisions.
Reviewed By: chadaustin
Differential Revision: D30520223
fbshipit-source-id: 036007c445c28686f777aa170d0344346e7348b0
Summary:
Allocations are expensive, especially when done under a lock as this increase
the critical section, reducing the potential concurrency. While this yields to
a 1.25x speedup, this is more of a sideway improvement as the allocation is now
done prior to enqueuing. This also means that de-duplicating requests is now
more expensive, as no allocation would be done before, but at the same time,
de-duplication is the non-common code path, so the tradeoff is worthwhile.
Reviewed By: chadaustin
Differential Revision: D30520228
fbshipit-source-id: 99dea65e828f9c896fdfca6b308106554c989282
Summary: The F14 hashmap are significantly faster than the std::unordered_map.
Reviewed By: chadaustin
Differential Revision: D30520225
fbshipit-source-id: d986908c5eac17f66ae2c7589f134c430a3c656e
Summary:
When turning on the native prefetch, EdenFS will enqueue tons of blob requests
to the import request queue. The expectation is then that the threads will
dequeue batch of requests and run them. What is being observed is however
vastly different: the dequeued batches are barely bigger than 10, far lower
than the batch capacity, leading to fetching inefficiencies. The reason for
this is that enqueuing is too costly.
The first step in making enqueuing less costly is to reduce the number of times
the lock needs to be acquired by moving the de-duplication inside the enqueue
function itself. On top of reducing the number of times the lock is held, it
also halves the number of allocation done under the lock.
Reviewed By: chadaustin
Differential Revision: D30520226
fbshipit-source-id: 52f6e3c1ec45caa5c47e3fd122b3a933b0448e7c
Summary:
It turns out that we do want to use a Future to make sure that the tracebus and
watches are completed on the producer and not on the consumer of the future. We
could use a `.via(inline executor)` but the code becomes less readable, so
let's just revert the diff.
Reviewed By: chadaustin
Differential Revision: D30545721
fbshipit-source-id: 524033ab4dbd16be0c377647f7f81f7cd57c206d
Summary:
I need to be able to run 'hg sparse files $PROFILE' in an Eden checkout
for a Sandcastle job that analyzes sparse profiles. We can't run 'hg sparse' at
all in Eden repos, so instead I run it in the backing repo. But that repo is
checked out to the null commit, so let's support --rev on 'hg sparse files' so I
can inspect an arbitrary rev.
Differential Revision: D30561529
fbshipit-source-id: 93b46caa9b63637bf4d63d5438bd23cbffb3983a
Summary:
This config option was used to slowly roll out LFS for a repo.
However, it is no longer used and can therefore be removed.
Reviewed By: StanislavGlebik
Differential Revision: D30511880
fbshipit-source-id: 59fe5925cc203aa609488fdf8ea29e9ff65ee862
Summary: Apparently OSX has different ls -R output than linux.
Reviewed By: singhsrb
Differential Revision: D30566611
fbshipit-source-id: 2f232b12d1971bea18c7131c1ec82244252527c7
Summary:
This adds a new endpoint to EdenApi: `/:repo/snapshot`.
It fetches information about a snapshot (for now, parents and files).
This will be used by the `hg snapshot restore` command.
- Parent will be used to do a normal "hg update" to it
- File changes will be used to apply the working state file changes on top of it.
Reviewed By: ahornby
Differential Revision: D30514854
fbshipit-source-id: b6c6930410cf15fe874eca1fce54314e5011512a
Summary:
Very basic shell for the `hg snapshot restore` command.
For now just reads the id and prints it to stdout. Next diffs will add functionality to it.
Reviewed By: StanislavGlebik
Differential Revision: D30449567
fbshipit-source-id: 9f4e0bb284dff7e6ffb4398942f3247a09511933
Summary:
This refactors the complete_trees endpoint to use the abstraction added on D30451710.
This one needed some extra refactoring, the reason why I separated it on its own diff:
- It was the only endpoint that also returned HTTP 400 errors, as well as 500.
- I made the handler trait return a custom error that may be 400 or 500 errors, but the default is 500 and like anyhow Error, errors automatically convert to that to make the happy path very simple
- Also added a type aliasing that makes the return type simpler to look at.
- Then moved this endpoint to use EdenApiHandler, while keeping the previous behaviour of return 400 error when the path is wrong.
Reviewed By: StanislavGlebik
Differential Revision: D30483501
fbshipit-source-id: 5cb4cd7f3fc2bbdafb9f0e6a78739086c829eee1
Summary:
This diff refactors most endpoints to use the new abstraction added on D30451710.
There are still a few endpoints that couldn't be refactored (see summary of previous diff).
Reviewed By: StanislavGlebik
Differential Revision: D30483500
fbshipit-source-id: ea2cc24afb119bf63a45a6f4bf48706a004743cc
Summary:
Uses the abstraction built on D30451710 to make more endpoints simpler. It needed to be made a little more generic by using streams instead of `Vec`.
Some endpoints can't be migrated as of now:
- hash_to_location and trees as they a custom CBOR stream. We might want to implement a similar abstraction that can deal with custom cbor stream, though since it's just two endpoints, maybe we should just move them to the same as the others, if possible.
- revlog_data as it doesn't use the Wire format. This seems like it's just legacy, and should probably be moved to use the wire format, though the transition is tricky.
- Some endpoints have no input, and this assumes input is a wire cbor struct.
- Some endpoints don't output a stream, but just a cbor object, that will need a different EdenApiHandler trait but it's doable.
The rest seem to be possible to migrate with current abstractions.
Reviewed By: StanislavGlebik
Differential Revision: D30453597
fbshipit-source-id: 0d1068021e83f3e1143f3bc9ee68d20e4c34cb50
Summary:
While I was creating EdenAPI endpoints, I noticed that there was quite a bit of duplicated code in it, for example, the logic to read things from inside a request and write a response.
It also needed several "manual" steps to create a new one, and though types made it hard to make a mistake, it was still possible, as nothing guaranteed you were using the same types everywhere. This diff aims to fix all of that.
Creating an endpoint, before:
1. Create handler function, that receives `State` and returns `TryIntoResponse`, by copying from other functions the initial/final handling, which includes adding some handler info, parsing path and query string extraction, creating a repo context and parsing the wire request.
2. Write a PathExtractor struct by copying the other structs, it must take a repo parameter.
3. Create a new variant in EdenApiMethod and update related methods.
2. From the handler function, convert stuff and call your inner logic function, that actually is different on all endpoints.
3. Write a wrapper for your wrapper, with can be done using `define_handler!`
4. Call `route.post(URL)`, don't forget to use the same Path/QueryStringExtractors.
Creating an endpoint, now:
1. Write a struct that implements `EdenApiHandler`, in which you specify the HTTP method, endpoint, and inner logic.
3. Create a new variant in EdenApiMethod and update related methods.
3. Call `Handlers::setup::<YourHandler>(route)`
This makes it easier to create a new endpoint and harder to make mistakes while doing so. It could be even easier with some procedural macro dark magic, but this is already pretty good and simple.
**On this diff, I only refactor one of the endpoints, to make it easier to review the new "refactor logic", on the next diffs I'll move more endpoints**
Reviewed By: StanislavGlebik
Differential Revision: D30451710
fbshipit-source-id: 9de4de483d4a755bfceafb6d52ec79bbf3f3b5e7
Summary:
Currently we have a problem in tunables - if a tunables was added, and then
removed, then value of tunable is not renewed until the binary is restarted.
This is not great - it makes reverts harder (e.g. if a diff added "tunable =>
true", then reverting the diff won't make any difference until the server is
restarted. So in order to revert the value of the tunable we'd need to write a
diff that does "tunable => false" or restart the binaries, neither of which is
great).
This is counter-intutivite, and this diff fixes it. From what I can tell, this
problem affected only strings/bools/ints - per repo variable weren't affected.
Reviewed By: Croohand
Differential Revision: D30544003
fbshipit-source-id: 353970a259a7faa0682866aae9a5699b8a783b22
Summary: It's convenient to be able to derive multiple derived data types at once
Reviewed By: mitrandir77
Differential Revision: D30545263
fbshipit-source-id: 79ba541981af1340a59145f44a2f7d5a54cc28e1
Summary:
fburl.com/botsdiffs is a random diff
fburl.com/botdiffs is the wiki
Differential Revision: D30547647
fbshipit-source-id: 337d6457cb6403f11fbbc9654f3d34f50d69b0e5
Summary:
Sparse profile change computation is quite slow, since globs mean it
has to iterate over a large portion of the tree even if the sparse profile
change is small. Let's make the BFS async and parallel.
Reviewed By: andll
Differential Revision: D30554961
fbshipit-source-id: bce461ad0b21d1dab1013cf3f501b5744b295c30
Summary:
Avoiding the GIL can speed up tree diff/iteration a lot. Let's extract
the pure Rust matcher when possible. We also convert a Python differencematcher
into a rust DifferenceMatcher when possible, since that is the core use case for
sparse profile change computation.
Reviewed By: andll
Differential Revision: D30553727
fbshipit-source-id: 41d2f13130da18a55f64f9f0f047825bd24144c6
Summary:
In a future diff we'll make tree BFS iteration async and parallel. To
do so we need the matchers to be Send/Sync. Let's update the types for now.
Reviewed By: andll
Differential Revision: D30553731
fbshipit-source-id: 82a441a0caa86b65ca7eaea7283bbbbbb79d5c88
Summary:
In a future diff we'll want to extract matchers into their pure Rust
forms when possible. Let's create a helper function for matcher extraction. In a
later diff we'll make this function smart.
Reviewed By: andll
Differential Revision: D30553730
fbshipit-source-id: 0612a3be54f0286308fc4c43d9a2e5e9aa431a16
Summary:
In a later diff we'll be making sparse profile change computation use a
pure Rust matcher. To do so, we need to be able to represent a difference
matcher in Rust.
Reviewed By: andll
Differential Revision: D30553729
fbshipit-source-id: df2194bbacaa7924bafbfc33e00f25e524df4613
Summary:
The diff algorithm kept track of left vs right stores, and applied the
left vs right tree lookups to the appropriate store. In reality we only ever
have one store, so we're just wasting cycles doing two sequential lookups. Let's
just get rid of the distinction.
Reviewed By: andll
Differential Revision: D30553726
fbshipit-source-id: 72de95d298dad71dc04728d5140b4d489073a33c
Summary:
When testing hg checkout with an empty cache, I noticed the tree
diff'ing phase would alternate between downloading data and doing cpu work. This
implied it was all synchronous. This diff speeds it up by doing the downloading
on a background thread, while the main thread does the actual diff.
Ideally we'd make the actual diff itself parallel as well, but the tree
structure is a tree of semi-mutable references, which would require a larger
refactoring.
Reviewed By: andll
Differential Revision: D30553728
fbshipit-source-id: 4ea953909827cf1ec4fd67ac297f63bfadd67483
Summary:
This change has the unintended effect of causing any Thrift calls to
potentially issue a recursive EdenFS call due to symlink resolution requiring
running `readlink` on the root of the repo itself.
Fixing this isn't really possible, thus let's revert the change altogether, we
can force clients to issue a realpath before issuing EdenFS Thrift calls.
Reviewed By: kmancini
Differential Revision: D30550796
fbshipit-source-id: 9494c8e08c8af2392eeb344879f156cb56f93ea6
Summary:
The documentation allows for not having to test in enqueue if the queue is
still running: if called in the destructor of the owner, no enqueue can
logically happen, and thus we do not need to protect against it.
Reviewed By: chadaustin
Differential Revision: D30520227
fbshipit-source-id: 9d6280ccd7fe875cd06b0746151a2897d1f98d61
Summary:
When trying to push thousands of requestst to the queue, the dequeue side only
manage to pull batches of ~10 requests at most. Let's measure the cost of
enqueue to optimize it.
Reviewed By: chadaustin
Differential Revision: D30503110
fbshipit-source-id: d06ae6741b13b831fa3711fb2dd0e38c3e54193c
Summary:
If hg sync job is pushing a commit, then we have no choice but to accept it,
even if it's too big. So let's not fail if too many commits are pushed.
Reviewed By: ahornby
Differential Revision: D30535025
fbshipit-source-id: eb607a8fbd691d6591ad990e0920411b1ad2f09c
Summary:
Original commit changeset: b4d12a0de8af
Back out "[hg] pymatcher: extract pure Rust matchers when possible"
This is needed to revert D29971824 (13614c3ed4) which makes lots of mononoke tests time out (T99068634). I'm not
sure where the bug is so I'll leave the fix up to DurhamG
Reviewed By: StanislavGlebik
Differential Revision: D30540881
fbshipit-source-id: 73ed2b7b63d33fbe3a76ee93d86241e37c4b2dd0
Summary:
Original commit changeset: edd0b9fb7821
Back out "[hg] pymatcher: extract pure Rust matchers when possible"
This is needed to revert D29971824 (13614c3ed4) which makes lots of mononoke tests time out (T99068634). I'm not
sure where the bug is so I'll leave the fix up to DurhamG
Reviewed By: StanislavGlebik
Differential Revision: D30540880
fbshipit-source-id: 8cac18ae17531ec72f255f35a62b0e5337f9d36e
Summary:
Original commit changeset: 2f04db619a20
Back out "[hg] pymatcher: extract pure Rust matchers when possible"
This is needed to revert D29971824 (13614c3ed4) which makes lots of mononoke tests time out (T99068634). I'm not
sure where the bug is so I'll leave the fix up to DurhamG
Reviewed By: StanislavGlebik
Differential Revision: D30540885
fbshipit-source-id: ef0d07b60a76cd4efaf29fb93424a5231d6e6ebf
Summary:
Original commit changeset: 8fdfa87bfc16
Back out "[hg] pymatcher: extract pure Rust matchers when possible"
This is needed to revert D29971824 (13614c3ed4) which makes lots of mononoke tests time out (T99068634). I'm not
sure where the bug is so I'll leave the fix up to DurhamG
Reviewed By: StanislavGlebik
Differential Revision: D30540882
fbshipit-source-id: 1db5b4f22ee5b5479fc444554f61e189760fee6b
Summary:
Original commit changeset: ff201a014ca1
Back out "[hg] pymatcher: extract pure Rust matchers when possible"
This is needed to revert D29971824 (13614c3ed4) which makes lots of mononoke tests time out (T99068634). I'm not
sure where the bug is so I'll leave the fix up to DurhamG
Reviewed By: StanislavGlebik
Differential Revision: D30540883
fbshipit-source-id: a9511690462b3137f21316e5f682f07e61cb5a48
Summary:
This is needed to revert D29971824 (13614c3ed4) which makes lots of mononoke tests time out (T99068634). I'm not
sure where the bug is so I'll leave the fix up to DurhamG
Reviewed By: StanislavGlebik
Differential Revision: D30540884
fbshipit-source-id: 4e209fccf1c94c871aa9652ee733e7ce0eaf4279
Summary:
This reverts D30168108 (058fff364d) which makes lots of mononoke tests time out (89 tests according to bisect tool, see T99069004). I'm not
sure where the bug is so I'll leave the fix up to DurhamG
Original commit changeset: 4e0de01c6ef7
Reviewed By: StanislavGlebik
Differential Revision: D30539931
fbshipit-source-id: 8b5772299d1c06b0ef2bd73465576373095c6f61
Summary:
As explained in previous diff summary - we need to make those deletions
before we can fixup history. This diff adds relevant args to megarepo tool.
Reviewed By: StanislavGlebik
Differential Revision: D30456464
fbshipit-source-id: 894e27750684f86e18184e3b98beeda8dfb5b53d
Summary:
In one of our large repos the files were synced using ad-hoc scripts between
the repos before the megarepo. This made the history and blame full of
irrelevant commit messages like "sync from repo x". The new functionality will
allow us to re-merge those files with correct histories. To do that we need to
delete the:
* files with wrong history from master branch
* all other files from the source branch with correct history - so only those
are merged in
As an optimisation we're not removing the files that are the same on both
branches from branch with correct history. This will make the merge bonsai
smaller while keeping the history the same (at least when unode v2s are
enabled).
Reviewed By: StanislavGlebik
Differential Revision: D30456465
fbshipit-source-id: f14e8caf8e104c8d8069edc098cfe6419bbe407e
Summary:
The current logic in megarepo merge tool command (which creates arbitrary
merges in the repo) checks if there's intersection between the manifest leaf
entries of both commits. This doesn't allow the merges where there are exactly
same files on both side of merge.
I've updated the logic to look at fsnode manifest diff. As the old futures code
was harder to work with I've updated the code I worked with to async/await.
NOTE: I think that both new and old logic lack one important check: file vs
directory conflicts. We should address this separately.
Reviewed By: StanislavGlebik
Differential Revision: D30456466
fbshipit-source-id: 70303e8ff8ff9a42e5dcff5c1fe6e4912910d4d9
Summary: Generalise the hook slightly; instead of relying on a specified file, use a revset, which opens this out to more use cases (e.g. blocking revisions in fbsource that aren't on our `master` branch)
Reviewed By: StanislavGlebik
Differential Revision: D30511155
fbshipit-source-id: 6b73c7c3e6caf2d670632110619eacb7b6216355
Summary:
Adds a helper "hg debugsparseprofilev2 $PROFILE" command that prints
out the difference in files matched by the profile using v1 and v2. This is
useful when migrating a profile to v2 and checking you haven't broken anything.
Reviewed By: kulshrax
Differential Revision: D30303868
fbshipit-source-id: 1b1d8b197470692fa13457ecb4c9344104b6d9c0
Summary:
v1 sparse profiles had excludes always take precedence over includes.
This made it hard to use and resulted in sparse profiles that were too wide.
Let's add a v2 sparse profile that respects the order of include/exclude rules.
Reviewed By: andll
Differential Revision: D30284698
fbshipit-source-id: 6765b8487695ec6cc7f8027e4da98ca65957b5d0
Summary:
In a future diff we'll want to handle ordered lists of include/exclude
rules. To do so we need to use the treematcher. Note, this means we can't use
regular expressions for sparse matchers now, but there's only one occurrence of
that in production.
Reviewed By: andll
Differential Revision: D30284700
fbshipit-source-id: c379e1feb233baee8c5ba2e1f57017b72ecaa835
Summary:
Currently excludes always take precedence over includes in sparse
configs. In a later diff we want to make this more flexible. To start with
though, let's refactor config loading to use a more "rules" oriented structure.
Excludes will still take precedence over includes, but will do so by being later
in the rule list. This will allow us to start changing rule ordering in a later
diff.
Reviewed By: andll
Differential Revision: D30284699
fbshipit-source-id: 955c016d67e22fce6b26ba7b8e1ffacc931989c8
Summary:
Previously the SparseConfig type was abused to mean three things:
1. The non-recursive contents of .hg/sparse
2. The non-recursive contents of a sparse profile.
3. The recursively expanded contents of enabled sparse rules and profiles.
In a later diff we'll be changing how includes/excludes work to make it more
flexible, and this reuse of the same type makes it complicated.
So let's split it into three distinct types. Later diffs will change them so
they aren't so similar. In doing so, we also slightly change the way sparse
profiles are loaded, to load them recursively.
Reviewed By: andll
Differential Revision: D30284701
fbshipit-source-id: f908ba167a04f9863ab75e1cd6a94027fa64607e
Summary:
add_sync_target might need to derive a lot of data, and it takes a long time to
do it. We don't have any resumability, so if it fails for any reason, then we'd
need to start over.
For now let's just retry a few times so that we don't have to start over
because of flakiness
Reviewed By: mitrandir77
Differential Revision: D30511785
fbshipit-source-id: 1a9c5e62db366022ad487ed108dd41b1dea4caa2
Summary:
This diff starts making the python binding layer simpler. It's better to make the python-aware Rust code as simple as possible, to make reusability easier.
This diff moves the `uploadsnapshot` functionality to a "pure-rust" file, and makes it in an easily extensible way, so that when adding more functionality we don't end up with a huge file with everything.
Reviewed By: ahornby
Differential Revision: D30429524
fbshipit-source-id: e46e71a8993917720ab38b50a8c967564ac82fb6
Summary:
This fixes a small bug in snapshots, as noted in the `TODO` comment I deleted.
If you `hg rm` a file and then replace it with something else, we want that to be uploaded to the snapshot as well.
These types of files are uploaded as "UntrackedChange", which is a bit confusing but correct, it means the file was deleted and now there are untracked changes in it (similar to how if a file is marked as "untracked deletion" but it didn't exist, it means it was `hg add`ed and then locally deleted)
Differential Revision: D30398143
fbshipit-source-id: dbd316fbcdc0e9b1781c5c967eb12c02805f4361
Summary:
Objects (for now, files and changesets) can be in the persistent or ephemeral blobstore (in any of the bubbles). This diff makes it so `UploadToken`s can reflect that.
Nicely, for objects in persistent blobstore, this shouldn't affect the size of anything shipped on the wire, as its serialisation gets skipped.
---
An unrelated but similar problem I noticed is that `UploadToken`s don't capture the repo the object was uploaded to. So it is theoretically possible to upload something in one repo and "prove" to mononoke that it was uploaded to another repo (well, once signatures are actually used). This seems very hard to actually do in practice without bad intent (or even with it), so I'm not sure if it's something we want to fix.
Reviewed By: StanislavGlebik
Differential Revision: D30396437
fbshipit-source-id: e278cd73061d3aaf7e3353236503d239a366ed5d
Summary:
The snapshot itself is uploaded to ephemeral blobstore, but before this diff, file content was still uploaded to the persistent one.
Server side logic to upload to bubble was already added some time ago, but we needed the client part of it.
There is a slight bug, since the logic to upload files still checks if the file was already uploaded in the **persistent** blobstore, so if it's there it won't be uploaded again to the ephemeral blobstore. This is fixed on the next diff (D30396437).
Differential Revision: D30396432
fbshipit-source-id: e6e5ec299ec5323503eedba32956569e16f3eb50
Summary:
Before this diff, only file changes (tracked or untracked) were stored in the changeset.
This diff makes it so deletions and untracked deletions are also stored in the changeset.
Differential Revision: D30373960
fbshipit-source-id: 4c1c40f4028e03a7858ce7ce55d385e02b672fc4
Summary:
This diff adds the `--bubble` argument to the `admin bonsai-fetch` command, which is useful to look at changesets stored in a bubble.
It also does some BlobRepo refactoring to this command, by building a smaller `BonsaiFetchContainer` object, instead of `BlobRepo`, and uses changes from D30368261 (6ed51f5514) to make that possible.
Reviewed By: StanislavGlebik
Differential Revision: D30370983
fbshipit-source-id: fb99d2f4534fed94c040a0f9ce371d7843114f51
Summary:
This diff makes snapshots be uploaded to the ephemeral blobstore.
Given the previous diffs which implemented a "repo view" and piped bubble id to edenapi server, on this diff I do some BlobRepo refactoring which allows CreateChangeset to use repo view instead of BlobRepo to save the snapshot changeset.
Reviewed By: StanislavGlebik
Differential Revision: D30370980
fbshipit-source-id: b1a6a7d9d061dd09d6973c0a6942c22e86387f4a
Summary:
In order to store the snapshots in the ephemeral blobstore, we'll need to pass the bubble via the eden api call.
This diff adds the plumbing for that, by adding bubble_id to the EdenAPI, though for now it doesn't do anything with the value.
Reviewed By: StanislavGlebik
Differential Revision: D30370981
fbshipit-source-id: 88f114798c9f8fd14acfdeed07e8796e6c3470fb
Summary:
Sparse profile change computation is quite slow, since globs mean it
has to iterate over a large portion of the tree even if the sparse profile
change is small. Let's make the BFS async and parallel.
Reviewed By: andll
Differential Revision: D30168108
fbshipit-source-id: 4e0de01c6ef7fe53f881bed722c8b406090091ef
Summary:
Avoiding the GIL can speed up tree diff/iteration a lot. Let's extract
the pure Rust matcher when possible. We also convert a Python differencematcher
into a rust DifferenceMatcher when possible, since that is the core use case for
sparse profile change computation.
Reviewed By: andll
Differential Revision: D30168109
fbshipit-source-id: 2466abf3362e947d1caf8769cca6b6fd54409b8d
Summary:
In a future diff we'll make tree BFS iteration async and parallel. To
do so we need the matchers to be Send/Sync. Let's update the types for now.
Reviewed By: andll
Differential Revision: D30168110
fbshipit-source-id: ff201a014ca1a34f0d03ca7b5b7e038c170a8354
Summary:
In a future diff we'll want to extract matchers into their pure Rust
forms when possible. Let's create a helper function for matcher extraction. In a
later diff we'll make this function smart.
Reviewed By: andll
Differential Revision: D30168106
fbshipit-source-id: 8fdfa87bfc1617bd5bfabc3e4b0870b64e8cf8fe
Summary:
In a later diff we'll be making sparse profile change computation use a
pure Rust matcher. To do so, we need to be able to represent a difference
matcher in Rust.
Reviewed By: andll
Differential Revision: D30168107
fbshipit-source-id: 2f04db619a20226ee5e8b2688df754d907f8b8c6
Summary:
The diff algorithm kept track of left vs right stores, and applied the
left vs right tree lookups to the appropriate store. In reality we only ever
have one store, so we're just wasting cycles doing two sequential lookups. Let's
just get rid of the distinction.
Reviewed By: andll
Differential Revision: D30132762
fbshipit-source-id: edd0b9fb7821851cd93df1f17fbf75319cad5813
Summary:
When testing hg checkout with an empty cache, I noticed the tree
diff'ing phase would alternate between downloading data and doing cpu work. This
implied it was all synchronous. This diff speeds it up by doing the downloading
on a background thread, while the main thread does the actual diff.
Ideally we'd make the actual diff itself parallel as well, but the tree
structure is a tree of semi-mutable references, which would require a larger
refactoring.
Reviewed By: quark-zju
Differential Revision: D29971824
fbshipit-source-id: b4d12a0de8af8e18e8a420d43088663a795db5f8
Summary:
Now that Link's have Arc<> under the hood, they can be copied. So we
don't need to store references in the traversal structures. This will make
upcoming parallelization changes much easier.
Reviewed By: andll
Differential Revision: D30132763
fbshipit-source-id: ff758de248f578fa7948982d8aa392ae8f034766
Summary:
We want to make tree traversals parallelized, but to do that it needs
to be safe to hand different parts of the tree to different threads. The current
structure involves the tree being a straight forward tree of
`Link->BTreeMap<Path, Link>` and the algorithms largely act on references to
Link's. This diff converts Link to actually contain an inner Arc, so in later
diffs we can stop using references and then start sending trees across threads.
Note: The change affects the mutability pattern for trees in an unusual way. See
code comments for details.
Note: The inner-Arc pattern was used (instead of wrapping Link's in Arc) because
generally don't want the Link's to be clonable and we want to avoid the
accidental clones that Arc allows.
Reviewed By: andll
Differential Revision: D30132761
fbshipit-source-id: 8a47b88fc27993420af3d486cbf4e80209cdb3ac
Summary:
In a future diff we want to parallel parts of the treemanifest diff
(like the fetching of data). To do so, we need to release the gil so other
threads can use it.
Reviewed By: kulshrax
Differential Revision: D29971827
fbshipit-source-id: 2869c24f9096497024e19ce13c8ed1ace9b660c3
Summary:
In a later diff we'll make tree diff'ing parallel, so we need the gil
to not be held. Let's switch to a matcher that acquires the gil on demand
instead of relying on it to be held.
Reviewed By: quark-zju
Differential Revision: D30132764
fbshipit-source-id: f06a6c5b76be7b39bf15b74639427324a55b82ed
Summary:
In a future diff we want to start allowing a background thread to fetch
tree data while the diff algorithm is running. To do so, we need the gil to not
be held during the diff algorithm. But, the diff algorithm needs to run the
matcher, and the matcher may be in Python. So let's introduce a new matcher
wrapper that acquires the gil.
In the future we should get rid of the python matchers entirely, which will
eliminate this problem.
Reviewed By: kulshrax
Differential Revision: D29971826
fbshipit-source-id: 8a9ba0ea65a0b4748e39178cdf4a08c922755b02
Summary:
In a future diff we'll want to run the treemanifest diff algorithm on a
separate thread from Python, so Python can be used for fetching from a parallel
thread. To do this, we need the tree to be accessible across threads, so let's
put it behind an Arc<RwLock<>>.
Reviewed By: quark-zju
Differential Revision: D29971825
fbshipit-source-id: 6b3ef1025eb7840b905bf60785e05da96980a2d7
Summary:
Previously treemanifest kept a reference to the Python store, and any
call would have to go up to Python then come back down to Rust. Since our stores
are now 100% Rust (at least at the top layer), let's just downcast it to the
appropriate Rust store and store that instead.
This helps in a future diff where we want to access the store without taking the
gil.
Reviewed By: kulshrax
Differential Revision: D29971828
fbshipit-source-id: 77ff11897045282c9e6a6029b126dcdd20c8e9db
Summary:
The diff introduces several small changes:
* It adds logging for the blob size, which can be useful to analyze latency of the `put`/`get` operations.
* Logging of the multiplex id as a multiplexed blobstore configuration used.
* I also added sampling for the `get`/`is_present` with the same rate as is used in blobstore trace table (it seems reasonable to me). `put` is not sampled, because it's not in the blobstore trace. Errors and "some failed others none" are not sampled either.
* Also some small refactoring to make the code look better.
Reviewed By: StanislavGlebik
Differential Revision: D30490848
fbshipit-source-id: a4fef8a1f1f7622054c75afbe09fe4a55d44ac19
Summary: Added a kill switch to enable/disable predictive prefetch profiles similar to the existing one for regular prefetch profiles (D24803728 (7dccb8a49f)). This can be set manually in a user's config or via the cli `eden prefetch-profile disable-predictive/enable-predictive` commands.
Reviewed By: genevievehelsel
Differential Revision: D30404139
fbshipit-source-id: 01900f4030ef6991124f89a67ea404ff2f07ffeb
Summary:
Added eden prefetch-profile activate-predictive/deactivate-predictive subcommands to activate and deactivate predictive prefetch profiles. This will update the checkout config to indicate if predictive prefetch profiles are currently active or not, and stores the overridden num_dirs if specified on activate (--num-dirs N). If activate is called twice with different num_dirs, the value is updated (only one is stored). Unless --skip-prefetch is specified, a predictive prefetch with num_dirs globs (or the default inferred in the daemon) is run.
Also added fetch-predictive [--num-dirs N], which will:
1. if num_dirs is specified: fetch num_dirs globs predictively
2. if num_dirs is not specified, and predictive fetch is active: get the active num_dirs from the checkout config and fetch globs predictively
3. if num_dirs is not specified, and predictive fetch is not active: fetch the default num_dirs (inferred in the daemon)
Added --if-active to fetch-predictive. If set, fetch will not run if predictive prefetch profiles have not been activated (predictive-prefetch-active in checkout-config). Used for post pull hook.
Reviewed By: genevievehelsel
Differential Revision: D30306235
fbshipit-source-id: ba02c2bc976128704c8ab0c3d567637265b7c95d
Summary:
Part of Rust-analyzer update.
Updated the affected sources (migrage from Url#into_string to Into#into
Reviewed By: jsgf
Differential Revision: D30344564
fbshipit-source-id: fc3ccbe25d7b3d9369a01dfb6b7f8e6a200a7083
Summary:
Made changes to ensure that numResults is always a 32 bit unsigned int, and startTime and endTime are 64 bit unsigned ints. This is to ensure consistency across the smartservice and the endpoint in the daemon.
Also, updated the scuba query in the smartservice to only consider dirs with > 1 access (may update this later to accept a configurable lower bound on access count, but for now, including access=1 doesn't make sense).
Reviewed By: genevievehelsel
Differential Revision: D30396526
fbshipit-source-id: 10e7bd969928da91ab29d413280a1ff956db438c
Summary:
This is now only used in HgQueuedBackingStore::logBackingStoreFetch, and
manually inlining it allows for the lock to be taken once instead of once per
path, reducing the number of times the lock needs to be acquired.
Differential Revision: D30494771
fbshipit-source-id: 2d59d0343e48051e4d9c4fc196e66bcb79e7ac71
Summary: While `eden trace hg` already prints queue time when it's over 1ms, this diff adds fb303 counters for import tree/block queue time so that we can have the percentiles.
Reviewed By: xavierd
Differential Revision: D30492275
fbshipit-source-id: 3601aeb9b51b2f55f189a0e0a753fd6ef29d7341
Summary: Currently, the store loops through the requests, calls HgImporter, then waits with `getTry`. This diff makes the change to kickoff all tree imports from HgImporter then waits for future fulfillment with `collectAll`.
Reviewed By: xavierd
Differential Revision: D30486459
fbshipit-source-id: 918e52be818a2064cf04d24f455d23c1ca618434
Summary:
This allows us to get more insights on race condition (ex. pull reverts part of
what cloud sync did) issues.
Reviewed By: andll
Differential Revision: D30415135
fbshipit-source-id: c99ce77d2748e503aea523e485be5b7a57ee8b98
Summary: The "this" and "other" changed should be swapped.
Reviewed By: andll
Differential Revision: D30415134
fbshipit-source-id: 7c14294c6a5926547960e236983879f3c6b746bd
Summary:
Instead of having 2 functions with one taking a single proxy hash, and the
other taking a vector, we can simply have a single function taking a
`folly::Range` and pass a range of one for the single proxy hash case.
Reviewed By: chadaustin
Differential Revision: D30490724
fbshipit-source-id: 5d57f5a5ffc2a5085369c61a2318edd54b24b448
Summary:
Many diffs include multiple tags as prefixes, however the current implementation only greedily takes prefix as the substring up to ']'.
## Issues
Current behaviour for
- **non-tag right bracket:** `My fake diff title with [link]()` -> `My fake diff title with [link]`
- **multiple tags:** `[hg][extensions] fake diff title` -> `[hg]`
## Solution
Use regex to capture all prefix tags:
```
(?:\[.*?\])+
```
## Explanation
- Non capturing group ( `(?: ... )` )
- matching the open bracket ( `\[` )
- and any character as few times as necessary (lazy) ( `.*?` )
- and the closing bracket ( `\]` )
- matching the group as many times as needed (greedy) ( `+` )
Also note `re.match()` matches from the beginning of the string (unlike `re.search()`)
Reviewed By: ronmrdechai
Differential Revision: D30415867
fbshipit-source-id: e09d4e6d2759d0106d41d1a5d4e607ec34eef3fa
Summary:
By default, atomics are using the most strict memory ordering forcing barriers
to be used. Since this atomic doesn't need any ordering, we can make it
relaxed.
Reviewed By: chadaustin
Differential Revision: D30459630
fbshipit-source-id: ff50aac919031d9bae8b870b41a6134331546a5f
Summary:
The recordFetch is an implementation detail of a BackingStore and thus we don't
need to explicitely make it virtual.
Differential Revision: D30459635
fbshipit-source-id: 34f847ca906f81924c99c26b4e8af646e91fd735
Summary:
When prefetching a large number of blobs, repeatly checking whether we should
log accesses to files can become expensive. Since the state of the config isn't
expected to change in the entire batch, we can simply test it once and bail if
logging isn't enabled.
Reviewed By: chadaustin
Differential Revision: D30458698
fbshipit-source-id: b48b9e0ad24585a76d8ce5948f5831db27e08eab
Summary: Looks like we never use this, thus let's simply remove it.
Differential Revision: D30454812
fbshipit-source-id: 28242a2144da4bab9d24debc1a60eeebcdcbaad5
Summary:
When a prefetch request is transformed into many blob requests, we query
RocksDB sequentially for all the proxy hashes, this can be quite expensive and
is also far less efficient than querying RocksDB concurrently with all the
hashes.
As a bonus, this also futurize the code a bit.
Reviewed By: chadaustin
Differential Revision: D30454068
fbshipit-source-id: 5fd238b752a662919e739451c0c1e92f66919ebf
Summary:
Since these are always used as SemiFuture, let's simply make them SemiFuture
from the get go.
Differential Revision: D30452901
fbshipit-source-id: b0863f363ce0cdb921a73d02c43fc82c1614a3dc
Summary:
Looking at strobelight when performing an `eden prefetch` shows that a lot of
time is spent copying data around. The list of hash to prefetch is for instance
copied 4 times, let's reduce this to only one time when converting Hash to a
ByteRange.
Reviewed By: chadaustin
Differential Revision: D30433285
fbshipit-source-id: 922e6e5c095bd700ee133e9bb219904baf2ae1ac
Summary:
Once the request has been dequeued, we no longer need to hold the lock, thus
let's release it to allow other threads to enqueue/dequeue requests.
Differential Revision: D30409797
fbshipit-source-id: a527c67a6bd9f47da5a3930364fd8fae0d1bc427
Summary:
In vast majority of cases I expect file/directory history to be linear i.e. no
merges. In that case there's no need to fetch generation number.
Since fetching generation number can trigger reads from db I'd rather avoid
doing that if that's not necessary, and this is what this diff does - it
doesn't start fetching generation numbers while we have a linear history.
Reviewed By: mitrandir77
Differential Revision: D30483093
fbshipit-source-id: 526fd33619c70cc4e0bb033a0048250b650fb2be
Summary:
To be precise, it's ordering by generation number and parent order i.e. we'd
like to show first parents ahead of second parents.
Note that parent ordering is very basic at the moment, and won't always order
commits correctly when both parents have the same generation numbers. We can
improve it in the future, but I believe it shouldn't be a big issue now.
Reviewed By: mitrandir77
Differential Revision: D30483089
fbshipit-source-id: 67e13b5757831d652b57d6ad42b6135005a0b621
Summary:
There were two variables that controlled the output of `list_file_history` -
one was `history` variable, another was `bfs`. `history` variable got all
immediate ancestors, and bfs was used to fetch new ancestors.
This split makes it hard to add generation number ordering, so in this diff I
suggest to remove `history` variable altogether and just use `bfs` to control
the order of the history.
Reviewed By: mitrandir77
Differential Revision: D30483094
fbshipit-source-id: 0a4cac771383e17e61f58354a30d4e6db7e6547f
Summary:
They will need to become async in the next diffs, let's make them async now to
make next diffs smaller
Reviewed By: mitrandir77
Differential Revision: D30483091
fbshipit-source-id: 8174a2d4618a7dd2721d00d7acd7d700bd57afd1
Summary:
In the next diffs I'd like to make it possible to change the ordering of
commits - currently we only support "bfs order", while I'd like to add
"generation number order".
Reviewed By: mitrandir77
Differential Revision: D30483090
fbshipit-source-id: 82d5a14b26495f5583ca38793023ce3521682237
Summary:
Simple binary that can be used for syncing changesets to hg servers. It's very
simple - it just moves a bookmark, and then wait until hg sync job syncs it
(obviously it means that if sync job is not running, then it isn't going to be
synced).
It also supports syncing only a straight line of commits with no merges -
merges add complexity here, so I decided to not deal with this complexity for
now.
Reviewed By: mitrandir77
Differential Revision: D30447234
fbshipit-source-id: e4624586e4fc53212c1b13a2cd622aa9474a20b8
Summary:
This diff is a step towards uploading snapshots to the ephemeral blobstore.
It adds:
- EphemeralChangesets implementation. This is the trait used for storing changesets, and also their generation numbers. Here we are using a SQL table to store mappings between snapshots and bubbles, as well as their generation number. It fetches information from the blobstore as well, which is different from "non-snapshots" as well, but this can be later optimised to use another table if necessary.
- EphemeralRepoView, a container that has a changesets object and a repo_blobstore, both of which first check the ephemeral blobstore, and then the persistent blobstore, and are useful for dealing with snapshots.
Reviewed By: StanislavGlebik
Differential Revision: D30370979
fbshipit-source-id: bf8e1d3c111d307c1ffbad56e1255a77a4871591
Summary:
It's `O(tr.get('nodes'))`, which does not scale. With BFS prefetch, it's not
that slow if the tree isn't prefetched. It has already been disabled for major
repos D23912965, and causes slowness with lazy changelog (see below).
Reviewed By: StanislavGlebik
Differential Revision: D30454407
fbshipit-source-id: 8027b5e5f1ee09a5f1ffe98a638585345464dd3d
Summary: This will let ```SetPathRootId``` suppose files
Reviewed By: chadaustin
Differential Revision: D29978308
fbshipit-source-id: df22af8bce4a707a7db51ef543c0e3e78cdcef06
Summary: This diff renames ```SetPathRootId``` to ```SetPathObjectId``` as we want to support BLOB
Reviewed By: chadaustin
Differential Revision: D30404536
fbshipit-source-id: f34446ec20aeaf87f5f61e29e421a9bceb0b2a4a
Summary: This will add the same getTreeEntryForRootId to ObjectStore
Reviewed By: chadaustin
Differential Revision: D29920475
fbshipit-source-id: 15bfc6a2ba70cce2095dfcf1f434fd7087605e04
Summary: Add a new method to backingstore so we can get TreeEntry by rootID
Reviewed By: chadaustin
Differential Revision: D29889482
fbshipit-source-id: 93e63624e75c7d559c4de6f68821a8efa0e0c184
Summary: The nth ancestor revset was crashing with multiple base revisions, e.g. "(. + .^)~". Fix variable shadowing issue in revset.ancestorspec.
Differential Revision: D30375233
fbshipit-source-id: 37a78bf1000a40872600e587733a84029f68343b
Summary: This is an untended pop and would throw if there is no stale apfs volumes (and would remove one less volume if there are stale volumes).
Reviewed By: xavierd
Differential Revision: D30432642
fbshipit-source-id: 193d9c15f393a66bc8b43b5f31579c1fe972a7f1
Summary:
Restructure the interface of `http-client::AsyncResponse` to make it easier to avoid misuse.
Specifically, both async and non-async responses now consist of two parts: a head (represented by the new `Head` type) and a body. This solves the problem of being able to access the response headers while consuming the response body: there is now an `into_parts` method on `AsyncResponse` that returns `(Head, AsyncBody)`, decoupling ownership of the parts. This approach was inspired by `hyper::Response`.
Previously, this was accomplished by allowing the body to be moved out of the response and replaced with an empty body. This meant that subsequent calls could incorrectly receive an empty body.
Additionally, `AsyncBody` is now an actual type (instead of an alias) which exposes `raw` and `decoded` methods for accessing the body stream. This makes it very explicit what's happening under the hood, and also minimizes the chance of the user forgetting to decode the response.
The new interface looks like:
```
(head, body) = res.into_parts();
// Choose one of the following:
let decoded_content = body.decoded(); // Automatically decompressed content.
let cbor_content = body.cbor(); // Content as deserialized CBOR entries.
let raw_content = body.raw(); // Raw on-wire content.
// Can still access response headers and status.
let status = head.status();
```
One-line usage is still possible with this interface:
```
let content = res.into_body().decoded().try_concat().await?;
```
Reviewed By: yancouto
Differential Revision: D30436322
fbshipit-source-id: 59911afc34b356a9e3295828ac63da5e295f77a6
Summary: Now that Mercurial's `http-client` crate has built-in support for decompressing responses, use that instead of manually doing it in the LFS code.
Reviewed By: andll
Differential Revision: D30269969
fbshipit-source-id: 9189aa1193e947625c1c98735303e0e038b88901
Summary:
In order to support compressed EdenAPI responses, Mercurial's `http-client` needs to be able to understand the `Content-Encoding` response header.
Since we're using libcurl under the hood, ordinarily we'd just need to set `CURLOPT_ACCEPT_ENCODING`, which sets the `Accept-Encoding` header in the request, and causes libcurl to automatically decompress the response.
Unfortunately, it seems that the Rust bindings build libcurl in a way without support for modern compression algorithms like `zstd` and `brotli`. (When I tested it, it seemed to only support `gzip` and `deflate`.) Since we explicitly want to support `zstd` compression, we have no choice but to decompress the received data ourselves.
Reviewed By: andll
Differential Revision: D30267341
fbshipit-source-id: 8627471ec38669fd9836622cd127423c67f2458e
Summary: Add getters for accessing the fields of `Response` and `AsyncResponse`, and make the fields private. This will make it easier to add support for automatic content decompression.
Reviewed By: yancouto
Differential Revision: D30270216
fbshipit-source-id: 8717f127775286ae799df6bcbe0c47b3aa46aa8d
Summary:
The RelativePath is always built from a valid valid one, thus re-validating it
is not necessary.
Reviewed By: chadaustin
Differential Revision: D30410686
fbshipit-source-id: 3e46359f68b1693a0a2af310466fc73d105cf2c0
Summary:
This adds allowlisted configs to FS trace event sample, which would facilitate A/B testing and parameter tuning. For example, if we want to verify if a larger `hg:import-batch-size` would speed up read operations, we can:
1. split users into two groups, one having size of 16 and another having 32.
2. make sure `hg:import-batch-size` is included in `telemetry:request-sampling-config-allowlist` config.
3. wait for events to populate and compare the durations.
Reviewed By: xavierd
Differential Revision: D30322855
fbshipit-source-id: b3cbdcb64f78d35b8708948db495b2d956cab327
Summary:
The current fb303 counters only report aggregated latency while we want to track Eden performance under different version, os, channel, and configs. So I am setting up a new logging mechanism for this purpose.
This diff introduces the class `FsEventLogger` for sampling and logging. There are 3 configs introduced by this diff. The configs are reloaded every 30 minutes.
1. `telemetry:request-sampling-config-allowlist`
A list of config keys that we want to attach to scuba events.
2. `telemetry:request-samples-per-minute`
Max number of events logged to scuba per minute per mount.
3. `telemetry:request-sampling-group-denominators`
* Each type of operation has a "sampling group" (defaulted to 0, which is dropping all).
* We use this sampling group as index to look up its denominator in this config.
* The denominator is then used for sampling. e.g. `1/x` of the events are send to scuba, if we haven't reached the cap specified by #2.
Example workflow:
1. receive tracing event
2. look up denominator of the sampling group of the operation type
3. sample based on the denominator
4. check that we have not exceeded the logging cap per min
5. create sample and send to scribe
Reviewed By: xavierd
Differential Revision: D30288054
fbshipit-source-id: 8f2b95c11c718550a8162f4d1259a25628f499ff
Summary:
For large megarepos that are result of merging multiple smaller repos
navigating to pre-merge history is usually not what user wants. The checkout
will be slow (in non-edenfs repos), the tooling that expects some repo
structure won't work etc.
Reviewed By: StanislavGlebik
Differential Revision: D30394205
fbshipit-source-id: 23fc4fc31bf01d4cc14f6e3baa1e1165a26a1896
Summary:
This adds the `eden redirect cleanup-apfs` command for deleting stale APFS volumes.
* An APFS is considered as stale if it's not currently mounted and not considered as under any of the checkouts managed by the eden instance.
* The command prints the list of such volumes and uses the APFS util to delete them if the user confirms.
Note: as the command is local to an eden instance, it will list not-mounted APFS volumes of a checkout managed by another eden instance as stale. This should rarely happen as in production we expect there to be a single eden instance. The prompt would also let the user abort if something is wrong.
Reviewed By: chadaustin
Differential Revision: D29940980
fbshipit-source-id: e784cb54d20198bb1f74cd5f15cee0e7546b227c
Summary: `edenfsctl rm` often breaks down due to long path on Windows. This diff fixes that issue.
Reviewed By: xavierd
Differential Revision: D30380003
fbshipit-source-id: e10faa357df932bdb49d7c62d04d9504c7885768
Summary:
By default, a prefetch will always go through the Mercurial importer, but
when store:use-eden-native-prefetch is set, prefetch is simply pushing tons of
blob requests to the HgQueuedBackingStore, and the internal batching takes care
of efficiently fetching.
The one drawback is that prior to pushing a blob request to the queue, getBlob
will query Mercurial to see if a blob is present locally. In the context of a
FUSE access, this is totally expected as this allows for low latency blob
access. But for a prefetch call, throughput matters significantly more and the
local check can negatively affect this.
Reviewed By: genevievehelsel
Differential Revision: D30404965
fbshipit-source-id: 113883993fa641caf7095a5bc8b7dd802f33348d
Summary:
As I'm looking through the code, it took me a bit of time to fully grasp what
the commented code was for, document it for future readers.
Reviewed By: chadaustin
Differential Revision: D30399723
fbshipit-source-id: bdf448b725192d7541b1d7de7e043ff97700dbce
Summary: This is never called, no need to keep it around.
Reviewed By: chadaustin
Differential Revision: D30399722
fbshipit-source-id: bbc169141b58976031fcae224f24ea23897c6f21
Summary:
If a wrong type is passed in, an exception would be thrown at runtime, while
the static_assert would fire at compiled time, reducing the time to find the
error for the developer.
Reviewed By: chadaustin
Differential Revision: D30398255
fbshipit-source-id: fd021f96063565f83c55a9bf3f175bf879afa6ed
Summary: This diff teaches eden doctor to generate a warning when user has SQLite overlay repo on disk and asking them to migrate.
Reviewed By: chadaustin
Differential Revision: D30345721
fbshipit-source-id: 95796ca77979f034904b87e3a38f149baddd720a
Summary:
This is a re-submit of D29915585 (6c5c7055ce), I've merely fixed the bug that it introduced,
thus the credit goes to markbt. Below is the original commit message:
If Mercurial asks EdenFS to update to a commit that it has just created, this
can cause a long delay while EdenFS tries to import the commit.
EdenFS needs to get the trees out of the Hg data store. But these also won't
know about the new trees until the data store is refreshed or synced.
To fix this, call the refresh method on the store if we fail to find the tree,
and try again. To make this work, we must first only look locally. To keep
things simple, we only do this for the root tree.
However, currently indexedlogdatastore doesn't actually do anything when you
ask it to refresh.
To fix these, we call flush(), which actually does a sync operation and loads
the latest data from disk, too.
Reviewed By: chadaustin
Differential Revision: D30387805
fbshipit-source-id: 3fdbd27b306f03df53b68a0bcc5ee5dc140326bb
Summary:
These functions were used heavily, and used old futures.
That meant a lot of uselessly copied stuff, and harder interop.
I also took the opportunity to pass `CoreContext` around as a reference, and did some BlobRepo refactoring to use attribute traits where possible.
Reviewed By: StanislavGlebik
Differential Revision: D30368261
fbshipit-source-id: 2e63677601fafa3c2e3d9d3340df0a5f31a19a11
Summary:
The SmartPlatform service that queries for a user's most used directories allows optional parameters of: os, startTime, endTime, and sandcastleAlias instead of user. This diff extends the current predictive prefetch option which queries based on the current user, mount repository, and a default numResults, to allow specification of all parameters including the optional ones.
If a user and/or repo is not specified these are determined from the server state and mount, respectively. If numResults is not specified, a default value is used (predictivePrefetchProfileSize, currently 10,000).
For sandcastle aliases, we check if the SANDCASTLE_ALIAS environment variable is set, and if so, use the value as a parameter. If a sandcastle alias is specified, the smartservice will ignore the user and query based on the alias, otherwise a user is assumed.
Differential Revision: D30160507
fbshipit-source-id: 174797f0a6f840bb33f669c8d1bb61d76ff7a309
Summary:
The rate metric can be unreliable, and in some environment is never updated by
the time it is read by the test. Since this test cares about the number of
events, using count is a better metric, and more reliable.
Differential Revision: D30377128
fbshipit-source-id: f10c656567e3a3c07b66ecc6fc563a53199e3088
Summary:
The diff is giant, but it's just a one-line change to add the
nested-values feature to slog, we just have a whole bunch of projects dependent
on slog.
Reviewed By: dtolnay
Differential Revision: D30351289
fbshipit-source-id: b6c1c896b06cbdf23b1f92c0aac9a97aa116085d
Summary: `//common/rust/shed/cached_config` is the center of a dependency graph that all only uses old configerator because cached_config uses it. This diff switches all of these over to the new client.
Reviewed By: farnz
Differential Revision: D30357631
fbshipit-source-id: 9a9df74096aa38a06371c6bc787245af71175e48
Summary:
The `DerivedDataManager` will manage the ordering of derivation for derived
data, taking into account dependencies between types as well as the topological
ordering of the repository. It will replace the free functions in
`derived_data` as well as much of the `utils` crate.
This is the first step: it introduces the manager, although currently it only takes
over management of the derived data lease.
Reviewed By: mitrandir77
Differential Revision: D30281634
fbshipit-source-id: 04c3a34d97ea02cc8c26d34096cca341e800da9b
Summary:
In preparation for the derived data manager, ensure that derived data
mappings do not require a `BlobRepo` reference.
The main use for this was to log to scuba. This functionality is extracted out
to the new `BonsaiDerivedMappingContainer`, which now contains just enough
information to be able to log to scuba.
Reviewed By: mitrandir77
Differential Revision: D30135447
fbshipit-source-id: 1daa468a87f297adc531cb214dda3fa7fe9b15da
Summary:
We have mover only for files, and it doesn't quite work for directories - at
the very least directory can be None (i.e. root of the repo).
In the next diffs we'll start recording files and directories renames during
megarepo operations, so let's DirectoryMultiMover as a preparation for that.
Reviewed By: mitrandir77
Differential Revision: D30338444
fbshipit-source-id: 4fed5f50397a7d3d8b77f23552921d515a684604
Summary:
AOSP megarepo wants to create release branches from existing branches, and then update configs to follow only release-ready code.
Provide the primitive they need to do this, which takes an existing commit and config, and creates a new config that tracks the same sources. The `change_target_config` method can then be used to shift from mainline to release branch
Reviewed By: StanislavGlebik
Differential Revision: D30280537
fbshipit-source-id: 43dac24451cf66daa1cd825ada8f685957cc33c1
Summary:
Let's make it possible to query mutable renames from fastlog. For now this is a
very basic support i.e. we don't support middle-of-history renames, blame is
not supported etc.
Mutable rename logic is gated by a tunable, so we can roll it back quickly in
case of problems.
Reviewed By: ahornby
Differential Revision: D30279932
fbshipit-source-id: 0e8e329e8ab4d4980ab401bd103e6c97419d0f67