Summary:
In a repository with files with large histories we run into a lot of SqlTimeout
errors while fetching file history to serve getpack calls. However fetching the
whole file history is not really necessary - client knows how to work with
partial history i.e. if client misses some portion of history then it would
just fetch it on demand.
This diff adds way to add a limit on how many entries were going to be fetched, and if more entries were fetched then we return FilenodeRangeResult::TooBig. The downside of this diff is that we'd have to do more sequential database
queries.
Reviewed By: krallin
Differential Revision: D23025249
fbshipit-source-id: ebed9d6df6f8f40e658bc4b83123c75f78e70d93
Summary:
See D23053788 for motivation. Let's add a new warmer that checks
mutable_counters to understand which commit has been imported already.
Reviewed By: krallin
Differential Revision: D23053991
fbshipit-source-id: 3651aed8836a791675dd8d7bcc145fd32e56a13f
Summary:
With Mercurial now supporting CMD_CAT_TREE for efficiently fetching and reading
trees, we can plumb this onto EdenFS. At startup time, we detect whether
Mercurial supports CMD_CAT_TREE and use that method, otherwise, we fallback to
the old CMD_FETCH_TREE.
Reviewed By: wez
Differential Revision: D23044953
fbshipit-source-id: 9aea5c5b82e97039a75ef18976a155dcb6e150bc
Summary:
Use sorted_vector_map when parsing hg manifest blob, as blobs are usually stored sorted, which can result in high cost of BTree insertion when traversing large repos.
Also uses the size_hint() from the parsing Split to save reallocations during insert.
Reviewed By: markbt
Differential Revision: D22975883
fbshipit-source-id: 1faff754f03d7b2c20ebb741fec4f97b310852f9
Summary:
When running `edenfsctl prefetch **/BUCK` with an empty hgcache, EdenFS ends up
asking mercurial for every manifest one by one. Unfortunately, every manifest
fetched also causes the packfile to be flushed to disk, which then leads EdenFS
to rescan the filesystem for the new packfile. Once too many packfiles are
present on disk, Mercurial triggers a repack. Effectively, that means we have a
quadratic complexity both on Mercurial, and on EdenFS's side.
While this has been a long standing issue, we've so far avoided falling into
this complexity for a number of reason. The main one being that the hgcache is
very rarely empty, and thus the quadratic complexity is usually on low number
of files. Users also rarely run a prefetch of all the files for the entire
repo. However, on repositories with long standing branches, the hgcache is
effectively cold and thus any prefetch would trigger the pathological behavior.
To solve this, we take the same approach taken for files: sending the raw
manifest to EdenFS, which will then take care of deserializing it properly.
Reviewed By: DurhamG
Differential Revision: D23035335
fbshipit-source-id: 855e6fb4fabf81c427fad6c9f17d05f95c47e9ae
Summary: There are no users waiting on manual scrub, so set it to use the background session mode.
Reviewed By: krallin
Differential Revision: D23054581
fbshipit-source-id: 985bcadbaf17d2a8c92fdec811ecb239cbca7b37
Summary:
On macOS, it appears that ssh has a ~1% chance of never being able to connect
too the server and just hang. This caused mactest to be completely unhealthy
for a couple of days and a similar hotfix was applied to mitigate the issue.
Since it proved to be working, let's now backport this hotfix in the actual
code.
Reviewed By: DurhamG
Differential Revision: D22953230
fbshipit-source-id: ead7662ea6d0a33efaa5c4044c9391b2835ee421
Summary: Client portion for the commit/revlog_data endpoint that was added to the server.
Reviewed By: kulshrax
Differential Revision: D23065989
fbshipit-source-id: 3115ad2b426daca22472e2106fcd293f3ccd70f3
Summary:
Pyre now has improved support for decorators and descriptors, which makes it
possible for us to add type annotations to `dirstate.py` without needing lots
of `pyre-ignore` comments everywhere. (Previously Pyre could not handle the
`propertycache` decorator, causing it to be confused about the type of
various dirstate members, like `_map`).
Reviewed By: mrkmndz
Differential Revision: D22969757
fbshipit-source-id: 1b54f1edfb56c20c237a34f14a47404d10605240
Summary: Begin adding some initial type annotations for the Rust Python bindings.
Reviewed By: quark-zju
Differential Revision: D22993222
fbshipit-source-id: 2073db93b22f6bb04e30b767594d435c36ddb17f
Summary:
Using os.kill on EdenFS would always fail and raise an exception. Use the
proc_utils code to detect if the process is running. Also using BUCKVERSION
always raises an error on Windows, so let's ignore that for now.
Reviewed By: fanzeyi
Differential Revision: D22915350
fbshipit-source-id: 806bfab12ae0e8fc97e83d5720481f2a47516129
Summary:
Let's split logic from WarmBookmarksCache into a separate builder. This builder
will configure which warmers we'd like to use.
This will make it easier to introduce a new warmer later in the stack
Reviewed By: krallin
Differential Revision: D23053785
fbshipit-source-id: 32acc9da98d32624ca0dc00277910443f3d86f66
Summary:
Previously we were unconditionally adding hg changesets, but that's a bit
strange and there's no reason to do it. Let's do the same check we do for other
derived data types. Note that there should be no change in behaviour - all our
repos have "hgchangesets" derived data type enabled.
Reviewed By: krallin
Differential Revision: D23053786
fbshipit-source-id: 0b3ea99f649bc89ea9b216f368fee11fa25e153f
Summary: I want to add a new warmer in the next diffs which won't do any deriving.
Reviewed By: krallin
Differential Revision: D23053787
fbshipit-source-id: 4c7febb60ab7e835302db746c670d656bd9d1989
Summary:
EdenFS may spawn several Mercurial process concurrently and they would all try
to take the wlock at startup time, more often than not, one of these process
would die early due to the tmplock not being present on disk. This is due to
the other Mercurial process removing it, let's have a 10s grace period where
temporary locks aren't removed to avoid this race.
Reviewed By: DurhamG
Differential Revision: D22954997
fbshipit-source-id: ce191265c03a7042d9c6e45db0dc44a688fa204c
Summary:
When doing large clones or checkouts the amount of data we add to an
indexedlog can be many GB. On a laptop we don't have much memory, so let's set a
max memory threshold for the file data/history indexedlogs.
Reviewed By: xavierd
Differential Revision: D23046489
fbshipit-source-id: 43b7686b11fe05e4c074bcb02c475ebf8cf14ab1
Summary: Dump the text into file as it is
Reviewed By: markbt
Differential Revision: D23039839
fbshipit-source-id: 966d6c5e90f020efbb8123704f5c2749596fbab5
Summary:
There are two different magic background syncing that can be enabled. The first
is triggered by commit or any other local changes. The second is triggered by
SCM Daemon by any remote change in this workspace.
I would like to explain it a bit better in `hg cloud status` command.
This will also offer some reassurance to clients.
For example, assume they run `hg cloud disable` command that should disable all background Commit Cloud traffic for some time, so then they can run `hg cloud status` and verify that neither local changes, nor remote changes trigger any commit cloud traffic on this machine.
I also provide full log path to Scm Daemon logs if it is enabled.
Reviewed By: markbt
Differential Revision: D23038954
fbshipit-source-id: c3a5b8f58df729ee3f1c7f15da44ad6e6e0b98f6
Summary:
Once we have revealed the commits to the user (D22864223 (578207d0dc), D22762800 (f1ef619284)), we need to merge the imported branch into the destination branch (specified by dest-bookmark). To do this, we extract the latest commit of the destination branch, then compare the two commits, if we have merge conflicts. If we have merge conflicts, we inform the user, so they can resolve it. Otherwise, we create a new bonsai having the two commits as parents.
Next step: pushrebase the merge commit
Minor refactor: moved app setup to a separate file for better readability.
Reviewed By: StanislavGlebik
Differential Revision: D23028163
fbshipit-source-id: 7f3e2a67dc089e6bbacbe71b5e4ef5f6eed2a9e1
Summary: Add context to show the affected key if there are problems peeking a key.
Reviewed By: farnz
Differential Revision: D23003001
fbshipit-source-id: b46b7626257f49d6f11e80a561820e4b37a5d3b0
Summary:
Now that the previous diff has pre-computed the hash value using EagerHashMemo, its less expensive to try a read-lock only get() first before committing to a write lock acquiring insert().
The combination of these and the previous diff moved WalkState::visit from dominating the cpu profile to not ( the path interning dominates now ).
Reviewed By: krallin
Differential Revision: D22975881
fbshipit-source-id: 90b2be83282ee2095c517c0d4f13536ddadf6267
Summary:
DashMap takes the hash of its keys multiple times, once outside the lock, and then once or twice inside the lock depending if the key is present in the shard.
Pre-computing the hash value using EagerHashMemo means its done only once and more importantly, outside the lock.
To use EagerHashMemo one needs to supply the BuildHasher, so its added as a struct member and the record method is made a member function.
Reviewed By: farnz
Differential Revision: D22975878
fbshipit-source-id: c2ca362fdfe31e5dca329e6200029207427cd9a1
Summary:
Matches the `getcommitdata` SSH endpoint.
This is going to be used to remove the requirement that client repostories
need to have all commits locally.
Reviewed By: krallin
Differential Revision: D22979458
fbshipit-source-id: 75d7265daf4e51d3b32d76aeac12207f553f8f61
Summary:
Instead of modifying the existing APIs and marking all callsites as deprecated in one diff, I am going to take the "add and remove" approach, where I will add the deprecated version of methods first, then mark all callsites, and finally remove the existing ones. This provides some backward compatibility without breaking things.
This diff is to add deprecated getSelection APIs to ServiceSelectorCache.
Differential Revision: D22981269
fbshipit-source-id: 6e3025e7f7df6ee7f9e1cba9dc036ca84adbe49a
Summary:
Previously we fetched metadata by commit hash and path. We knew this would be a
little extra expensive, but turns out this is a lot extra expensive.
Wait why is it expensive?
In short: lots of extra lookups that are not satisfied by cache :(
In long:
1. Each piece of the path would require a read to fetch the fsnode for that tree.
So this means asking for the metadata of a/b/c/d/e means 5 reads.
2. Normally these reads could be cached, but often we would make these requests
with a commit hash for a draft commit. On the server side this info is not
cached for a draft commit, this means a lot of database reads and recalculating.
(Most of the real uses of metadata prefetching is when an engineer is working
on a local commit. We just use the commit hash of the commit the user was on
when fetching metadata for a tree, even if that tree hasn't changed since a public
commit. so this means lots of requests with draft commit hashes).
Fetching by manifest id we are able to bypass this sequential path look up.
(and even if we are on a draft commit, if the tree has not locally changed
since a public commit, the manifest id will be the same as the public commit
avoiding this whole draft commit issue).
This allows us to query scs with a manifest id for a tree.
Reviewed By: wez
Differential Revision: D22990687
fbshipit-source-id: aa81d67de1f1d04a14d174774ee216f5ac6be5ba
Summary:
The query we use to select blobs to heal is naturally expensive, due to the use of a subquery. This means that finding the perfect queue limit is hard, and we depend on task restarts to handle brief overload of MySQL.
Give us a fast fall in batch size (halve on each failure), and slow climb back (10% climb on each success), and a random delay after each failure before retrying.
Reviewed By: StanislavGlebik
Differential Revision: D23028518
fbshipit-source-id: f2909fe792280f81d604be99fabb8b714c1e6999
Summary:
All of these are already valid utf-8 characters, no need to dance to
decode/encode them again.
Reviewed By: DurhamG
Differential Revision: D22978828
fbshipit-source-id: c5f6e25e71cdcaa1c0558d4a1181b667ffe379fb
Summary:
The StringConv.h header contains many functions to convert from Windows paths
to Eden's path (and vice versa) to workaround the fact that Eden's path don't
support wide strings that Windows uses. Let's simply add support for these wide
strings in PathFuncs so we can greatly simplify all the call sites. Instead of
calling "edenToWinName(winstr)", "PathComponent(winstr)" is both more
descriptive and more idiomatic.
To be fair, I'm not entirely a fan of the approach taken in this diff, as this
adds Windows specific code to PathFuncs.h, but I feel that the benefit is too
big to not do that.
Reviewed By: chadaustin
Differential Revision: D23004523
fbshipit-source-id: 3a1507e398a66909773251907db01e06603b91dd
Summary:
`is_tree` weren't part of the cache key, and that means we could have returned
incorrect history if we had a file and a directory with the same name.
This diff fixes it.
Reviewed By: krallin
Differential Revision: D23028527
fbshipit-source-id: 98a3b2028fa62231dfb570a76fb836374ce1eed0
Summary:
I noticed that fastreplay doesn't init tunables, and that means that it doesn't
get the updates, and more importantly it doesn't use default values of
tunables.
That doesn't look expected (but lmk if I'm wrong!)
Reviewed By: krallin
Differential Revision: D23027311
fbshipit-source-id: ee43d02457d2240ebeb1530c672cb3847bc3afd4
Summary: This has my into_key() PR https://github.com/xacrimon/dashmap/pull/91 merged so the patch pointing to my fork is also removed.
Reviewed By: farnz
Differential Revision: D22896911
fbshipit-source-id: 188d438ce2aa20cfb3c466a62227d1cd27625f74
Summary:
Nobody is using it, and it is very likely very out of date, no need to keep
this around.
Reviewed By: chadaustin
Differential Revision: D23008636
fbshipit-source-id: 2f29dae5986ce14b5b77523ff6a888c6824e97c5
Summary:
This makes it similar to the Unix one, which reduces the ifdef a tiny bit.
Ideally I'd want to move the pipe handling into its own class so callers won't
have to care about windows/linux specificities.
Reviewed By: fanzeyi
Differential Revision: D22954056
fbshipit-source-id: c92a25b6abe084a7c7496c0d6e07795779e0abad
Summary:
When computing which heads to remove because of bookmark removal,
ignore any heads that are not in the smartlog dag.
Heads might be missing from the smartlog dag if they're not
available on the server.
Reviewed By: quark-zju
Differential Revision: D22980810
fbshipit-source-id: f002eece8567aaf57780f592aaf29a790b8314ce
Summary:
Vendor ahash 0.4.4. In tests I haven't found this update significant in mononoke walker performance, but might as well be current now I'd tried it.
I have found that wrapping ahash in a memoizing hasher helps, but that is for another diff.
Reviewed By: farnz
Differential Revision: D22864635
fbshipit-source-id: 5019259273ae3bd2df95cdd18adceed895baf8f2
Summary:
A commit doesn't show up after `hg pull -r` command if it's known locally.
This is a bug that the test demonstrates.
Reviewed By: quark-zju
Differential Revision: D22977182
fbshipit-source-id: 428094568140892fc8a13004f3395371d8b55ebf
Summary: Add a non-thrift header to packblob so we can vary thrift protocol in future.
Reviewed By: farnz
Differential Revision: D22953758
fbshipit-source-id: a114a350105e75cbe57f6c824295d863c723f32f
Summary:
Introduce taggederror-util, which provides a new trait `AnyhowEdenExt`, which provides a method `eden_metadata` for anyhow errors and results. This method works much like `AnyhowExt::common_metadata`, but additionally supports extracting default error metadata from known `Tagged` types which are listed explicitly in the method implementation.
Extend `FilteredAnyhow` to support a configuration "metadata function", which allows swapping out `eden_metadata` for the standard `common_metadata`.
Modify Rust dispatch and Python bindings to use `AnyhowEdenExt` for metadata extraction and printing.
Modify `intentional_error` to rely on `AnyhowEdenExt` for tagging (removes `.tagged` call, no tags will be visible if `AnyhowEdenExt` is not used).
Reviewed By: DurhamG
Differential Revision: D22927203
fbshipit-source-id: 04b36fdfaa24af591118acb9e418d1ed7ae33f91
Summary:
Add a `buffered()` method to `AsyncResponse` allowing the user to specify the desired chunk size for the body stream.
(This was already used internally by `CborStream`; this just exposes it in the public interface.)
Reviewed By: quark-zju
Differential Revision: D22935891
fbshipit-source-id: e110e85bf9cb4c7923a8977ea4631ca1cc4cf4cb
Summary: Rename the `cbor` module to `stream` to better indicate that it contains various stream combinators (not all of which are related to CBOR).
Reviewed By: quark-zju
Differential Revision: D22935892
fbshipit-source-id: 3f73aa707ab59c31717c1cf35995ad79946a15c9
Summary:
Previously `eden prefetch` had two subcommands `record-profile` and `finish-profile`, but then when we only want to use `eden prefetch PATTERN`, an error shows up saying that the COMMAND argument is missed.
Since `eden prefetch` has many of its own arguments, and we don't want to use it with `record-profile` and `finish-profile` all the time, we remove those subcommands and create a new `eden debug prefetch_profile` command to get prefetch profiles.
Reviewed By: fanzeyi
Differential Revision: D22959981
fbshipit-source-id: 21b278555fcb56580a62f66a7384b1cff54ba398
Summary: I think that the broken config (wrong relative base for search_path), was what prevented the upgrade from going automatically.
Reviewed By: grievejia
Differential Revision: D22966243
fbshipit-source-id: 4ef42a8e2e6f2c79483301c6876509a3009a83d1