Summary:
Use sorted_vector_map when parsing hg manifest blob, as blobs are usually stored sorted, which can result in high cost of BTree insertion when traversing large repos.
Also uses the size_hint() from the parsing Split to save reallocations during insert.
Reviewed By: markbt
Differential Revision: D22975883
fbshipit-source-id: 1faff754f03d7b2c20ebb741fec4f97b310852f9
Summary:
When running `edenfsctl prefetch **/BUCK` with an empty hgcache, EdenFS ends up
asking mercurial for every manifest one by one. Unfortunately, every manifest
fetched also causes the packfile to be flushed to disk, which then leads EdenFS
to rescan the filesystem for the new packfile. Once too many packfiles are
present on disk, Mercurial triggers a repack. Effectively, that means we have a
quadratic complexity both on Mercurial, and on EdenFS's side.
While this has been a long standing issue, we've so far avoided falling into
this complexity for a number of reason. The main one being that the hgcache is
very rarely empty, and thus the quadratic complexity is usually on low number
of files. Users also rarely run a prefetch of all the files for the entire
repo. However, on repositories with long standing branches, the hgcache is
effectively cold and thus any prefetch would trigger the pathological behavior.
To solve this, we take the same approach taken for files: sending the raw
manifest to EdenFS, which will then take care of deserializing it properly.
Reviewed By: DurhamG
Differential Revision: D23035335
fbshipit-source-id: 855e6fb4fabf81c427fad6c9f17d05f95c47e9ae
Summary: There are no users waiting on manual scrub, so set it to use the background session mode.
Reviewed By: krallin
Differential Revision: D23054581
fbshipit-source-id: 985bcadbaf17d2a8c92fdec811ecb239cbca7b37
Summary:
On macOS, it appears that ssh has a ~1% chance of never being able to connect
too the server and just hang. This caused mactest to be completely unhealthy
for a couple of days and a similar hotfix was applied to mitigate the issue.
Since it proved to be working, let's now backport this hotfix in the actual
code.
Reviewed By: DurhamG
Differential Revision: D22953230
fbshipit-source-id: ead7662ea6d0a33efaa5c4044c9391b2835ee421
Summary: Client portion for the commit/revlog_data endpoint that was added to the server.
Reviewed By: kulshrax
Differential Revision: D23065989
fbshipit-source-id: 3115ad2b426daca22472e2106fcd293f3ccd70f3
Summary:
Pyre now has improved support for decorators and descriptors, which makes it
possible for us to add type annotations to `dirstate.py` without needing lots
of `pyre-ignore` comments everywhere. (Previously Pyre could not handle the
`propertycache` decorator, causing it to be confused about the type of
various dirstate members, like `_map`).
Reviewed By: mrkmndz
Differential Revision: D22969757
fbshipit-source-id: 1b54f1edfb56c20c237a34f14a47404d10605240
Summary: Begin adding some initial type annotations for the Rust Python bindings.
Reviewed By: quark-zju
Differential Revision: D22993222
fbshipit-source-id: 2073db93b22f6bb04e30b767594d435c36ddb17f
Summary:
Using os.kill on EdenFS would always fail and raise an exception. Use the
proc_utils code to detect if the process is running. Also using BUCKVERSION
always raises an error on Windows, so let's ignore that for now.
Reviewed By: fanzeyi
Differential Revision: D22915350
fbshipit-source-id: 806bfab12ae0e8fc97e83d5720481f2a47516129
Summary:
Let's split logic from WarmBookmarksCache into a separate builder. This builder
will configure which warmers we'd like to use.
This will make it easier to introduce a new warmer later in the stack
Reviewed By: krallin
Differential Revision: D23053785
fbshipit-source-id: 32acc9da98d32624ca0dc00277910443f3d86f66
Summary:
Previously we were unconditionally adding hg changesets, but that's a bit
strange and there's no reason to do it. Let's do the same check we do for other
derived data types. Note that there should be no change in behaviour - all our
repos have "hgchangesets" derived data type enabled.
Reviewed By: krallin
Differential Revision: D23053786
fbshipit-source-id: 0b3ea99f649bc89ea9b216f368fee11fa25e153f
Summary: I want to add a new warmer in the next diffs which won't do any deriving.
Reviewed By: krallin
Differential Revision: D23053787
fbshipit-source-id: 4c7febb60ab7e835302db746c670d656bd9d1989
Summary:
EdenFS may spawn several Mercurial process concurrently and they would all try
to take the wlock at startup time, more often than not, one of these process
would die early due to the tmplock not being present on disk. This is due to
the other Mercurial process removing it, let's have a 10s grace period where
temporary locks aren't removed to avoid this race.
Reviewed By: DurhamG
Differential Revision: D22954997
fbshipit-source-id: ce191265c03a7042d9c6e45db0dc44a688fa204c
Summary:
When doing large clones or checkouts the amount of data we add to an
indexedlog can be many GB. On a laptop we don't have much memory, so let's set a
max memory threshold for the file data/history indexedlogs.
Reviewed By: xavierd
Differential Revision: D23046489
fbshipit-source-id: 43b7686b11fe05e4c074bcb02c475ebf8cf14ab1
Summary: Dump the text into file as it is
Reviewed By: markbt
Differential Revision: D23039839
fbshipit-source-id: 966d6c5e90f020efbb8123704f5c2749596fbab5
Summary:
There are two different magic background syncing that can be enabled. The first
is triggered by commit or any other local changes. The second is triggered by
SCM Daemon by any remote change in this workspace.
I would like to explain it a bit better in `hg cloud status` command.
This will also offer some reassurance to clients.
For example, assume they run `hg cloud disable` command that should disable all background Commit Cloud traffic for some time, so then they can run `hg cloud status` and verify that neither local changes, nor remote changes trigger any commit cloud traffic on this machine.
I also provide full log path to Scm Daemon logs if it is enabled.
Reviewed By: markbt
Differential Revision: D23038954
fbshipit-source-id: c3a5b8f58df729ee3f1c7f15da44ad6e6e0b98f6
Summary:
Once we have revealed the commits to the user (D22864223 (578207d0dc), D22762800 (f1ef619284)), we need to merge the imported branch into the destination branch (specified by dest-bookmark). To do this, we extract the latest commit of the destination branch, then compare the two commits, if we have merge conflicts. If we have merge conflicts, we inform the user, so they can resolve it. Otherwise, we create a new bonsai having the two commits as parents.
Next step: pushrebase the merge commit
Minor refactor: moved app setup to a separate file for better readability.
Reviewed By: StanislavGlebik
Differential Revision: D23028163
fbshipit-source-id: 7f3e2a67dc089e6bbacbe71b5e4ef5f6eed2a9e1
Summary: Add context to show the affected key if there are problems peeking a key.
Reviewed By: farnz
Differential Revision: D23003001
fbshipit-source-id: b46b7626257f49d6f11e80a561820e4b37a5d3b0
Summary:
Now that the previous diff has pre-computed the hash value using EagerHashMemo, its less expensive to try a read-lock only get() first before committing to a write lock acquiring insert().
The combination of these and the previous diff moved WalkState::visit from dominating the cpu profile to not ( the path interning dominates now ).
Reviewed By: krallin
Differential Revision: D22975881
fbshipit-source-id: 90b2be83282ee2095c517c0d4f13536ddadf6267
Summary:
DashMap takes the hash of its keys multiple times, once outside the lock, and then once or twice inside the lock depending if the key is present in the shard.
Pre-computing the hash value using EagerHashMemo means its done only once and more importantly, outside the lock.
To use EagerHashMemo one needs to supply the BuildHasher, so its added as a struct member and the record method is made a member function.
Reviewed By: farnz
Differential Revision: D22975878
fbshipit-source-id: c2ca362fdfe31e5dca329e6200029207427cd9a1
Summary:
Matches the `getcommitdata` SSH endpoint.
This is going to be used to remove the requirement that client repostories
need to have all commits locally.
Reviewed By: krallin
Differential Revision: D22979458
fbshipit-source-id: 75d7265daf4e51d3b32d76aeac12207f553f8f61
Summary:
Instead of modifying the existing APIs and marking all callsites as deprecated in one diff, I am going to take the "add and remove" approach, where I will add the deprecated version of methods first, then mark all callsites, and finally remove the existing ones. This provides some backward compatibility without breaking things.
This diff is to add deprecated getSelection APIs to ServiceSelectorCache.
Differential Revision: D22981269
fbshipit-source-id: 6e3025e7f7df6ee7f9e1cba9dc036ca84adbe49a
Summary:
Previously we fetched metadata by commit hash and path. We knew this would be a
little extra expensive, but turns out this is a lot extra expensive.
Wait why is it expensive?
In short: lots of extra lookups that are not satisfied by cache :(
In long:
1. Each piece of the path would require a read to fetch the fsnode for that tree.
So this means asking for the metadata of a/b/c/d/e means 5 reads.
2. Normally these reads could be cached, but often we would make these requests
with a commit hash for a draft commit. On the server side this info is not
cached for a draft commit, this means a lot of database reads and recalculating.
(Most of the real uses of metadata prefetching is when an engineer is working
on a local commit. We just use the commit hash of the commit the user was on
when fetching metadata for a tree, even if that tree hasn't changed since a public
commit. so this means lots of requests with draft commit hashes).
Fetching by manifest id we are able to bypass this sequential path look up.
(and even if we are on a draft commit, if the tree has not locally changed
since a public commit, the manifest id will be the same as the public commit
avoiding this whole draft commit issue).
This allows us to query scs with a manifest id for a tree.
Reviewed By: wez
Differential Revision: D22990687
fbshipit-source-id: aa81d67de1f1d04a14d174774ee216f5ac6be5ba
Summary:
The query we use to select blobs to heal is naturally expensive, due to the use of a subquery. This means that finding the perfect queue limit is hard, and we depend on task restarts to handle brief overload of MySQL.
Give us a fast fall in batch size (halve on each failure), and slow climb back (10% climb on each success), and a random delay after each failure before retrying.
Reviewed By: StanislavGlebik
Differential Revision: D23028518
fbshipit-source-id: f2909fe792280f81d604be99fabb8b714c1e6999
Summary:
All of these are already valid utf-8 characters, no need to dance to
decode/encode them again.
Reviewed By: DurhamG
Differential Revision: D22978828
fbshipit-source-id: c5f6e25e71cdcaa1c0558d4a1181b667ffe379fb
Summary:
The StringConv.h header contains many functions to convert from Windows paths
to Eden's path (and vice versa) to workaround the fact that Eden's path don't
support wide strings that Windows uses. Let's simply add support for these wide
strings in PathFuncs so we can greatly simplify all the call sites. Instead of
calling "edenToWinName(winstr)", "PathComponent(winstr)" is both more
descriptive and more idiomatic.
To be fair, I'm not entirely a fan of the approach taken in this diff, as this
adds Windows specific code to PathFuncs.h, but I feel that the benefit is too
big to not do that.
Reviewed By: chadaustin
Differential Revision: D23004523
fbshipit-source-id: 3a1507e398a66909773251907db01e06603b91dd
Summary:
`is_tree` weren't part of the cache key, and that means we could have returned
incorrect history if we had a file and a directory with the same name.
This diff fixes it.
Reviewed By: krallin
Differential Revision: D23028527
fbshipit-source-id: 98a3b2028fa62231dfb570a76fb836374ce1eed0
Summary:
I noticed that fastreplay doesn't init tunables, and that means that it doesn't
get the updates, and more importantly it doesn't use default values of
tunables.
That doesn't look expected (but lmk if I'm wrong!)
Reviewed By: krallin
Differential Revision: D23027311
fbshipit-source-id: ee43d02457d2240ebeb1530c672cb3847bc3afd4
Summary: This has my into_key() PR https://github.com/xacrimon/dashmap/pull/91 merged so the patch pointing to my fork is also removed.
Reviewed By: farnz
Differential Revision: D22896911
fbshipit-source-id: 188d438ce2aa20cfb3c466a62227d1cd27625f74