Summary: The first request was way too slow as the cache was not yet warm. This speeds up the initial request substentially.
Reviewed By: StanislavGlebik
Differential Revision: D13817064
fbshipit-source-id: 0f2a01395743ef848e6bc4a5e71c0562b268c0cf
Summary:
Previously pushrebasing an empty commit failed because we assumed that root
manifest of a commit is always sent in a bundle. This diff removes this
assumption
Reviewed By: lukaspiatkowski
Differential Revision: D13818556
fbshipit-source-id: 44e96374ae343074f48e42a90c691b21e3c41386
Summary:
this is required to cover corner cases when client has some stacks and part of those became public
calculation for public roots happen for draft heads only, it doesn't change performance of hg pull
Reviewed By: StanislavGlebik
Differential Revision: D13742685
fbshipit-source-id: d8c8bc357628b9b513bbfad4a82a7220d143f364
Summary:
There is no much point in keeping since we have skiplist which should solve the
same problems in a better way.
The only way where CachingChangesets maybe useful is when many users fetch a
lot of commits simultaneously. It may happen when we merge a new big repository.
However current implementation of CachingChangesets won't help with it since we
do not update its indexes.
Reviewed By: lukaspiatkowski
Differential Revision: D13695201
fbshipit-source-id: 2a4600eccf8224453ca13047e5a2ef3a0af650e3
Summary:
File content blobs are thrift encoded in Mononoke. This is done so
that we can change the encoding of content blobs easily. For example, we can
add compression or we can add split the blobs in chunks.
However there is a problem. At the moment file content blob key is a hash of
the actual data that's written to blobstore i.e. of a thrift encoded data. That
means that if we add compression or change thrift encoding in any way, then the
file content blob key changes and it changes the commit hashes.
This is wrong. To fix it let's use hash of the actual file content as the key.
Reviewed By: farnz
Differential Revision: D12884898
fbshipit-source-id: e60a7b326c39dad86e2b26c6f637defcb0acc8e8
Summary:
The bulk api makes less queries to mysql and therefore is more efficient.
This is especially important for `hg pull` requests where the list of heads is very large.
Reviewed By: lukaspiatkowski
Differential Revision: D13677298
fbshipit-source-id: 3dec1b3462c520c11481325e82523ef7a6ae6516
Summary:
This version still misses:
- proper production-ready logging
- smarter handling of case where the queue entries related to each other do not fit in the limit or `older_than` limit, so the healer will heal much more entries without realizing it shouldn't do so.
Reviewed By: aslpavel
Differential Revision: D13528686
fbshipit-source-id: 0245becea7e4f0ac69383a7885ff3746d81c4add
Summary: MyRouter and cache support will come in next diffs
Reviewed By: StanislavGlebik
Differential Revision: D13465146
fbshipit-source-id: 0ede0e875d4a49794ff71173bd0d1563abb3ff08
Summary: Format files effected by next commit in a stack
Reviewed By: StanislavGlebik
Differential Revision: D13650639
fbshipit-source-id: d4e37acd2bcd29b291968a529543c202f6944e1a
Summary:
It breaks the pushrebase test.
Original commit: 4e084bee13ff4941d1a42d1f75fe501575858a63
Original diff: D13573105
Reviewed By: StanislavGlebik
Differential Revision: D13651039
fbshipit-source-id: b67c32e0fc4acc953265a089e746ede3d4426b6f
Summary:
After some discussion with Pavel Aslanov, Lukas Piatkowski and Stanislau Hlebik, it was evident that shared future is the best approach for the bookmarks cache.
The cache in this implementation maintains a shared future for each repo, fetching the full list of bookmarks. When a list of bookmarks with a given prefix is required, a filter is applied to a full list future.
Two locks are used in this implementation: one for adding new repos to the hashtable and one for updating the cache. In both cases the optimistic strategy: "let's first first grab a read lock and try checking if it is good enough" is applied.
Reviewed By: StanislavGlebik
Differential Revision: D13573105
fbshipit-source-id: 4e084bee13ff4941d1a42d1f75fe501575858a63
Summary:
When receiving and infinitepush bundle, don't store the filenodes for the
commit in the filenodes table.
When a client pulls these commits, we will reconstruct the filenode info from
the blobstore data. However, they will receive null linknodes, and will need
to use adjustlinknode to compute the correct linknode.
Reviewed By: StanislavGlebik
Differential Revision: D13467110
fbshipit-source-id: 739b06f30a530a159352ffbf612d136c9c831aeb
Summary:
For draft commits we will stop storing the filenodeinfo for filenodes
introduced by that commit in the database. This means the filenodeinfo lookup
may fail. For these cases, reconstruct the filenodeinfo from the blob in the
blobstore, setting the missing linknode to the null changeset id.
Reviewed By: StanislavGlebik
Differential Revision: D13467112
fbshipit-source-id: 27ad406723a6affd80e7c3b2dc538b03004451ec
Summary: There's nothing Mercurial-specific about identifying a repo. This also outright removes some dependencies on mercurial-types.
Reviewed By: StanislavGlebik
Differential Revision: D13512616
fbshipit-source-id: 4496a93a8d4e56cd6ca319dfd8effc71e694ff3e
Summary: similar to get_bookmarks_maybe_stale but read from master db
Reviewed By: markbt
Differential Revision: D13417055
fbshipit-source-id: 805cbe3953a6c0a2380c0168eb403c6e9e0551c9
Summary: Restructure the configs so that we can specify more than one blobstore
Reviewed By: lukaspiatkowski
Differential Revision: D13234286
fbshipit-source-id: a98ede17921ed6148add570288ac23636b086398
Summary:
Previously metaconfig depended on BlobRepo, and so ManifoldArgs had to be
definided in BlobRepo. That was an weird dependency, but a necessary one
because of Mononoke config repo. In the previous diffs we got rid of the
Mononoke config repo, so now we can reverse the dependencies.
Reviewed By: lukaspiatkowski
Differential Revision: D13180160
fbshipit-source-id: efe713ce3b160c98d56fc13559c57a920146841f
Summary:
Now that Rust macros can be `use`d like normal symbols, `stats` can
simply import the `lazy_static!` macro without requiring its users to do it.
Reviewed By: Imxset21
Differential Revision: D13281897
fbshipit-source-id: a6780fbace07dd784308e642d4a384322a17c367
Summary:
Let's add a command that builds and reads a skiplist indexes. This indexes will
be used by getbundle wireproto request to decrease the latency and cpu usage.
Note that we are saving only the longest "jump" from the skiplist. This is done
in order to save space.
Reviewed By: jsgf
Differential Revision: D13169018
fbshipit-source-id: 4d654284b0c0d8a579444816781419ba6ad86baa
Summary:
Let's make it use the same ChangesetFethcer as getbundle already does. It will
be used in the next diffs
Reviewed By: lukaspiatkowski
Differential Revision: D13122344
fbshipit-source-id: 37eba612935a209098a245f4be0af3bc18c5787e
Summary:
Most of our revsets are already migrated, let's migrate skiplists as well since
we want to use them in getbundle requests.
Reviewed By: lukaspiatkowski
Differential Revision: D13083910
fbshipit-source-id: 4c3bc40ccff95c3231c76b9e920af5db31b80d01
Summary:
We've recently found that `known()` wireproto request gets much slower when we
send more traffic to Mononoke jobs. Other wireproto methods looked fine, cpu
and memory usage was fine as well.
Background: `known()` request takes a list of hg commit hashes and returns
which of them Mononoke knows about.
One thing that we've noticed is that `known()` handler sends db requests sequentially.
Experiments with sending `known()` requests with commit hashes that Mononoke
didn't know about confirmed that it's latency got higher the more parallel
requests we sent. We suspect this is because Mononoke has to send a requests to
db master, and we limit the number of master connections.
A thing that should help is batching the requests i.e. do not send many
requests asking if a single hg commit exists, but sending the same request for
many commits at once.
That change also required doing changes to the bonsai-mapping caching layer to
do batch cache requests.
Reviewed By: lukaspiatkowski
Differential Revision: D13194775
fbshipit-source-id: 47c035959c7ee12ab92e89e8e85b723cb72738ae
Summary:
Currently we read all bookmarks from primary replica a few times during `hg
pull`. First time when we do listkeys, second time when we get heads.
That might create a high load on primary replica.
However the delay between primary and secondary replicas is fairly small, and so it
*should* be fine to read bookmarks from secondary local replica as long as there is only
one replica per region (because if we have a few replicas per region, then
heads and listkeys response might be inconsistent).
Reviewed By: lukaspiatkowski
Differential Revision: D13039779
fbshipit-source-id: e1b8050f63a3a05dc6cf837e17a448c3b346b723
Summary:
According to [Git-LFS Plan](https://www.mercurial-scm.org/wiki/LfsPlan), `getfiles` instead of file content should return file in the [following format](https://www.mercurial-scm.org/wiki/LfsPlan#Metadata_format)
```
oid: sha256.SHA256HASH
size: size_int
```
Hg client requests files using sha1 hgfilenode hash. To calculate sha256 of the content, Mononoke is fetching the file from blobstore to memory, and calculate sha256.
It does not give any profit in time and memory consumptions, comparing to non-LFS transfer of Mononoke.
*Solution:*
To put a `key-value` to blobstore, after first request of the file. This means, that after hg client requested sha256 of the file for the first time, after calculation, put it to the blobstore.
Next request of the sha256 of the file content avoid recalcualtion of sha256 in Mononoke. It return sha256 saved in the blob.
Reviewed By: StanislavGlebik
Differential Revision: D13021826
fbshipit-source-id: 692e01e212e7d716bd822fa968e87abed5103aa7
Summary:
Mononoke requires several references to the same blob in the blobstore.
Sha256 aliases are good example. [post](https://fb.facebook.com/groups/scm.mononoke/permalink/739273266435251/)
Short description of alias mechanism:
- we have `key: value` blob in blobstore.
- put a `key1: key` blob in blobstore to have 2-step access from `key1` to `value`.
All the keys in Mononoke are of the form `type_prefix.hash_name.hash`
I expanded MononokeId interface to have an access to the prefix `type_prefix.hash_name` for verification `key` content (see alias mechanism description).
Reviewed By: farnz
Differential Revision: D13084145
fbshipit-source-id: 5b8a4e80869481414a7356ccd7c9aab6e24a5138
Summary:
Purpose:
- Sha256 alias link to file_content is a required part of LFS getfiles works correct.
LFS protocol uses SHA-256 to refer to the file content. Mononoke uses Blake2.
To support LFS in Mononoke we need to set up a link from SHA-256 hash of the content to blake2 of the content.
These links are called aliases.
- Aliases are uploading together with file content blobs.
But only for new push operations.
- If repo is blobimported from somewhere, we need to make sure, that all the links are in blobstore.
If repo was blobimported before aliases were added then it may miss aliases for some blobs.
- This tool can be used to
- find if any aliases are missing
- fill missing aliases.
Implementation:
- Run in repo.
- Iterate through all changesets.
- Go through all the file_content blobs in the changesets
- Verify/generate alias256 links to file_content blobs.
Mode supported:
- verify, count the number of errors and print to console
- generate, if blob is missing to add it to the blobstore
Reviewed By: StanislavGlebik
Differential Revision: D10461827
fbshipit-source-id: c2673c139e2f2991081c4024db7b85953d2c5e35
Summary:
Added a get_stats() Hashmap<String, Box<Any>> method for all ChangesetFetchers.
The CachingChangesetFetcher now returns statistics for cache.misses: usize, cache.hits: usize, fetches.from.blobstore: usize and max.latency: Duration.
Reviewed By: StanislavGlebik
Differential Revision: D10852637
fbshipit-source-id: 34114fd94c47aa26ea525fcc4ff76ad60827bc71
Summary:
Sharding filenodes by path should stop us knocking over databases -
make it configurable.
Reviewed By: StanislavGlebik
Differential Revision: D12894523
fbshipit-source-id: e27452f9b436842e1cb5e9e0968c1822f422b4c9
Summary:
Bookmarks point to Bonsai changesets. So previously we were fetching bonsai
changeset for a bookmark then converting it to hg changeset in `get_bookmark`
method, then converting it back to bonsai in `pushrebase.rs`.
This diff adds method `get_bonsai_bookmark()` that removes these useless
conversions.
Reviewed By: farnz
Differential Revision: D10427433
fbshipit-source-id: 1b15911fc5d77483b5a135a8d4484fccff23c774
Summary:
getfiles implementation for lfs
The implementation is the following:
- get file size from file envelope (retrieve from manifold by HgNodeId)
- if file size > threshold from lfs config
- fetch file to memory, get sha256 of the file, will be fixed later, as this approach consumes a lot of memory, but we don't have any mapping from sha256 - blake2 [T35239107](https://our.intern.facebook.com/intern/tasks/?t=35239107)
- generate lfs metadata file according to [LfsPlan](https://www.mercurial-scm.org/wiki/LfsPlan)
- set metakeyflag (REVID_STORED_EXT) in the file header
- if file size < threshold, process usual way
Reviewed By: StanislavGlebik
Differential Revision: D10335988
fbshipit-source-id: 6a1ba671bae46159bcc16613f99a0e21cf3b5e3a