Summary:
Previously we had a timeout per session i.e. multiple wireproto command will
share the same timeout. It had a few disadvantages:
1) The main disadvantage was that if connection had timed out we didn't log
stats such as number of files, response size etc and we didn't log parameters
to scribe. The latter is even a bigger problem, because we usually want to
replay requests that were slow and timed out and not the requests that finished
quickly.
2) The less important disadvantage is that we have clients that do small
request from the server and then keep the connection open for a long time.
Eventually we kill the connection and log it as an error. With this change
the connection will be open until client closes it. That might potentially be
a problem, and if that's the case then we can reintroduce perconnection
timeout.
Initially I was planning to use tokio::util::timer to implement all the
timeouts, but it has different behaviour for stream - it only allows to set
per-item timeout, while we want timeout for the whole stream.
(https://docs.rs/tokio/0.1/tokio/timer/struct.Timeout.html#futures-and-streams)
To overcome it I implemented simple combinator StreamWithTimeout which does
exactly what I want.
Reviewed By: HarveyHunt
Differential Revision: D13731966
fbshipit-source-id: 211240267c7568cedd18af08155d94bf9246ecc3
Summary:
the extension was not enabled for repo pull (the second repo)
hg pull -r was still working but other things like hg up <commit cloud hash> were not working.
it caused a bit of confusion
It is cleaner to enable the extension for both sides.
Reviewed By: StanislavGlebik
Differential Revision: D13710518
fbshipit-source-id: 231aec1a71a5c13d707c2b361ce77158573b93f0
Summary:
There is no much point in keeping since we have skiplist which should solve the
same problems in a better way.
The only way where CachingChangesets maybe useful is when many users fetch a
lot of commits simultaneously. It may happen when we merge a new big repository.
However current implementation of CachingChangesets won't help with it since we
do not update its indexes.
Reviewed By: lukaspiatkowski
Differential Revision: D13695201
fbshipit-source-id: 2a4600eccf8224453ca13047e5a2ef3a0af650e3
Summary:
Previously to get copy/move source we had to join `paths` and `fixedcopyinfo`
table. That worked fine when we had just one shard. However now we have many
shards, and join no longer works. The reason is that move source path is in a
different shard compared to move destination path, and join returns no data.
Consider this situation. shardA contains all the data for pathA, shardB
contains all the data for pathB. That means that sharded `paths` table will
have pathA in shardA and pathB in shardB. Then if file pathA was copied form
pathB, then `fixedcopyinfo` table in shardA contains a path_hash of pathB.
However joining shardA's `fixedcopyinfo` with shardA's `paths` to convert
path_hash to path fails because pathB is in shardB.
The only possible fix is to split fetching path_hash from `fixedcopyinfo` and
converting path_hash to path.
I don't think we'll be able to keep the logic with join that we have at the
moment. It would require us to have all paths on all shards which is
unfeasible because it'll make writes much slower.
Reviewed By: aslpavel
Differential Revision: D13690141
fbshipit-source-id: 16b5cae6f23c162bb502b65c208f3ca9e443fb04
Summary:
Going to change these files in the next diff. To make next diff smaller
splitting format changes to this diff.
Reviewed By: aslpavel
Differential Revision: D13690143
fbshipit-source-id: 124232b832d8c67ee7fe931ef174230cb09ff564
Summary:
File content blobs are thrift encoded in Mononoke. This is done so
that we can change the encoding of content blobs easily. For example, we can
add compression or we can add split the blobs in chunks.
However there is a problem. At the moment file content blob key is a hash of
the actual data that's written to blobstore i.e. of a thrift encoded data. That
means that if we add compression or change thrift encoding in any way, then the
file content blob key changes and it changes the commit hashes.
This is wrong. To fix it let's use hash of the actual file content as the key.
Reviewed By: farnz
Differential Revision: D12884898
fbshipit-source-id: e60a7b326c39dad86e2b26c6f637defcb0acc8e8
Summary:
Mercurial has a hack to determine if a file was renamed. If p1 is None then
copy metadata is checked. Note that this hack is purely to make finding renames
faster and we don't need it in Mononoke. So let's just read copy metadata.
This diff also removes `maybe_copied()` method and unused code like `Symlink`
Reviewed By: farnz
Differential Revision: D12826409
fbshipit-source-id: 53792218cb61fcba96144765790278d17eecdbb1
Summary:
as you can test the query like this:
```
select * from demo WHERE `name` IN ()
```
is fine for sqlite but **invalid syntax** in MySql (empty list of value)
the error will be similar to this:
```
You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ') LIMIT 10000' at line 1; 'select * from phases WHERE repo_id IN () LIMIT 10000'
```
So, such errors are usually shoot in production.
It is better to have the empty check right before calling queries with lists
Reviewed By: lukaspiatkowski
Differential Revision: D13704726
fbshipit-source-id: a9fb3a2e21e88b3af14f57917c2004454eb42531
Summary:
Reformat several rust files with the current rustfmt, to make the linter
happy.
Reviewed By: yfeldblum
Differential Revision: D13683205
fbshipit-source-id: f7a02dae0fbe095b6acde4de380aca2acfedf39d
Summary: This diff is created to separate the lint formatting work from the rest of the code changes in D13632296
Reviewed By: lukaspiatkowski
Differential Revision: D13691680
fbshipit-source-id: 8e12016534d2e6066d803b51b5f12cbf6e89a822
Summary:
It showed up in our profiles because it does unnecessary copies. And for big
request like getfiles it matters a bit. So let's fix it.
Reviewed By: aslpavel
Differential Revision: D13634952
fbshipit-source-id: 98be8bf7236eb12a4009b4b174ffac258f46e0f4
Summary:
Add data to the extra_context scuba field that includes the number of commits in the bundle
as well as certain stats from the changesetfetcher (such as cache misses).
Reviewed By: aslpavel
Differential Revision: D13528646
fbshipit-source-id: 4603d7e95182f4e36b5ef325651ec80997742ea0
Summary:
Update the wireproto command gettreepack to log the total size of the returned
treepacks, as well as the number that are returned.
Reviewed By: StanislavGlebik
Differential Revision: D13278254
fbshipit-source-id: aab9b6f42b11240a7b84bfda07bf99f15508043d
Summary:
Update the wireproto logging to log a summary of the getfile requests, rather than
logging every individual request. This should reduce our logging to scuba.
This diff includes logging of:
- Number of returned files
- Maximum file size
- Total size of files
- Maximum file request latency
Reviewed By: aslpavel
Differential Revision: D13278256
fbshipit-source-id: 069318a718fe915995c7bbe25aa8ccb02c2372f8
Summary:
Some wireproto commands use WireProtoLogger to record information to
both scuba and scribe (for replay). Modify this struct to also allow
a PerfCounter struct to be logged to scuba but _not_ scribe.
This allows for logging of command specific information to scuba, such as
number of files requested.
Reviewed By: StanislavGlebik
Differential Revision: D13278255
fbshipit-source-id: 0ed364c8264ba3ae439746387126a7778712b860
Summary:
PerfCounters is a small wrapper around a concurrent hashmap
that can be used to store performance metrics in. Include it in CoreContext
so that it can be used throughout the codebase.
Reviewed By: aslpavel
Differential Revision: D13528647
fbshipit-source-id: 7c3f26ab8c0c7ba5ee619e85a069af7e7721037f
Summary:
The bulk api makes less queries to mysql and therefore is more efficient.
This is especially important for `hg pull` requests where the list of heads is very large.
Reviewed By: lukaspiatkowski
Differential Revision: D13677298
fbshipit-source-id: 3dec1b3462c520c11481325e82523ef7a6ae6516
Summary:
This version still misses:
- proper production-ready logging
- smarter handling of case where the queue entries related to each other do not fit in the limit or `older_than` limit, so the healer will heal much more entries without realizing it shouldn't do so.
Reviewed By: aslpavel
Differential Revision: D13528686
fbshipit-source-id: 0245becea7e4f0ac69383a7885ff3746d81c4add
Summary:
the remaining part to implement bulk phases fetch and update
this is required to optimize number of MySql queries we are using to look up at the db for phases and bookmarks.
the single get api has been removed and reimplemented by calling bulk ones
Reviewed By: aslpavel
Differential Revision: D13664900
fbshipit-source-id: 29342e86c057b92e331fadcebe51f452d9569e09
Summary: as opposed to other blobstores, this one is using memcache directly, because it even stores chunks in the cache
Reviewed By: aslpavel
Differential Revision: D13487613
fbshipit-source-id: bf9eeaef4d795e4f2322f128fb8501ace619d8f1
Summary:
this is required to optimize number of MySql queries we are using to look up at the db for phases and bookmarks.
the next step is to add the same with Memcache to caching.rs
implementations for the single get is replaced with just calling the implementation for multiple get
Reviewed By: aslpavel
Differential Revision: D13658610
fbshipit-source-id: e3876044e2cbbefb156175c51ab7051db3885eb8
Summary: the from_value_opt already knows how to properly parse an integer from sql response, no need to reinvent the wheel and introduce bugs
Reviewed By: aslpavel
Differential Revision: D13651328
fbshipit-source-id: 55af810c99b93bd2f9c67c721a9d0be6034ee466
Summary: MyRouter and cache support will come in next diffs
Reviewed By: StanislavGlebik
Differential Revision: D13465146
fbshipit-source-id: 0ede0e875d4a49794ff71173bd0d1563abb3ff08
Summary: Instead of dumping the debug output we print the most important information: Changeset id, author, message and file changes.
Reviewed By: StanislavGlebik
Differential Revision: D13621492
fbshipit-source-id: ea0f93f58516cc759d0dc9aac14545b1827ea136
Summary: Format files effected by next commit in a stack
Reviewed By: StanislavGlebik
Differential Revision: D13650639
fbshipit-source-id: d4e37acd2bcd29b291968a529543c202f6944e1a
Summary:
reachability_query uses recursion so we can run out of stack. However getting
rid of recursion is easy because it was already implemented for `lca_hint`
method.
I also used this diff as an opportunity to rename `src_hash` and `dst_hash` to
`maybe_descendant_hash` and `maybe_ancestor_hash` respectively to make code
clearer.
Reviewed By: lukaspiatkowski
Differential Revision: D13650422
fbshipit-source-id: 0e52ae592992208a03691b1a5c24021a4fb94313
Summary:
We need to add python exports to our TARGETS file to enable scmquery
to create a ServiceRouter Client for mononoke-api
Reviewed By: StanislavGlebik
Differential Revision: D13651275
fbshipit-source-id: ba5a4eb3665dae1ea127f4ceb96c1ce62e1e4563
Summary:
It breaks the pushrebase test.
Original commit: 4e084bee13ff4941d1a42d1f75fe501575858a63
Original diff: D13573105
Reviewed By: StanislavGlebik
Differential Revision: D13651039
fbshipit-source-id: b67c32e0fc4acc953265a089e746ede3d4426b6f
Summary:
use the correct skip index
sorry for some rustfmt.
Reviewed By: StanislavGlebik
Differential Revision: D13636059
fbshipit-source-id: 2815d82b63b86bda053f5a3a9a1b8a3b72abbf82
Summary: Format skiplist.rs file before we make changes to it.
Reviewed By: aslpavel
Differential Revision: D13650420
fbshipit-source-id: 394e07d94b57814fc9b7b345ffc81e06f95446d7
Summary:
These tests constantly fail because people add new logging from C++. Overriding
XLOG is not that easy, so let's just grep for lines we are interested in.
Reviewed By: HarveyHunt
Differential Revision: D13650651
fbshipit-source-id: 419ad55b1087212debb7aaba652b49ba24763fc4
Summary:
After some discussion with Pavel Aslanov, Lukas Piatkowski and Stanislau Hlebik, it was evident that shared future is the best approach for the bookmarks cache.
The cache in this implementation maintains a shared future for each repo, fetching the full list of bookmarks. When a list of bookmarks with a given prefix is required, a filter is applied to a full list future.
Two locks are used in this implementation: one for adding new repos to the hashtable and one for updating the cache. In both cases the optimistic strategy: "let's first first grab a read lock and try checking if it is good enough" is applied.
Reviewed By: StanislavGlebik
Differential Revision: D13573105
fbshipit-source-id: 4e084bee13ff4941d1a42d1f75fe501575858a63