Commit Graph

27 Commits

Author SHA1 Message Date
Stanislau Hlebik
1d0c535123 mononoke: support reading streaming changelog chunks with tag
Reviewed By: ahornby

Differential Revision: D30015772

fbshipit-source-id: ca19f41b95ce0db43895b3c53009538d5712e239
2021-08-10 05:13:54 -07:00
Jan Mazur
b9820ec1b7 mononoke/server: include individual wireproto commands as qps before command execution
Summary:
Wireproto session multiplexes wireproto commands. Counting them individually is most likely a better metric for QPS even though we wouldn't be able to offload them to a different server/region one by one.

It makes the cost of a query more even across wireproto and edenapi.

Reviewed By: krallin

Differential Revision: D28058054

fbshipit-source-id: 5d606841e07816ec8808a3b9aba4b7c0614b9cb6
2021-05-04 03:41:05 -07:00
Thomas Orozco
5186e6b92f mononoke: fix the build
Summary:
This is breaking with a warning because there's a method called `intersperse`
that might be introduced in the std lib:

```
stderr: error: a method with this name may be added to the standard library in the future
  --> eden/mononoke/hgproto/src/sshproto/response.rs:48:53
   |
48 |             let separated_results = escaped_results.intersperse(separator);
   |                                                     ^^^^^^^^^^^
   |
note: the lint level is defined here
  --> eden/mononoke/hgproto/src/lib.rs:14:9
```

This should fix it.

Reviewed By: ikostia

Differential Revision: D27705212

fbshipit-source-id: 5f2f641ea6561c838288c8b158c6d9e134ec0724
2021-04-12 05:07:48 -07:00
Stanislau Hlebik
a58b18a974 mononoke: verify bonsai changesets during replay
Differential Revision: D27362903

fbshipit-source-id: 136207fbab3f729e8575d8d06596ce790c3f4783
2021-03-29 12:39:19 -07:00
Mark Juggurnauth-Thomas
3f20c956a2 fix warnings
Summary:
Fix some warnings in the Mononoke build:

- URLs in doc comments should be delimited with `<` and `>`.

- Permission checker `try_from_ssh_encoded` parameter is unused.

Reviewed By: krallin

Differential Revision: D26224590

fbshipit-source-id: 49ce62655189a7045b78538642dbf638519f71de
2021-02-04 01:09:15 -08:00
Egor Tkachenko
11dd72d6c5 Add unbundlereplay command
Summary:
Unbundlereplay command was not implemented in the mononoke but it is used by sync job. So let's add this command here
together with additional integration test for sync between 2 mononoke repos. In addition I'm adding non fast forward bookmark movements by specifying key to sync job.

Reviewed By: StanislavGlebik

Differential Revision: D25803375

fbshipit-source-id: 6be9e8bfed8976d47045bc425c8c796fb0dff064
2021-01-07 20:36:26 -08:00
Mateusz Kwapich
51b21dd9a7 unit test for dechunker raw bundle2 saving
Summary: Tests the behaviour of collecting the raw bundles.

Reviewed By: krallin

Differential Revision: D25025255

fbshipit-source-id: 114da273a28d131f5dd24047ed28ea23d076f235
2020-11-19 06:41:06 -08:00
Mateusz Kwapich
1cb9ad2aaf fix the construction of full_bundle2_content in dechunker
Summary:
Dechunker has a feature of saving entire dechunked bundle contents in memory
 this is used to save the raw bundles to manifold. Unitl now this feature worked
 properly when accesed via `Read` trait methods. When accessed via `BufRead`
 trait the logic that was collecting the read contents was skipped.

 This manifested in the saved infinitepush bundles being always trimmed to 4kb.

Reviewed By: markbt

Differential Revision: D25020371

fbshipit-source-id: c606c9fb116a1cd00ae7f4558a7249364faa9c13
2020-11-17 04:56:39 -08:00
Lukas Piatkowski
2a779e82d8 mononoke/mercurial_bundles: use futures 0.3 in Bundle2Item
Summary: This is a step towards modernizing unbundle crate to use futures 0.3.

Reviewed By: farnz

Differential Revision: D24682963

fbshipit-source-id: 55c17fd699846a24647a23ea1c22888407643dfd
2020-11-03 00:12:21 -08:00
Kostia Balytskyi
2ea25308ab commit_rewriting: use is_empty() where possible
Summary: `clippy` often complains about the use of `.len() != 0`, `.len() > 0` or `.len() == 0`and proposes to use `.is_empty()` instead. This diff does that across Mononoke.

Reviewed By: aslpavel

Differential Revision: D24099427

fbshipit-source-id: 1bba2f958485b7efb3f41bf3eae820879c92b0e5
2020-10-04 10:03:42 -07:00
David Tolnay
0cb8a052f5 Update formatter to rustfmt 2.0
Reviewed By: zertosh

Differential Revision: D23591021

fbshipit-source-id: e664aa2fdd3aaa457796a59080be6b94f604a112
2020-09-09 07:52:33 -07:00
David Tolnay
be0786f14b Prepare for rustfmt 2.0
Summary:
Generated by formatting with rustfmt 2.0.0-rc.2 and then a second time with fbsource's current rustfmt (1.4.14).

This results in formatting for which rustfmt 1.4 is idempotent but is closer to the style of rustfmt 2.0, reducing the amount of code that will need to change atomically in that upgrade.

 ---

*Why now?* **:** The 1.x branch is no longer being developed and fixes like https://github.com/rust-lang/rustfmt/issues/4159 (which we need in fbcode) only land to the 2.0 branch.

 ---

Reviewed By: StanislavGlebik

Differential Revision: D23568780

fbshipit-source-id: b4b4a0aa683d236e2fdeb5b96d723ac2d84b9faf
2020-09-08 07:33:16 -07:00
Thomas Orozco
dd1aaf90fe mononoke/{hgproto,mercurial_bundles}: eliminate O(N^2) behavior in decoding
Summary:
This updates the AsyncRead implementations we use in hgproto and
mercurial_bundles to use a LimitedAsyncRead. The upshot of this change is that
we eliminate O(N^2) behavior when parsing the data we receive from clients.

See the earlier diff on this stack for more detail on where this happens, but
the bottom line is that Framed presents a full-size buffer that we zero out
every time we try to read data. With this change, the buffer we zero out is
comparable to the amount of data we are reading.

This matters in commit cloud because bundles might be really big, and a single
big bundle is enough to take an entire core for a spin or 20 minutes (and they
achieve nothing but time out in the end). That being said, it's also useful for
non-commit cloud bundles: we do occasionally receive big bundles (especially
for WWW codemods), and those will benefit from the exact same speedup.

One final thing I should mention: this is all in a busy CPU poll loop, and as I noted
in my earlier diff, the effect persists across our bundle receiving code. This means
it will sometimes result in not polling other futures we might have going.

Reviewed By: farnz

Differential Revision: D22432350

fbshipit-source-id: 33f1a035afb8cdae94c2ecb8e03204c394c67a55
2020-07-08 08:07:13 -07:00
Arun Kulshreshtha
977c3c73e3 edenapi_server: rename the subtree endpoint to complete_trees
Summary:
Rename the `subtree` endpoint on the EdenAPI server to `complete_trees` to better express what it does (namely, fetching complete trees, in contrast to the lighter weight `/trees` endpoint that serves individual tree nodes). This endpoint is not used by anything yet, so there isn't much risk in renaming it at this stage.

In addition to renaming the endpoint, the relevant request struct has been renamed to `CompleteTreeRequest` to better evoke its purpose, and the relevant client and test code has been updated accordingly. Notably, now that the API server is gone, we can remove the usage of this type from Mononoke's `hgproto` crate, thereby cleaning up our dependency graph a bit.

Reviewed By: krallin

Differential Revision: D22033356

fbshipit-source-id: 87bf6afbeb5e0054896a39577bf701f67a3edfec
2020-06-15 13:40:44 -07:00
Arun Kulshreshtha
cde0436ca9 edenapi_types: move EdenAPI types into separate crate
Summary:
Several of the structs used by EdenAPI were previously defined in Mercurial's
`types` crate. This is not ideal since these structs are used for data interchange
between the client and server, and changes to them can break Mononoke, Mercurial,
or both. As such, they are more fragile than other types and their use should be
limited to the EdenAPI server and client to avoid the need for extensive breaking
changes or costly refactors down the line.

I'm about to make a series of breaking changes to these types as part of the
migration to the new EdenAPI server, so this seemed like an ideal time to split
these types into their own crate.

Reviewed By: krallin

Differential Revision: D21857425

fbshipit-source-id: 82dedc4a2d597532e58072267d8a3368c3d5c9e7
2020-06-10 19:29:44 -07:00
Stanislau Hlebik
37437ebe60 mononoke: remove getfiles wireproto parsing
Reviewed By: farnz

Differential Revision: D21623155

fbshipit-source-id: b1f763b653c47c42bc1d765cfa8985a767a63652
2020-05-19 04:43:00 -07:00
Lukas Piatkowski
7033889eac mononoke/repo_client: make repo_client buildable in OSS
Summary:
There are few related changes included in this diff:
- backsyncer is made public
- stubs for SessionContext::is_quicksand and scuba_ext::ScribeClientImplementation
- mononoke/hgproto is made buildable

Reviewed By: krallin

Differential Revision: D21330608

fbshipit-source-id: bf0a3c6f930cbbab28508e680a8ed7a0f10031e5
2020-05-06 06:11:02 -07:00
Kostia Balytskyi
842cc18863 remove old comment and attribute
Reviewed By: farnz

Differential Revision: D21245424

fbshipit-source-id: a58d1f451341f374734b4518a2ed465a60809f0b
2020-04-25 13:55:43 -07:00
Thomas Orozco
0e7cbcf453 mononoke/repo_client: use wireproto encoding for directories
Summary:
We use the logged arguments directly for wireproto replay, and then we replay
this directly in traffic replay, but just joining a list with `,` doesn't
actually work for directories:

- We need trailing commas
- We need wireproto encoding

This does that. It also clarifies that this encoding is for debug purposes by
updating function names, and relaxes a bunch of types (since hgproto uses
bytes_old).

Reviewed By: StanislavGlebik

Differential Revision: D20868630

fbshipit-source-id: 3b805c83505aefecd639d4d2375e0aa9e3c73ab9
2020-04-07 04:36:06 -07:00
Thomas Orozco
01c05f5925 mononoke/hgproto: zero copy-validation (120x faster on 70MiB Gettreepack)
Summary:
The way decoders work in Tokio is that they get repeatedly presented whatever
is on the wire right now, and they have to report whether the data being
presented is valid and they'd like to consume it (and otherwise expect Tokio to
provide more data).

It follows that decoders have to be pretty fast, because they will be presented
a bunch of data a bunch of times. Unfortunately, it turns out our SSH Protocol
decoder is everything but.

This hadn't really been a problem until now, because we had ad-hoc decoding for
things like Getpack that might have a large number of parameters, but for now
the designated nodes implementation is decoded in one go through the existing
Gettreepack decoder, so it is important ot make the parsing fast (not to
mention, right now, we buffer the entire request for Getpack as well ... so
maybe we could actually update it to this too!).

Unfortunately, as I mentioned, right now the parsing wasn't fast. The reason is
because it copies parameters to a `Vec<u8>` while it decodes them. So, if
you start decoding and copying, say, 50MB of arguments, before you find out
you're missing a few more bytes, then you just copied 50MB that you need to
throw away.

Unfortunately, the buffer size is 8KiB, so if we say "I need more data", we get
8KiB. That means that if we want to decode a 70MiB request, we're going to make
8960 ( = 70 * 1024 / 8) copies of the data (the first 8KiB, then the first 16,
and so on), which effectively means we are going to copy and throw away ~612GiB
of data (8960 * 70 / 2). That's a lot of work, and indeed it is slow.

Fortunately, our implementation is really close to doing the right thing. Since
everything is length delimited, we can parse pretty quick if we don't make
copies: all we need to do is read the first length, skip ahead, read the second
length, and so on.

This is what this patch does: it extracts the parsing into something that
operates over slices. Then, **assuming the parsing is successful** (and that is
the operative part here), it does the conversion to an owned Vec<u8>.

In O(X) terms .. this means the old parsing is O(N^2) and the new one is O(N).

I actually think we could take this one step further and do the conversion even
later (once we want to start decoding), but for now this is more than fast
enough.

After this patch, it takes < 1 second to parse a 70MiB Gettreepack request.
Before this patch, it took over 2 minutes (which is 3 times longer than it
takes to actually service it).

PS: While in there, I also moved the `gettreepack_directories` function to a
place that makes more sense, which I had introduced earlier in the wrong place
(`parse_star`, `parse_kv` and `params` are a group of things that go together,
it's a bit clowny to have `gettreepack_directories` in the middle of them!).

Reviewed By: kulshrax

Differential Revision: D20517072

fbshipit-source-id: 85b10e82768bf14530a1ddadff8f61a28fdcbcbe
2020-03-19 04:31:23 -07:00
Arun Kulshreshtha
bc5b530959 hgproto: use Option<MPath> instead of Bytes for path in GettreepackArgs
Summary: title

Reviewed By: StanislavGlebik

Differential Revision: D20459401

fbshipit-source-id: 3473c2eef39b0d51e6133a5f575d2fd7bef3bd97
2020-03-17 15:07:54 -07:00
Stefan Filip
450073c203 commands: add getcommitdata command
Summary:
The diff only contains HgCommand signatures. No implementation yet.

The purpose of the getcommitdata command is to return the serialized contents
of a commit. Given a Mercurial Changelog Id, the endpoint returns the same
contents that the Revlog would return on a Mercurial server.

At this point I am looking for suggestions regarding the protocol and the
implementation. My assumption is that both request and response can be fully
kept in memory. I think that we may decide that the request is going to be
streamed to the client so the initial protocol allows for that.

Requirements:
Input: HgChangelogId
Output: Changelog entry contents

Protocol Summary:
```
Request: "getcommitdata" LF "nodes " Length LF Length*(HEXDIG|" ")
Response: *(40HEXDIG Length LF Length*(%x00-79) LF)
```

A bit of a silly protocol. Let me know what recommendations you have.
The Request is modelled after the "known" command. This allows for efficient
batching compared to a batch command model. It's a bit awkward that we don't
pass in the number of HgChangelogId entries that we have in the request but
that is the existing protocol.

For every HgChangelogId in the request the response will first have a line
with the HgChangelogId that was requested and the length of the contents.
The next line will contain the contents followed by line feed.

Reviewed By: krallin

Differential Revision: D20345367

fbshipit-source-id: 50dffff4f6c60396f564f2f1f519744ce730bf96
2020-03-12 14:36:12 -07:00
Thomas Orozco
06d11938cf mononoke/hgproto: expect comma-terminated list of directories
Summary:
We need to differentiate the empty directory from no directories. Adding a
trailing comma after each directory instead of separating them achieves that.

Reviewed By: StanislavGlebik

Differential Revision: D20309700

fbshipit-source-id: 387ec477560968392de0a9631d67ccb591bd3cab
2020-03-12 06:35:34 -07:00
Thomas Orozco
87491d14a7 mononoke/repo_client: add designated manifest fetching in gettreepack
Summary:
This will allow the hg client to do tree fetching like we do in the API Server,
but through the SSH protocol — i.e. by passing a series a manifest ids and
their paths, without recursion on the server side through gettreepack.

Reviewed By: StanislavGlebik

Differential Revision: D20307442

fbshipit-source-id: a6dca03622becdebf41b264381fdd5837a7d4292
2020-03-12 06:35:34 -07:00
David Tolnay
91cb486949 rust: Begin upgrading to bytes 0.5
Summary:
This upgrade is complicated because Tokio's codecs are coupled to a specific version of bytes.

- 0.1 codecs use bytes 0.4
    - https://docs.rs/tokio-codec/0.1/tokio_codec/trait.Encoder.html
    - https://docs.rs/tokio-codec/0.1/tokio_codec/trait.Decoder.html

- 0.2 codecs use bytes 0.5
    - https://docs.rs/tokio-util/0.2/tokio_util/codec/trait.Encoder.html
    - https://docs.rs/tokio-util/0.2/tokio_util/codec/trait.Decoder.html

Since we can't possibly do a coordinated atomic upgrade of tokio, we'll be straddling bytes versions during the migration period. This relies on the adapters added in D19919402.

Reviewed By: jsgf

Differential Revision: D19919403

fbshipit-source-id: 18c5f66efa587bc53ab13c9aab95c7098bfbce4e
2020-02-18 21:20:09 -08:00
Lukasz Piatkowski
542d1f93d3 Manual synchronization of fbcode/eden and facebookexperimental/eden
Summary:
This commit manually synchronizes the internal move of
fbcode/scm/mononoke under fbcode/eden/mononoke which couldn't be
performed by ShipIt automatically.

Reviewed By: StanislavGlebik

Differential Revision: D19722832

fbshipit-source-id: 52fbc8bc42a8940b39872dfb8b00ce9c0f6b0800
2020-02-11 11:42:43 +01:00
Lukasz Piatkowski
e8d62b64d5 mononoke: move the codebase under eden/ directory
fbshipit-source-id: 43a0252cb3ec42aa365f20d1b6faa4d24d74c9b8
2020-02-06 13:46:04 +01:00