Summary: Tests the behaviour of collecting the raw bundles.
Reviewed By: krallin
Differential Revision: D25025255
fbshipit-source-id: 114da273a28d131f5dd24047ed28ea23d076f235
Summary:
Dechunker has a feature of saving entire dechunked bundle contents in memory
this is used to save the raw bundles to manifold. Unitl now this feature worked
properly when accesed via `Read` trait methods. When accessed via `BufRead`
trait the logic that was collecting the read contents was skipped.
This manifested in the saved infinitepush bundles being always trimmed to 4kb.
Reviewed By: markbt
Differential Revision: D25020371
fbshipit-source-id: c606c9fb116a1cd00ae7f4558a7249364faa9c13
Summary: As part of the effort to deprecate futures 0.1 in favor of 0.3 I want to create a new futures_ext crate that will contain some of the extensions that are applicable from the futures_01_ext. But first I need to reclame this crate name by renaming the old futures_ext crate. This will also make it easier to track which parts of codebase still use the old futures.
Reviewed By: farnz
Differential Revision: D24725776
fbshipit-source-id: 3574d2a0790f8212f6fad4106655cd41836ff74d
Summary: This is a step towards modernizing unbundle crate to use futures 0.3.
Reviewed By: farnz
Differential Revision: D24682963
fbshipit-source-id: 55c17fd699846a24647a23ea1c22888407643dfd
Summary: `clippy` often complains about the use of `.len() != 0`, `.len() > 0` or `.len() == 0`and proposes to use `.is_empty()` instead. This diff does that across Mononoke.
Reviewed By: aslpavel
Differential Revision: D24099427
fbshipit-source-id: 1bba2f958485b7efb3f41bf3eae820879c92b0e5
Summary:
Generated by formatting with rustfmt 2.0.0-rc.2 and then a second time with fbsource's current rustfmt (1.4.14).
This results in formatting for which rustfmt 1.4 is idempotent but is closer to the style of rustfmt 2.0, reducing the amount of code that will need to change atomically in that upgrade.
---
*Why now?* **:** The 1.x branch is no longer being developed and fixes like https://github.com/rust-lang/rustfmt/issues/4159 (which we need in fbcode) only land to the 2.0 branch.
---
Reviewed By: StanislavGlebik
Differential Revision: D23568780
fbshipit-source-id: b4b4a0aa683d236e2fdeb5b96d723ac2d84b9faf
Summary:
This updates the AsyncRead implementations we use in hgproto and
mercurial_bundles to use a LimitedAsyncRead. The upshot of this change is that
we eliminate O(N^2) behavior when parsing the data we receive from clients.
See the earlier diff on this stack for more detail on where this happens, but
the bottom line is that Framed presents a full-size buffer that we zero out
every time we try to read data. With this change, the buffer we zero out is
comparable to the amount of data we are reading.
This matters in commit cloud because bundles might be really big, and a single
big bundle is enough to take an entire core for a spin or 20 minutes (and they
achieve nothing but time out in the end). That being said, it's also useful for
non-commit cloud bundles: we do occasionally receive big bundles (especially
for WWW codemods), and those will benefit from the exact same speedup.
One final thing I should mention: this is all in a busy CPU poll loop, and as I noted
in my earlier diff, the effect persists across our bundle receiving code. This means
it will sometimes result in not polling other futures we might have going.
Reviewed By: farnz
Differential Revision: D22432350
fbshipit-source-id: 33f1a035afb8cdae94c2ecb8e03204c394c67a55
Summary:
Rename the `subtree` endpoint on the EdenAPI server to `complete_trees` to better express what it does (namely, fetching complete trees, in contrast to the lighter weight `/trees` endpoint that serves individual tree nodes). This endpoint is not used by anything yet, so there isn't much risk in renaming it at this stage.
In addition to renaming the endpoint, the relevant request struct has been renamed to `CompleteTreeRequest` to better evoke its purpose, and the relevant client and test code has been updated accordingly. Notably, now that the API server is gone, we can remove the usage of this type from Mononoke's `hgproto` crate, thereby cleaning up our dependency graph a bit.
Reviewed By: krallin
Differential Revision: D22033356
fbshipit-source-id: 87bf6afbeb5e0054896a39577bf701f67a3edfec
Summary:
Several of the structs used by EdenAPI were previously defined in Mercurial's
`types` crate. This is not ideal since these structs are used for data interchange
between the client and server, and changes to them can break Mononoke, Mercurial,
or both. As such, they are more fragile than other types and their use should be
limited to the EdenAPI server and client to avoid the need for extensive breaking
changes or costly refactors down the line.
I'm about to make a series of breaking changes to these types as part of the
migration to the new EdenAPI server, so this seemed like an ideal time to split
these types into their own crate.
Reviewed By: krallin
Differential Revision: D21857425
fbshipit-source-id: 82dedc4a2d597532e58072267d8a3368c3d5c9e7
Summary:
There are few related changes included in this diff:
- backsyncer is made public
- stubs for SessionContext::is_quicksand and scuba_ext::ScribeClientImplementation
- mononoke/hgproto is made buildable
Reviewed By: krallin
Differential Revision: D21330608
fbshipit-source-id: bf0a3c6f930cbbab28508e680a8ed7a0f10031e5
Summary:
We use the logged arguments directly for wireproto replay, and then we replay
this directly in traffic replay, but just joining a list with `,` doesn't
actually work for directories:
- We need trailing commas
- We need wireproto encoding
This does that. It also clarifies that this encoding is for debug purposes by
updating function names, and relaxes a bunch of types (since hgproto uses
bytes_old).
Reviewed By: StanislavGlebik
Differential Revision: D20868630
fbshipit-source-id: 3b805c83505aefecd639d4d2375e0aa9e3c73ab9
Summary:
The way decoders work in Tokio is that they get repeatedly presented whatever
is on the wire right now, and they have to report whether the data being
presented is valid and they'd like to consume it (and otherwise expect Tokio to
provide more data).
It follows that decoders have to be pretty fast, because they will be presented
a bunch of data a bunch of times. Unfortunately, it turns out our SSH Protocol
decoder is everything but.
This hadn't really been a problem until now, because we had ad-hoc decoding for
things like Getpack that might have a large number of parameters, but for now
the designated nodes implementation is decoded in one go through the existing
Gettreepack decoder, so it is important ot make the parsing fast (not to
mention, right now, we buffer the entire request for Getpack as well ... so
maybe we could actually update it to this too!).
Unfortunately, as I mentioned, right now the parsing wasn't fast. The reason is
because it copies parameters to a `Vec<u8>` while it decodes them. So, if
you start decoding and copying, say, 50MB of arguments, before you find out
you're missing a few more bytes, then you just copied 50MB that you need to
throw away.
Unfortunately, the buffer size is 8KiB, so if we say "I need more data", we get
8KiB. That means that if we want to decode a 70MiB request, we're going to make
8960 ( = 70 * 1024 / 8) copies of the data (the first 8KiB, then the first 16,
and so on), which effectively means we are going to copy and throw away ~612GiB
of data (8960 * 70 / 2). That's a lot of work, and indeed it is slow.
Fortunately, our implementation is really close to doing the right thing. Since
everything is length delimited, we can parse pretty quick if we don't make
copies: all we need to do is read the first length, skip ahead, read the second
length, and so on.
This is what this patch does: it extracts the parsing into something that
operates over slices. Then, **assuming the parsing is successful** (and that is
the operative part here), it does the conversion to an owned Vec<u8>.
In O(X) terms .. this means the old parsing is O(N^2) and the new one is O(N).
I actually think we could take this one step further and do the conversion even
later (once we want to start decoding), but for now this is more than fast
enough.
After this patch, it takes < 1 second to parse a 70MiB Gettreepack request.
Before this patch, it took over 2 minutes (which is 3 times longer than it
takes to actually service it).
PS: While in there, I also moved the `gettreepack_directories` function to a
place that makes more sense, which I had introduced earlier in the wrong place
(`parse_star`, `parse_kv` and `params` are a group of things that go together,
it's a bit clowny to have `gettreepack_directories` in the middle of them!).
Reviewed By: kulshrax
Differential Revision: D20517072
fbshipit-source-id: 85b10e82768bf14530a1ddadff8f61a28fdcbcbe
Summary:
The diff only contains HgCommand signatures. No implementation yet.
The purpose of the getcommitdata command is to return the serialized contents
of a commit. Given a Mercurial Changelog Id, the endpoint returns the same
contents that the Revlog would return on a Mercurial server.
At this point I am looking for suggestions regarding the protocol and the
implementation. My assumption is that both request and response can be fully
kept in memory. I think that we may decide that the request is going to be
streamed to the client so the initial protocol allows for that.
Requirements:
Input: HgChangelogId
Output: Changelog entry contents
Protocol Summary:
```
Request: "getcommitdata" LF "nodes " Length LF Length*(HEXDIG|" ")
Response: *(40HEXDIG Length LF Length*(%x00-79) LF)
```
A bit of a silly protocol. Let me know what recommendations you have.
The Request is modelled after the "known" command. This allows for efficient
batching compared to a batch command model. It's a bit awkward that we don't
pass in the number of HgChangelogId entries that we have in the request but
that is the existing protocol.
For every HgChangelogId in the request the response will first have a line
with the HgChangelogId that was requested and the length of the contents.
The next line will contain the contents followed by line feed.
Reviewed By: krallin
Differential Revision: D20345367
fbshipit-source-id: 50dffff4f6c60396f564f2f1f519744ce730bf96
Summary:
We need to differentiate the empty directory from no directories. Adding a
trailing comma after each directory instead of separating them achieves that.
Reviewed By: StanislavGlebik
Differential Revision: D20309700
fbshipit-source-id: 387ec477560968392de0a9631d67ccb591bd3cab
Summary:
This will allow the hg client to do tree fetching like we do in the API Server,
but through the SSH protocol — i.e. by passing a series a manifest ids and
their paths, without recursion on the server side through gettreepack.
Reviewed By: StanislavGlebik
Differential Revision: D20307442
fbshipit-source-id: a6dca03622becdebf41b264381fdd5837a7d4292
Summary:
This commit manually synchronizes the internal move of
fbcode/scm/mononoke under fbcode/eden/mononoke which couldn't be
performed by ShipIt automatically.
Reviewed By: StanislavGlebik
Differential Revision: D19722832
fbshipit-source-id: 52fbc8bc42a8940b39872dfb8b00ce9c0f6b0800