Commit Graph

691 Commits

Author SHA1 Message Date
Arun Kulshreshtha
96f67f862b http_client: move Buffered into submodule
Summary: In preparation for adding a streaming handler, create a handler directory and move `Buffered` to a submodule therein.

Reviewed By: quark-zju

Differential Revision: D22201915

fbshipit-source-id: f90bb6a24dd2137900df825bd23a12201107e9cc
2020-06-29 18:43:49 -07:00
Arun Kulshreshtha
3fc3eb749b http_client: make client callback return a Result
Summary: Instead of returning a bool, make the client callback return a `Result`. A new error type called `Abort` has been added which the client can return to signal that the driver should abort all active transfers and return early. This results in more idiomatic code, and gives the callback the ability to specify the reason for the abort.

Reviewed By: quark-zju

Differential Revision: D22201912

fbshipit-source-id: 46d72e28f754132a4ef30defa40c4a5d09fe8e07
2020-06-29 18:43:49 -07:00
Arun Kulshreshtha
95a083a0fe http_client: move stats collection into driver
Summary: Move stats collection into `MultiDriver`. This makes sense from a design standpoint since the driver is the thing performing the requests. It will also reduce code duplication when streaming responses are added later in the stack.

Reviewed By: quark-zju

Differential Revision: D22201910

fbshipit-source-id: fd3174bb6f5a452901b405341b2c001dca1a9832
2020-06-29 18:43:49 -07:00
Arun Kulshreshtha
8560f22ec5 http_client: make HttpClient implement Send
Summary: Implement `Send` for `HttpClient`. I've split this out into its own diff since unsafe code warrants additional scrutiny. See the comment in the code for details on correctness.

Reviewed By: quark-zju

Differential Revision: D22157713

fbshipit-source-id: 1a92ebb51142a98d3996686197b77ad7500c19db
2020-06-29 18:43:49 -07:00
Arun Kulshreshtha
61783baed4 http_client: report transfer stats
Summary: Report transfer stats such as bytes downloaded, uploaded, elapsed time, and number of requests upon completion of a batch of requests. This code was adapted from the EdenAPI client.

Reviewed By: quark-zju

Differential Revision: D22157711

fbshipit-source-id: 57b814e7a923f85467a79ee48bddd48b2f2b253c
2020-06-29 18:43:49 -07:00
Arun Kulshreshtha
a99ba6aa42 http_client: add HttpClient
Summary:
Add an `HttpClient` struct that is intended as the primary API for this crate. This struct is a wrapper around libcurl's "multi" interface, which allows sending several requests concurrently. Notably, if HTTP/2 is used, this allows for potentially multiplexing
many requests to the same server over the same connection.

This code is essentially a generalized version of the `multi_request` function from the EdenAPI crate.

Reviewed By: quark-zju

Differential Revision: D22157710

fbshipit-source-id: cbd7f82555444c7d1e1932944257a6949c60f34e
2020-06-29 18:43:49 -07:00
Arun Kulshreshtha
44efb6ee0d http_client: add progress monitoring
Summary: Add the ability for the client to monitor the collective progress of a set of transfers. This will be used in the next diff to allow monitoring several concurrent requests. Most of this code was adapted from the EdenAPI client's progress module, with some modifications.

Reviewed By: quark-zju

Differential Revision: D22157709

fbshipit-source-id: d474cd46db29bebf64049629dce69d975e220e3a
2020-06-29 18:43:48 -07:00
Lukas Piatkowski
b7b5c9e7cc shed/fbthrift_ext: move some useful fbthrift extensions to the shed
Reviewed By: farnz

Differential Revision: D22257111

fbshipit-source-id: b8f59cb0aa0d0e1100b5eb9c2a1f295ad6442650
2020-06-29 09:16:00 -07:00
Arun Kulshreshtha
19c55df1fd http_client: add CLI for testing
Summary: Add a simple CLI to allow the HTTP client to be tested manually.

Reviewed By: quark-zju

Differential Revision: D22228930

fbshipit-source-id: 12fea3131ec6d8c3df4457fb74a09ea52f42c066
2020-06-26 13:24:21 -07:00
Arun Kulshreshtha
e054376937 http_client: allow setting CA cert path
Summary: Allow setting the CA certificate bundle, similar to the curl commands `--cacert` flag. This is required for integration tests since those servers use dummy certificates signed by a fake CA.

Reviewed By: quark-zju

Differential Revision: D22203833

fbshipit-source-id: 261e6c2904504c3a98f95b4a5b5b6ed24cb7402d
2020-06-26 13:24:21 -07:00
Arun Kulshreshtha
6e73978580 http_client: add new libcurl-based HTTP client
Summary:
Add a simple HTTP client library, based on libcurl. This crate is essentially an attempt to factor out the HTTP code from the Eden API client, since over time it had accumulated all of the pieces needed for a general-purpose HTTP client library. Factoring it out will help clean up the EdenAPI code base and allow code reuse by other crates that also need to work with HTTP.

This initial diff introduces the `Request` and `Response` types which can be used to build, send, and see the results of individual HTTP requests. In this diff, requests can only be made serially, but later in the stack it will be possible to run many requests concurrently (potentially multiplexed over the same connection in the case of HTTP/2).

The HTTP functionality in the EdenAPI client had very little unit test coverage (it relied primarily on Mononoke's integration tests). With this crate, I've added many unit tests (often involving mocked HTTP servers) to help ensure correctness.

Reviewed By: quark-zju

Differential Revision: D22157712

fbshipit-source-id: 3b0823ece26b19979980841727f1eefcf0519ad5
2020-06-26 13:24:21 -07:00
Nick Brekhus
f2435d8d4b metalog: implement compaction api
Summary: Compacts metalog by copying current root into new metalog and creating (or updating) a metalog-internal pointer file

Reviewed By: quark-zju

Differential Revision: D22100213

fbshipit-source-id: 7cea17dde46ac4fa2c84da873df68c536dca4119
2020-06-26 09:58:58 -07:00
Jun Wu
b1978f2ea5 revlogindex: use util::path::atomic_write_symlink to write nodemap
Summary:
The new `atomic_write_symlink` API handles platform weirdness especially on
Windows with mmap. Use it to avoid issues.

Reviewed By: DurhamG

Differential Revision: D22225317

fbshipit-source-id: c04a3948c30834e1025a541fc66b371654ed77e4
2020-06-25 13:56:12 -07:00
Jun Wu
5c66d921c6 util: implement a symlink-based atomic_write
Summary:
This diff aims to solve `atomic_write` issues on Windows. Namely:
- `tempfile` left overs if temp files are not deleted on Drop.
- `tempfile` does unnecessary `chmod`.
- For mmap-ed files, it has to be deleted before `atomic_write`, causing
  reader to have a chance to see inconsistent data.

This diff solves the above issues by:
- Use extra GC to clean up older files. Do not realy on successful `Drop`.
- Do not use `tempfile` and do not set permissions.
- Use a symlink so the symlink can still be atomic-replaced while the real
  content is being mmaped.

Reviewed By: DurhamG

Differential Revision: D22225039

fbshipit-source-id: d45bb198a53f8beeef71798cdb9ae57f9b4b8cd3
2020-06-25 13:56:12 -07:00
Jun Wu
570dbef1db util: add PathLock
Summary: RAII lock on a filesystem path.

Reviewed By: DurhamG

Differential Revision: D22226017

fbshipit-source-id: 4e26358bf0b6d4b2440fb77058032dbde8b6d02c
2020-06-25 13:56:12 -07:00
Xavier Deguillard
31d831bfaf revisionstore: manually set the Content-Length for empty lfs requests
Summary:
We've had a case where a client sent a bundle that contained an LFS pointer to
Sandcastle. On that original repo, the LFS blob and pointer were in the
hgcache, which signified that the server has it. When that bundle gets applied
on Sandcastle, the pointer then makes its way onto the local lfs pointer store
and the blob will be fetched from the LFS server itself properly.

Where it gets tricky is that Sandcastle then tried to re-bundle the change and
thus tried to re-upload it. The expectation from the client perspective is that
since the blob was fetched from the server in the first place, the server will
just not ask for the blob to be re-uploaded. The asumption proved to be
semi-inacurate in the case where the LFS server mirrors all the writes to a
secondary LFS server. When that secondary LFS server is standalone and not
Mononoke, it may not have the blob in question, and thus the LFS server may
request the blob to be re-uploaded so it can write it to the secondary LFS
server.

Unfortunately, things start breaking down at this point. Since the blob isn't
present in the local store the client can't rely on being able to read it and
fetching it from the server to then upload it is silly and a bit too complex.
Instead, what we can do is teach the server of this specific scenario by
manually setting the Content-Length to 0 in the case of an upload where the
blob wasn't found in the local store. The server recognizes this and will
manually copy the blob to the secondary LFS server.

Reviewed By: krallin

Differential Revision: D22192718

fbshipit-source-id: 67c5ba1a751cc07d5d5d51e07703282d8175b010
2020-06-24 09:05:29 -07:00
Jun Wu
d59189b0d9 revlogindex: update outdated nodemap more aggressively
Summary:
If the nodemap file is seriously out of sync, it's harder to recover on Windows,
because nodemap mmap in other processes prevents `atomic_write`.

Use `util::path::remove_file` to remove the old mmap files and improve the
success rate of nodemap rewrites.

Reviewed By: xavierd

Differential Revision: D22168134

fbshipit-source-id: 577df978b3175eac3257794f5e5ff7522cf21871
2020-06-23 22:53:36 -07:00
Jun Wu
bc09039a7d util: make remove_file work with mmap on Windows
Summary:
Make it possible to delete files that are mmap-ed. This avoids the extra
complexity of cleaning leftover files.

Reviewed By: xavierd

Differential Revision: D22168135

fbshipit-source-id: edb33a75638c668945feae49608c3fff25e742a4
2020-06-23 22:53:36 -07:00
Durham Goode
60dfefda40 configs: remove assert for duplicate dynamic config entries
Summary:
The dynamicconfig logic had an assert checking that a config was not
added by the dynamicconfig twice. This isn't actually a valid assert since the
caller could load the config as many times as it wanted.

This happened in hg doctor, where it loaded the ui once manually (without a repo
object) then it loaded it again during the creation of the repo object. I went
ahead and cleaned up hg doctor to not do this, but let's get rid of the assert
anyway.

Reviewed By: quark-zju

Differential Revision: D22194273

fbshipit-source-id: 9491d35fe14523ad3d9cb69b4ca0d615149dd0f0
2020-06-23 16:05:16 -07:00
Jun Wu
1020f76e7d stackdesc: remove the crate
Summary: The tracing APIs and error context APIs can achieve similar effects.

Reviewed By: xavierd

Differential Revision: D22129585

fbshipit-source-id: 0626e3f4c1a552c69c046ff06ac36f5e98a6c3d8
2020-06-23 14:06:54 -07:00
Xavier Deguillard
275de2a00e revisionstore: add tracing spans to lfs code
Summary: This will help in debugging potential performance issues.

Reviewed By: DurhamG

Differential Revision: D22140035

fbshipit-source-id: d7403897fbb843a4eca874c1da3788190d181bc0
2020-06-19 19:54:21 -07:00
Xavier Deguillard
b8744c42cc revisionstore: add a timeout when sending lfs requests
Summary:
We've seen a handful of hangs in the LFS code, adding a timeout on the
connection request would help in surfacing issues when the client for some
reason doesn't get a reply from the server.

We unfortunately can't add a general timeout, as a user's connection speed
would need to be taken into account.

Reviewed By: krallin

Differential Revision: D22139560

fbshipit-source-id: 0118fb8a38af488e2f40bed0a09677bc68d666a8
2020-06-19 19:54:21 -07:00
Xavier Deguillard
c87a18ebae revisionstore: enable all tokio's features
Summary:
At least one mac user had a confirmed hang in the new LFS code that caused us
to rollback LFS for fbsource. One of the thing that was weird is that both
Sandcastle and devservers have several orders of magnitude higher request to
the LFS server and thus we would expect this issue to be widespread there. We
haven't seen this.

One of the difference between the package on the devservers/sandcastle and on
macs is how they are compiled. The former is with buck, while the later is with
cargo, and the package built with buck has all the tokio features enabled,
while cargo has no features enabled. What interest us here is the kind of
scheduler that is in use: on mac the basic scheduler, on devserver/sandcastle,
the threaded scheduler.

While I'm not entirely convinced that the difference in the scheduler is what
causes the hang, it's a good idea to make sure that both of our packages are
compiled the same way.

Reviewed By: krallin

Differential Revision: D22138593

fbshipit-source-id: ce9e64a6a930d2b0cccb634a1af9c3b5b8816a21
2020-06-19 19:54:21 -07:00
Arun Kulshreshtha
da38165c2b edenapi_types: update terminology in comment
Summary: Replace the term 'blacklisted' with 'redacted'.

Reviewed By: quark-zju

Differential Revision: D22142842

fbshipit-source-id: 461faa09a5d3958a4575c48f03e2286eec917681
2020-06-19 13:29:03 -07:00
Arun Kulshreshtha
6926c80fe8 edenapi_types: add doc comment
Summary: Add a crate-level doc comment clarifying the purpose of this crate. In particular, the motivation behind moving some of EdenAPI's types into this crate was to make it easier to share types between the server and client. As such, types that are not shared (which may pull in more complex dependencies and therefore cause build issues for either the server or client) should not be put in this crate.

Reviewed By: quark-zju

Differential Revision: D22142180

fbshipit-source-id: 7258ef33b73a87acf72d4f6bcbe8b27cbc361735
2020-06-19 12:55:49 -07:00
Mark Thomas
a06ea2f1a6 thrift-types: update thrift types
Reviewed By: farnz

Differential Revision: D22089415

fbshipit-source-id: 44596865ef65579d7910f1677735c1820ad69f58
2020-06-18 12:15:00 -07:00
Jeremy Fitzhardinge
1b4edb5567 eden: remove unused Rust dependencies
Summary:
Remove unused dependencies for Rust targets.

This failed to remove the dependencies in eden/scm/edenscmnative/bindings
because of the extra macro layer.

Manual edits (named_deps) and misc output in P133451794

Reviewed By: dtolnay

Differential Revision: D22083498

fbshipit-source-id: 170bbaf3c6d767e52e86152d0f34bf6daa198283
2020-06-17 17:55:03 -07:00
Jeremy Fitzhardinge
c530d32056 rust: clean up some warnings
Summary: Prep for 1.44 but also general cleanups.

Reviewed By: dtolnay

Differential Revision: D22024428

fbshipit-source-id: 8e1d39a1e78289129b38554674d3dbf80681f4c3
2020-06-15 16:50:40 -07:00
Carolyn Busch
89f238b8dd Replace whitelist/blacklist term
Summary: Replaces usages of whitelist/blacklist with include/exclude/allow. These terms are more descriptive and less likely to contribute to racial stereotyping. More context: https://fb.workplace.com/groups/sourcecontrolteam/permalink/2926049127516414/

Reviewed By: farnz

Differential Revision: D22039299

fbshipit-source-id: d5b4cdeca681e1f4c992f7ef0b9f772b2ae1abd3
2020-06-15 15:01:19 -07:00
Arun Kulshreshtha
977c3c73e3 edenapi_server: rename the subtree endpoint to complete_trees
Summary:
Rename the `subtree` endpoint on the EdenAPI server to `complete_trees` to better express what it does (namely, fetching complete trees, in contrast to the lighter weight `/trees` endpoint that serves individual tree nodes). This endpoint is not used by anything yet, so there isn't much risk in renaming it at this stage.

In addition to renaming the endpoint, the relevant request struct has been renamed to `CompleteTreeRequest` to better evoke its purpose, and the relevant client and test code has been updated accordingly. Notably, now that the API server is gone, we can remove the usage of this type from Mononoke's `hgproto` crate, thereby cleaning up our dependency graph a bit.

Reviewed By: krallin

Differential Revision: D22033356

fbshipit-source-id: 87bf6afbeb5e0054896a39577bf701f67a3edfec
2020-06-15 13:40:44 -07:00
Carolyn Busch
2c7f30d0c4 configparser: replace whitelist/blacklist term
Summary: Replace usages of whitelist/blacklist with include/exclude/filter/allow. These terms are more descriptive and less likely to contribute to racial stereotyping. More context: https://fb.workplace.com/groups/sourcecontrolteam/permalink/2926049127516414/

Reviewed By: kulshrax

Differential Revision: D22039298

fbshipit-source-id: 255c7389ee5ce5e54bbccdfb05ffa4cafc6958e5
2020-06-15 12:47:08 -07:00
Arun Kulshreshtha
7d72408ade edenapi_server: return streaming responses
Summary: Make most of the handlers on the EdenAPI server return streaming responses.

Reviewed By: krallin

Differential Revision: D22014652

fbshipit-source-id: 177e540e1372e7dfcba73c594614f0225da3a10f
2020-06-15 09:11:09 -07:00
Durham Goode
5a83869eb1 thrift: add serde to generated thrift types
Summary:
This will let us serialize and deserialize thrift structures from
non-thrift sources. In my particular case I'll be using it to deserialize json
into hg client config thrift structures that are shared with the backend.

Reviewed By: quark-zju

Differential Revision: D21875371

fbshipit-source-id: 6a85f28e69f2eeb375f6639ed1dcd3c1cd957d99
2020-06-12 16:51:52 -07:00
Arun Kulshreshtha
14846656f5 edenapi_types: restructure crate
Summary:
Note: Although this diff looks big, it's all just cutting and pasting code.

Restructure the crate so that the types relevant to each EdenAPI endpoint are grouped in their own module. This means that authors of new endpoints can just add a new module and put their types there, rather than having to make edits throughout the crate.

Reviewed By: quark-zju

Differential Revision: D21983470

fbshipit-source-id: 4901662dc6b5f0a1feca1ab8d914f394114ef233
2020-06-12 12:16:51 -07:00
Arun Kulshreshtha
b7aafa1df9 edenapi_types: add HistoryResponseChunk
Summary: Change `HistoryResponse` so that instead of repeatedly sending the file path alongside every history entry, we group the history entries by file and send the path once per group. This will prevent the server from transmitting redundant information.

Reviewed By: quark-zju

Differential Revision: D21982558

fbshipit-source-id: f1c5d2573c97940c7bf1645ed7fef6e1887c0d42
2020-06-12 12:16:51 -07:00
Arun Kulshreshtha
7ef8f7bc9d edenapi_types: remove superfluous tests
Summary:
These particular unit tests essentially just verify that these types (which are trivial wrapper structs around `Vec`s) implement `IntoIterator`. As such, they don't really provide value (the `IntoIterator` implementations are trivial*) but require updates whenever the format of the items produced by the iterators changes. This adds a maintenance burden with little value to show for it, so let's just get rid of these.

*The only thing non-trivial is that for `HistoryResponse`, the iterator does convert the entries from `WireHistory` to `HistoryEntry`, but the actual logic of this conversion is tested separately in `api.rs`, so testing it here is redundant.

Reviewed By: quark-zju

Differential Revision: D21983471

fbshipit-source-id: beebad53abb9ce5801b1738af92f250405c50990
2020-06-10 19:29:52 -07:00
Xavier Deguillard
e05de75a3f revlogindex: revert "do not use mmap to read .i on Windows"
Summary:
Reading the 00changelog.i entirely can be very expensive both in time and
memory. Since the 00changelog.i is now copied, then truncated and copied back
in place, we no longer have the limitation that prevents Windows from using
mmap for it, so let's do it.

Reviewed By: quark-zju

Differential Revision: D21968805

fbshipit-source-id: c0d19593e1f565c93b9243a63c87d03cfbc821d8
2020-06-10 19:29:50 -07:00
Arun Kulshreshtha
9c689337cd edenapi_types: rename depth to length in history request
Summary: D21880220 renamed the `depth` parameter in Mononoke's history fetching functions to be `length`. This diff makes the same change for EdenAPI's `HistoryRequest` struct.

Reviewed By: StanislavGlebik

Differential Revision: D21948599

fbshipit-source-id: fe8649a5789f07d8262ad3d5e2f477a8b50f2c6f
2020-06-10 19:29:45 -07:00
Arun Kulshreshtha
cde0436ca9 edenapi_types: move EdenAPI types into separate crate
Summary:
Several of the structs used by EdenAPI were previously defined in Mercurial's
`types` crate. This is not ideal since these structs are used for data interchange
between the client and server, and changes to them can break Mononoke, Mercurial,
or both. As such, they are more fragile than other types and their use should be
limited to the EdenAPI server and client to avoid the need for extensive breaking
changes or costly refactors down the line.

I'm about to make a series of breaking changes to these types as part of the
migration to the new EdenAPI server, so this seemed like an ideal time to split
these types into their own crate.

Reviewed By: krallin

Differential Revision: D21857425

fbshipit-source-id: 82dedc4a2d597532e58072267d8a3368c3d5c9e7
2020-06-10 19:29:44 -07:00
Mark Thomas
860594a0e6 streampager: fix progress rendering
Summary:
With the internal streampager, progress bars must be sent on a separate stream so that
streampager can render them correctly.

Reviewed By: quark-zju

Differential Revision: D21906173

fbshipit-source-id: eb41b0bf22807d9cae518b3f676996ab1c642c6e
2020-06-10 19:29:28 -07:00
Mark Thomas
ce3d15ef1d upgrade streampager to 0.8
Summary: Upgrade the internal streampager to the latest version, which includes line wrapping.

Reviewed By: quark-zju

Differential Revision: D21906172

fbshipit-source-id: 1bc63723f55ac115090e7ae0a2541158863056b9
2020-06-10 19:29:27 -07:00
Zeyi (Rice) Fan
854637e854 implement batch blob import for HgNativeBackingStore
Summary:
NOTE: This stack works together, they are separated for easier reviewing. Check D21723465 if you want to read this stack in one place.

This diff encapsulates the newly added batch blob import API in backingstore crate, and expose it to EdenFS. It converts incoming complex data structures down to primitive data types that can go through FFI boundary.

Reviewed By: xavierd

Differential Revision: D21697578

fbshipit-source-id: 8f5505ef4cab342dfa710798a04df4e0e94055d9
2020-06-10 19:29:25 -07:00
Zeyi (Rice) Fan
cfa5945749 implement batch blob fetching API in backingstore crate
Summary:
NOTE: This stack works together, they are separated for easier reviewing. Check D21723465 if you want to read this stack in one place.

This diff implements getBlobBatch in `backingstore` crate that is able to process multiple blob import request at once. This will help EdenFS to process blob import request efficiently. See following diffs in the stack for usage.

To import a list of import requests, the function first check if the file exists locally. When it is already in local store, the function will call the provided `resolve` function to send the blob back. Then it proceeds to send **ONE** request to remote store (in our case, EdenAPI) for all these missing files if this import permits remote importing.

We use the callback style `resolve` parameter because we want to awake waiting threads as soon as the blob is ready, so we don't waste time on unnecessary waiting.

Reviewed By: xavierd

Differential Revision: D21697580

fbshipit-source-id: b550accf6f6163cf6f2e9be6b628e46f44076c37
2020-06-10 19:29:24 -07:00
Jun Wu
d3791fe721 revlogindex: do not use mmap to read .i on Windows
Summary: The `.i` file might get truncated. `mmap` prevents truncation on Windows.

Reviewed By: DurhamG

Differential Revision: D21869321

fbshipit-source-id: 18d6186954c6fb4c757d571b97fe620eed0731ea
2020-06-03 15:23:57 -07:00
Jun Wu
954d1ce8e4 renderdag: make render API accept trait objects
Summary: This makes the API more flexible.

Reviewed By: sfilipco

Differential Revision: D21626196

fbshipit-source-id: d0d28ba075ee3321a1a757f848fb72592827d75d
2020-06-03 13:26:27 -07:00
Jun Wu
fc4c73a7cf revlogindex: implement DagAlgorithm trait
Summary: This allows DAG algorithms to be executed on revlog.

Reviewed By: sfilipco

Differential Revision: D21626217

fbshipit-source-id: e9fd61ad62f95be7b055a0ef8879c59cbeeb60b9
2020-06-03 13:26:26 -07:00
Jun Wu
de8b085e6e revlogindex: port gca and range algorithms from bindag
Summary:
Mostly copy-paste from code added in D19503373 and D19511574. Adjusted to match
the revlog index interface.

Reviewed By: sfilipco

Differential Revision: D21626201

fbshipit-source-id: 05d160e4c03d7e2482b6a4f2d68c3688ad78f568
2020-06-03 13:26:26 -07:00
Jun Wu
234147239a dag: add ToIdSet trait
Summary: The trait converts NameSet to IdSet. It'll be used by the revlog index.

Reviewed By: sfilipco

Differential Revision: D21795869

fbshipit-source-id: 55f7a238158442db9d8bdfe84e64438be504f618
2020-06-03 13:26:25 -07:00
Jun Wu
45d6b00593 dag: add InverseDag
Summary: Add a way to inverse the DAG (swap parent / children relations).

Reviewed By: sfilipco

Differential Revision: D21795870

fbshipit-source-id: 2d076f4ae491141aa758faa5f5f303c97f7e56dc
2020-06-03 13:26:25 -07:00
Jun Wu
a3b663735e dag: add IdLazySet
Summary:
Similar to LazySet, but the iterator is using Ids. This will be useful for
lazy calculations that are cheaper with Ids.

Reviewed By: sfilipco

Differential Revision: D21626208

fbshipit-source-id: 9a34fbf18f0039caeb4f6e698294c4d335354093
2020-06-03 13:26:24 -07:00
Jun Wu
223faebe5f dag: rename DagSet to IdStaticSet
Summary:
The NameSet is not really about Dag. It is about using Id and is static.
Rename it to clarify. In an upcoming change we'll have IdLazySet.

Reviewed By: sfilipco

Differential Revision: D21626204

fbshipit-source-id: 84f25008f7032f6e26a26fc656ccbcd2a5880ecf
2020-06-03 13:26:24 -07:00
Jun Wu
bf90003c24 dag: implement NameIter automatically
Summary:
This makes it possible to use NameIter without manually specifying out iterator
types, which might be quite long.

Reviewed By: sfilipco

Differential Revision: D21626202

fbshipit-source-id: 67b338765c09629645794cf73a9b496271524f9d
2020-06-03 13:26:24 -07:00
Jun Wu
6292253ef8 dag: add fast paths using hints
Summary: Take advantage of Hints and add fast paths.

Reviewed By: sfilipco

Differential Revision: D21626216

fbshipit-source-id: 6d43666bd6cdec7ff4b93032c1064cafd8de85cf
2020-06-03 13:26:23 -07:00
Jun Wu
d3878732f8 dag: set hints with existing hints
Summary: Update hints if they are easy to obtain or calculate.

Reviewed By: sfilipco

Differential Revision: D21626206

fbshipit-source-id: 453b7db2444406ce51d574c688fe536316fb9b0f
2020-06-03 13:26:23 -07:00
Zeyi (Rice) Fan
39583bb2e3 be graceful when trying to fix dir permission
Summary:
In some rare cases, we would have hgcache that contains broader permission than we are expecting. We shouldn't be fixing it if that's the case.

We also might be in situations where hgcache directory isn't entirely created by Mercurial, and the owner of the directory will be different than the process. This will cause the `chmod` call to fail with permission error. In that case, this will cause EdenFS to panic. This is undesirable. We should be handling this case more gracefully and let the original error populate.

Reviewed By: xavierd

Differential Revision: D21854542

fbshipit-source-id: e9d11399aeb40b375725b49f4bcd54050afdcbad
2020-06-03 10:25:03 -07:00
Jun Wu
fb56b1962d dag: move optimization hints to a dedicate structure
Summary:
Previously, the NameSet has properties like "is_all", "is_topo_sorted", etc.
To make lazy sets efficient, it's important to have hints about min / max Ids
and maybe some other information.

Add a dedicated Hints structure for that.

Reviewed By: sfilipco

Differential Revision: D21626219

fbshipit-source-id: 845e88d3333f0f48f60f2739adae3dccc4a2dfc4
2020-06-02 14:00:36 -07:00
Jun Wu
13503a1490 dag: add some default impls for DagAlgorithm
Summary:
Implement a small subset of DagAlgorithm by default. This makes
other implementations of DagAlgorithm slightly easier.

Reviewed By: sfilipco

Differential Revision: D21626199

fbshipit-source-id: ac6dfb5c22bf1da44f521fc9e76d59bfb95063c7
2020-06-02 14:00:36 -07:00
Jun Wu
c920549e09 dag: fix DagSet::contains
Summary:
D21479023 broke it. It should convert to Id, and check Id against the SpanSet,
instead of just checking the IdMap ignoring the SpanSet.

Reviewed By: sfilipco

Differential Revision: D21626193

fbshipit-source-id: 6daf86f292a7acfd3688893a55e2a794cfe068fe
2020-06-02 14:00:36 -07:00
Jun Wu
62719f10eb dag: make to_span_set take reference
Summary: This makes the next change easier to implement.

Reviewed By: sfilipco

Differential Revision: D21626198

fbshipit-source-id: 57ab69cba7f43350767e5d0d52ebfe66764895ca
2020-06-02 14:00:35 -07:00
Jun Wu
48c003fb11 revlogindex: impl IdConvert and PrefixLookup for RevlogIndex
Summary:
Implements part of the dag IdMap related traits.

It does not get used yet, but eventually I'd like `pydag` to be able to work
with an abstracted dag including RevlogIndex.

Reviewed By: sfilipco

Differential Revision: D21626210

fbshipit-source-id: 53f19622f03fd71b76073dccf8dcc9b4778b40ca
2020-06-02 14:00:35 -07:00
Jun Wu
38d6c6a819 revlogindex: include NodeRevMap in RevlogIndex
Summary:
This will allow RevLogIndex to answer node -> rev and hex lookup queries.

Also change RevlogIndex::new to take file names so it can write back the
nodemap index when the index is lagging. That part of logic currently exists in
pyindexes + clindex.pyx, which are going to be replaced by revlogindex.

Practically, this will generate a `00changelog.nodemap` file in svfs, which is
temporarily unused, but will be used once clindex.pyx gets replaced.

Reviewed By: sfilipco

Differential Revision: D21626209

fbshipit-source-id: 297d9eff26a73c26558708f7a2290d4d8ba1e777
2020-06-02 14:00:34 -07:00
Arun Kulshreshtha
3223d12f99 edenapi: add data subcommand to read_res
Summary: Previously, `read_res` was called `data_util` and only dealt with EdenAPI data responses. Support for history responses was added later as a `history` subcommand. For consistency, let's move the top-level commands for data responses underneath a new `data` subcommand. When support for addition response types is added in the future, those can also go under their own subcommands.

Reviewed By: quark-zju

Differential Revision: D21825197

fbshipit-source-id: f5cb759a68324e7d0f98e3448bd5d1cba6417bad
2020-06-02 12:49:18 -07:00
Arun Kulshreshtha
735b112d97 edenapi: rename data_util to read_res
Summary: Give this tool a more descriptive name. (It reads EdenAPI responses, so `read_res` seemed fitting.)

Reviewed By: quark-zju

Differential Revision: D21796964

fbshipit-source-id: 8a4ee365aa3bcf115fc7a3452406ed96b4a25edc
2020-06-02 12:49:18 -07:00
Arun Kulshreshtha
c5143fa4d8 edenapi: rename utils dir to tools
Summary: In line with other crates that contain utility binaries alongside the crate, rename the `edenapi/utils` directory to `edenapi/tools`.

Reviewed By: quark-zju

Differential Revision: D21796899

fbshipit-source-id: 058319e2756b1d596f06d6e57d17a6c07a7f1c9c
2020-06-02 12:49:18 -07:00
Xavier Deguillard
9d18e2eae6 revisionstore: fix permissions on shared directories
Summary:
Now that the shared directories are created with the right permissions, we
could still have some in the wild with the incorrect permissions. Let's make
sure that we fix these up.

Reviewed By: wez

Differential Revision: D21832436

fbshipit-source-id: d8ee40f61b16857d29e1360ec6df50b2a95ea7aa
2020-06-02 08:46:57 -07:00
Xavier Deguillard
1c0668e512 revisionstore: set sgid bit when creating directories
Summary:
This is required for the shared cache to avoid permissions issues when multiple
users are trying to use it.

Reviewed By: fanzeyi

Differential Revision: D21830490

fbshipit-source-id: 3db0dbd674b6d2e10b9361ff3c3a668d046f5d78
2020-06-02 08:46:56 -07:00
Durham Goode
713fbeec24 datastore: support indexedlog as the hgcache write store
Summary:
Adds a remotefilelog.write-hgcache-to-indexedlog config which directs
all hgcache writes to the indexedlog.

This change also removes remotefilelog.indexedlogdatastore as it's on by default
now.

Reviewed By: xavierd

Differential Revision: D21772132

fbshipit-source-id: a71e188df2958fb4f8c4776da23ba87deccb3b79
2020-06-01 11:18:44 -07:00
Jun Wu
6b595410ce revlogindex: make changelog data type consistent
Summary:
Change `NodeRevMap`'s changelog type from `[u8]` to `[RevlogEntry]`.
This makes it consistent with `RevlogIndex`.

Reviewed By: sfilipco

Differential Revision: D21626203

fbshipit-source-id: 7457f48ccd7b3489264684a5db21d21e9eb7a937
2020-06-01 10:56:55 -07:00
Jun Wu
445e9f9fa7 pyindexes: move NodeRevMap to revlogindex
Summary:
NodeRevMap helps converting from a commit hash to a rev number. It's similar to
IdMap in the dag crate, but was designed for the revlog.

Move NodeRevMap to revlogindex so it becomes easier to implement the IdConvert
trait required by the dag crate.

Reviewed By: sfilipco

Differential Revision: D21626211

fbshipit-source-id: 14996f1234231b507efb5186ec30f84df5aaad10
2020-06-01 10:56:55 -07:00
Jun Wu
40fbbff9af pyrevlogindex: move non-Python logic to a pure Rust crate
Summary:
The idea is that the pure Rust revlogindex crate can implement the DagAlgorithm
interface so we will have a consistent interface in the code base that works
for both the existing storage (revlog) and the new segmented changelog.

The other way to do this is to implement the `bindings.dag.namedag` interface
in pure Python for the revlog-based DAG, or supporting quite different
interfaces (ex. revlog DAG and the Rust segmented changelog DAG) in the code
base. At present, I think implementing the Rust DAG traits for revlog is the
most appealing, partially because we already have some key algorithms
implemented (ex. prefix lookup, common ancestors, etc).

Reviewed By: sfilipco

Differential Revision: D21626197

fbshipit-source-id: 733b1af1bcd5fc0784764fc7103412988894d43b
2020-06-01 10:56:54 -07:00
Thomas Orozco
006864a6ce lfs v2: don't send empty batch requests
Summary:
We can skip the round trip to the server if we have nothing to ask or tell the
server.

Reviewed By: xavierd

Differential Revision: D21763025

fbshipit-source-id: 7f199c12ccaa0f66ce765dd976723e4002c5dd4e
2020-06-01 03:45:30 -07:00
Arun Kulshreshtha
0b4dc0f103 edenapi: add usage examples to make_req
Summary: Added comments showing the expected JSON input format for each kind of EdenAPI request.

Reviewed By: xavierd

Differential Revision: D21782765

fbshipit-source-id: bf08dd4b36a33aca506eb0fa0341e40d0150d7cb
2020-05-29 15:45:10 -07:00
Arun Kulshreshtha
10fa44290e edenapi: use array to specify keys in history requests
Summary: Update the JSON format for history requests to use an array rather than an object to represent keys, for the same reason as D21412989. (Namely, that it's possible for two keys to share the same path, making the path unsuitable for use as a field name in a JSON object.)

Reviewed By: xavierd

Differential Revision: D21782763

fbshipit-source-id: eb04013795d1279ecbf00a8a0be106318695bd05
2020-05-29 15:45:10 -07:00
Jake Bailey (Hacklang)
9090d7bdf1 Use types from the oxidized_by_ref crate instead of the oxidized crate for to_oxidized
Summary: This diff replaces the use of the oxidized typechecker types in `to_oxidized`, replacing them with oxidized_by_ref definitions. This unblocks deleting typechecker types and decls from the oxidized crate, since they are used only in this debugging-only path. It also reduces the amount of conversion necessary for `to_oxidized` (since we need not convert the oxidized_by_ref structures the typechecker already uses, like `Ty`).

Differential Revision: D21587518

fbshipit-source-id: 5e7ec2237d0059070653b45d15eb7c1259588ccb
2020-05-28 22:19:13 -07:00
Durham Goode
0c5a19eaab configs: move Repo enum to fb/ module
Summary:
Let's hide the repo list in the fb/ module since we don't really want
to expose the whole repo list in our public repo.

Reviewed By: sfilipco

Differential Revision: D21731419

fbshipit-source-id: 22ca20a3a80637c852e313f1390849aac1fecbf4
2020-05-28 12:14:08 -07:00
Jun Wu
14b3c2e0f0 dag: move from_ascii to traits
Summary:
This adds flexibility. Now every type that implements DagAddHeads, including
NameDag, can import ASCII graphs.

Reviewed By: sfilipco

Differential Revision: D21626213

fbshipit-source-id: e258d88f97cbcc9aaf98d353a929803325185df7
2020-05-27 12:16:48 -07:00
Jun Wu
bd6c6fe18b dag: implement IdConvert on Dag structs
Reviewed By: sfilipco

Differential Revision: D21626214

fbshipit-source-id: 90d5a587e42340ac2b0f0b3f35f3bc084e969d40
2020-05-27 12:16:48 -07:00
Jun Wu
be5e3a20b4 dag: IdMapLike -> IdConvert
Summary: The trait was about converting between Id and VertexName. Rename to clarify.

Reviewed By: sfilipco

Differential Revision: D21626195

fbshipit-source-id: 874ca4ca3a1467084a08c6d9aa321201974e1978
2020-05-27 12:16:47 -07:00
Jun Wu
64dc05ab9d dag: move add_heads, flush, add_heads_and_flush to traits
Summary: This allows other kinds of DAG to implement the operations.

Reviewed By: sfilipco

Differential Revision: D21626220

fbshipit-source-id: 896c5ccebb1672324d346dfca6bcac9b4d3b4929
2020-05-27 12:16:47 -07:00
Jun Wu
4934987796 dag: implement PrefixLookup for Dag, MemDag and MemIdMap
Summary: This makes things a bit more flexible.

Reviewed By: sfilipco

Differential Revision: D21626194

fbshipit-source-id: f3ad486bcd5a6478d9e00f674d48f99504cded8c
2020-05-27 12:16:46 -07:00
Jun Wu
26217dcdb5 dag: move hex prefix lookup to a trait
Summary: This makes it possible for other types to implement the hex prefix lookup.

Reviewed By: sfilipco

Differential Revision: D21626218

fbshipit-source-id: 96e8b8c37e5aae2bd60658a238333b61902936d1
2020-05-27 12:16:46 -07:00
Jun Wu
577c9442bb dag: add VertexName::from_hex
Summary: It will be used in the next change.

Reviewed By: sfilipco

Differential Revision: D21626207

fbshipit-source-id: bbef70ef9d4f9aaa2039a6bc15d296e88db7f8dc
2020-05-27 12:16:46 -07:00
Jun Wu
38cc83e1bf dag: add short aliases for main public types
Summary:
Types like IdDag are not really used. The use of the word "name" is sometimes
confusing in other context. Therefore export shorter names like Dag, MemDag,
Vertex, avoid "name" in NameDag, MemNameDag and NameSet. This makes external
code shorter and less ambiguous.

Reviewed By: sfilipco

Differential Revision: D21626212

fbshipit-source-id: 5bcf3cecfd38277149b41bf3ba9e6d4ef2a07b2b
2020-05-27 12:16:45 -07:00
Jun Wu
e0d11803f2 dag: move DagAlgorithm to an independent trait
Summary:
This decouples DagAlgorithm from the IdMap + IdDag backend, making it possible
to support other kinds of backends of DagAlgorithm (ex. a revlog backend).

Reviewed By: sfilipco

Differential Revision: D21626200

fbshipit-source-id: f53cc271a200062e9c02f739b6453e1d7de84e6d
2020-05-27 12:16:45 -07:00
Zeyi (Rice) Fan
8b49918ce1 add a fake remotestore to enable memcache
Summary:
xavierd and I realized that memcache is not actually enabled in EdenFS as `ContentStoreBuilder` only sets it when there is are remote store. We believe turning on memcache store can help us improve some performance for importing blobs (80% hit rate).

This diff adds a noop remote store that does nothing to trick `ContentStoreBuilder` to include the memcache store. We decided to go this way instead of modifying `ContentStoreBuilder` is because EdenFS is going to have a real remotestore very soon, and changes to `ContentStoreBuilder` may have some impact on how Mercurial uses it.

Reviewed By: xavierd

Differential Revision: D21621234

fbshipit-source-id: 5502e001ab498134b6a927d83be823261e70e9e1
2020-05-21 16:32:42 -07:00
Durham Goode
1d770d8e17 config: fix dynamic config overriding non-subset final value
Summary:
There was a bug where if the dynamicconfig chose a value that matched
one of the ported rc files, but it did not match the final value of that config
(from a not-yet-ported rc file), it would chose the dynamicconfig value instead
of the original final value. Let's drop the dynamicconfig value in this case as
well.

Reviewed By: quark-zju

Differential Revision: D21674469

fbshipit-source-id: efcd36e9602e16210999ec8c88e86b1d7ee355e4
2020-05-21 10:41:11 -07:00
Xavier Deguillard
ee81495b3d revisionstore: handle empty proxy
Summary:
If either the config, or the environment variable is empty, the URL parsing
would raise an error, while the intent is to not have proxy. Let's handle it
properly.

Reviewed By: DurhamG

Differential Revision: D21669922

fbshipit-source-id: 84c8d09242ac8d5571f7592d874f4954692110f5
2020-05-20 14:03:50 -07:00
Durham Goode
98d428929b config: fix config validation at clone time
Summary:
At clone time we apply dynamic configs in memory, instead of loading
them from disk. The validation logic operated on the config value's location
field, which isn't set for data that comes from in memory. So we need to update
the validation logic to also consider the value source if location is not set.

Reviewed By: quark-zju

Differential Revision: D21664205

fbshipit-source-id: 8460c58c6d654780048de51ada8178c70ff0a9e6
2020-05-20 13:35:28 -07:00
Durham Goode
75c96be5ff configs: don't write dynamic config if it hasn't changed
Summary:
We kick off a background process every 15 minutes or so to compute the
new dynamicconfig. If the config hasn't changed, we should just touch the file
instead of rewriting the whole thing.

Reviewed By: quark-zju

Differential Revision: D21653098

fbshipit-source-id: 9d53d2a636cff082cd048994255cc809ce1b0221
2020-05-20 13:35:28 -07:00
Durham Goode
9f6f200a08 configs: version dynamic configs
Summary:
If we release a new version of Mercurial, we want to ensure that it's
builtin configs are used immediately. To do so, let's write a version number
into the generated config file, and if the version number doesn't match, we
force a synchronous regeneration of the config file.

For now, if regeneration fails, we just log it. In the future we'll probably
throw an exception and block the user since we want to ensure people are running
with modern configuration.

Reviewed By: quark-zju

Differential Revision: D21651317

fbshipit-source-id: 3edbaf6777f4ca2363d8617fad03c21204b468a2
2020-05-20 13:35:28 -07:00
Stefan Filip
60966c93e7 autocargo: regenerate
Summary: maintenance

Reviewed By: StanislavGlebik

Differential Revision: D21640322

fbshipit-source-id: d0b2ce604735c05d540d06835c8e4c8a940fbf5c
2020-05-19 16:08:40 -07:00
Durham Goode
861f813f25 configs: convert facebook_overrides.rc
Summary: Converts facebook_overrides.rs to our dynamic config generator

Reviewed By: quark-zju

Differential Revision: D21625721

fbshipit-source-id: 2a374939d90f1fb7f9173268e2a7fa636d672393
2020-05-19 13:23:19 -07:00
Jun Wu
88a9982eb6 hgcommands: make version a native command
Reviewed By: DurhamG

Differential Revision: D19803761

fbshipit-source-id: 32f77cc667375e537e1ff70316329251359ae6ed
2020-05-18 18:50:42 -07:00
Xavier Deguillard
97a90e7de9 revisionstore: add doc comment
Summary:
We didn't have a high level overview of the types and traits and their
purposes, add that.

Reviewed By: DurhamG

Differential Revision: D21599759

fbshipit-source-id: 18e8d23ed00134f9d662778eddaee4a9451f7c2c
2020-05-18 12:11:52 -07:00
Jun Wu
2aa85dc4d7 version: a Rust crate providing version information
Summary: Otherwise the version information is only available in Python.

Reviewed By: DurhamG

Differential Revision: D19803762

fbshipit-source-id: 044c5da86efc8c657d0c422a2b1947086444895e
2020-05-18 09:00:40 -07:00
Jun Wu
aeac1551d2 dag: implement beautify
Summary:
This function reorders commits so the graph looks better.
It will be used to optimize graph rendering for cloud smartlog (and perhaps
smartlog in the future).

Reviewed By: markbt

Differential Revision: D21554675

fbshipit-source-id: d3f0f27c7935c49581cfa6e87d7c32eb5a075f75
2020-05-14 12:03:43 -07:00
Jun Wu
cde3140e8f dag: implement BitAnd, BitOr, Sub for NameSet
Summary: This makes it easier to do `a & b`, `a | b`, `a - b`.

Reviewed By: markbt

Differential Revision: D21554677

fbshipit-source-id: e1e2571a3dc83f80a1ec7a056f2c8f71ab292d9e
2020-05-14 12:03:43 -07:00
Durham Goode
b922952589 configs: convert fbsource_overrides.rc
Summary: Converts fbsource_overrides.rs to our dynamic config generator

Reviewed By: quark-zju

Differential Revision: D21536497

fbshipit-source-id: 26d4cc656114bc7bd85c2bf09b149d78cc8eb08a
2020-05-14 10:27:43 -07:00
Durham Goode
f181eef520 configs: convert ovrsource_overrides.rc
Summary: Converts ovrsource_overrides.rs to our dynamic config generator

Reviewed By: quark-zju

Differential Revision: D21536498

fbshipit-source-id: 3f0021b7be90a82e5a517fa81fb3dad04b2837ed
2020-05-14 10:27:42 -07:00
Xavier Deguillard
99f74f0155 lfs: move uploaded blobs to the shared store
Summary:
As a developpers is working on large blobs and iterating on them, the local LFS
store will be growing significantly over time, and that growth is unfortunately
unbounded and will never be cleaned up. Thankfully, one the guarantee that the
server is making is that an uploaded LFS blob will never be removed[0]. By using
this property, we can simply move blobs from the local store to the shared
store after uploading the blob is complete.

[0]: As long as it is not censored.

Reviewed By: DurhamG

Differential Revision: D21134191

fbshipit-source-id: ca43ddeb2322a953aca023b49589baa0237bbbc5
2020-05-13 12:50:20 -07:00
Xavier Deguillard
dc589c38e5 revisionstore: don't import unused format_err
Summary: The compiler is warning about it.

Reviewed By: singhsrb

Differential Revision: D21550266

fbshipit-source-id: 4e66b0dda0e443ed63aeccd888d38a8fcb5e4066
2020-05-13 10:44:07 -07:00
Jun Wu
4f39a8e5a6 mutationstore: add a method that returns a dag
Summary:
Part of the mutation graph (excluding split and fold) can fit in the DAG
abstraction. Add a method to do that. This allows cross-dag calculations
like:

  changelogdag = ... # suppose available by segmented changelog

  # mutdag and changelogdag are independent (might have different nodes),
  # with full DAG operations on either of them.
  mutdag = mutation.getdag(...)
  mutdag.heads(mutdag.descendants([node])) & changelogdag.descendants([node2]) # now possible

Comparing to the current situation, this has some advantages:
- No need to couple the "visibility", "filtered node" logic to the mutation
  layer. The unknown nodes can be filtered out naturally by a set "&"
  operation.
- DAG operations like heads, roots can be performed on mutdag when it's
  previously impossible. We also get operations like visualization for free.

There are some limitations, though:
- The DAG cannot represent non 1:1 modifications (fold, split) losslessly.
  Those relationships are simply ignored for now.
- The MemNameDag is not lazy. Reading a long chain of amends might be slow.
  For most normal use-cases it is probably okay. If it becomes an issue we
  can seek for other solutions, for example, store part of mutationstore
  directly in a DAG format on disk, or have fast paths to bypass long
  predecessor chain calculation.

Reviewed By: DurhamG

Differential Revision: D21486521

fbshipit-source-id: 03624c8e9803eb1852b3034b8f245555ec582e85
2020-05-13 09:45:24 -07:00
Arun Kulshreshtha
647a91647b edenapi: add history support to data_util
Summary: Add the ability to parse EdenAPI history responses to `data_util`.

Reviewed By: quark-zju

Differential Revision: D21489228

fbshipit-source-id: 42dda64273673431a6f3e4d7bd430689c76c387f
2020-05-12 16:26:22 -07:00
Arun Kulshreshtha
40928f027c make_req: take array instead of object as input for data requests
Summary: Change `make_req` to take a JSON array as input when constructing `DataRequest`s instead of a JSON object. This is more correct because DataRequests can include multiple `Key`s with the same path; this cannot be represented as an object since an object is effectively a hash map wherein we would have duplicate keys.

Reviewed By: quark-zju

Differential Revision: D21412989

fbshipit-source-id: 07a092a15372d86f3198bea2aa07b973b1a8449d
2020-05-12 16:26:20 -07:00
Ellis Hoag
1d0d626a36 Pass config object down to repack
Summary:
Pass `configparser::config::ConfigSet` to `repack` in
`revisionstore/src/repack.rs` so that we can use various config values in `filter_incrementalpacks`.

* `repack.maxdatapacksize`, `repack.maxhistpacksize`
  * The overall max pack size
* `repack.sizelimit`
  * The size limit for any individual pack
* `repack.maxpacks`
  * The maximum number of packs we want to have after repack (overrides sizelimit)

Reviewed By: xavierd

Differential Revision: D21484836

fbshipit-source-id: 0407d50dfd69f23694fb736e729819b7285f480f
2020-05-11 16:41:30 -07:00
Xavier Deguillard
1cd0bba3fa revisionstore: enable use of proxies for LFS
Summary:
If http_proxy.no is set, we should respect it to avoid sending traffic to it
whenever required.

Reviewed By: wez

Differential Revision: D21383138

fbshipit-source-id: 4c8286aaaf51cbe19402bcf8e4ed03e0d167228b
2020-05-11 10:36:11 -07:00
Xavier Deguillard
2001c3fd69 revisionstore: add translate_lfs_missing to remote store get
Summary:
When Qing implemented all the get method, the translate_lfs_missing function
didn't exist, and I forgot to add them in the right places when landing the
diff that added it. Fix this.

Reviewed By: sfilipco

Differential Revision: D21418043

fbshipit-source-id: baf67b0fe60ed20aeb2c1acd50a209d04dc91c5e
2020-05-11 10:34:01 -07:00
Jun Wu
85a60dd9e4 renderdag: provide a method to render MemNameDag directly to a string
Summary: This would be handy to visualize a MemNameDag.

Reviewed By: sfilipco

Differential Revision: D21486522

fbshipit-source-id: c8d7147dc53a1a7c1b8b09ce055493c69cceba2f
2020-05-11 09:50:00 -07:00
Jun Wu
4352be72d3 renderdag: use MemNameDag to simplify tests
Summary:
Use MemNameDag::from_ascii to simplify the tests. This removes the need of:
- using tempdir
- converting between Id and VertexName manually via an IdMap
- depending on drawdag directly

Reviewed By: sfilipco

Differential Revision: D21486519

fbshipit-source-id: f04061d8892f043de40e7e321273acc51e15308a
2020-05-11 09:50:00 -07:00
Jun Wu
60684eb2c5 dag: make ASCII -> MemNameDag a public API
Summary:
It seems handy to construct a Dag just from ASCII. Therefore move it to a
public interface.

Reviewed By: sfilipco

Differential Revision: D21486525

fbshipit-source-id: de7f4b8dfcbcc486798928d4334c655431373276
2020-05-11 09:49:59 -07:00
Jun Wu
a6b7e965f3 dag: remove a TODO comment
Summary: It was done as NameSet.

Reviewed By: sfilipco

Differential Revision: D21479022

fbshipit-source-id: 1c32cabb27d72a6438409ede226104a9ebac6a1d
2020-05-11 09:49:59 -07:00
Jun Wu
4eb9251172 dag: move sort and parent_names to NameDagAlgorithm
Summary:
They are part of the read-only algorithms that are not specific to a certain
type of NameDag.

Reviewed By: sfilipco

Differential Revision: D21479017

fbshipit-source-id: 3fa58071ac43246d3cd45d84384ee93c7385f414
2020-05-11 09:49:59 -07:00
Jun Wu
282e034d30 dag: add MemNameDag
Summary:
Adds an in-memory NameDag so we can construct the DAG and use its algorithms by
just providing parents function and heads.

Reviewed By: sfilipco

Differential Revision: D21479021

fbshipit-source-id: e12d53a97afec77b2307d5efbb280bd506dee0ba
2020-05-11 09:49:58 -07:00
Jun Wu
5cbb99f4eb dag: add MemIdMap
Summary: Adds an in-memory IdMap to be used in an in-memory NameDag.

Reviewed By: sfilipco

Differential Revision: D21479018

fbshipit-source-id: bc702762b059e8659c6ab322f3c39f032e95d5b6
2020-05-11 09:49:58 -07:00
Jun Wu
682e8e96a7 dag: use IdMap traits in NameDag and NameSet
Summary:
This allows them to switch to a different IdMap implementation relatively
easily.

Reviewed By: sfilipco

Differential Revision: D21479023

fbshipit-source-id: 8ecb99cafe2093ec7d14b848ffa08581c5300414
2020-05-11 09:49:57 -07:00
Jun Wu
759f8b35c5 dag: move some IdMap operations to traits
Summary: This will allow different IdMap implementations.

Reviewed By: sfilipco

Differential Revision: D21479016

fbshipit-source-id: 852501896fddcb82624338acd9dceee41150e302
2020-05-11 09:49:57 -07:00
Jun Wu
30163eeb58 dag: update snapshot_map on change
Summary:
`NameDag::add_heads` API changes the internal `dag` state without updating
`snapshot_map`. That will cause queries relying on `snapshot_map` to fail.
Update it so that `snapshot_map` gets updated by `add_heads`.

Reviewed By: sfilipco

Differential Revision: D21479019

fbshipit-source-id: 70528aa4a488cef3dc71bf21dd89e45cfe763794
2020-05-11 09:49:57 -07:00
Jun Wu
f014f86b7a dag: move NameDag algorithms to a trait
Summary:
This makes it easier to add an "in-memory-only" NameDag with all the algorithms
implemented.

Reviewed By: sfilipco

Differential Revision: D21479020

fbshipit-source-id: c1a73e95f3291c273c800650f70db2a7eb0966d7
2020-05-11 09:49:56 -07:00
Xavier Deguillard
4e7303efd9 lfs: only upload when LFS blobs are present
Summary: If no LFS blobs needs uploading, then don't try to connect to the LFS server in the first place.

Reviewed By: DurhamG

Differential Revision: D21478243

fbshipit-source-id: 81fa960d899b14f47aadf2fc90485747889041e1
2020-05-08 13:24:21 -07:00
Meyer Jacobs
d49ac73f4c datastore: remove HgIdDataStore ::get_delta and ::get_delta_chain
Summary:
Remove HgIdDataStore::get_delta and all implementations. Remove HgIdDataStore::get_delta_chain from trait, remove all unnecessary implentations, remove all implementations from public Rust API. Leave Python API and introduce "delta-wrapping".

MutableDataPack::get_delta_chain must remain in some form, as it necessary to implement get using a sequence of Deltas. It has been moved to a private inherent impl.

DataPack::get_delta_chain must remain in some form for the same reasons, and in fact both implenetations can probably be merged, but it is also used in repack.rs for the free function repack_datapack. There are a few ways to address this without making DataPack::get_delta_chain part of the public API. I've currently chosen to make the method pub(crate), ie visible only within the revisionstore crate. Alternatively, we could move the repack_datapack function to a method on DataPack, or use a trait in a private module, or some other technique to restrict visibility to only where necessary.

UnionDataStore::get has been modified to call get on it's sub-stores and return the first which matches the given key.

MultiplexDeltaStore has been modified to implement get similarly to UnionDataStore.

Reviewed By: xavierd

Differential Revision: D21356420

fbshipit-source-id: d04e18a0781374a138395d1c21c3687897223d15
2020-05-07 11:04:01 -07:00
Mark Thomas
052e7c3877 check-code: convert to Python 3
Summary:
Update `contrib/check-code.py` to Python 3.

Mostly it was already compatible, however stricter regular expression parsing
revealed a case where one of our tests wasn't working, and as a result lots of
instances of `open(file).read()` existed that this test should have caught.

I have fixed up most of the instances in the code, although there are many
in the test suite that I have ignored for now.

Reviewed By: quark-zju

Differential Revision: D21427212

fbshipit-source-id: 7461a7c391e0ade947f779a2b476ca937fd24a8d
2020-05-07 09:07:50 -07:00
Durham Goode
939ff6c956 configs: move repo names to a enum
Summary:
A number of repo names are used quite frequently. Let's use an enum to
prevent typos and make things cleaner.

Reviewed By: quark-zju

Differential Revision: D21365036

fbshipit-source-id: 1d3d681443df181e9076f5ee87029ae61124a486
2020-05-06 09:03:17 -07:00
Mohan Zhang
2ef3e20e4d Run auto cargo locally
Summary: Since thirdpart depedency change on D21341319

Reviewed By: jsgf, wqfish

Differential Revision: D21417890

fbshipit-source-id: 3cc6bafa23512c7ae489513216bcafa46e7a744f
2020-05-05 20:59:02 -07:00
Zeyi (Rice) Fan
952069397c fix get blob local
Summary: This bug got in while iterating the original Diff. It should only be returning empty when the blob does not exist locally.

Reviewed By: xavierd

Differential Revision: D21417659

fbshipit-source-id: 676e22313ab4a024af5341d8c99797fc062bd293
2020-05-05 20:21:21 -07:00
Durham Goode
97d84e3b5d configs: move hgrc.dynamic to always be in the shared repo
Summary:
Instead of trying to maintain two hgrc.dynamic's for shared repositories,
let's just always use the one in the shared repo. In the long term we may be
able to get rid of the working-copy-specific hgrc entirely.

This does remove the ability to dynamically configure individual working copies.
That could be useful in cases where we have both eden and non-eden pointed at
the same repository, but I don't think we rely on this at the moment.

Reviewed By: quark-zju

Differential Revision: D21333564

fbshipit-source-id: c1fb86af183ec6dc5d973cf45d71419bda5514fb
2020-05-05 18:19:10 -07:00
Durham Goode
6e7f85b949 config: load .hg/hgrc.dynamic
Summary:
Adds .hg/hgrc.dynamic to the default load path, before .hg/hgrc though,
so it can be override.

Reviewed By: quark-zju

Differential Revision: D21310921

fbshipit-source-id: 288a2a2ba671943a9f8532489c29e819f9d891e1
2020-05-05 18:19:08 -07:00
Durham Goode
dc90e2ca04 configs: add domain, platform, and improve dynamicconfigs
Summary:
Adds dynamic config conditionals for domain, platform, and adds a few
useful helper functions.

Reviewed By: quark-zju

Differential Revision: D21310920

fbshipit-source-id: 58f35e52d6d7a4edae2b3aefff533ef2c021aa57
2020-05-05 18:19:08 -07:00
Durham Goode
8744b81151 git: update git dependency to 0.13.5 to match internal version
Summary:
Our internal git dependency got upgraded, so we need to upgrade our
Cargo.toml version.  Unfortunately this doesn't seem to have any test coverage?

Reviewed By: singhsrb

Differential Revision: D21410241

fbshipit-source-id: 64fe7f39a9c93aa5d97ce095ee1641c1cc6ed365
2020-05-05 15:35:12 -07:00
Zeyi (Rice) Fan
3baa8cc9b4 check if the blob fetching is present locally
Summary:
Talked with xavierd last week and we can use LocalStore's `get_missing` to determine if a blob is present locally. In this way we can prevent the backingstore crate from accidentally asking EdenAPI for a blob, so better control at EdenFS level.

With this change, we can use this function at the time where a blob import request is created with confidence that this should be short cheap call.

This diff should not change any behavior or performance.

Reviewed By: xavierd

Differential Revision: D21391959

fbshipit-source-id: fd31687da1e048262cb4eae2974cab6d8915a76d
2020-05-05 11:14:40 -07:00
Zeyi (Rice) Fan
da28f5a5b1 revisionstore: create directory with group share permission in correct places
Summary: When we create directory at certain places, we want these directories to be shared between different users on the same machine. This Diff uses the previously added `create_shared_dir` function to create these directories.

Reviewed By: xavierd

Differential Revision: D21322776

fbshipit-source-id: 5af01d0fc79c8d2bc5f946c105d74935ff92daf2
2020-05-04 19:21:33 -07:00
Jun Wu
73ff6559e6 zstore: add simple caching
Summary: Add simple caching so zstore can avoid some zstd calculation.

Reviewed By: DurhamG

Differential Revision: D21213076

fbshipit-source-id: 5e3152949cf4e6d6193c3ef3401f24e2efac5620
2020-05-01 14:24:52 -07:00
Durham Goode
3ac2be361a configs: move fbrules to a Facebook only part of the crate
Summary:
We'll be adding a bunch of Facebook specific configuration and values
here. Let's move it to someplace not open source.

Reviewed By: quark-zju

Differential Revision: D21241038

fbshipit-source-id: 2ac9cdce40b1b46f15f171d9d1f6b6692dcd29bf
2020-05-01 13:17:21 -07:00
Durham Goode
b1a2785a19 configparser: add ensure_location_supersets function
Summary:
Implements an ensure_location_supersets function who's goal is to
verify that a given config location specifies the exact same configs as a given
set of other locations. Any inconsistencies are removed from the config and
reported to the caller.

This will be used to ensure our dynamic configs match our existing rc file
configs exactly, before we delete the file configs.

Reviewed By: quark-zju

Differential Revision: D21240837

fbshipit-source-id: e2c8ec054a3696d2cf02e65c212ad886c5117253
2020-05-01 13:17:21 -07:00
Jason White
d5b2fb798e Fix outdated Cargo.toml files
Summary: `cargo autocargo` should normally produce no changes on `master`. The features of the `log` crate was updated in D21303891 without re-running autocargo. This fixes it.

Reviewed By: dtolnay

Differential Revision: D21349799

fbshipit-source-id: ce487bc5989e179673297350249593103b4d34dd
2020-05-01 10:29:33 -07:00
Zeyi (Rice) Fan
8830ed55df util: not try to create the directory when it already exists
Summary: Fix permission issues we are seeing with the latest Mercurial release.

Reviewed By: xavierd

Differential Revision: D21294499

fbshipit-source-id: bcfb13dd005258b2e3b74fa281dbd8df36133ef6
2020-04-28 20:33:59 -07:00
Carolyn Busch
4eeab3b81b Update cpython to 0.5
Summary:
D21270958 updated the cpython, python27-sys, and python3-sys crates to 0.5. Update
the Mercurial cargo dependencies to match.

Reviewed By: xavierd

Differential Revision: D21281875

fbshipit-source-id: ccad68749a25d11240351b5faeef27cb9c693456
2020-04-28 11:47:41 -07:00
Jun Wu
d479053954 metalog: support exporting to a git repo
Summary:
I wanted to figure out "who added this visible head", "what is the difference
between this metalog root and that root". Those are actually source control
operations (blame, diff). Add a git export feature so we can export metalog
to git to run those queries.

Choosing git here as we don't have native Rust utilities to create a more
efficient hg repo yet.

Ideally we can also make hg operate on a metalog directory as a "metalogrepo"
directly. However that seems to be quite difficult right now due to poor
abstractions.

Reviewed By: DurhamG

Differential Revision: D21213073

fbshipit-source-id: 4cc0331fbad6e1586907c0a66c18bcc25608ea49
2020-04-27 20:25:25 -07:00
Jun Wu
a0207c4542 metalog: expose root id API
Summary: This allows the Python world to obtain the root ID for logging purpose.

Reviewed By: DurhamG

Differential Revision: D21179513

fbshipit-source-id: 3f289c06d3d470ff492de39fa985203b3facbf00
2020-04-27 19:50:58 -07:00
Jun Wu
d8bdae0449 indexedlog: remove chown feature
Summary:
We removed the feature in D20704618 and it does not cause complaints.
Let's remove the code supporting the chown feature.

Reviewed By: DurhamG

Differential Revision: D21170307

fbshipit-source-id: c845016219e8c681930bb1780b94e6d31ca99730
2020-04-27 15:47:59 -07:00
Xavier Deguillard
86965b2f80 revisionstore: query store before fetching
Summary:
While the change looks fairly mechanical and simple, the why is a bit tricky.
If we follow the calls of `ContentStore::get`, we can see that it first goes
through every on-disk stores, and then switches to the remote ones, thanks to
that, when we reach the remote stores there is no reason to believe that the
local store attached to them contains the data we're fetching. Thus the
code used to always prefetch the data, before reading from the store what was
just written.

While this is true for regular stores (packstore, indexedlog, etc), it starts
to break down for the LFS store. The reason being that the LFS store is
internally represented as 2 halves: a pointer store, and a blob store.  It is
entirely possible that the LFS store contains a pointer, but not the actual
blob. In that case, the `get` executed on the LFS store will simply return
`Ok(None)` as the blob just isn't present, which will cause us to fallback to
the remote stores. Since we do have the pointer locally, we shouldn't try to
refetch it from the remote store, and thus why a `get_missing` needs to be run
before fetching from the remote store.

As I was writing this, I realized that all of this subtle behavior is basically
the same between all the stores, but unfortunately, doing a:
  impl<T: RemoteDataStore + ?Sized> HgIdDataStore for T
Conflicts with the one for `Deref<Target=HgIdDataStore>`. Macros could be used
to avoid code duplication, but for now let's not stray into them.

Reviewed By: DurhamG

Differential Revision: D21132667

fbshipit-source-id: 67a2544c36c2979dbac70dac5c1d055845509746
2020-04-27 12:53:11 -07:00
Qing Dong
b4edb7ff7f revisionstore: implement the get() functions on the various LocalDataStore interface
Summary: implement the get() functions on the various LocalDataStore interface implementations

Reviewed By: quark-zju

Differential Revision: D21220723

fbshipit-source-id: d69e805c40fb47db6970934e53a7cc8ac057b62b
2020-04-27 12:35:24 -07:00
Xavier Deguillard
39d49b694a revisionstore: remove memcache dependency on @mode/mac
Summary:
Memcache isn't available for Mac, but we can build the revisionstore with Buck
on macOS when building EdenFS. Let's only use Memcache for fbcode builds on
Linux for now.

Reviewed By: chadaustin

Differential Revision: D21235247

fbshipit-source-id: 5943ad84f6442e4dabbd2a44ae105457f5bb9d21
2020-04-24 17:29:36 -07:00
Zeyi (Rice) Fan
3374b99f28 util: ensure correct permission is set when creating directories
Summary:
When creates directories sometime we want to make sure other users within the same group have the write access to it to enable data sharing. Previously we rely on setting umask for the entire process to make sure the newly created directories have the correct permission bit. This is kind fragile and error-prone when running in a multi-thread environment.

This diff introduces an internal function `create_dir_with_mode` to create directory with specified permission mode. It first creates a temporary directory within the parent of the directory being created, setting up the correct permission bit, then attempts to rename the temporary directory to the desired name. This ensures that we never leave a directory without the correct permission in the place we need and without changing umask for the process.

Reviewed By: xavierd

Differential Revision: D21188903

fbshipit-source-id: 381bff7d3aaca097b9d50150e86cbbf70a90a0a5
2020-04-24 17:17:05 -07:00
Durham Goode
092d350800 filesystem: add treestate walking logic
Summary:
The second phase of pending changes is to iterate over the treestate
and figure out what files were not seen in the filesystem walk. This diff
implements that.

Reviewed By: xavierd

Differential Revision: D20546899

fbshipit-source-id: 3523fbc7e31ef0ed09c4937c72264b64e2a3db5b
2020-04-24 13:58:53 -07:00
Durham Goode
73a45b695b filesystem: add filesystem walking to PendingChanges
Summary:
The first phase of pending changes is inspecting the filesystem for
changes. This diff adds that logic.

Reviewed By: xavierd

Differential Revision: D20546909

fbshipit-source-id: 1c2c0fa7f700dbff4acfce4d5271b4472a13571f
2020-04-24 13:58:53 -07:00
Xavier Deguillard
19bfd35298 revisionstore: multiplex stores should return a path on flush
Summary:
On repack, when the Rust stores are in use, the repack code relies on
ContentStore::commit_pending to return the path of a newly created packfile, so
it won't delete it when going over the repacked ones. When LFS is enabled, both
the shared and the local stores are behind the LfsMultiplexer store that
unfortunately would always return `Ok(None)`. In this situation, the repack
code would delete all the repacked packfiles, which usually is the expected
behvior, unless only one packfile is being repacked, in which case the repack
code effectively re-creates the same packfile, and is then subsequently
deleted.

The solution is for the multiplex stores to properly return a path if one was
returned from the underlying stores.

Reviewed By: DurhamG

Differential Revision: D21211981

fbshipit-source-id: 74e4b9e5d2f5d9409ce732935552a02bdde85b93
2020-04-23 15:14:28 -07:00
Arun Kulshreshtha
6d3cacf9fd edenapi: add utility programs
Summary:
Add two utility programs for ad-hoc debugging of EdenAPI. EdenAPI requests and responses are encoded as CBOR, which is not easy to work with manually on the command line. In order to allow debugging the HTTP API using tools like `curl`, we need tools that can generate raw request payloads and interpret CBOR responses.

The utility programs included in this diff are:

- `make_req` - Can construct EdenAPI request payloads from a human-editable JSON representation.
- `data_util` - Can list, validate, and extract the contents of an EdenAPI data response.

These tools can be used by themselves or as part of a pipeline. See test plan for examples.

Reviewed By: xavierd

Differential Revision: D21136575

fbshipit-source-id: d1ac8d92964614005078a6ac76dd0835c29a80a5
2020-04-23 11:43:51 -07:00
Mark Thomas
c05efd8a5c mutationstore: move MutationEntry type to types crate
Summary: Move the MutationEntry type to the Mercurial types crate.  This will allow us to use it from Mononoke.

Reviewed By: quark-zju

Differential Revision: D20871338

fbshipit-source-id: 8de3bb8a2673673bc4c8a6cc7578a0a76358c14a
2020-04-23 08:58:10 -07:00
Durham Goode
dbff6c6b9a filesystem: add initial PendingChanges stubs
Summary:
The part of status that lists what files have changed is called
PendingChanges. This diff introduces the initial stub for PendingChanges. The
pending changes algorithm involves three parts:

1. Looking at files on the filesystem for changes.
2. Looking at files in the dirstate map for changes.
3. Looking at the content for any files that we were unsure of during steps 1
and 2.

This diff puts the basic state machine in place, and accepts the basic
information about the working copy (the root and what type of filesystem it is).
In the future we might have it detect what type of filesystem it is, but for now
this makes it easy.

Reviewed By: xavierd

Differential Revision: D20546898

fbshipit-source-id: a3030b7c846b3cb2fcba805b7fe4744df7c5764e
2020-04-22 19:55:50 -07:00
Durham Goode
701273d08f treestate: trim separators off get_filtered_keys inputs and outputs
Summary:
treestate.get_filtered_keys passes directory paths to the filter
function and returns directory matches with a trailing '/' on the end. This
makes it difficult to act as a path normalization function when the caller
doesn't know if the path is a file or directory.

It seems like we can just strip the trailing '/' before exposing the strings to
the caller (both as filter inputs and as get_filtered_keys outputs).

This is useful in the following diff that adds a case normalization crate.

Reviewed By: xavierd

Differential Revision: D20880881

fbshipit-source-id: 6e9f419178b4e278844244bd6aff2fc10e09d2cd
2020-04-22 19:55:50 -07:00