Commit Graph

1314 Commits

Author SHA1 Message Date
Jun Wu
0695b17375 dag: abstract some implementations
Reviewed By: sfilipco

Differential Revision: D24399504

fbshipit-source-id: 388b6788fbe7bbb30a34dcb91b4ca488d49ac8af
2020-10-20 15:19:30 -07:00
Jun Wu
558ed0466f dag: Persist::lock takes &mut self
Summary: It'll satisfy a future change.

Reviewed By: sfilipco

Differential Revision: D24399505

fbshipit-source-id: 6ee2cd0d2b4bd20003082e2733423647cb99619b
2020-10-20 15:19:30 -07:00
Jun Wu
f8e3d67631 dag: add TryClone trait
Summary: Will be used as bounds for abstraction.

Reviewed By: sfilipco

Differential Revision: D24399497

fbshipit-source-id: 343be12237d4850fbde9ebbe4034469527bd77fc
2020-10-20 15:19:30 -07:00
Jun Wu
f04e0fa60e namedag: define an abstract NameDag struct
Summary: It's not yet abstract. But one step closer.

Reviewed By: sfilipco

Differential Revision: D24399510

fbshipit-source-id: 32969733babd41b221170ee440f5d7ced1f7490a
2020-10-20 15:19:30 -07:00
Jun Wu
c49caf64b4 dag: drop snapshot_map from NameDag
Summary: The `snapshot` field can be used instead.

Reviewed By: sfilipco

Differential Revision: D24399507

fbshipit-source-id: 67de20d897b8b763f724f3ccbd46618dec7911b9
2020-10-20 15:19:30 -07:00
Jun Wu
465ac6c5df dag: drop IdMapEq trait
Summary:
The trait requires an `IdMap` snapshot to be locally ready. That's not easy for
all possible implementations. Drop it to simplify things.

Reviewed By: sfilipco

Differential Revision: D24399501

fbshipit-source-id: 4d85f77c99208cda30b2a543a0bb5b295f49a65c
2020-10-20 15:19:30 -07:00
Jun Wu
5f00c7984b dag: unify prepare_filesystem_sync
Summary: There were 2 prepare_filesystem_sync. Unify them into one implementation.

Reviewed By: sfilipco

Differential Revision: D24399513

fbshipit-source-id: 80d009c33b7f23dc2c4225da6fd0fb09589ba061
2020-10-20 15:19:30 -07:00
Jun Wu
4a7ffc4bd7 dag: use Locked for SyncableIdMap
Summary: Simplifies some code.

Reviewed By: sfilipco

Differential Revision: D24399500

fbshipit-source-id: a1317149da066617c4060b7efdae5234e5bd7262
2020-10-20 15:19:30 -07:00
Jun Wu
282c6800cb dag: use Locked for SyncableIdDag
Reviewed By: sfilipco

Differential Revision: D24399506

fbshipit-source-id: 91cfa176b8cfeca3f96dfeb211bf9d46a3d95bd5
2020-10-20 15:19:30 -07:00
Jun Wu
5c7b169e0e dag: add Locked type
Summary: More general purposed type for Syncable{IdDag,IdMap}.

Reviewed By: sfilipco

Differential Revision: D24399502

fbshipit-source-id: 0599db6dd07fe3d430458f86a33a9144d850fca1
2020-10-20 15:19:29 -07:00
Jun Wu
77790e9f49 dag: move non-master write APIs to IdMapWrite trait
Summary: This makes it more generic.

Reviewed By: sfilipco

Differential Revision: D24399493

fbshipit-source-id: 8a1d0a13dd29989b17fe3ef1497b10b6fa0629d6
2020-10-20 15:19:29 -07:00
Jun Wu
b249950984 dag: impl Persist on IdMaps
Summary: IdMap fits the Persist trait.

Reviewed By: sfilipco

Differential Revision: D24399494

fbshipit-source-id: 97b84d155f4b9bb3006bfad116defa4fca6330d6
2020-10-20 15:19:29 -07:00
Jun Wu
625b8ab4d5 dag: move IdMap impls to separate files
Summary: Similar to IdDag change, move impls to separate files.

Reviewed By: sfilipco

Differential Revision: D24399508

fbshipit-source-id: 575b6e7194677b67b6755b0a30ae7d014d498b10
2020-10-20 15:19:29 -07:00
Jun Wu
34aa41f24f dag: iddagstore::GetLock -> ops::Persist
Summary:
The lock, reload, mutate, persist pattern is general. It can be used for IdMap
too.

Reviewed By: sfilipco

Differential Revision: D24399512

fbshipit-source-id: d25e51ba735061ca101101d75aff95deb88b1d36
2020-10-20 15:19:29 -07:00
Jun Wu
f01dca1995 dag: drop build_segments_persistent APIs
Summary:
Now `build_segments_persistent` and `build_segments_volatile` are the same.
Just keep one of them.

Reviewed By: sfilipco

Differential Revision: D24399511

fbshipit-source-id: a9f1ac920cdf5b448bd99bf9b6d4ca4160ba0304
2020-10-20 15:19:29 -07:00
Jun Wu
5fad63b010 dag: drop last high level segment by default
Summary:
Previously, we keep the last high level segment per level in memory, and
drop it on disk. When we cross the memory / disk boundary, we had to
maintain such properties carefully. That was needed because some DAG
algorithms rely on complete high level segments.

Now that no DAG algorithms depend on such properties, let's just drop
the logic adding the last segment back to simplify the code.

This removes the need of building segments after open() and sync().

Reviewed By: sfilipco

Differential Revision: D24399515

fbshipit-source-id: 4c640d9aa03c050fcd97f70ee386e32d3a8ee26d
2020-10-20 15:19:29 -07:00
Jun Wu
496724e45e dag: make children work with missing high-level segments
Summary:
This makes the algorithm a bit more robust. Now none of the DAG algorithms
depend on high-level segments are complete and cover all low-level segments.

This also removes constraints. For example, SyncableIdDag can now just
deref() to the normal IdDag for queries without worrying about correctness.

Reviewed By: sfilipco

Differential Revision: D24399503

fbshipit-source-id: e6a91010cff82264cf423e2f24dee1d372822ef6
2020-10-20 15:19:29 -07:00
Jun Wu
f290afe421 dag: remove {range,descendants}_old algorithms
Summary:
They depend on high-level segments covering low-level segments, which
adds extra complexities. Remove them to simplify logic.

Reviewed By: sfilipco

Differential Revision: D24399509

fbshipit-source-id: 56a8e06c263107d1da4d6754b884ce51e18e30bf
2020-10-20 15:19:29 -07:00
Jun Wu
9bf6b674a6 config: use Rust graph render as default
Summary: Change the legacy Python graph render to Rust renderer.

Reviewed By: DurhamG

Differential Revision: D24317802

fbshipit-source-id: 4c3dc3a6dd02b7ebe79596a8e77f4b6b139d2e20
2020-10-19 17:07:30 -07:00
Jun Wu
6dac514fae continue: pass --noninteractive to continued command
Summary: This preserves the `--noninteractive` flag used by some tools.

Reviewed By: DurhamG

Differential Revision: D24040789

fbshipit-source-id: 8d50f3f3ce6b2015f0ef6c3bd1b4fbb874d0ea7d
2020-10-16 18:40:51 -07:00
Jun Wu
b36584e704 configparser: set ui.merge:interactive from ui.merge in user config
Summary:
This restores the compatibility of setting up merge tools using the `ui.merge`
config while still limiting the default `editmerge` tool to interactive
sessions.

Reviewed By: sfilipco

Differential Revision: D24377259

fbshipit-source-id: 3d2befba412b824fc985ddffa131e339644178c2
2020-10-16 18:33:06 -07:00
Jun Wu
122108e46e configparser: move load_user to a testable method
Summary: Make it testable by allowing specifying paths to load as user hgrc.

Reviewed By: sfilipco

Differential Revision: D24377258

fbshipit-source-id: 969028df64d55ad1f1304e35675d84595ed6a2bf
2020-10-16 18:33:06 -07:00
Arun Kulshreshtha
d16a62ce06 edenapi: send user agent string
Summary:
Include a `User-Agent` header in EdenAPI requests from Mercurial. This will allow us to see the version in Scuba, and in the future, will allow us to distinguish between requests send by Mercurial and those sent directly by EdenFS.

Keeping with the current output of `hg version`, the application is specified as "EdenSCM" rather than "Mercurial".

Reviewed By: singhsrb

Differential Revision: D24347021

fbshipit-source-id: e323cfc945c9d95d8b2a0490e22c2b2505a620dc
2020-10-16 11:05:24 -07:00
Jun Wu
e39f3bc233 revisionstore: add mutex for tests related to env vars
Summary:
Rust tests run in multiple threads. Setting environment variables affects other
tests running in other threads and causes random test failures.
Protect env vars using a lock.

Reviewed By: DurhamG

Differential Revision: D24296639

fbshipit-source-id: db0bee85625a7b63e07b95ea76d96029487881d4
2020-10-15 22:48:41 -07:00
Jun Wu
263d1c5a7f hg: reduce flakiness of cargo tests
Summary:
The shell-script cargo tests seem very flaky. Use a dedicated Python script to
run the tests, with a more concise output that only includes failures, and run
tests in parallel.

Reviewed By: DurhamG

Differential Revision: D24296433

fbshipit-source-id: 1d63146c6c84f1035dded24fcd3d79f116c2e740
2020-10-15 22:48:41 -07:00
Thomas Orozco
35ed8fe3dc revisionstore/lfs: retry 429
Summary:
When the server returns a 429, the intention is that the client will wait for a
little bit then try again later (there is no harm in that, as we haven't really
used many server resources for this). However, it turned out that right now we
just abort. Let's fix it!

Note that this changes the behavior a bit for the error cases, in the sense
that we no longer return `Ok(None)` but instead return an `Err`. Xavier noted
this should make sense here.

I've also had the client send its retry attempt via a header, because who
knows, that might be useful.

Reviewed By: kulshrax

Differential Revision: D24308127

fbshipit-source-id: 35639956f36342dfb0056b0d348dc4ad56bd576c
2020-10-15 01:11:32 -07:00
Meyer Jacobs
120fbd3280 trees: port SCS aux data request method to edenapi
Summary: Introduces fetching of child entry IDs, and child file metadata for a specified tree manifest ID. The aux data lookup will only be performed if `with_file_metadata` is set, which is actually kind of wrong. Instead `with_children` from the wire type should be exposed in the API request type, and `with_*_metadata` should be hidden or used for data other than the child entry `Key`s.

Reviewed By: kulshrax

Differential Revision: D23886678

fbshipit-source-id: 0cba72cea7be47ae3348a406d407a19b60976c0c
2020-10-14 11:12:59 -07:00
Arun Kulshreshtha
32c109d955 edenapi: send client correlator to server
Summary:
Include the client correlator string from the `clienttelemetry` extension in each EdenAPI HTTP request via the  `X-Client-Correlator` header.

The `ClientIdentityMiddleware` in `gotham_ext` already understands this header (as it is already used by the LFS server), and `gotham_ext`'s `ScubaMiddleware` will automatically include the provided correlator in the server's Scuba samples.

Reviewed By: farnz

Differential Revision: D24282244

fbshipit-source-id: 13d04e706eda38893cff6e740bd1d7bf104e43dd
2020-10-13 13:25:52 -07:00
Meyer Jacobs
f9958ca35a taggederror: introduce category and transience metadata and precedence
Summary:
This change introduces two new metadata types, Category and Transience, and a mechanism for Category to provide a default Fault and Transience, which can be overriden by the user.

Also introduces a mechanism for attempting to log exceptions which occur during exception logging, falling back to the previous behavior of just swallowing the exception on failure.

Reviewed By: DurhamG

Differential Revision: D22677565

fbshipit-source-id: 1cf75ca1e2a65964a0ede1f072439378a46bd391
2020-10-12 17:17:34 -07:00
Jun Wu
bd5cfe49b2 commitstore: remove it
Summary:
It only has benchmark code that led to the use of mincode. Now hgcommits is the
main crate for commit storage. `commitstore` without `hg` in its name was
initially planned to support other kinds of commits including git and bonsai.
However we don't have immediate goal for that at present. So let's just remove
the commitstore directory.

Reviewed By: singhsrb

Differential Revision: D24263618

fbshipit-source-id: 84b4861ae490817377e69d8c2006c63331e3db1f
2020-10-12 16:42:58 -07:00
Meyer Jacobs
87cc599161 edenapi: add aux data to FileMetadata and DirectoryMetadata, and recursive children field to TreeEntry
Summary: Need to add new quickcheck tests, verify that remove `Serialize` from `TreeEntry` is okay.

Reviewed By: kulshrax

Differential Revision: D23457777

fbshipit-source-id: aa94ed7aa81b41924eba4a8bd1bdc2c737365b77
2020-10-12 14:05:23 -07:00
Arun Kulshreshtha
67aa5455aa edenapi: remove commented out code
Summary: Delete commented out code added in D23455274 (bdff69b747).

Reviewed By: sfilipco

Differential Revision: D24213060

fbshipit-source-id: a017b35241521510c26886505d1de6c7f6538895
2020-10-09 09:35:58 -07:00
Arun Kulshreshtha
66ea4f6677 edenapi: print operation in debug output
Summary:
Print a message for each EdenAPI method call to stderr if the user has `edenapi.debug` set.

These messages are already logged to `tracing`, but also printing them out when `edenapi.debug` is set makes the debug output more useful, since it provides context for the download stats. This is especially useful when reading through EdenFS logs.

Reviewed By: quark-zju

Differential Revision: D24204381

fbshipit-source-id: 37b47eed8b89438cdf510443e917a5c8660eb43b
2020-10-08 16:12:50 -07:00
Arun Kulshreshtha
e924af7ba5 edenapi: store headers in a HashMap
Summary: Use a `HashMap` to store user-specified additional HTTP headers. This allows headers to be set in multiple places (whereas previously, setting new headers would replace all previously set headers).

Reviewed By: quark-zju

Differential Revision: D24200833

fbshipit-source-id: 93147cf334a849c4d2fc4f29849018a4c7565143
2020-10-08 16:12:50 -07:00
Jun Wu
9ed54f1b94 dag: replace 2 panics with non-panic errors
Summary: The panics can happen when the input sets are out of range.

Reviewed By: kulshrax

Differential Revision: D24191789

fbshipit-source-id: efbcbd7f6f69bd262aa979afa4f44acf9681d11e
2020-10-08 13:22:10 -07:00
Stefan Filip
6e2ec8b1ca dag: add serde derives to IdDag and InProcessStore
Summary:
Some sort of serialization for the Dag is useful for saving the IdDag produced
by offline jobs load that when a mononoke server starts.

Reviewed By: quark-zju

Differential Revision: D24096964

fbshipit-source-id: 5fac40f9c10a5815fbf5dc5e2d9855cd7ec88973
2020-10-08 09:43:46 -07:00
Meyer Jacobs
6421dca639 read_res: add --debug flag to cat command for printing entire message
Summary: Add `--debug` flag to `read_res cat` command for debug printing entire entry rather than just the data blob.

Reviewed By: kulshrax

Differential Revision: D23999804

fbshipit-source-id: 6955854edab2643cffbe5fae484a398716b48055
2020-10-06 19:22:14 -07:00
Jun Wu
d103af79df hgcommits: add hybrid backend
Summary:
The hybrid backend is similar to the doublewrite backend, except that it does
not use revlog to read commit data, but uses EdenAPI instead.

Note:
- The non-stream API will not fetch commit data from EdenAPI.
- The commit hashes are not lazy yet.

Reviewed By: sfilipco

Differential Revision: D23924147

fbshipit-source-id: eb2cf8d3a7e1704b4efb13ad3ad86f8b6a1b31d0
2020-10-06 19:13:02 -07:00
Jun Wu
f54efdd04a hgcommits: serde serialize on ParentlessHgCommit
Summary:
This makes it convertible to `PyObject` via `cpython_ext::convert::Serde`
without additional code or dependencies.

Reviewed By: sfilipco

Differential Revision: D23966993

fbshipit-source-id: 74d83524a7c0701cde7aa6d61bb930ff4a1c90f5
2020-10-06 19:13:02 -07:00
Jun Wu
80056bef23 hgcommits: add a streaming data fetching API
Summary:
This API allows us to stream the data. If callsites only use this API, we'll
be more confident that there are no 1-by-1 fetches.

Reviewed By: sfilipco

Differential Revision: D23911865

fbshipit-source-id: 4c7dd8c2b5be33be5a55822845d55345797bacdf
2020-10-06 19:13:02 -07:00
Jun Wu
6defe87dcb streams: add abstraction about downloading missing data from remote
Summary:
The API is basically to resolve `input_stream` to `output_stream`, with a
stateful "resolver" that can resolve locally and remotely.

Reviewed By: sfilipco

Differential Revision: D23915775

fbshipit-source-id: 14a3a37fc897c8229514acac5c91c7e46b270896
2020-10-06 19:13:02 -07:00
Meyer Jacobs
bdff69b747 edenapi: Add file, directory metadata to TreeEntry
Summary:
Introduce `FileMetadata` and `DirectoryMetadata` to `Treeentry`, along with corresponding request API.

Move `metadata.flags` to `file_metadata.revisionstore_flags`, as it is never populated for trees. Do not use `metadata.size` on the wire, as it is never currently populated.

Leaving `DirectoryMetadata` commented out temporarily because serde round trips fail for unit struct. Re-introduced with fields in the next change in this stack.

Reviewed By: DurhamG

Differential Revision: D23455274

fbshipit-source-id: 57f440d5167f0b09eef2ea925484c84f739781e2
2020-10-06 18:36:28 -07:00
Arun Kulshreshtha
7576d60c9c edenapi: skip hash check for LFS files
Summary:
EdenAPI always checks the integrity of filenode hashes before returning file data to the application. In the case of LFS files, this resulted in errors because the filenode hash is computed using the full file content, but the blob from the server only contains an LFS pointer.

Fix the bug by exempting LFS blobs from filenode integrity checks. (If integrity checks for LFS blobs are desired, the LFS code should be able to do this on its own since LFS blobs are content-addressed.)

Reviewed By: quark-zju

Differential Revision: D24145027

fbshipit-source-id: d7d86e2b912f267eba4120d1f5186908c3f4e9e3
2020-10-06 16:18:28 -07:00
Jun Wu
47d5813a17 cpython-ext: add a general From/ToPyObject for serde types
Summary:
This can be used to automate Python/Rust conversions for complex structures
like `CommitRevlogData`.

Reviewed By: kulshrax

Differential Revision: D23966988

fbshipit-source-id: 17a19d38270e6ef0952c13a1cd778487e84a94ff
2020-10-06 16:01:23 -07:00
Jun Wu
b5a22da53c cpython-ext: add a serde deserializer that converts Python objects to Rust values
Summary:
The goal is to implement `FromPyObject` and `ToPyObject` more easily.
Today crates have to dependent on `cpython` to implement `From/ToPyObject`,
which is somewhat unwanted for pure Rust crates.

The `ser` module used to ignore the `variant` field for non-unit enum variants.
They have been fixed so the serialized value can be deserialized correctly.
For example, `enum E { A, B(T) }` will be serialized to `"A"` for `E::A`, and
`{"B": T}` for `E::B`.

Reviewed By: kulshrax

Differential Revision: D23966994

fbshipit-source-id: c50d57bf313caeec65a604ed9b05a5729f3b3635
2020-10-06 16:01:22 -07:00
Jun Wu
ab88771161 types: support multi-format deserialization for HgId
Summary:
Switch from the default tuple deserialization which only understands the tuple
format, to "bytes" deserialization, which understands not only the existing
"tuple" format (therefore compatible with old data), but also "bytes" and "hex"
formats (for CBOR).

This will unblock us from switching to bytes serialization in the future.

Note: This is a breaking change for mincode serialization. Mincode + HgId users
(zsotre, metalog) have switched to explicit tuple serialization so they don't use
the default deserializaiton and remain unaffected.

Reviewed By: kulshrax

Differential Revision: D23966995

fbshipit-source-id: 83dd53f57bd4e6098de054f46a1d47f8b48133d0
2020-10-06 15:44:42 -07:00
Jun Wu
9c5d20904d revisionstore: explicitly mark how to serialize HgId, Sha256, Key, NodeInfo
Summary: This will unblock us from switching HgId to bytes serialization by default.

Reviewed By: kulshrax

Differential Revision: D24009039

fbshipit-source-id: a277869ec24652af428cda581faffa62c25d32c4
2020-10-06 15:44:42 -07:00
Jun Wu
aa8bc2afda types: add serde(with) support for Key, NodeInfo, and derived types
Summary: Similar to D23966992 (2a2971a4c7), add support to serialize Key differently.

Reviewed By: DurhamG

Differential Revision: D24009041

fbshipit-source-id: 2ecf1610b989a04083196d180bc62307b5162c2f
2020-10-06 15:44:42 -07:00
Jun Wu
bb07129c2d types: add serde(with) support for Sha256
Summary: Similar to D23966992 (2a2971a4c7), add support to serialize Sha256 differently.

Reviewed By: DurhamG

Differential Revision: D24009040

fbshipit-source-id: b77f6732802f95507e1540f0bbde4d5a92d13cac
2020-10-06 15:44:42 -07:00
Arun Kulshreshtha
720bad11ac progress: fix typo in comment
Reviewed By: singhsrb

Differential Revision: D24130363

fbshipit-source-id: 6505f51f892dffb90c89b3c18e981e55721b6106
2020-10-05 22:26:16 -07:00
Arun Kulshreshtha
0ad6c1229b edenapi: return no-op Fetch on empty request
Summary: Instead of returning an error upon receiving an empty request, just return a `Fetch` object that does nothing. This prevents Mercurial from crashing in situations where an empty request somehow makes it to the EdenAPI remote store.

Reviewed By: quark-zju

Differential Revision: D24119632

fbshipit-source-id: cf4ec707b4097656c76d7084a55b2d0b3150b679
2020-10-05 15:16:33 -07:00
Arun Kulshreshtha
1556c10e28 edenapi: add edenapi.debug option
Summary:
Previously, EdenAPI was using `remotefilelog.debug` to determine whether to print things like download stats. Let's give EdenAPI its own `debug` option that can be configured independently of remotefilelog.

One notable benefit of this change is that download stats will always be printed immediately after the HTTP request completes. This can help rule out network or server issues in situations where Mercurial appears to be hanging during data fetching. (e.g, if hg had downloaded all of the data but was taking a while to process it, the debug output would show this.)

Reviewed By: DurhamG

Differential Revision: D24097942

fbshipit-source-id: bf9b065e7b97fc7ffe50ab74b1b13e2fe364755c
2020-10-05 15:16:33 -07:00
Durham Goode
10248e54b3 phases: make public phase calculation more efficient
Summary:
Previously phase calculation was done via a simple ancestor check. This
was very slow in cases that required going far back into the graph. Going a year
back could take a number of seconds.

To fix it, let's take the Rust phaseset logic and rework it to make only_both
produce an incremental public nodes set. In a later diff we can switch the
phaseset function to use this as well, but right now phaseset returns IdSet, and
that would need to be changed to Set, which may have consequences. So I'll do it
later.

Reviewed By: quark-zju

Differential Revision: D24096539

fbshipit-source-id: 5730ddd45b08cc985ecd9128c25021b6e7d7bc89
2020-10-05 14:40:53 -07:00
Lukas Piatkowski
e7d9e6f6da eden/scm: fix build by regenerating thrift files after D24070707 was landed
Summary: D24070707: `[Thrift] Provide sorted fields to read_field_begin` made a change to the generated rust thrift files, so the eden/scm thrift files have to be regenerated to fix the build.

Reviewed By: farnz

Differential Revision: D24109655

fbshipit-source-id: e8575a76642673a11514fdce8e30f13ca28151f0
2020-10-05 04:44:07 -07:00
Arun Kulshreshtha
b16d724844 http-client: fix typo in comment
Reviewed By: singhsrb

Differential Revision: D24097983

fbshipit-source-id: 4f218a2bc9d3dc1413b18f9741e630ac6261ad7c
2020-10-02 22:03:14 -07:00
Jun Wu
ecbc2abb70 cpython-ext: add a general From/ToPyObject for bytes-like Rust structs
Summary: This can be used by dag::Vertex and minibytes::Bytes.

Reviewed By: kulshrax

Differential Revision: D23966985

fbshipit-source-id: 3b4b29648e038ef49f26ce2b500119e148544d9e
2020-10-02 21:51:49 -07:00
Jun Wu
833ac3fb4c cpython-async: drop py_stream_class macro
Summary:
The py_stream_class causes the code to be more verbose. It basically enforces
the bindings crate to define new types wrapping pure Rust types, and then
define py_stream_class.

In a future diff, I'm adding FromPyObject/ToPyObject support for types that
implements serde Deserialize/Serialize. py_stream_class gets in the way,
because the blanket type from cpython-ext cannot be used in the py_stream_class
macro. cpython-ext is not the proper place to define business-related stream
types.

Therefore, define a type-erased Python class, and implement
FromPyObject/ToPyObject automatically for TStream<anyhow::Result<T>> where
T implements FromPyObject or ToPyObject.

The FromPyObject now converts a Python iterator back to a stream. It's
no longer zero-cost. However, I'd imagine such usecases can be short-cut
using pure Rust code.

Background: Initially, I added some FromPyObject/ToPyObject impls to pure
Rust crates gated by a "pytypes" feature. While that works fine with cargo
build, buck does not support dynamic features and the fact that we support
both py2 and py3 makes it extremely hard to support cleanly in buck build.
For example, if minibytes::Bytes defines ToPyObject for Bytes, then any
crate using minibytes would have 2 different versions: a py2 version, a
py3 version, and they both depend on python. That seems to be a bad approach.

Reviewed By: sfilipco

Differential Revision: D23966984

fbshipit-source-id: eafb31ad458dcbdd8e970d8e419a10fbbe30595f
2020-10-02 21:51:49 -07:00
Arun Kulshreshtha
b101b19d45 http-client: shorten download stats
Summary:
Per the feedback on D23920367 (318f5683a5), let's make the human-readable download stats shorter. Example:

```
Downloaded 10.59 MiB in 12.35s over 5 requests (7.19 Mb/s, latency: 123ms)
```
The amount downloaded is now reported in binary-prefixed bytes (so that it can be directly compared to file sizes) whereas the transfer rate is reported in decimal-prefixed bits per second (so that it can be directly compared to a user's measured network speed).

Additionally, we now use the default formatting available from `std::time::Duration`, which will automatically choose the appropriate display units.

Reviewed By: quark-zju

Differential Revision: D24096525

fbshipit-source-id: 39c49f1b08135bbae7a7544b1ffe2bdbfe1533a1
2020-10-02 19:54:11 -07:00
Durham Goode
2a9263cfe2 memcache: add progress bar to Rust memcachestore
Summary: We now get progress bar output when fetching from memcache!

Reviewed By: kulshrax

Differential Revision: D24060663

fbshipit-source-id: ff5efa08bced2dac12f1e16c4a55fbc37fbc0837
2020-10-02 15:03:17 -07:00
Xavier Deguillard
24f28191f3 mman-win32: remove
Summary: This is unused, remove it.

Reviewed By: DurhamG

Differential Revision: D24062631

fbshipit-source-id: 2c0b758866881986d3454ddb2941cd04d44861f3
2020-10-02 10:47:24 -07:00
Xavier Deguillard
0967eafcba build: remove portability headers
Summary: These aren't included anywhere, we can remove them.

Reviewed By: DurhamG

Differential Revision: D24062627

fbshipit-source-id: 9ff101eb44965ac3502ada3265ffcc8acc09d2e5
2020-10-02 10:47:24 -07:00
Xavier Deguillard
8cc2738aec clib: remove buffer.[ch]
Summary: These are unused, no need to keep the code around.

Reviewed By: DurhamG

Differential Revision: D24055085

fbshipit-source-id: 6246d746983a575c051ddcb51ae02582a764a814
2020-10-02 10:47:23 -07:00
Xavier Deguillard
a4e27d898d lib: remove portability/dirent.h
Summary: This is unused.

Reviewed By: DurhamG

Differential Revision: D24055083

fbshipit-source-id: bc6bbcf765ccb8c277e4a06e9fc3f033cd01733a
2020-10-02 10:47:23 -07:00
Xavier Deguillard
dc97cebe7c lib: remove cdatapack
Summary: This is unused, no need to keep it around.

Reviewed By: DurhamG

Differential Revision: D24054164

fbshipit-source-id: 161b294eb952c6b4584aa0d49d8ff46cd63ee30f
2020-10-02 10:47:23 -07:00
Ivan Murashko
2374b860a2 lint-ignore processing for clang-tidy (fbcode)
Summary: Disable CLANGTIDY checks for several places at the code.

Reviewed By: zertosh, benoitsteiner

Differential Revision: D24018176

fbshipit-source-id: b2d294f9efd64b2e2c72b11b18d8033f9928e826
2020-10-01 03:27:25 -07:00
Jun Wu
ea60810732 async-runtime: support multi-thread block_on_future
Summary:
This would have been easier if we can upgrade tokio (D24011447).
For now, let's just solve it by using a channel so the mutex is not held for long.

The implementation has some side effects, though:
- panic message is not preserved.
- 'static lifetime is required on Future.

The `'static` lifetime is incompatible with some existing code. The old function
is preserved as `block_on_exclusive` and is used in places where a future does
not have `'static` lifetime.

Reviewed By: sfilipco

Differential Revision: D24033134

fbshipit-source-id: 7b35d1ff636d2a289db9b04e60419c31bdea9453
2020-09-30 20:31:34 -07:00
Jun Wu
3c5c6bf5af async-runtime: add iter_to_stream
Summary: `iter_to_stream` converts a blocking iterator to a stream.

Reviewed By: sfilipco

Differential Revision: D24033135

fbshipit-source-id: da5b1f8e6768124ef7c915e1bb17216fde00a55a
2020-09-30 20:16:40 -07:00
Arun Kulshreshtha
f1bdf9aadf edenapi: improve debug messages
Summary: Minor tweaks to debug messages.

Reviewed By: quark-zju

Differential Revision: D24039535

fbshipit-source-id: 950c984f72ff7652f79c346f88273ee7e6c9f926
2020-09-30 19:53:21 -07:00
Arun Kulshreshtha
dfbe53cf11 revisionstore: add progress bars to EdenAPI stores
Summary: Make EdenAPI data stores optionally show progress bars.

Reviewed By: markbt

Differential Revision: D23982320

fbshipit-source-id: b3affd3b630258f15c3cdc64c213df8aa28af589
2020-09-30 13:01:15 -07:00
Arun Kulshreshtha
b5a36de8cd progress: add null progress bar
Summary:
Add a null progress bar implementation that just keeps track of state, similar to the `progress.nullbar` in hg's Python code.

A benefit of this is that code that optionally shows progress can unconditionally update the progress bar rather than wrapping it in an `Option` and checking for presence each time.

Reviewed By: markbt

Differential Revision: D23982318

fbshipit-source-id: ffd762b59cc0c9bd2ad0c67c3ca785350db4850f
2020-09-30 13:01:15 -07:00
Arun Kulshreshtha
6dea84a3c9 progress: add Rust progress bar interface
Summary:
This diff introduces a new `progress` crate that provides an abstract interface for progress bars in Rust code:

- The `ProgressFactory` trait can be used to create new progress bars.
- The `ProgressBar` trait allows Rust code to interact with the progress bar.
- The `ProgressSpinner` trait is similar, but for spinner-type progress indicators.

These traits are intended to be used as trait objects, allowing pure Rust code to accept an opaque `ProgressFactory` and use it to report progress. This kind of abstraction, while not common in idiomatic Rust code, allows the progress implementation to be completed decoupled from the pure Rust code, which is important given that Mercurial's progress bars are currently implemented in Python.

Part of the goal of this crate is to allow a smooth transition to pure Rust progress bars (once we eventually implement them). As long as the Rust progress bars implement the above traits, the can be used as drop-in replacements for Python progress bars everywhere.

Reviewed By: markbt

Differential Revision: D23982319

fbshipit-source-id: 9ccf167f18d9518bb0ed66e1606a5b8188d98428
2020-09-30 11:20:31 -07:00
Xavier Deguillard
46ce143dcf build: various fixes to get eden to compile with @mode/win
Summary:
As EdenFS depends on a few bits of Mercurial code, these needs to be able to
compile with Buck.

Reviewed By: chadaustin

Differential Revision: D24000881

fbshipit-source-id: 078a2a958039a63db1b716785f872b4bbde3bab6
2020-09-29 16:10:27 -07:00
Meyer Jacobs
7f89121cab edenapi: non-key Entry attributes optional
Summary: Make `parents`, `data`, and `metadata` optional, and introduce `WireTreeAttributesRequest` for selecting which attributes to request on the wire.

Reviewed By: kulshrax

Differential Revision: D23406763

fbshipit-source-id: 5edd674d9ba5d37c23b12ab4d7b54bbf6c9ff990
2020-09-29 12:35:19 -07:00
Meyer Jacobs
cecbca5bb7 edenapi: make tree query method extensible
Summary:
Adds a `WireTreeQuery` enum for query method, with a single `ByKeys(WireTreeKeyQuery)` available currently, to request a specific set of keys.

Leave the API struct alone for now.

Reviewed By: kulshrax

Differential Revision: D23402366

fbshipit-source-id: 19cd8066afd9f14c7e5f718f7583d1e2b9ffac02
2020-09-29 12:08:05 -07:00
Jun Wu
d39c632679 zstore: do not test zstd compression size
Summary: The size can change with zstd upgrades. Do not test them.

Reviewed By: sfilipco

Differential Revision: D23976933

fbshipit-source-id: d560061b6e4fefc3bb89513bdb12c770ea0bd881
2020-09-29 10:13:18 -07:00
Jun Wu
f833f03ba2 metalog: explicitly use tuple serialization for HgId
Summary:
metalog uses mincode serialization and requires certain bytes layout of the
HgId. Explicitly opt-in tuple serialization so HgId default serialization
change won't affect metalog.

Reviewed By: kulshrax

Differential Revision: D23966991

fbshipit-source-id: 23c217f1e8cb0c8a6cc12f50bb333cdc7bba36ca
2020-09-28 21:32:21 -07:00
Jun Wu
0bb45fcbc4 zstore: explicitly use tuple serialization for HgId
Summary:
zstore uses mincode serialization and requires certain bytes layout of the
HgId. Explicitly opt-in tuple serialization so HgId default serialization
change won't affect zstore.

Reviewed By: kulshrax

Differential Revision: D23966986

fbshipit-source-id: 69a60e26ec4e64c20a0b080288f622e765438ee4
2020-09-28 21:32:21 -07:00
Jun Wu
305b95895a edenapi/types: mark some commit related types to use hgid::bytes serialization
Summary:
This makes it so commit hashes are serialized to bytes instead of tuples in Python:

  In [1]: s,f=api.commitdata(repo.name, list(repo.nodes('master')))
  In [2]: list(s)
  Out[3]: [{'hgid': '...', ...}]

Some `Vec<HgId>`s cannot be changed using this way. It'd be nice if we can change
the default `HgId` serialization to bytes.

Reviewed By: kulshrax

Differential Revision: D23966989

fbshipit-source-id: 4d013525419741d3c5c23621be16e70441bab3c4
2020-09-28 21:32:21 -07:00
Jun Wu
2a2971a4c7 types: add serde(with) functions for HgId
Summary:
`HgId` currently serializes into a tuple of 20 items. This is suboptimal in
CBOR, because the items are untyped. A byte might be serialized into one or two
bytes:

  In [2]: cbor.dumps([1,1,1,1])
  Out[2]: b'\x84\x01\x01\x01\x01'

  In [3]: cbor.dumps([255,255,255,255])
  Out[3]: b'\x84\x18\xff\x18\xff\x18\xff\x18\xff'

CBOR supports "bytes" type to efficiently encode a `[u8]`:

  In [5]: cbor.dumps(b"\x01\x01\x01\x01")
  Out[5]: b'D\x01\x01\x01\x01'

  In [6]: cbor.dumps(b"\xff\xff\xff\xff")
  Out[6]: b'D\xff\xff\xff\xff'

Add `serde_with` with 3 flavors: `bytes`, `tuple`, `hex` to satisfy different
needs. Check the added docstring for details.

Reviewed By: kulshrax

Differential Revision: D23966992

fbshipit-source-id: 704132648f9e50b952ffde0e96ee2106f2f2fbcf
2020-09-28 21:32:21 -07:00
Durham Goode
37f47b452a dynamicconfig: fix reponame detection during synchronous config generation
Summary:
Dynamicconfig can generate configs two ways, 1) via `hg
debugdynamicconfig` and 2) synchronously in-process in an hg command when it
detects that the dynamicconfig is completely missing or has the wrong version
number.

In the first case, dynamicconfig gets the repo name from the standard config
object loaded by the hg dispatch.  In the second case, the standard config
object isn't even loaded yet, so dynamicconfig does a mini-load of the user and
repo hgrcs so it can get the repo name and user name (needed for dynamic
conditions).

Unfortunately the second code path computed the wrong path (it had two .hg/'s)
which meant the reponame and user name were always none. This meant that the
dynamicconfig on disk could randomly be either computed with or without a
reponame.

Let's fix the path computation, and add a test.  We may want to make
dynamicconfig fail if no repo name is passed, but I'm not sure if we'll want to
support no-repo configuration at some point.

This didn't cause a problem for most people, since it would only happen during a
hg version number change, and 15 minutes later the background 'hg
debugdynamiconfig' process would fix it up. It did affect sandcastle though,
since it often creates new repositories and acts on them immediately.

Reviewed By: quark-zju

Differential Revision: D23955628

fbshipit-source-id: c922f4b523d19df9223aa28c97700b7011fc03eb
2020-09-28 09:14:05 -07:00
Durham Goode
040ee1b744 revisionstore: fix default pending data pack limit
Summary:
The old code tried to express 4GB by using ^ to do an exponent. That
operator is actually the bitwise xor, so this was producing a limit closer to 4
bytes. It doesn't seem to have mattered much since a later diff overrode the
default via dynamicconfig. But let's fix this anyway.

Reviewed By: krallin

Differential Revision: D23955629

fbshipit-source-id: 6abebcb7e84b7a47f70ac501fa11b0dc60dfda7b
2020-09-28 09:14:04 -07:00
Arun Kulshreshtha
d3b39542f0 revisionstore: use async_runtime in EdenAPI stores
Summary: Now that the `async_runtime` crate exists, use Mercurial's global `tokio::Runtime` instead of creating one for each EdenAPI store.

Reviewed By: quark-zju

Differential Revision: D23945569

fbshipit-source-id: 7d7ef6efbb554ca80131daeeb2467e57bbda6e72
2020-09-26 16:50:06 -07:00
Arun Kulshreshtha
825ebe35a8 edenapi: print response headers in debug output
Summary: Print relevant HTTP response headers in the debug output for debugging client-server issues.

Reviewed By: quark-zju

Differential Revision: D23923168

fbshipit-source-id: c9fda57c53fb25b15c450f0afd14e539de43cfcb
2020-09-24 21:05:21 -07:00
Arun Kulshreshtha
29b855b256 edenapi: add server load to ResponseMeta
Summary: Now that the EdenAPI server is using the `LoadMiddleware` from `gotham_ext`, each response will contain an `X-Load` header that contains the number of active requests that the server is currently handling.

Reviewed By: quark-zju

Differential Revision: D23922809

fbshipit-source-id: 973143de5ddccf074d28aa3ef38d73f9fc1501b6
2020-09-24 21:05:21 -07:00
Arun Kulshreshtha
318f5683a5 http-client: report download speed as Mbit/s and MiB/s
Summary:
Network speeds are usually reported in megabits per second (Mb/s), whereas file sizes are usually reported in [mebibytes](https://en.wikipedia.org/wiki/Binary_prefix) per second (MiB/s). Previously, the HTTP client reported neither of those and instead reported megabytes per second (MB/s).

This diff changes the latter to the former so that the numbers are more immediately useful. As a bonus, the speeds are now directly comparable to those reported by `hg debugnetwork`.

Reviewed By: quark-zju

Differential Revision: D23920367

fbshipit-source-id: 46500a42681ab83fc7c4ead82980e8ed620a4d5a
2020-09-24 21:05:20 -07:00
Arun Kulshreshtha
f1aeffd67a hg-http: remove stats logging
Summary: Now that stats are logged to `tracing` by the `HttpClient` directly, we no longer need to log them here. This commit backs out D23858077 (613fbc858f) which added the logging.

Reviewed By: quark-zju

Differential Revision: D23919308

fbshipit-source-id: 23d3a12c5307bc4b84dd9ffd25bd376718e3cc91
2020-09-24 21:05:20 -07:00
Arun Kulshreshtha
45e8c3377c http-client: improve log messages
Summary:
Improve the log output of the HTTP client to avoid spewing redundant debug messages.

As part of this change, logging now uses the `tracing` crate instead of the `log` crate for better integration with the rest of Mercurial's logging infrastructure. Right now, `tracing` is just being used as a drop-in replacement for `log`, but now that it's in use we can start using its full capabilities (such as defining tracing spans) in later diffs.

Reviewed By: quark-zju

Differential Revision: D23919310

fbshipit-source-id: 95555ad083ead805ceece39c6e30aaf879bdf2bc
2020-09-24 21:05:20 -07:00
Arun Kulshreshtha
3a5b6d2958 http-client: fix curl_multi_wait timeout
Summary:
We were using the timeout parameter on `Multi::wait` (equivalent to `curl_multi_wait` in C) incorrectly. Previously, we were passing in the timeout provided by `curl_multi_timeout`.

This is incorrect usage because the value returned by `curl_multi_timeout` is the current value of libcurl's internal timeout (based on the state of the transfers), which will always be respected. The actual intention of the timeout parameter is to allow the caller to specify a hard cap on curl's internal timeout, so we should just pass some reasonable default value here. ([See explanation here.](https://github.com/curl/curl/issues/2996))

The purpose of `curl_multi_timeout` is to allow libcurl to tell the application what its desired timeout is in situations where the application itself is waiting for socket activity (using something like `epoll`), which is not the case when using `curl_multi_wait`.

Reviewed By: DurhamG

Differential Revision: D23914093

fbshipit-source-id: 76a25d7c59a4b08437c8d7be3d24708fb37b9172
2020-09-24 16:46:39 -07:00
Arun Kulshreshtha
5bac6466e2 edenapi: add timeout option
Summary: Use the functionality from D23910534 (721f5af278) to set a timeout for EdenAPI requests, configured via the `edenapi.timeout` option.

Reviewed By: DurhamG

Differential Revision: D23911552

fbshipit-source-id: 4a6e3de1094d0faa1daaf6fe4b9b7aafb37a25a8
2020-09-24 16:46:39 -07:00
Arun Kulshreshtha
721f5af278 http-client: add ability to set timeout
Summary: Add the ability to set a timeout on HTTP requests. Equivalent to [`CURLOPT_TIMEOUT_MS`](https://curl.haxx.se/libcurl/c/CURLOPT_TIMEOUT_MS.html).

Reviewed By: DurhamG

Differential Revision: D23910534

fbshipit-source-id: a7aec792ec3c122a01aa44fcfe2e2df6e3a111fc
2020-09-24 12:59:42 -07:00
Arun Kulshreshtha
183fff1b9f http-client: fail loudly on non-fatal errors
Summary:
There are several places in the HTTP client where we log and discard errors. (Typically, these are "this should never happen" type situations.)

Previously, these were logged at the `trace` log level, meaning that in practice no one would ever know if we did hit these errors.

Let's upgrade them to `error` so that they'll be printed out. (In theory, users should never see these error messages unless something has gone horribly wrong.)

Reviewed By: DurhamG

Differential Revision: D23888268

fbshipit-source-id: 9007205f946ebb0127238c76812cf62524878047
2020-09-24 11:19:36 -07:00
Durham Goode
46d0991cd0 revisionstore: expose shared mutable stores to Python
Summary:
Treemanifest needs to be able to write to the shared stores from paths
other than just prefetch (like when it receives certain trees via a standard
pull). To make this possible we need to expose the Rust shared mutable stores.
This will also make just general integration with Python cleaner.

In the future we can get rid of the non-prefetch download paths and remove this.

Reviewed By: quark-zju

Differential Revision: D23772385

fbshipit-source-id: c1e67e3d21b354b85895dba8d82a7a9f0ffc5d73
2020-09-24 09:46:59 -07:00
Meyer Jacobs
75105421ce edenapi: Hide edenapi wire types from externally visible API
Summary:
Introduce separate wire types to allow protocol evolution and client API changes to happen independently.

* Duplicate `*Request`, `*Entry`, `Key`, `Parents`, `RepoPathBuf`, `HgId`, and `revisionstore_types::Metadata` types into the `wire` module. The versions in the `wire` module are required to have proper `serde` annotations, `Serialize` / `Deserialize` implementations, etc. These have been removed from the original structs.
* Introduce infallible conversions from "API types" to "wire types" with the `ToWire` trait and fallible conversions from "wire types" to "API types" with the `ToApi`. API -> wire conversions should never fail in a binary that builds succesfully, but wire -> API conversions can fail in the case that the server and client are using different versions of the library. This will cause, for instance, a newly-introduced enum variant used by the client to be deserialized into the catch-all `Unknown` variant on the server, which won't generally have a corresponding representation in the API type.
* Cleanup: remove `*Response` types, which are no longer used anywhere.
* Introduce a `map` method on `Fetch` struct which allows a fallible conversion function to be used to convert a `Fetch<T>` to a `Fetch<U>`. This function is used in the edenapi client implementation to convert from wire types to API types.
* Modify `edenapi_server` to convert from API types to wire types.
* Modify `edenapi_cli` to convert back to wire types before serializing responses to disk.
* Modify `make_req` to use `ToWire` for converting API structs from the `json` module to wire structs.
* Modify `read_res` to use `ToApi` to convert deserialized wire types to API types with the necessary methods for investigating the contents (`.data()`, primarily). It will print an error message to stderr if it encounters a wire type which cannot be converted into the corresponding API type.
* Add some documentation about protocol conventions to the root of the `wire` module.

Reviewed By: kulshrax

Differential Revision: D23224705

fbshipit-source-id: 88f8addc403f3a8da3cde2aeee765899a826446d
2020-09-23 17:27:08 -07:00
Arun Kulshreshtha
a745a145b1 edenapi: optionally print log messages
Summary: Add log messages for debugging using the `tracing` crate, which allows them to be enabled via `env_logger`.

Reviewed By: quark-zju

Differential Revision: D23858076

fbshipit-source-id: a8ef1afac6c9ecbfb5d6d78232aa0d03a2fe2054
2020-09-23 17:19:28 -07:00
Arun Kulshreshtha
613fbc858f hg-http: optionally print stats
Summary: Log HTTP stats to stderr to assist with ad-hoc debugging. Will not be printed unless `RUST_LOG` is set appropriately.

Reviewed By: quark-zju

Differential Revision: D23858077

fbshipit-source-id: 39acf3de3fd0ca4403a986eb5373a6a79f1d004a
2020-09-23 17:19:28 -07:00
Arun Kulshreshtha
31ceb7f0d1 hg-http: use autocargo
Summary: Onboard the crate onto autocargo.

Reviewed By: quark-zju

Differential Revision: D23858075

fbshipit-source-id: 7179ae0f9ca8a1d4e664d7eb5cb614940e2b2c30
2020-09-23 16:40:49 -07:00
Jun Wu
7f1c05dd74 cpython-async: expose Rust Future to Python
Summary:
Add a `PyFuture<F>` type that can be used as return type in binding function.
It converts Rust Future to a Python object with an `await` method so Python
can access the value stored in the future.

Unlike `TStream`, it's currently only designed to support Rust->Python one
way conversion so it looks simpler.

Reviewed By: kulshrax

Differential Revision: D23799644

fbshipit-source-id: da4a322527ad9bb4c2dbaa1c302147b784d1ee41
2020-09-21 13:28:07 -07:00
Jun Wu
41b200c8d8 cpython-async: expose Rust Stream to Python
Summary:
The exposed type can be used as a Python iterator:

  for value in stream:
      ...

The Python type can be used as input and output parameters in binding functions:

  # Rust
  type S = TStream<anyhow::Result<X>>;
  def f1() -> PyResult<S> { ... }
  def f2(x: S) -> PyResult<S> { Ok(x.stream().map_ok(...).into()) }

  # Python
  stream1 = f1()
  stream2 = f2(stream1)

This crate is similar to `cpython-ext`: it does not define actual business
logic exposed by `bindings` module. So it's put in `lib`, not
`bindings/modules`.

Reviewed By: markbt

Differential Revision: D23799641

fbshipit-source-id: c13b0c788a6465679b562976728f0002fd872bee
2020-09-21 13:28:07 -07:00
Liubov Dmitrieva
01615ae4de improve scm daemon checks and check workspace name as well
Summary:
Move bunch of code into a separate file (scm daemon related options). Move them
out of cloud sync.

Also introduce additional check that the `hg cloud sync` command scm daemon
runs is intended for the current connected workspace

In theory when we switch a subscription, the SCM daemon gets notified but races possible and it is better to have this additional check, so SCM daemon triggers cloud sync where it is supposed to.

Reviewed By: markbt

Differential Revision: D23783616

fbshipit-source-id: b91a8b79189b7810538c15f8e61080b41abde386
2020-09-18 14:01:11 -07:00
Durham Goode
f68177a983 treemanifest: flush shared stores when flushing local stores
Summary:
The Rust contentstore has no way to flush the shared stores, except
when the object is destructed. In treemanifest, the lifetime of the shared store
seems to be different from with files and we're not seeing them flushes
appropriately during certain commands. Let's make the flush api also flush the
shared stores.

Reviewed By: quark-zju

Differential Revision: D23662976

fbshipit-source-id: a542c3e45d5b489fcb5faf2726854cb49df16f4c
2020-09-17 14:27:50 -07:00
Durham Goode
556ae539fa repack: prevent Rust repack from repacking an entry twice
Summary: The old logic would just double pack some bits. Let's prevent that.

Reviewed By: xavierd

Differential Revision: D23661933

fbshipit-source-id: 155291fa08ec2c060619329bd1cb6040769feb63
2020-09-17 10:16:03 -07:00
Durham Goode
6ae1cf9619 revisionstore: add refresh function
Summary:
The rust pack stores currently have logic to refresh their list of
packs if there's a key miss and if it's been a while since we last loaded the
list of packs. In some cases we want to manually trigger this refresh, like if
we're in the middle of a histedit and it invokes an external command that
produces pack files that the histedit should later consume (like an external
amend, that histedit then needs to work on top of).

Python pack stores solve this by allowing callers to mark the store for a
refresh. Let's add the same logic for rust stores. Once pack files are gone we
can delete this.

This will be useful for the upcoming migration of treemanifest to Rust
contentstore. Filelog usage of the Rust contentstore avoided this issue by
recreating the entire contentstore object in certain situations, but refresh
seems useful and less expensive.

Reviewed By: quark-zju

Differential Revision: D23657036

fbshipit-source-id: 7c6438024c3d642bd22256a8e58961a6ee4bc867
2020-09-17 10:16:03 -07:00
Durham Goode
055fc0d20b packstore: avoid substracting from an Instant
Summary:
Instants do not represent actual time and can only be compared against
each other. When we subtracted arbitrary Durations from them, we run the risk of
overflowing the underlying storage, since the Instant may be represented by a
low number (such as the age of the process).

This caused crashes in test_refresh (in the next diff) on Windows.

Let's instead represent the "must rescan" state as a None last_scanned time, and avoid any arbitrary subtraction. It's generally much cleaner too.

Reviewed By: quark-zju

Differential Revision: D23752511

fbshipit-source-id: db89b14a701f238e1c549e497a5d751447115fb2
2020-09-17 10:16:03 -07:00
Durham Goode
dd387dd0d1 mutablepacks: only create mutable history packs when needed
Summary:
Previously the MetadataStore would always construct a mutable pack, even
if the operation was readonly. This meant all read commands required write
access. It also means that random .tmp files get scattered all over the place
when the rust structures are not properly destructed (like if python doesn't
bother doing the final gc to call destructors for the Rust types).

Let's just only create mutable packs when we actually need them.

Reviewed By: quark-zju

Differential Revision: D23219961

fbshipit-source-id: a47f3d94f70adac1f2ee763f3170ed582ef01a14
2020-09-16 21:39:25 -07:00
Durham Goode
1f5835e70a mutablepacks: only create mutable data packs when needed
Summary:
Previously the ContentStore would always construct a mutable pack, even
if the operation was readonly. This meant all read commands required write
access. It also means that random .tmp files get scattered all over the place
when the rust structures are not properly destructed (like if python doesn't
bother doing the final gc to call destructors for the Rust types).

Let's just only create mutable packs when we actually need them.

Reviewed By: quark-zju

Differential Revision: D23219962

fbshipit-source-id: 573844f81966d36ad324df03eecec3711c14eafe
2020-09-16 21:39:25 -07:00
Jun Wu
2abf0ada42 version: print EdenSCM instead of Mercurial
Summary: Per team discussion.

Reviewed By: singhsrb

Differential Revision: D23719401

fbshipit-source-id: a1e9a1e9a10369c307413354054a65e6520d13e5
2020-09-15 21:03:59 -07:00
Thomas Orozco
d7081f6aba lfs: add client support for received compressed responses
Summary:
As it says in the title, this adds support for receiving compressed responses
in the revisionstore LFS client. This is controlled by a flag, which I'll
roll out through dynamicconfig.

The hope is that this should greatly improve our throughput to corp, where
our bandwidth is fairly scarce.

Reviewed By: StanislavGlebik

Differential Revision: D23652306

fbshipit-source-id: 53bf86d194657564bc3bd532e1a62208d39666df
2020-09-15 07:59:53 -07:00
Thomas Orozco
21290702e1 third-party/rust: import async-compression + update zstd
Summary:
This imports the async-compression crate. We have an equivalent-ish in
common/rust, but it targets Tokio 0.1, whereas this community-supported crate
targets Tokio 0.2 (it offers a richer API, notably in the sense that we
can use it for Streams, whereas the async-compression crate we have is only for
AsyncWrite).

In the immediate term, I'd like to use this for transfer compression in
Mononoke's LFS Server. In the future, we might also use it in Mononoke where we
currently use our own async compression crate when all that stuff moves to
Tokio 0.2.

Finally, this also updates zstd: the version we link to from tp2 is actually
zstd 1.4.5, so it's a good idea to just get the same version of the zstd crate.

The zstd crate doesn't keep a great changelog, so it's hard to tell what has changed.
At a glance, it looks like the answer is not much, but I'm going to look to Sandcastle
to root out potential issues here.

Reviewed By: StanislavGlebik

Differential Revision: D23652335

fbshipit-source-id: e250cef7a52d640bbbcccd72448fd2d4f548a48a
2020-09-15 07:59:53 -07:00
Durham Goode
a674b25157 hgcache: add config driven cache nuking
Summary:
We've often had cases where we need to nuke peoples caches for various
reasons. It's a hug pain since we haven't a way to communicate with all hg
clients. Now that we have configerator dynamicconfigs, we can use that to reach
all clients.

This diff adds support for configs like:
```
[hgcache-purge]
foo=2020-08-20
```
The key, 'foo' in this case, is an identifier used to only run this purge once.
The value is a date after which this purge will no longer run. This is useful
for bounding the damager from forgetting about a purge and having it delete caches
over and over in the future for new repos or repos where the run once marker
file is deleted for some reason.

Reviewed By: quark-zju

Differential Revision: D23044205

fbshipit-source-id: 8394fcf9ba6df09f391b5317bad134f369e9b416
2020-09-14 11:01:02 -07:00
Xavier Deguillard
ed4021b8e3 revisionstore: disallow reading LFS pointers from packfiles
Summary:
For repositories that have the old-style LFS extension enabled, the pointers
are stored in packfiles/indexedlog alongside with a flag that signify to the
upper layers that the blob is externally stored. With the new way of doing LFS,
pointers are stored separately.

When both are enabled, we are observing some interesting behavior where
different get and get_meta calls may return different blobs/metadata for the
same filenode. This may happen if a filenode is stored in both a packfile as an
LFS pointers, and in the LFS store. Guaranteeing that the revisionstore code is
deterministic in this situation is unfortunately way too costly (a get_meta
call would for instance have to fully validate the sha256 of the blob, and this
wouldn't guarantee that it wouldn't become corrupted on disk before calling
get).

The solution take here is to simply ignore all the lfs pointers from
packfiles/indexedlog when remotefilelog.lfs is enabled. This way, there is no
risk of reading the metadata from the packfiles, and the blob from the
LFSStore. This brings however another complication for the user created blobs:
these are stored in packfiles and would thus become unreadable, the solution is
to simply perform a one-time full repack of the local store to make sure that
all the pointers are moved from the packfiles to to LFSStore.

In the code, the Python bindings are using ExtStoredPolicy::Ignore directly as
these are only used in the treemanifest code where no LFS pointers should be
present, the repack code uses ExtStoredPolicy::Use to be able to read the
pointers, it wouldn't be able to otherwise.

Reviewed By: DurhamG

Differential Revision: D22951598

fbshipit-source-id: 0e929708ba5a3bb2a02c0891fd62dae1ccf18204
2020-09-09 18:27:42 -07:00
Stefan Filip
1c172c9008 lfs: use hg-http built client for network requests
Summary: This client provides automatic metrics collection.

Reviewed By: kulshrax

Differential Revision: D23577871

fbshipit-source-id: 137299222a20bc8e4d52c3321febbb91d861b236
2020-09-09 17:35:49 -07:00
Stefan Filip
046db98222 edenapi: use hg-http built client for network requests
Summary:
hg-http's built client should provide integration with Mercurial's stats
collection mechanisms.

Reviewed By: kulshrax

Differential Revision: D23577867

fbshipit-source-id: 93c777021bc347511322269d678d6879710eed3e
2020-09-09 17:35:48 -07:00
Stefan Filip
c1ab6a4e92 http-client: add stats reporting hook
Summary:
Add `with_stats_reporting` to HttpClient. It takes a closure that will be
called with all `Stats` objects generated. We then use this function in
the hg-http crate to integrate with the metrics backend used in Mercurial.

Reviewed By: kulshrax

Differential Revision: D23577869

fbshipit-source-id: 5ac23f00183f3c3d956627a869393cd4b27610d4
2020-09-09 17:35:48 -07:00
Stefan Filip
7f72a04c0e metrics: crate for collecting metrics
Summary:
We start off simple here. Python only really has counters so we only implement
counters. There are a lot of options on how to improve this and things get
slightly complicated when we look at the how ecosystem and fb303. Anyway,
simple start.

Reviewed By: quark-zju

Differential Revision: D23577874

fbshipit-source-id: d50f5b2ba302d900b254200308bff7446121ae1d
2020-09-09 17:35:48 -07:00
Stefan Filip
4ad9091598 thrift: update thrift types
Summary: autogenerated by `make local`

Reviewed By: quark-zju

Differential Revision: D23577872

fbshipit-source-id: 6ca98fd865c3b3bc3a00d8126ce20b59110f8118
2020-09-09 17:35:48 -07:00
Saurabh Singh
384c4f61fa fix the Windows build
Reviewed By: sfilipco

Differential Revision: D23601358

fbshipit-source-id: c5a33286b7468882bbedb3e8fe85f66a8f9db0e2
2020-09-09 10:39:35 -07:00
Arun Kulshreshtha
de7f7ab4fe http-client: rename crate
Summary: The Mercurial codebase uses hyphens in crate names rather than underscores. This is similar to the convention favored by the larger Rust community, though it is different from Mononoke, which uses underscores. While we'll probably need to eventually settle on a consistent convention for all of projects in the Eden SCM repo, for now, `http_client` should be made consistent with the adjacent crates.

Reviewed By: sfilipco

Differential Revision: D23585721

fbshipit-source-id: d2e690d86815be02d7b8d645198bcd28e8cbd6e0
2020-09-09 10:12:50 -07:00
David Tolnay
e83e05ff25 Update formatter to rustfmt 2.0
Reviewed By: zertosh

Differential Revision: D23591028

fbshipit-source-id: f458503fc2b9c25023fa1643eca5e166882a4811
2020-09-09 07:52:34 -07:00
Lukasz Piatkowski
379065faab eden/scm: remove leftover of tokio-core after tokio 0.2 migration (#52)
Summary: Pull Request resolved: https://github.com/facebookexperimental/eden/pull/52

Reviewed By: krallin

Differential Revision: D23594074

Pulled By: lukaspiatkowski

fbshipit-source-id: 776c02418f4951321887f566bac8b76c9da8bcc1
2020-09-09 02:32:49 -07:00
Zeyi (Rice) Fan
5e02a93e91 eden-client: move to use tokio 0.2 socket transport
Summary: No more tokio-core! More `async/await`.

Reviewed By: kulshrax

Differential Revision: D23586509

fbshipit-source-id: b2e766ddb7575bc96963432f0c8582b4370b19aa
2020-09-08 20:24:26 -07:00
Zeyi (Rice) Fan
a6a73ec6b6 switch to tokio 0.2 transport
Summary:
This diff adds a `SocketTransport` implementation that no longer uses legacy `tokio-core` based futures but `tokio-tower` and `tower-service` for processing Thrift requests.

The old implementation is renamed to `SocketTransportLegacy` for better transitioning.

Reviewed By: dtolnay

Differential Revision: D20019196

fbshipit-source-id: 3bee684e9254bf1a81669ef0d2c2262a55e75daa
2020-09-08 17:53:57 -07:00
Durham Goode
2919268555 revisionstore: auto-delete when we have too much pack data
Summary:
In order to keep the hgcache size bounded we need to keep track of pack
file size even during normal operations and delete excess packs.

This has the negative side effect of deleting necessary data if the operation is
legitimately huge, but we'd rather have extra downloading time than fill up the
entire disk.

Reviewed By: quark-zju

Differential Revision: D23486922

fbshipit-source-id: d21be095a8671d2bfc794c85918f796358dc4834
2020-09-08 11:33:50 -07:00
Durham Goode
717d10958f revisionstore: refactor pack iteration code
Summary:
In a future diff we'll add logic to delete old pack files. We'll want
to use this pack iteration code, so let's move it to a function.

Reviewed By: quark-zju

Differential Revision: D23486920

fbshipit-source-id: 5f872e946ffe816289c925dd2e03c292e29da5af
2020-09-08 11:33:50 -07:00
Durham Goode
651a0690be revisionstore: auto-commit datapacks when they get large
Summary:
As the repository grows the opportunity for large downloads increases.
Today all writes to data packs get sent straight to disk, but we have no way to
prevent this from eating all the disk.

Let's automatically flush datapacks when they reach a certain size (default
4GB). In a future diff this will let us automatically garbage collect data packs
to bound the maximum size of packs.

Rotatelog already have this behavior.

Reviewed By: quark-zju

Differential Revision: D23478780

fbshipit-source-id: 14f9f707e8bffc59260c2d04c18b1e4f6bdb2f90
2020-09-08 11:33:50 -07:00
Thomas Orozco
2948993c38 remotefilelog: add killswitch for client certs
Summary:
See D23538897 for context. This adds a killswitch so we can rollout client
certs gradually through dynamicconfig.

Reviewed By: StanislavGlebik

Differential Revision: D23563905

fbshipit-source-id: 52141365d89c3892ad749800db36af08b79c3d0c
2020-09-08 10:39:07 -07:00
Thomas Orozco
d1c4772da3 remotefilelog: use client certs when connecting to LFS
Summary:
Like it says in the title, this updates remotefilelog to present client
certificates when connecting to LFS (this was historically the case in the
previous LFs extension). This has a few upsides:

- It lets us understand who is connecting, which makes debugging easier;
- It lets us enforce ACLs.
- It lets us apply different rate limits to different use cases.

Config-wise, those certs were historically set up for Ovrsource, and the auth
mechanism will ignore them if not found, so this should be safe. That said, I'd
like to a killswitch for this nonetheless. I'll reach out to Durham to see if I
can use dynamic config for that

Also, while I was in there, I cleaned up few functions that were taking
ownership of things but didn't need it.

Reviewed By: DurhamG

Differential Revision: D23538897

fbshipit-source-id: 5658e7ae9f74d385fb134b88d40add0531b6fd10
2020-09-08 10:39:07 -07:00
David Tolnay
e62b176170 Prepare for rustfmt 2.0
Summary:
Generated by formatting with rustfmt 2.0.0-rc.2 and then a second time with fbsource's current rustfmt (1.4.14).

This results in formatting for which rustfmt 1.4 is idempotent but is closer to the style of rustfmt 2.0, reducing the amount of code that will need to change atomically in that upgrade.

 ---

*Why now?* **:** The 1.x branch is no longer being developed and fixes like https://github.com/rust-lang/rustfmt/issues/4159 (which we need in fbcode) only land to the 2.0 branch.

 ---

Reviewed By: zertosh

Differential Revision: D23568779

fbshipit-source-id: 477200f35b280a4f6471d8e574e37e5f57917baf
2020-09-07 20:47:59 -07:00
Durham Goode
8b91cccc8b remotefilelog: log undesired filename fetches
Summary:
Now that the Rust revisionstore records undesired filename fetches,
let's log those results to Scuba in Python.

Reviewed By: StanislavGlebik

Differential Revision: D23462572

fbshipit-source-id: b55f2290e30e3a5c3b67d9f612b24bc3aad403a8
2020-09-04 14:55:15 -07:00
Durham Goode
9772ab1718 revisionstore: record remote fetches that match a pattern
Summary:
We want to be able to record when fetches to certain paths happen.
Let's add recording infrastructure to the new ReportingRemoteDataStore.

A future diff will make the seen accessible from Python for scuba logging.

Reviewed By: xavierd

Differential Revision: D23462574

fbshipit-source-id: 5d749f2429e26e8e7fe4fb5adc29140b4309eac9
2020-09-04 14:55:15 -07:00
Durham Goode
84cbc26b1e revisionstore: add reporting wrapper for remote data store
Summary:
We want to monitor what paths are fetched from our remote servers.
Since all of our remote stores are hidden behind the RemoteDataStore interface,
let's create a wrapper around that. A future diff will insert the actual
monitoring and reporting.

Reviewed By: quark-zju

Differential Revision: D23462571

fbshipit-source-id: e6031f19db23f7d1b09767efb9613d7528fb457d
2020-09-04 14:55:14 -07:00
Jun Wu
e74133f0fa dag: limit max segment level to 4
Summary:
This is based on fbsource data, building level 5 proves to be not useful.

This would save 300ms in the write path.

Reviewed By: sfilipco

Differential Revision: D23494505

fbshipit-source-id: ca795b4900af40dbfdaa463d36f3169413bf6a62
2020-09-04 12:20:54 -07:00
Jun Wu
b4adf0602f dag: remove non-master "Name -> Id" index on request
Summary:
Previously the IdMap's "Name -> Id" index simply ignores the "reassign
non-master" request. It turns out stale entries in that index can cause
issues as demonstrated by the previous diff.

Update IdMap to actually remove both indexes of non-master group on
remove_non_master so it cannot have stale entries.

To optimize the index, the format of IdMap is changed from:

  [ 8 bytes Id (Big Endian) ] [ Name ]

to:

  [ 8 bytes Id (Big Endian) ] [ 1 byte Group ] [ Name ]

So the index can use reference to the slice, instead of embedding the bytes, to
reduce index size.

The filesystem directory name for IdMap used by NameDag is bumped to `idmap2`
so it won't read the incompatible old `idmap` data.

Reviewed By: sfilipco

Differential Revision: D23494508

fbshipit-source-id: 3cb7782577750ba5bd13515b370f787519ed3894
2020-09-04 12:20:53 -07:00
Jun Wu
c5d6c9d0f2 dag: add a test showing non-master rebuild issues
Summary: Some vertexes can disappear from the graph!

Reviewed By: sfilipco

Differential Revision: D23494506

fbshipit-source-id: ecbf2a4169e5fc82596e89a4bfe4c442a82e9cd2
2020-09-04 12:20:53 -07:00
Jun Wu
4aea3657e1 dag: move some test utilities to a TestDag struct
Summary: The TestDag struct will be used to do some more complicated tests.

Reviewed By: sfilipco

Differential Revision: D23494507

fbshipit-source-id: 11350f9e448725ae49f50a7b6f19efc57ad84448
2020-09-04 12:20:53 -07:00
Stefan Filip
c09f80882c edenapi: use async-runtime to schedule futures
Summary:
Replacing places where the tokio runtime is instantiated inside the edenapi
client crate.

Reviewed By: quark-zju

Differential Revision: D23468596

fbshipit-source-id: ef68718c7d5b89b6477a2946daaa51618b53d06a
2020-09-03 15:45:34 -07:00
Jun Wu
cea2bf8728 dag: limit segment level at open time
Summary:
At open time, it's pointless to attempt to create new levels. So let's just
read the existing max_level and do not try to build max_level + 1.

This turns out to save 300ms in profiling result.

Reviewed By: sfilipco

Differential Revision: D23494509

fbshipit-source-id: 4ea326a3cc21792790ea0b87e5bf608a94ae382b
2020-09-03 13:48:43 -07:00
Jun Wu
f238529a97 multilog: use per-log meta to pick up updated indexes
Summary:
With MultiLog, per-log meta was previously entirely ignored. However, they can
be useful for updated indexes. For example, application defines a new index,
and opens a Log via MultiLog. The application would expect the new index is
built only once. Without MultiLog, per-log meta is updated at open time in
place. With MultiLog, the updated index meta is not written back to the
multimeta so the new index would be rebuilt multiple times undesirably.

Update MultiLog to reuse the per-log meta if it's compatible so it can pick up
new indexes.

Reviewed By: sfilipco

Differential Revision: D23488212

fbshipit-source-id: c8b3e6b5589dbda2e76a143d15085862a93dae22
2020-09-03 13:48:43 -07:00
Jun Wu
f79e7657af multilog: stop writing poisoned per-log meta
Summary:
The poisoned meta makes investigation harder. ex. `debugdumpindexlog` won't
work on those logs.

Reviewed By: sfilipco

Differential Revision: D23488213

fbshipit-source-id: b33894d8c605694b6adf5afdaed45707fbd7357e
2020-09-03 13:48:43 -07:00
Jun Wu
99511f8743 dag: benchmark dag_ops on different IdDagStores
Summary:
Change dag_ops benchmarks to use different IdDagStores. An example run shows:

  benchmarking dag::iddagstore::indexedlog_store::IndexedLogStore
  building segments (old)                           856.803 ms
  building segments (new)                           127.831 ms
  ancestors                                          54.288 ms
  children (spans)                                  619.966 ms
  children (1 id)                                    12.596 ms
  common_ancestors (spans)                            3.050 s
  descendants (small subset)                         35.652 ms
  gca_one (2 ids)                                   164.296 ms
  gca_one (spans)                                     3.132 s
  gca_all (2 ids)                                   270.542 ms
  gca_all (spans)                                     2.817 s
  heads                                             247.504 ms
  heads_ancestors                                    40.106 ms
  is_ancestor                                       108.719 ms
  parents                                           243.317 ms
  parent_ids                                         10.752 ms
  range (2 ids)                                       7.370 ms
  range (spans)                                      23.933 ms
  roots                                             620.150 ms

  benchmarking dag::iddagstore::in_process_store::InProcessStore
  building segments (old)                           790.429 ms
  building segments (new)                            55.007 ms
  ancestors                                           8.618 ms
  children (spans)                                  196.562 ms
  children (1 id)                                     2.488 ms
  common_ancestors (spans)                          545.344 ms
  descendants (small subset)                          8.093 ms
  gca_one (2 ids)                                    24.569 ms
  gca_one (spans)                                   529.080 ms
  gca_all (2 ids)                                    38.462 ms
  gca_all (spans)                                   540.486 ms
  heads                                             103.930 ms
  heads_ancestors                                     6.763 ms
  is_ancestor                                        16.208 ms
  parents                                           103.889 ms
  parent_ids                                          0.822 ms
  range (2 ids)                                       1.748 ms
  range (spans)                                       6.157 ms
  roots                                             197.924 ms

  benchmarking dag::iddagstore::bytes_store::BytesStore
  building segments (old)                           724.467 ms
  building segments (new)                            90.207 ms
  ancestors                                          23.812 ms
  children (spans)                                  348.237 ms
  children (1 id)                                     4.609 ms
  common_ancestors (spans)                            1.315 s
  descendants (small subset)                         20.819 ms
  gca_one (2 ids)                                    72.423 ms
  gca_one (spans)                                     1.346 s
  gca_all (2 ids)                                   116.025 ms
  gca_all (spans)                                     1.470 s
  heads                                             155.667 ms
  heads_ancestors                                    19.486 ms
  is_ancestor                                        51.529 ms
  parents                                           157.285 ms
  parent_ids                                          5.427 ms
  range (2 ids)                                       4.448 ms
  range (spans)                                      13.874 ms
  roots                                             365.568 ms

Overall, InProcessStore > BytesStore > IndexedLogStore. The InProcessStore
uses `Vec<BTreeMap<Id, StoreId>>` for the level-head index, which is more
efficient on the "Level" lookup (Vec), and more cache efficient (BTree).
BytesStore outperforms IndexedLogStore because it does not need to verify
checksum on every read access - the checksum was verified at store creation
(IdDag::from_bytes).

Note: The `BytesStore` is something optimized for serialization, and hasn't been sent.

Reviewed By: sfilipco

Differential Revision: D23438174

fbshipit-source-id: 6e5f15188e3b935659ccde25fac573e9b963b78f
2020-09-02 18:54:12 -07:00
Jun Wu
84ad7a5351 dag: implement GetLock for all IdDagStores
Summary: This allows them to use the SyncableIdDag APIs.

Reviewed By: sfilipco

Differential Revision: D23438170

fbshipit-source-id: 7ec7288cfb8186b88f85f0212a913cb0dffe7345
2020-09-02 18:54:12 -07:00
Jun Wu
cfff0e9144 dag: make IdDag::prepare_filesystem_sync generic
Summary: Other IdDagStores can also use the API. This will be used in benchmarks.

Reviewed By: sfilipco

Differential Revision: D23438180

fbshipit-source-id: 565552b66372dcfbb268c397883f627491d6e154
2020-09-02 18:54:12 -07:00
Jun Wu
8874e07f9b dag: IdDagStore::reload -> GetLock::reload
Summary:
Similar to `IdDagStore::sync` -> `GetLock::persist`, `reload` is more related
to filesystem/internal state exchange, and should be protected by a lock.  So
let's move the API there, and requires a lock.

Reviewed By: sfilipco

Differential Revision: D23438169

fbshipit-source-id: 4228106b7739a1a758677adfddd213ad54aa4b6a
2020-09-02 18:54:12 -07:00
Jun Wu
d633576880 dag: remove NameDag::reload
Summary:
`NameDag::reload` is used in `flush` to get a "fresh" NameDag.
In a future diff the `IdDag::reload` API gets changed, so let's
remove NameDag's use of it.

Instead, let's just re-`open` the path again to get a fresh NameDag.
It's a bit more expensive but probably okay, and easier to understand.
`get_new_segment_size()` was added as an internal API to preserve tests.

This also solves an issue where `NameDag` cannot recover properly if its
`flush` fails, because the old `NameDag` state is not lost.

After removing `NameDag::reload`, `idMap::reload` is no longer used publicly
and was made private.

Reviewed By: sfilipco

Differential Revision: D23438179

fbshipit-source-id: 0a32556a2cd786919c233d7efcae1cb9cbc5fb09
2020-09-02 18:54:11 -07:00
Jun Wu
8e16e4260f dag: IdDagStore::sync -> GetLock::persist
Summary:
The word "sync" is bi-directional: flush + reload. It was indexedlog::Log's
behavior. However, in the IdDag context "sync" is confusing - it is actually
only used to write data out, with protection from lock. Rename to `persist`
to clarify it's memory -> disk. Besides, requires a reference to a lock object
as a lightweight prove that some lock is held.

Reviewed By: sfilipco

Differential Revision: D23438175

fbshipit-source-id: 3d9ccd7431691d1c4e2ee74f3c80d95f5e7243b5
2020-09-02 18:54:11 -07:00
Jun Wu
3ad58ff945 dag: make SyncableIdMap use &mut IdMap instead of IdMap
Summary:
This removes the need of cloning `IdMap`.

SyncableIdMap is a bit tricky. I added some comments to clarify things.

Reviewed By: sfilipco

Differential Revision: D23438176

fbshipit-source-id: fe66071da07067ed6c53a6437790af1d81b28586
2020-09-02 18:54:11 -07:00
Jun Wu
23f9bec22b dag: move IdDagStore impls to separate files
Summary: This makes `iddagstore.rs` cleaner.

Reviewed By: sfilipco

Differential Revision: D23438177

fbshipit-source-id: 465cec2231a084a36b20da8e413cb9272f64a00a
2020-09-02 18:54:10 -07:00
Jun Wu
4e9200db44 dag: test IndexedLogIdDagStore
Summary:
Make the test cover IndexedLogIdDagStore. The only change is the parent index
returns children in a different order.

Reviewed By: sfilipco

Differential Revision: D23438173

fbshipit-source-id: bcfabcd329e45bbc5e7e773103fa42307c23c35d
2020-09-02 18:54:10 -07:00
Stefan Filip
1ddf5aaa0e tools: add location-to-hash command to read_res
Summary:
There aren't too many thigs that we can do with the responses that we get back
from the server. Thigs are somewhat application specific for this endpoint.
One option that is not available right now and might make sense to add is
limiting the number of entries that are printed for a given location.

Reviewed By: kulshrax

Differential Revision: D23456220

fbshipit-source-id: eb24602c3dea39b568859b82fc27b7f6acc77600
2020-09-02 17:20:43 -07:00
Stefan Filip
932450fb15 handlers: update location-to-hash endpoint with count parameter
Summary:
To reduce the size over the wire on cases where we would be traversing the
changelog on the client, we want to allow the endpoint to return a whole parent
chain with their hashes.

Reviewed By: kulshrax

Differential Revision: D23456216

fbshipit-source-id: d048462fa8415d0466dd8e814144347df7a3452a
2020-09-02 17:20:42 -07:00
Stefan Filip
7122cdded7 types: rename Location to CommitLocation
Summary:
Renaming all the LocationToHash related structures to CommitLocationToHash.
This is done for consistency. I realized the issue when the command for reading
the request from cbor was not what I was expecting it to be. The reason was that
the commit prefix was used inconsistently for LocationToHash.

Reviewed By: kulshrax

Differential Revision: D23456221

fbshipit-source-id: 0181dcaf81368b978902d8ca79c5405838e4b184
2020-09-02 17:20:42 -07:00
Stefan Filip
c2079c3464 revisionstore: use async-runtime crate for lfs
Summary:
Replacing uses of the custom Runtime in lfs with the global runtime in the
`async-runtime` crate.

Reviewed By: xavierd

Differential Revision: D23468347

fbshipit-source-id: 61d2858634a37eb2d7d807104702d24889ec047a
2020-09-02 10:01:08 -07:00
Jun Wu
a0223bc7e7 dag: make iddagstore test generic
Summary: Make it possible to test other IdDagStores.

Reviewed By: sfilipco

Differential Revision: D23438178

fbshipit-source-id: e5fc1b20833c71dd7569c77c31c76a26a6e357fe
2020-09-01 23:58:04 -07:00
Jun Wu
211739f00c dag: remove SpanSetAsc
Summary:
Now SpanSet can easily support `push_front`, we can just use SpanSet
efficiently without SpanSetAsc.

Reviewed By: sfilipco

Differential Revision: D23385246

fbshipit-source-id: b2e0086f014977fa990d5142e6eee844293e7ca5
2020-09-01 21:02:08 -07:00
Jun Wu
64bdf70811 dag: add SpanSet::intersection_span_min
Summary: To remove SpanSetAsc, its API needs to be implemented on SpanSet.

Reviewed By: sfilipco

Differential Revision: D23385250

fbshipit-source-id: ebd9d537287b5c1cde6e2c52ffb6da57dbd71852
2020-09-01 21:02:08 -07:00
Jun Wu
16eaceafe9 dag: use VecDeque for SpanSet
Summary: This will make it possible to `push_front` and remove SpanSetAsc special case.

Reviewed By: sfilipco

Differential Revision: D23385249

fbshipit-source-id: 63ac67e9bce7cb281236399b3fb86eba23bbf8a0
2020-09-01 20:53:32 -07:00
Jun Wu
71f101054a dag: implement binary_search_by for VecDeque
Summary:
This makes it easier to replace Vec<Span> with VecDeque<Span> in SpanSet for
efficient push_front and deprecates SpanSetAsc (which uses Id in a bit hacky
way - they are not real Ids).

Reviewed By: sfilipco

Differential Revision: D23385245

fbshipit-source-id: b612cd816223a301e2705084057bd24865beccf0
2020-09-01 20:38:29 -07:00
Jun Wu
2d02d3b0f7 dag: validate SpanSet order and no mergable adjacent spans
Summary:
Previously the `is_valid()` function only checks about ordering.
Make it also check "no mergeable adjacent spans" and `span.low<=span.high`.
To provide better debug messages, the function does assertions
directly without returning a bool.

Reviewed By: sfilipco

Differential Revision: D23385247

fbshipit-source-id: 84829e9242e47e68dc2a4b2a6775b13331eba959
2020-09-01 20:27:03 -07:00
Jun Wu
4bf5817dad dag: always merge adjacent spans in SpanSet
Summary:
Previously, `SpanSet::from_sorted_spans` allows having adjacent spans like
`[1..=2, 3..=4]`, while `SpanSet::from_spans` would merge them into `[1..=4]`.
Change it so `SpanSet::from_sorted_spans` merges them too.  This simplifies
the `contains` logic and could make some Sets more efficient.

Reviewed By: sfilipco

Differential Revision: D23385248

fbshipit-source-id: 85b5ba9533f15034779e93255085a4fa09c6328a
2020-09-01 20:04:12 -07:00
Jun Wu
b7f2ee577a spawn-ext: extend Command::spawn to avoid inheriting fds
Summary:
The Rust upstream took the "set F_CLOEXEC on every opened file" approach and
provided no support for closing fds at spawn time to make spawn lightweight [1].

However, that does not play well in our case:
- On Windows:
  - stdin/stdout/stderr are not created by Rust, and inheritable by
    default (other process like `cargo`, or `dotslash` might leak them too).
  - a few other handles like "Null", "Afd" are inheritable. It's
    unclear how they get created, though.
  - Fortunately, files opened by Python or C in edenscm (ex. packfiles) seem to
    be not inheritable and do not require special handling.
- On Linux:
  - Files opened by Python or C are likely lack of F_CLOEXEC and need special
    handling.

Implement logic to close file handlers (or set F_CLOEXEC) explicitly.

[1]: https://github.com/rust-lang/rust/issues/12148

Reviewed By: DurhamG

Differential Revision: D23124167

fbshipit-source-id: 32f3a1b9e3ae3a9475609df282151c9d6c4badd4
2020-08-31 17:34:48 -07:00
David Tolnay
75c2118e01 Remove crate_root from Rust dependency info
Reviewed By: danobi

Differential Revision: D23430948

fbshipit-source-id: c4b374021325fc247121ceecd0e82a0291aa75d6
2020-08-31 14:43:24 -07:00
Jun Wu
01c551bb30 hgcommits: add flush_commit_data API
Summary: This would be used to avoid excessive memory usage during pull.

Reviewed By: DurhamG

Differential Revision: D23408833

fbshipit-source-id: 8edd95ab8201697074f65cc118d14755a230567d
2020-08-31 11:57:53 -07:00
Durham Goode
08c938e859 dirstate: block addition of paths containing "." and ".."
Summary:
Mergedrivers can call dirstate.add directly and are adding paths with
"." and "..". Let's block those paths.

Reviewed By: quark-zju

Differential Revision: D23375469

fbshipit-source-id: 64e9f20169cfd50325ecd8ebcc1dd3be7a5cb202
2020-08-28 09:42:25 -07:00
Jun Wu
f271d882e6 hgcommands: make commands! macro define modules
Summary: Similar to D18528858 so module names do not need to be spelled twice.

Reviewed By: markbt

Differential Revision: D23091380

fbshipit-source-id: a2a261abc9c78c8805cea62b38498ba65398796d
2020-08-27 19:02:27 -07:00
Arun Kulshreshtha
cb3f95d06e configparser: make code compile without "fb" feature
Summary: This crate would fail to build without the "fb" feature because `serde_json` was listed as an optional dependency (but is used in a way that isn't conditional on the `fb` feature). This diff makes the dependency non-optional, and also silences several dead code warnings that are emitted when building without the "fb" feature.

Reviewed By: quark-zju

Differential Revision: D23386786

fbshipit-source-id: b00a8b0b8b0b978c1cfab2838629fcb388a076e9
2020-08-27 18:28:46 -07:00
Jun Wu
d586a40ada hgcommands: add debugfsync
Summary:
The `debugfsync` command calls fsync on newly modified files in svfs.
Right now it only includes locations that we know have constant number
of files.

The fsync logic is put in a separate crate to avoid slow compiles.

Reviewed By: DurhamG

Differential Revision: D23124169

fbshipit-source-id: 438296002eed14db599d6ec225183bf824096940
2020-08-27 18:26:03 -07:00
Jun Wu
d8e775f423 tracing-collector: limit maximum count of spans
Summary:
Some functions might be called very frequently. For example,
`phases.phasecache.loadphaserevs` might be called 100k+ times.
That makes the tracing data harder to process.

Limit the count of spans to 1k by default so the data is cheaper to process,
and some highly repetitive cases can now be reasoned about. Note the limit
is only put on static Span Ids. If a span uses dynamic metadata or ask for
different Span Ids each time, they will not be limited.

In debugshell,

  td = %trace repo.revs('smartlog()')
  len(td.serialize())

dropped from 6MB to 0.87MB.

It's also possible to reason about:

  td = %trace len(repo.revs('ancestors(.)'))

in debugshell (taking 30s, 98KB serialized, vs 21s without tracing), while
previously the result would be too large to show (`%trace` just hangs).

Reviewed By: DurhamG

Differential Revision: D23307793

fbshipit-source-id: 3c1e9885ce7a275c2abd8935a4e4539a4f14ce83
2020-08-27 18:14:29 -07:00
Jun Wu
9f4dac104f dag: truncate output in <SpanSet as Debug>::fmt
Summary: Set a default limit so the output won't be too long.

Reviewed By: DurhamG

Differential Revision: D23307792

fbshipit-source-id: 7e2ed99e96bbde06436a034e78f899fc2e3e03f8
2020-08-27 18:14:29 -07:00
Jun Wu
ed78542610 dispatch: add --trace flag
Summary:
The `--trace` flag enables tracing Python modules.
For compatibility reasons, it also enables `--traceback`.

It can be used with debugshell to make `%trace` more useful.

Reviewed By: sfilipco

Differential Revision: D23278600

fbshipit-source-id: d6d0b34bd5c48111f8cd33d7df115f349b0e95b6
2020-08-27 18:14:28 -07:00
Arun Kulshreshtha
0b9ca4e83b hgcommands: remove unused imports in dynamicconfig module
Summary: Remove unused imports.

Reviewed By: quark-zju

Differential Revision: D23356940

fbshipit-source-id: 31b81eac11946aa8b24ec23c98ddb14716fbea3a
2020-08-27 14:06:52 -07:00
Durham Goode
4d4e425624 configs: add fbitwhoami tiers to dynamicconfig inputs
Summary:
Corp has a different concept of tier than prod. Let's load the corp
tier into our tier set as well.

Reviewed By: quark-zju

Differential Revision: D23354056

fbshipit-source-id: c9543b8253f042c7b1224578e0687b4bdf21738e
2020-08-27 09:24:28 -07:00
Jun Wu
12d23ba64d revisionstore: fix GitHub build (#46)
Summary:
Pull Request resolved: https://github.com/facebookexperimental/eden/pull/46

See https://github.com/facebookexperimental/eden/runs/1034006668:

   error: unused import: `env::set_var`
      --> src/lfs.rs:1539:15
       |
  1539 |     use std::{env::set_var, str::FromStr};
       |               ^^^^^^^^^^^^
       |
  note: the lint level is defined here
      --> src/lib.rs:125:9
       |
  125  | #![deny(warnings)]
       |         ^^^^^^^^
       = note: `#[deny(unused_imports)]` implied by `#[deny(warnings)]`

  error: unnecessary braces around method argument
      --> src/lfs.rs:2439:36
       |
  2439 |         remote.batch_upload(&objs, { move |sha256| local_lfs.blobs.get(&sha256) })?;
       |                                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ help: remove these braces
       |
  note: the lint level is defined here
      --> src/lib.rs:125:9
       |
  125  | #![deny(warnings)]
       |         ^^^^^^^^
       = note: `#[deny(unused_braces)]` implied by `#[deny(warnings)]`

  error: aborting due to 2 previous errors

  error: could not compile `revisionstore`.

I dropped `#![deny(warnings)]` as I don't think warnings like the above ones
should break the build. (denying specific warnings that we care about explicitly
might be a better approach)

Reviewed By: singhsrb

Differential Revision: D23362178

fbshipit-source-id: 02258f57727edfac9818cd29dda5e451c7ca80a7
2020-08-26 20:40:25 -07:00
Arun Kulshreshtha
30e2cf4413 cargo_from_buck: reenable autocargo for edenapi
Summary: Now that it is possible to control which features are enabled on manually-managed dependencies, we can reenable autocargo for `edenapi`. See D23216925, D23327844, and D23329351 (840e6dd6f6) for context.

Reviewed By: dtolnay

Differential Revision: D23335122

fbshipit-source-id: 8ce250c3a106d2a02f457f7ed531623dd866232f
2020-08-26 19:16:48 -07:00
Jun Wu
039419d281 configparser: fix non-fb dependencies (#45)
Summary:
Pull Request resolved: https://github.com/facebookexperimental/eden/pull/45

Fix referring to 'version' without proper codegen by making 'version' compile
without codegen. This fixes configparser test when version/src/lib.rs was not
generated.

Make unneeded deps without 'fb' feature optional.

This would hopefully fix the "EdenSCM Rust Libraries" GitHub workflow.

Reviewed By: DurhamG

Differential Revision: D23269864

fbshipit-source-id: f9e691fe0a75159c4530177b8a96dad47d2494a9
2020-08-26 16:31:00 -07:00
Jun Wu
55116e223f hgcommits: use dag::delegate to simplify code
Summary: This makes the code simpler.

Reviewed By: sfilipco

Differential Revision: D23269866

fbshipit-source-id: 30c9e9d218378c0d6df8b822b2a81df2b38f5b01
2020-08-26 15:32:26 -07:00
Jun Wu
85b3cea8ee dag: define delegate macro for other main traits
Summary: Will be used to simplify code.

Reviewed By: sfilipco

Differential Revision: D23269859

fbshipit-source-id: bed0c4dca075ff60900025642af1d84bdd03452d
2020-08-26 15:32:26 -07:00
Jun Wu
6b3096c7a4 dag: avoid other 'impl<T> Trait for T' usecases
Summary:
`impl<T> Trait for T` in the current Rust makes it impossible to have
`impl<Q> Trait for Q`. Avoid using it for IdConvert and PrefixLookup.

Reviewed By: sfilipco

Differential Revision: D23269861

fbshipit-source-id: a837f3984ff4e1bd5a3983dd1642b9f064f51a36
2020-08-26 15:32:25 -07:00
Jun Wu
4a2ee4c522 dag: avoid impl<T> DagAlgorithm for T
Summary:
`impl<T> Trait for T` in the current Rust makes it impossible to have
`impl<Q> Trait for Q`. Avoid using it for DagAlgorithm.

Reviewed By: sfilipco

Differential Revision: D23269860

fbshipit-source-id: 031e75e9bf1f1eec2b9e8f36220ef8b817a143a5
2020-08-26 15:32:25 -07:00
Jun Wu
846768fb53 dag: drop LowLevelAccess
Summary: LowLevelAccess is a subset of NameDagStorage. Use the latter instead.

Reviewed By: sfilipco

Differential Revision: D23269865

fbshipit-source-id: 81ebb1e986d8b02c968a9a237ad9a97d4afd54bf
2020-08-26 15:32:25 -07:00
Jun Wu
f4021486ab dag: move beautify to default_impl
Summary: This makes `ops.rs` look simpler.

Reviewed By: sfilipco

Differential Revision: D23269863

fbshipit-source-id: ddb55ab8eb3b2d3e7c4b2ccbc2252395d62317a1
2020-08-26 15:32:25 -07:00
Jun Wu
bb461d2240 dag: improve range calculation in repos with many heads
Summary:
If there are too many heads, the current `descendants` algorithm would visit
all "old" heads. For example, with this graph:

      head9999  (N9999)
     /
    Z (master)
    :
    : (many heads)
    :/
    : head2 (N2)
    :/
    C head1 (N1)
    |/
    B head0 (N0)
    |/
    A

`A::head9999` or `Z::head9999` will visit N0, N1, ..., N9999, because
`descendands_up_to` is provided with `max_id = N9999` and Z as a vertex in the
master group, is before N0 in non-master.  The current algorithm also means
`descendands_up_to` gets linearly slower as the user uses the repo more, which
is quite undesirable.

This diff changes `descendands_up_to` to take an `ancestors` set, which is
`::head9999` in this case, and iterate non-master flat segments in it. So it
will skip N0 to N9998 directly by finding the N9999 flat segment and only use
it. The number of heads will have a smaller impact on performance.

Another slowness is `draft::draft_heads`, if there are too many `draft_heads`,
the internal calculation of `::draft_heads` can be slow. Optimize it by
limiting `draft_heads` to `draft:`. Practically this affects `y::` revset as
`y::` is translated to `y::visible_heads` and `visible_heads` can be large.

`cargo bench --bench dag_ops -- '::-master'` shows significant difference:

Before:

  range (master::draft)                              18.112 s
  range (recent_draft::drafts)                        2.594 s

After:

  range (master::draft)                              72.542 ms
  range (recent_draft::drafts)                       14.932 ms

In my fbsource checkout there were 20k+ heads. The improvement of
`master::recent_draft` (`x::y`) is pretty visible, and `y::` is also improved:

    % lhg debugbenchmarkrevsets -m -x 'p1(min(7e8c86ae % master))' -Y 'draft() & 7e8c86ae' -e 'x::y' -e 'y::' --no-default
    # x:  168f5228e570fb6b2ff7f851bd82413102748d84  (p1(min(7e8c86ae % master)))
    # y:  7e8c86aec68ebc6e0b8254afcb381315991fd21c  (draft() & 7e8c86ae)

    # before
    | revset \ backend | segments | revlog | revlog-cpy |
    |------------------|----------|--------|------------|
    | x::y             |     17ms |  0.1ms |      0.5ms |
    | y::              |    3.3ms |  0.7ms |      1.3ms |

    # after
    | revset \ backend | segments | revlog | revlog-cpy |
    |------------------|----------|--------|------------|
    | x::y             |    0.2ms |  0.1ms |      0.6ms |
    | y::              |    1.0ms |  0.7ms |      1.3ms |

Reviewed By: sfilipco

Differential Revision: D23214387

fbshipit-source-id: 4d11db84cd28f4e04e8b991cbc650c9d5781fd27
2020-08-26 15:32:25 -07:00
Jun Wu
a3cbda76bb dag: add a benchmark for x::y with lots non-master heads
Summary:
Lots of non-master heads is not an exercised graph in the benchmarks.
Add it as it practically happens.  This will be used by the next change.

Reviewed By: sfilipco

Differential Revision: D23259879

fbshipit-source-id: 7fe290d14403e42e6d135bde56e2d5c8519ae530
2020-08-26 15:32:24 -07:00
Jun Wu
89570e223a dag: use non-master group in fuzz test
Summary:
Currently the fuzz test only uses the master group. Let it exercise non-master
group too.

Reviewed By: DurhamG

Differential Revision: D23214388

fbshipit-source-id: 7108a1055fbdda2b012f93c5948fb83ef3b9a96f
2020-08-26 15:32:24 -07:00
Jun Wu
ded7c2e380 hgcommits: add explain_internals to print human-readable segments
Summary: Provide a way to see segments.

Reviewed By: sfilipco

Differential Revision: D23196408

fbshipit-source-id: b1418f945a5a3364ac73b0f97466d973dd4b6300
2020-08-26 15:32:24 -07:00
Jun Wu
9666dab916 dag: implement Debug for NameDag
Summary:
Provide a way to print out all segments with resolved names. This will be used
in a debug command.

Reviewed By: sfilipco

Differential Revision: D23196410

fbshipit-source-id: 1712bfda0271aa548699fe4a6b8603c5ec07af7f
2020-08-26 15:32:23 -07:00
Jun Wu
5829fc4e20 dag: children(small set) has a fast path
Summary:
Use the parent-child index to answer children query quickly.

`cargo bench --bench dag_ops -- children`:

Before:

  children (spans)                                  606.076 ms
  children (1 id)                                   124.105 ms

After:

  children (spans)                                  602.999 ms
  children (1 id)                                    10.777 ms

Reviewed By: sfilipco

Differential Revision: D23196411

fbshipit-source-id: 37195d5ccaa582d35314e0000352ef477287d38c
2020-08-26 15:32:23 -07:00
Jun Wu
a5a396027d dag: expose API to lookup children by parent
Summary: This will be used to optimize "children(single vertex)" query.

Reviewed By: sfilipco

Differential Revision: D23196409

fbshipit-source-id: 050c0859faf83b909e3174bb7c7bd6e7725165c0
2020-08-26 15:32:23 -07:00
Jun Wu
bad2ae41ef dag: maintain non-master parent-child indexes
Summary:
Update the parent index to store non-master group too. To make
"remove_non_master" work, the index contains a "child group" prefix that
allows efficient range invalidation.

This will allow answering "children(single vertex)" query more efficiently.

This diff does not expose an API to query the index yet.

Reviewed By: sfilipco

Differential Revision: D23196406

fbshipit-source-id: 9137da5ffa8306bdafbcabc06b6f0d23f38dcf57
2020-08-26 15:32:23 -07:00
Jun Wu
6c468b7ac0 dag: add benchmark about children(1 id)
Summary:
Practically, the input of `children` is often one vertex instead of a large set.
Add a benchmark for it.

It looks like:

  children (spans)                                  606.076 ms
  children (1 id)                                   124.105 ms

Reviewed By: sfilipco

Differential Revision: D23196407

fbshipit-source-id: 0645b59ac846836fd061386384f6386a57661741
2020-08-26 15:32:23 -07:00
Jun Wu
6f3616a2b8 nameset: make dag and idmap immutable in hints
Summary: They can be figured out at Hints initialization time. So they don't need to be mutable.

Reviewed By: sfilipco

Differential Revision: D23182518

fbshipit-source-id: 133375fdf27a2546a50b63fb130534acdadc5938
2020-08-26 15:32:22 -07:00
Jun Wu
682365f14d nameset: make Id{Static,Lazy}Set require Dag on construction
Summary:
Both IdSet and IdLazy set require both Dag and IdMap to construct.
This is step 1 torwards making Dag and IdMap immutable in hints.

A misspeall of "lhs" vs "hints" in the union set is discovered by the change
and fixed.

Reviewed By: sfilipco

Differential Revision: D23182520

fbshipit-source-id: 3d052de4b8681d3672ebc45d953d1e784f64b2a4
2020-08-26 15:32:22 -07:00
Jun Wu
3ba655abf3 dag: add DummyDag for testing
Summary:
It will be used in places (ex. tests) where a Dag is required but constructing
a real Dag is troublesome.

Reviewed By: sfilipco

Differential Revision: D23182517

fbshipit-source-id: 736911365778e5071c1e0b9615090a4e960392a0
2020-08-26 15:32:22 -07:00
Jun Wu
bd7769b34a dag: rename snapshot_dag to dag_snapshot
Summary: This is more consistent with `id_map_snapshot`.

Reviewed By: sfilipco

Differential Revision: D23182519

fbshipit-source-id: 62b7fc8bfdc9d6b3a4639a6518ea084c7f3807dd
2020-08-26 15:32:22 -07:00
Jun Wu
4d798c39d9 dag: add new range algorithm
Summary:
Similar to descendants, the new range algorithm avoids potentially expensive
checks about whether high-level segments can be used or not. Practically this
is overall an improvement.

`cargo bench --bench dag_ops -- range`:

Before:

  range (2 ids)                                     115.380 ms
  range (spans)                                     243.666 ms

After:

  range (2 ids)                                     123.274 ms
  range (spans)                                      23.101 ms

It is 100x faster with the range x::y benchmark added later on `git.git`.

Reviewed By: sfilipco

Differential Revision: D23106175

fbshipit-source-id: 691e0418ba2b7ad9f52ac15b5cd6088ec28d5f48
2020-08-26 15:32:22 -07:00
Jun Wu
c2e03b9129 dag: add new descendants algorithm
Summary:
The old algorithm tries to make use high-level segments.
However, the code to test whether a high-level segment can be used is
often too expensive for the benefit. Often, high-level segments cannot
be used most of the time and it's similar to O(flat segments).

This diff adds a simpler algorithm that just iterates through the flat
segments. It's faster in most practical cases.

`cargo bench --bench dag_ops -- descendants` shows improvements too:

Before:

  descendants (small subset)                        436.515 ms

After:

  descendants (small subset)                         33.460 ms

Reviewed By: sfilipco

Differential Revision: D23106174

fbshipit-source-id: e6101483d8539b2b1c881be2ccfd0071f122352f
2020-08-26 15:32:22 -07:00
Jun Wu
e22b816a12 dag: add iddag.iter_segments_ascending API
Summary: This will be used by upcoming changes.

Reviewed By: sfilipco

Differential Revision: D23106177

fbshipit-source-id: 9bf183f7464c06b801be64fd938db0babd544756
2020-08-26 15:32:21 -07:00
Jun Wu
0dcf08e509 dag: add SpanSetAsc struct
Summary: This internal struct will be used by upcoming changes.

Reviewed By: sfilipco

Differential Revision: D23106172

fbshipit-source-id: 6d5b9bc1c810984814d0912100acca38a2565a63
2020-08-26 15:32:21 -07:00
Durham Goode
201f63be32 build: rename third-party rust fbthrift crate
Summary:
Our internal build infra creates a workspace and workspaces don't like
it when two crates have the same name. Eden scm had third-party rust crates that
were simple redirects to the internal location, but had the same name. This
caused breakages once these crates became part of the edenfs open source build.
Let's rename them to avoid this issue.

Reviewed By: kulshrax

Differential Revision: D23252539

fbshipit-source-id: 9ff2fa160a19c6bc54e015c71f9da7044ce659a7
2020-08-26 12:26:21 -07:00
Stefan Filip
72db1cbedb async-runtime: crate for async job scheduling from blocking threads
Summary:
We have a thread blocking application. We have async libraries. This crate
provides common utilities for communicating between the blocking world and the
async world. It is intended to be a guide so that not all developers have to
get in depth understanding of Tokio in order to use async functions.

Reviewed By: quark-zju, xavierd

Differential Revision: D23222876

fbshipit-source-id: b9a61795bc917bfc664c9d6da95c9e5e2d506c79
2020-08-26 00:57:32 -07:00
Arun Kulshreshtha
a99837c818 revisionstore: fix unused variable warnings on EdenApiStoreKind
Summary: The default method implementations on this trait were causing unused variable warnings. Prefix them with underscores to silence the warning and add `#![deny(warnings)]` to `revisionstore` to prevent future warnings from slipping through the cracks.

Reviewed By: singhsrb

Differential Revision: D23334309

fbshipit-source-id: d17b27ca0dd462e1613eac918fb595faa8637741
2020-08-25 21:53:22 -07:00
David Tolnay
840e6dd6f6 edenapi: Unmanage Cargo.toml
Reviewed By: quark-zju

Differential Revision: D23329351

fbshipit-source-id: e07440a2ac5f93efda19e7a3c7d5e7ae18598c7d
2020-08-25 15:37:55 -07:00
Durham Goode
18b20f4b24 configs: move dynamicconfig to be before system configs
Summary:
We want to eventually get rid of system and repo configs, but for now
they should take precedence over the dynamicconfigs. Previously we relied on
validation to remove any entries from dynamicconfig that interferes with a
system rc config, but in some code paths we didn't run that validation (like if
we loaded configs purely from Rust).

Let's just make dynamicconfig load before system configs. If validation doesn't
run, we might miss the case where dynamicconfig sets a value and the system rc
doesn't. But that's probably fine.

Reviewed By: quark-zju

Differential Revision: D23305711

fbshipit-source-id: 77b5f49d348cfa116694a641ed17e6d1184a81ab
2020-08-25 07:33:28 -07:00
Durham Goode
d643f48c8c configs: remove loaddynamicconfig option
Summary:
Dynamicconfigs are now part of our critical path. Let's remove the
option to not load them. This also let's us get rid of a circularl dependency
where loading dynamicconfigs required having already loaded some configs. This
will let us move dynammicconfig loading to be before system rc loading in a
later diff.

Reviewed By: sfilipco

Differential Revision: D23309090

fbshipit-source-id: 5138059a8ed944c3616007e7c1289b6a57be0e65
2020-08-25 07:33:28 -07:00
Durham Goode
c42a494668 dynamicconfig: don't block read operations on dynamicconfig write permission errors
Summary:
Dynamicconfig was throwing errors if hgrc.dynamic wasn't writable.
Let's eat those errors for normal read operations. We still treat it as an error
for straight hg debugdynamicconfig invocations.

Reviewed By: quark-zju

Differential Revision: D23301100

fbshipit-source-id: ed0bd1282d2c7ee747f0909c238a5fa07b7bc9bc
2020-08-24 21:40:00 -07:00
Jun Wu
7872c44fdf configparser: stabilize tests
Summary:
Add locking for tests reading / mutating global env vars.
Restore HG_TEST_REMOTE_CONFIG after testing.

Reviewed By: DurhamG

Differential Revision: D23269862

fbshipit-source-id: d61141b25c923a059de07c3dc8479f3bee06dce7
2020-08-24 12:36:09 -07:00
Jun Wu
e7f3167810 hgcommands: show milliseconds on RUST_LOG output
Summary: This makes it a bit easier to track down perf issues printed by RUST_LOGs.

Reviewed By: sfilipco

Differential Revision: D23095463

fbshipit-source-id: 78221a1992389f512fac6e6e633be6d19123e04a
2020-08-21 13:00:45 -07:00
Jun Wu
d7cbb641ff dag: fix fuzz tests
Summary:
The fuzz tests need `TestContext::id_dag()`, which was removed by D20471712 (1fb5acf242).
Restore it so fuzz tests can run. This is mainly to check the new `range`
function.

The `range` fuzz test does find an issue caused by `>` written as `>=`
relatively quickly.

Reviewed By: sfilipco

Differential Revision: D23106176

fbshipit-source-id: e9540cc932503a9d54246d24c70bac829fcb13df
2020-08-21 13:00:45 -07:00
Jun Wu
749602e534 hgcommits: add gitsegments backend
Summary:
The backend translates git commit graph to segments. It's useful for
benchmarking on git commit graphs.

Reviewed By: DurhamG

Differential Revision: D23095470

fbshipit-source-id: 21a28869e91ef8f38bbf9925443eb4ac26f05e3d
2020-08-21 13:00:45 -07:00
Jun Wu
d352133d6d hgcommits: use concrete error types
Summary: Migrate to concrete types so it can be typechecked.

Reviewed By: DurhamG

Differential Revision: D23095469

fbshipit-source-id: 27c6da30ca8a1329df544cd2ded7d9734593e48a
2020-08-21 13:00:45 -07:00
Jun Wu
e5527715b7 gitdag: crate to build segmented dag from git history
Summary:
Read git commit graph and migrate them to `dag::Dag`.

This allows using Rust dag abstractions on the git
commit graph.

Reviewed By: DurhamG

Differential Revision: D23095471

fbshipit-source-id: 2163701350ce82ce6e97074e56ad5877f3c9c158
2020-08-21 13:00:45 -07:00
Jun Wu
45db3bbf96 mutationstore: add a native path to calculate 'obsolete()'
Summary:
The new path does not calculate the complicated `successorssets`, and is
known to make wez's repo operations significantly faster (which, I suspect is
slowed by a very long chain).

The new code is about 3x faster on my repo too:

  # before
  In [1]: list(repo.nodes('draft()'))
  In [2]: %time len(m.mutation.obsoletenodes(repo))
  CPU times: user 246 ms, sys: 42.2 ms, total: 288 ms
  Wall time: 316 ms
  Out[2]: 1127

  # after
  In [1]: list(repo.nodes('draft()'))
  In [2]: %time len(m.mutation.obsoletenodes(repo))
  CPU times: user 74.3 ms, sys: 7.92 ms, total: 82.3 ms
  Wall time: 82.3 ms
  Out[2]: 1127

Reviewed By: markbt

Differential Revision: D23036063

fbshipit-source-id: afd6ac122bb5d8d513b5cdc033e04d2c377286eb
2020-08-21 13:00:45 -07:00
Jun Wu
78477ad9c5 mutationstore: optimize get_dag
Summary:
Optimize get_dag:
- Avoid parsing mutation entries once they are parsed, by keeping an in-memory
  `parent_map`.
- Pass `heads` to `add_heads` so the segments are less fragmented, cycle break
  helper is more efficient.

The `heads` optimization is effective. Practically this makes `get_dag` about 2x faster.

This has a subtle change on cycle handling - full cycle without any non-cycle heads will
be ignored. Practically cycles are rare so it might be okay.

Together with improvements on the `dag` side, `get_dag` is about 4x faster.

Reviewed By: markbt

Differential Revision: D23036062

fbshipit-source-id: 3dc407b562f7ebf2543a87c5cd651ad6a2339d67
2020-08-21 13:00:45 -07:00
Jun Wu
be2d28fb95 dag: fix non-master high-level segments building
Summary:
If there is no new master segments, it's still possible to have new non-master
segments. Fix the loop condition so we don't skip building non-master segments.

Reviewed By: sfilipco

Differential Revision: D23095465

fbshipit-source-id: 46eb9d5b5f2b04241981558646e0bc090652abce
2020-08-21 13:00:45 -07:00
Jun Wu
e11f36e96b dag: test high-level segments building for non-master
Summary:
I noticed that high-level segments are somehow not built for non-master vertexes.
Add a test to demonstrate the issue.

Reviewed By: DurhamG, sfilipco

Differential Revision: D23095466

fbshipit-source-id: c5a6da14bdfabcf7c432f6c6dfe096c71cc10ee9
2020-08-21 13:00:45 -07:00
Jun Wu
23074edd9b dag: add some tracing spans
Summary: This is useful to investigate internals of dag calculations.

Reviewed By: sfilipco

Differential Revision: D23095473

fbshipit-source-id: 4750c1b4ffad32b1317051d17db9659aaaed59c4
2020-08-21 13:00:45 -07:00
Jun Wu
cd9aa9cb6c dag: improve segment building perf by using precalculated flat segments
Summary:
Follow up of the previous change by actually using the flat segments to build
segments. This significantly improved the perf. `cargo bench --bench dag_ops`
shows:

  building segments (old)                           774.109 ms
  building segments (new)                           143.879 ms

Besides, a `O(N^2)` update to `head_ids` is changed. It improves performance
when the graph has many heads (ex. the mutation graph).

Reviewed By: sfilipco

Differential Revision: D23036080

fbshipit-source-id: 033565700f253c6f20e30a00adb6b579921d6679
2020-08-21 13:00:45 -07:00
Jun Wu
9c9ecbc82b dag: make IdMap::assign_head calculate flat segments
Summary:
While testing the `obsolete()` set, I found an in-memory segmented DAG takes
10x time to build than a HashMap DAG.

Part of the inefficiency is to use a translated "parent_func" that round-trips
through Id and Vertex, used by segment building logic. This diff makes
`IdMap::assign_head` return flat segments, so we don't need a translated
"parent_func" to build flat segments.

This diff only adds checks to make sure the parent_func (Id version) matches
the segments. The next diff switches the segment building to not use the
translated parent_func.

Reviewed By: sfilipco

Differential Revision: D23036060

fbshipit-source-id: 99137f4b5be455cdf43218ba23eb3954b6d9e05a
2020-08-21 13:00:45 -07:00
Jun Wu
0742dc6293 dag: make to_set API bind the dag
Summary:
This affects the `tonodes` API in the Python world. Practically this will bind
the main commit graph to sets like draft, public.

The `ToSet` requirement on `DagAlgorithm` has to be removed to avoid stack
overflow of rustc resolving constraints.

Reviewed By: sfilipco

Differential Revision: D23036077

fbshipit-source-id: 912b924e29611680ab6b2ee4dbcd7ab39824409a
2020-08-21 13:00:45 -07:00
Jun Wu
adf027742e nameset: add flatten API
Summary: This will be useful for the `obsolete()` set.

Reviewed By: sfilipco

Differential Revision: D23036072

fbshipit-source-id: 2f944ef31cf19f902622d90545fa02b7dda89221
2020-08-21 13:00:45 -07:00
Jun Wu
f23b1112f0 nameset: a & b should not use id-based fast path if id map is incompatible
Summary:
If two sets have different IdMap, their Ids cannot be compared directly
for correctness.

Reviewed By: sfilipco

Differential Revision: D23036068

fbshipit-source-id: e800e8273b95c1f8174236e0f30445db7fd44556
2020-08-21 13:00:45 -07:00
Jun Wu
c1e596dbd6 nameset: use real id map snapshot instead of a pointer in hints
Summary: This is similar to the previous change. This allows "binding" IdMaps to sets.

Reviewed By: sfilipco

Differential Revision: D23036058

fbshipit-source-id: ec1b1ec73e949ad4865aecf17bfcc5c1ca723e0d
2020-08-21 13:00:45 -07:00
Jun Wu
0ac5f05097 nameset: use real dag snapshot instead of a pointer in hints
Summary:
This trades a bit performance (calculating the snapshot) for correctness (no
pointer reuse issues) and convenience (set captures dag information with them
and enables use-cases like converting NameSet from another dag to the
current dag without requiring extra `dag` objects).

Reviewed By: sfilipco

Differential Revision: D23036067

fbshipit-source-id: 2e691f09ad401ba79dbc635e908d79e54dadca5e
2020-08-21 13:00:45 -07:00
Jun Wu
759ceb6212 nameset: do not swap x & y if they come from different graphs
Summary:
If `x` and `y` come from a same graph, `x & y` is more efficient than
`y & x` if `y` is larger. However, if `x` and `y` are from different
graphs, the `FULL` hint can no longer accurately predict which one
is larger. Therefore the swap should be avoided.

Reviewed By: sfilipco

Differential Revision: D23036081

fbshipit-source-id: fe3970fc38c853b36689bfd0ee1dec20643ace78
2020-08-21 13:00:45 -07:00
Jun Wu
762603455a nameset: new metaset for separate iter+contains lazy/fast paths
Summary:
For sets like `obsolete()`, `merge()`, they could have a fast "contains" path:
Just check the given commit without calculating a full set. It's also possible
to have a relatively efficient code path to return StaticSet (for obsolete()),
or IdStaticSet (for merge(), by checking flat segments). This diff adds a
`MetaSet` that allows defining two fast paths separately.

This will be used for the `obsolete()` set in upcoming changes.

Reviewed By: sfilipco

Differential Revision: D23036059

fbshipit-source-id: 06e6f90e7e9511626a12cfa729c306ff539256d2
2020-08-21 13:00:45 -07:00
Jun Wu
7d8f4ef92f dag: fix re-assigning master flush
Summary:
Before this change, `flush` with empty changes but `master` moves will cause an
error, because the `parents_func` only contains "pending changes", aka. new
vertexes. The `parents_func` does not know `master` and `master` is needed to
re-assign them from the non-master to the master group.

With the snapshot API, things become easier. We just take a snapshot before
reloading, and use the snapshot to answer parent_names.

Reviewed By: sfilipco

Differential Revision: D22970569

fbshipit-source-id: 99a25857ba98792edff69985c16df118a560ffb0
2020-08-21 13:00:45 -07:00
Jun Wu
f666cb1cf0 dag: add DagAlgorithm::snapshot_dag
Summary:
This API allows the underlying Dag to provide a snapshot. The snapshot can then
be used in places that do not want a lifetime (ex. NameSet).

Reviewed By: sfilipco

Differential Revision: D22970579

fbshipit-source-id: ededff82009fd5b4583f871eef084ec907b45d33
2020-08-21 13:00:45 -07:00
Jun Wu
b8e7828edd dag: add NameDag::snapshot_dag
Summary:
Make it possible to snapshot a Dag. This is useful for cases where another
struct wants access to the Dag without lifetimes. Namely, the LazySet can
might want to keep a snapshot of Dag.

Reviewed By: sfilipco

Differential Revision: D22970568

fbshipit-source-id: 508c38d3ffac2ffcd2e682578c3c5e5787ea3bcf
2020-08-21 13:00:45 -07:00
Jun Wu
741d050f10 dag: drop inverse DAG
Summary:
The only intended use of the inverse DAG is to implement the Python dag
interface in `dagutil.py`. D22519589 (2d4d44cf3d) stack changed it so the Python dag
interface becomes optional. Therefore there is no need to keep the inverse DAG
interface, which is a bit tricky on sorting.

Reviewed By: sfilipco

Differential Revision: D22970581

fbshipit-source-id: 58a126b41d992e75beaf76ece25cb578ee84760b
2020-08-21 13:00:45 -07:00
Jun Wu
fa25f42fea pydag: add an API to migrate from one DAG to segmented DAG
Summary:
This will be used for migrating revlog DAG to segmented changelog. It does not
migrate commit text data (which can take 10+ minutes).

Reviewed By: DurhamG, sfilipco

Differential Revision: D22970582

fbshipit-source-id: 125a8726d48e15ceb06edb139d6d5b2fc132a32c
2020-08-21 13:00:45 -07:00
Durham Goode
b2ece412fd configs: handle timestamp anomalies in dynamicconfigs
Summary:
Dynamicconfigs compares the timestamp of config files with the current
timestamp to determine when to regenerate. If the timestamp of the config file
is newer than the current timestamp, Rust throws an exception. Let's handle that
case and treat it as if the file was just created instead of crashing.

Reviewed By: quark-zju

Differential Revision: D23230216

fbshipit-source-id: ca185de7dfca46953e04ec08c84668eda6d749bd
2020-08-21 13:00:45 -07:00
Mark Thomas
4d18561ab8 bgprogress: Stdio is only used on Unix
Summary: This fixes the Windows build.

Reviewed By: farnz

Differential Revision: D23212195

fbshipit-source-id: 159f3ddebf6a97f52f9b6c80ef19315c8f4b0c85
2020-08-21 13:00:45 -07:00
Jun Wu
6b64f9a2bf dag: add import_and_flush API
Summary:
This allows importing from other DAGs. It will be used to import revlog DAG to
the new segmented format.

Reviewed By: sfilipco

Differential Revision: D22970572

fbshipit-source-id: 0a183e7b64831574cc9c60d4639124d02d19cf43
2020-08-21 13:00:45 -07:00
Jun Wu
c448e0f575 renderdag: move to dag
Summary:
This allows dag to use renderdag in tests to verify graph result. Previously
it was hard because dag <-> renderdag would form circular dependency.

It also make it possible to implement more efficient and integrated fast paths
for graph rendering.

Reviewed By: sfilipco

Differential Revision: D22970570

fbshipit-source-id: 526497339bd7aa8898d1af4aa9cf6d2a6797aae0
2020-08-21 13:00:45 -07:00
Jun Wu
d047f07b70 commits: add a trait to describe storage backend and use-cases
Summary: This will be used to describe what the commit graph backend is.

Reviewed By: sfilipco

Differential Revision: D22970577

fbshipit-source-id: 753efdbdd4466730ece758d9f4789fbd21e2801b
2020-08-21 13:00:45 -07:00
Jun Wu
b77355ca0c commits: add double write commits backend
Summary:
This allows us to try segmented changelog while maintaining revlog
compatibility.

Reviewed By: sfilipco

Differential Revision: D22970583

fbshipit-source-id: 7c43cdadd76300e76e89f38aac5ed3ecc0cff728
2020-08-21 13:00:45 -07:00
Durham Goode
d7b036c29a Enable fb features for cargo test diff runs
Summary:
We missed a Windows http client breakage because our LFS server integration
wasn't run on Windows. Let's enable the fb feature for all our cargo test runs.

Reviewed By: singhsrb

Differential Revision: D23140315

fbshipit-source-id: 46cc533c1e543ffc32d472b49a8f6daeee3b5009
2020-08-18 14:01:01 -07:00
Meyer Jacobs
656e3c90d6 edenapi: Introduce serde annotations for wire protocol compatibility and compact wire representation
Summary:
Aux data wire protocol part 1: field annotations & basic compatibility model.

Annotates fields in `file`, `tree`, and `complete_tree` wire structs with `#[serde(rename = "N", default, skip_serializing_if = "is_default")]`. I've avoided using `#[serde(default)]` on the container structs themselves because this can cause some confusion / incorrect behavior if not used carefully. Consider a wire struct `FooRequest` with a field of type `Option<Bar>`. `Option<Bar>` defaults to `None`. If `FooRequest`'s `Default` implenentation sets the field's default to `Some(bar)`, a `FooRequest` explicitly constructed with `None` for the field will be serialized with the field omitted (because it passes `is_default`) and will be deserialized on the server as `Some(bar)`, causing incorrect behavior. To address this, we'd need to change the `is_default` function used with `skip_serializing_if` to check against the field's default value as set by the container, which isn't trivially possible without some sort of reflection (please correct me if you know a good way to achieve this). This is unfortunate, as it'd be very desirable for the container to be able to set defaults different from the individual field type defaults, for cases where one boolean, for instance, should default to true. As-is, we'd need to address this with wrapper types instead, where we can fully control the `Default` implenentation.

We can, of course, address this by providing an alternate `skip_serializing_if` function to fields with default that doesn't match that set by the container. This will need to be done carefully, though, to avoid the issue I described above.

Currently the JSON module manually serializes and deserializes all the top-level request objects, so the rename annotation doesn't impact it. We can add `#[serde(alias = "rustfieldname")]` if we'd like the server and client to be able to accept manually-crafted requests and responses with explicit field names. This could also be useful to replace the manual parsing in the JSON module, but can't replace the manual serialization in a clean way. We'd need to introduce a second copy of the wire types, without the serde `rename` attribute, to allow serializing with the actual rust field name.

I've only modified the `tree`, `file`, and `complete_tree` modules. I intend to eventually update the rest of the edenapi protocol later on, when the implementation of `file` and `tree` are complete / stable. This will give us a chance to fix any mistakes before copying the design to more places.

Note: I do not intend to keep to proper wire protocol compatibility at this stage in the implementation. Expect field numbers to be re-used by non-compatible changes.

Reviewed By: kulshrax

Differential Revision: D23172756

fbshipit-source-id: 39976ed4bede892bd6981f9c3f23557a91f9028b
2020-08-18 13:44:35 -07:00
Xavier Deguillard
88c3bf4826 revisionstore: remove translate_lfs_missing
Summary:
As noted in the documentation for it, this can be removed once get and prefetch
return a continuation. This is now done, and thus we can remove it entirely.

Mis-use of it caused data to be fetched twice: once by memcache, and the second
one by getpackv2.

Reviewed By: singhsrb

Differential Revision: D23123344

fbshipit-source-id: 9ac0594faaba94ead04a8bb9035e14809a706641
2020-08-17 17:05:58 -07:00
Durham Goode
fe6cb9dc13 configs: fix handling shared path with trailing new lines
Summary: The python code stripped new lines but the Rust code did not.

Reviewed By: singhsrb

Differential Revision: D23167515

fbshipit-source-id: add33ec6e4cfd9169e6fef8208490e0aeede38bd
2020-08-17 15:53:08 -07:00
Durham Goode
33a634167e dynamicconfig: support a disallowlist config
Summary:
This new disallowlist will let us specify config section.key's which
should not be accepted from old rc files. This will let us incrementally disable
loading of those configs from the old files, which will then let us delete them
from the old rc's and eventually delete the old rc's entirely.

This diff also removes hgrc.local and hgrc.od from the list of configs we
verify, since those are not on the list of configs that need to be removed in
this initiative.

Reviewed By: quark-zju

Differential Revision: D23065595

fbshipit-source-id: 5cd742d099efd651174cab5e87bb7cdc4bae8054
2020-08-16 16:56:00 -07:00
Durham Goode
0cf7ebeffe configs: make backingstore load hg configs through the approved path
Summary:
Previously the backing store was loading configs manually. Now that
system, dynamic, user, and repo config loading are unified, let's go through
that approved path.

Reviewed By: kulshrax

Differential Revision: D22736338

fbshipit-source-id: 232023e660107a096691e9d99bf89c04c218dfbd
2020-08-16 16:56:00 -07:00
Durham Goode
2da121cb60 configs: add rust support for loading dynamic and repo configs
Summary:
This threads the calls to load_dynamic and load_repo through the Rust
layer up to the Python bindings. This diff does 2 notable things:

1. It adds a reload API for reloading configs in place, versus creating a new
one. This will be used in localrepo.__init__ to construct a new config for the
repo while still maintaining the old pinned values from the copied ui.
2. It threads a repo path and readonly config list from Python down to the Rust
code. This allows load_dynamic and load_repo to operate on the repo path, and
allows the readonly filter to applied to all configs during reloading.

Reviewed By: quark-zju

Differential Revision: D22712623

fbshipit-source-id: a0f372f4971c5feac2f20e89a0fb3fe6d4a65d6f
2020-08-16 16:56:00 -07:00
Durham Goode
6b0014490c configs: implement dynamic and repo config loading in Rust
Summary:
In a future diff we'll enable dynamic and repo config loading purely
from Rust. To do so we need load functions for both cases.  A future diff will
call these.

The dynamicconfig loading is based off the Python equivalent in uiconfig.py

Reviewed By: quark-zju

Differential Revision: D22712624

fbshipit-source-id: ff46f6315fb80d4cd9e31d875ac60264563b12f2
2020-08-16 16:56:00 -07:00
Durham Goode
194e815245 configs: move HGRCPATH loading to load_system
Summary:
Previously load_system would skip loading if HGRCPATH was present and
then load_user would actually load the HGRCPATH. In an upcoming diff I add
load_dynamic, which happens after system but before user. The tests for
dynamicconfig depend on HGRCPATH being loaded when load_dynamic runs, so let's
move HGRCPATH loading up to load_system.

Reviewed By: quark-zju

Differential Revision: D22712627

fbshipit-source-id: 91175d9d7f85b9392ffea4af815a4facebbfe7c1
2020-08-16 16:56:00 -07:00
Durham Goode
ef9ba19dc5 configs: make Options clonable
Summary:
In a future diff we'll allow an outside caller to pass an Options down
to configparsers::hg::load() so that filters can be applied during loading. Inside
hg::load() we need to use the options multiple times with different values, so
let's make Options clonable.

Reviewed By: quark-zju

Differential Revision: D22712626

fbshipit-source-id: 975145f38d35afe7d4a6c8e87071b0fb0ae74797
2020-08-16 16:55:59 -07:00
Durham Goode
0cea385252 configs: remove config from repo.rs API
Summary:
A future diff will move all dynamic and repo config loading to be in
configparser. As part of this, let's simplify the repo.rs API to not pass
configs around everywhere.

Reviewed By: quark-zju

Differential Revision: D22712628

fbshipit-source-id: 79f23991aa826ce8b4f7430b45d7702efdc6b982
2020-08-16 16:55:59 -07:00
Durham Goode
26564596a1 utils: add background process utility
Summary:
Similar to the Python runbgcommand (extutil.py), this is a Rust utility that runs a
detached background process in a cross platform way.

This will be used in a later diff to run dynamicconfig generation in the
background.

Reviewed By: quark-zju

Differential Revision: D22712629

fbshipit-source-id: a317465bf03c96d977a203678e2bef13ce57cc12
2020-08-16 16:55:59 -07:00
Durham Goode
0b123ba41d configs: move Rust dynamicconfig generation into configparser::hg
Summary:
As part of moving all hg config loading and generation logic into Rust,
let's move the config generation logic from hgcommands and pyconfigparser to
configparser, unifying them at the same time.

Future diffs will move config loading in as well.

Reviewed By: quark-zju

Differential Revision: D22590208

fbshipit-source-id: d1760c404a6a5c57347df30713c20de55cfdb9a4
2020-08-16 16:55:59 -07:00
Durham Goode
7ff28d3e1c configs: move dynamicconfig into configparser
Summary:
A future diff will unify all config loading into configparser::hg, but
to do so we need dynamicconfig to live in configparser, so it can load
dynamicconfigs. Let's move everything in.

Reviewed By: quark-zju

Differential Revision: D22587237

fbshipit-source-id: 5613094175b6e1597aa113ee3e6d92ce7ec79f6d
2020-08-16 16:55:59 -07:00
Durham Goode
a40331be8d configs: unify system+user config loading into pure rust layer
Summary:
We had two spots that loaded system and user configs, one in the
pyconfigparser layer, and one in the pure rust config layer. In an upcoming diff
I'd like to move dynamicconfig loading down into the pure rust layer, so let's
unify these.

Reviewed By: quark-zju

Differential Revision: D22585554

fbshipit-source-id: 0cea7801ae1d5a3a3c12b80ee23b37f9e690e2bc
2020-08-16 16:55:59 -07:00
Durham Goode
3129f032a4 contentstore: make history rotatelog size configurable
Summary:
In a future diff we'll increase the size of the rotatelog temporarily
during clones. To do so we need it to be configurable.

Reviewed By: quark-zju

Differential Revision: D23089539

fbshipit-source-id: ebfc3beaf3c0fe5b01b87d97c19455b0a24afa72
2020-08-16 16:44:16 -07:00
Durham Goode
b821ab3766 contentstore: make data rotatelog size configurable
Summary:
In a future diff we'll increase the size of the rotatelog temporarily
during clones. To do so we need it to be configurable.

Reviewed By: quark-zju

Differential Revision: D23089541

fbshipit-source-id: 5010e417a83a2611283322f1dbb7023f4286f503
2020-08-16 16:44:16 -07:00
Durham Goode
76d3d46837 revisionstore: remove from_path from LocalStore
Summary:
from_path is an awkward constructor because it doesn't pass any other
information, like a config object. It also requires that the constructor be very
generic across all the stores. Right now it's only needed for pack files, so
let's move it to it's own trait that is limited to pack files.

This will allow us to make the indexedlog store constructors more versatile in a
later diff. Once we get rid of pack files we can delete the StoreFromPath trait
entirely.

Reviewed By: xavierd

Differential Revision: D23089542

fbshipit-source-id: ea6c50853e5d5390a029002ef5d15c74fe41fe69
2020-08-16 16:44:16 -07:00
Jun Wu
2db783bed8 revlogindex: make parent_revs fallible
Summary: If parent_revs gets an out-of-bound rev, it should fail.

Reviewed By: sfilipco

Differential Revision: D23036071

fbshipit-source-id: 7fae0fd5adf07ac3c933a29d7d06289d8d740c60
2020-08-14 22:00:26 -07:00
Jun Wu
0f838d7abf revlogindex: fix \0 header handling
Summary:
If the text starts with `\0`, the `\0` should be considered as part of the
uncompressed text instead of a separated header.

Reviewed By: sfilipco

Differential Revision: D22970575

fbshipit-source-id: 49e8a1a1ea42a3c4cf153b70f59fd0558dcfcede
2020-08-14 22:00:26 -07:00
Jun Wu
54a1a620d0 revlogindex: fix parent handling
Summary:
The parent handling is unsound when there are revs that are skipped. Fix it by
reasoning about commit hashes for parents.

Reviewed By: sfilipco

Differential Revision: D23036078

fbshipit-source-id: 8f710171471025cd48b3bd8f6ea57c68330eb8b8
2020-08-14 22:00:26 -07:00
Xavier Deguillard
e0ec7d8896 fsinfo: fix cargo test
Summary:
Somehow `make local` doesn't complain about missing features, but `cargo test`
does.

Reviewed By: singhsrb

Differential Revision: D23132496

fbshipit-source-id: cdbfe1faf194d61d86493a760e45fd38087d2956
2020-08-14 13:38:41 -07:00
Durham Goode
661d02d6d5 http_client: disable ssl revocation checking on Windows
Summary:
Windows defaults to checking a revocation server for ssl certs. Inside
our datacenter it can't reach the server and fails. We don't have this on for
any other platforms, so let's disable it.

Reviewed By: sfilipco

Differential Revision: D23121739

fbshipit-source-id: 4d44d2a065bf340a8f74332553deb09a9c61be9b
2020-08-13 23:17:28 -07:00
Meyer Jacobs
b9ce375f36 edenapi: Split DataEntry into FileEntry and TreeEntry
Summary:
The primary change is in `eden/scm/lib/edenapi/types`:
* Split `DataEntry` into `FileEntry` and `TreeEntry`.
* Split `DataError` into `FileError` and `TreeError`. Remove `Redacted` error variant from `TreeError` and `MaybeHybridManifest` error variant from `FileError`.
* Split `DataRequest`, `DataResponse` into appropriate File and Tree types.
* Refactor `data.rs` into `file.rs` and `tree.rs`.
* Lift `InvalidHgId` error, used by both File and Tree, into `lib.rs`.
* Bugfix: change `MaybeHybridManifest` to be returned only for hash mismatches with empty paths, to match documented behavior.

Most of the remaining changes are straightforward fallout of this split. Notable changes include:
* `eden/scm/lib/edenapi/tools/read_res`: I've split the "data" commands into "file" and "tree", but I've left the identical arguments sharing the same argument structs. These can be refactored later if / when they diverge.
* `eden/scm/lib/types/src/hgid.rs`: Moved `compute_hgid` from `eden/scm/lib/edenapi/types/src/data.rs` to as a new `from_content` constructor on the `HgId` struct.
* `eden/scm/lib/revisionstore/src/datastore.rs`: Split `add_entry` method on `HgIdMutableDeltaStore` trait into `add_file` and `add_tree` methods.
*  `eden/scm/lib/revisionstore/src/edenapi`
    * `mod.rs`: Split `prefetch` method on `EdenApiStoreKind` into `prefetch_files` and `prefetch_trees`, which are given a default implementation that fails with `unimplemented!`.
    * `data.rs`: Replace blanket trait implementations for `EdenApiDataStore<T>` with specific implementations for `EdenApiDataStore<File>` and `EdenApiDataStore<Tree>` which call the appropriate fetch and add functions.
    * `data.rs` `test_get_*`: Replace dummy hashes with real hashes. These tests were only passing due to the hash mismatches (incorrectly) being considered `MaybeHybridManifest` errors, and allowed to pass.

Reviewed By: kulshrax

Differential Revision: D22958373

fbshipit-source-id: 788baaad4d9be20686d527f819a7342678740bc3
2020-08-13 10:01:40 -07:00
Jeremy Sze Wei Teo
43425f1116 Revert D22992103: hgcommands: add debugfsync
Differential Revision:
D22992103 (f6d086d13b)

Original commit changeset: b5503e498d52

fbshipit-source-id: ad8f0d9c0bba1d07edb0aebca052da10c0f8e59c
2020-08-12 19:25:24 -07:00
Jun Wu
f6d086d13b hgcommands: add debugfsync
Summary:
The `debugfsync` command calls fsync on newly modified files in svfs.
Right now it only includes locations that we know have constant number
of files.

The fsync logic is put in a separate crate to avoid slow compiles.

Reviewed By: DurhamG

Differential Revision: D22992103

fbshipit-source-id: b5503e498d5216d4ba19701ecd5582387e4f45f5
2020-08-12 18:33:52 -07:00
Jun Wu
3ee967c003 clidispatch: add repo.store_path API
Summary: This allows callsites to get access to the storage.

Reviewed By: DurhamG

Differential Revision: D22992104

fbshipit-source-id: c72fa313be1468170c9728d3856f822bb6385dc8
2020-08-12 18:33:52 -07:00
Jun Wu
8ca7ab1c5a hgcommands: move debug commands to individual files
Summary:
This makes the main command table cleaner.

I dropped the `indexedlogrepair` command as it cannot rebuild indexes. `hg
doctor` is a better replacement. Some debug commands are renamed so they
no longer have `-` in the command name.

Reviewed By: DurhamG

Differential Revision: D22992107

fbshipit-source-id: f65d74e36fb971e592ad0cc8be9a94e245c39662
2020-08-12 18:33:52 -07:00
Jun Wu
bcfa8e5676 hgcommands: move version to a module
Summary: Move some native commands to independent modules.

Reviewed By: DurhamG

Differential Revision: D22992106

fbshipit-source-id: cf7751418d19d54d9dd89d9d0f79851ac11879c3
2020-08-12 18:33:52 -07:00
Jun Wu
8c51e81c97 hgcommands: move root to a module
Summary: Move some native commands to independent modules.

Reviewed By: DurhamG

Differential Revision: D22992105

fbshipit-source-id: e4fd8db3f0d6f9d2ab5be862f6d9469da7d15a93
2020-08-12 18:33:52 -07:00
Jun Wu
896671cefb hgcommands: define a macro to register command from modules
Summary:
If every command lives in their module, then we can define the "module" interface:

- run(...): run the command
- doc(): the help text
- name(): command name, with aliases

Then the macro would make command registration look simpler.

This diff changes `status` to use the pattern as an example.

Reviewed By: DurhamG

Differential Revision: D22992109

fbshipit-source-id: eaf589863092ec2eb1f8c24c1c7e425492fe1e3a
2020-08-12 18:33:52 -07:00
Jun Wu
757daa5eaf hgcommands: move commands to a directory
Summary:
As the number of commands grows, it starts making sense to move them to
individual files. Let's create a directory for them.

Reviewed By: DurhamG

Differential Revision: D22992108

fbshipit-source-id: a0556be602b832579a8e027342d5b86d9d84d257
2020-08-12 18:33:51 -07:00
Xavier Deguillard
2d3370dca4 fsinfo: recognize EdenFS mounts on Windows
Summary:
EdenFS on Windows is a bit weird as ProjectedFS is implemented as a filter
driver that adds reparse point to all the files/directories to get notified of
filesystem operations on them. It then hides these reparse points from the
outside which means that the dwAttributes of a file in EdenFS will not claim
that a reparse point is attached to it. On top of this, newly created
files/directories won't have any reparse points attached to them, until they
start being tracked by EdenFS.

While the first issue can be solved by always querying the reparse tags, I'm
not entirely sure how to solve the second one. That second issue causes
Mercurial to always try to create hardlink in the .hg directory, while it shouldn't.

Reviewed By: DurhamG

Differential Revision: D22937788

fbshipit-source-id: 5d90cd37d40858ed60103ff2d17c2cef16472b38
2020-08-12 15:47:49 -07:00
Stefan Filip
e06d9979f5 client: add commit revlog data endpoint
Summary: Client portion for the commit/revlog_data endpoint that was added to the server.

Reviewed By: kulshrax

Differential Revision: D23065989

fbshipit-source-id: 3115ad2b426daca22472e2106fcd293f3ccd70f3
2020-08-11 22:15:10 -07:00
Durham Goode
2e8915a653 revisionstore: set max memory footprint for data/history indexedlog
Summary:
When doing large clones or checkouts the amount of data we add to an
indexedlog can be many GB. On a laptop we don't have much memory, so let's set a
max memory threshold for the file data/history indexedlogs.

Reviewed By: xavierd

Differential Revision: D23046489

fbshipit-source-id: 43b7686b11fe05e4c074bcb02c475ebf8cf14ab1
2020-08-11 09:51:26 -07:00
Stefan Filip
2825193931 edenapi: add /commit/revlog_data endpoint
Summary:
Matches the `getcommitdata` SSH endpoint.
This is going to be used to remove the requirement that client repostories
need to have all commits locally.

Reviewed By: krallin

Differential Revision: D22979458

fbshipit-source-id: 75d7265daf4e51d3b32d76aeac12207f553f8f61
2020-08-11 01:54:14 -07:00
Meyer Jacobs
b9f3c9c692 taggederror: Introduce taggederror-util for more ergonomic error tagging for eden error types.
Summary:
Introduce taggederror-util, which provides a new trait `AnyhowEdenExt`, which provides a method `eden_metadata` for anyhow errors and results. This method works much like `AnyhowExt::common_metadata`, but additionally supports extracting default error metadata from known `Tagged` types which are listed explicitly in the method implementation.

Extend `FilteredAnyhow` to support a configuration "metadata function", which allows swapping out `eden_metadata` for the standard `common_metadata`.

Modify Rust dispatch and Python bindings to use `AnyhowEdenExt` for metadata extraction and printing.

Modify `intentional_error` to rely on `AnyhowEdenExt` for tagging (removes `.tagged` call, no tags will be visible if `AnyhowEdenExt` is not used).

Reviewed By: DurhamG

Differential Revision: D22927203

fbshipit-source-id: 04b36fdfaa24af591118acb9e418d1ed7ae33f91
2020-08-06 19:37:25 -07:00
Arun Kulshreshtha
f293577672 http_client: allow setting chunk size for async responses
Summary:
Add a `buffered()` method to `AsyncResponse`  allowing the user to specify the desired chunk size for the body stream.

(This was already used internally by `CborStream`; this just exposes it in the public interface.)

Reviewed By: quark-zju

Differential Revision: D22935891

fbshipit-source-id: e110e85bf9cb4c7923a8977ea4631ca1cc4cf4cb
2020-08-06 15:56:56 -07:00
Arun Kulshreshtha
6707c2fc3c http_client: rename cbor module
Summary: Rename the `cbor` module to `stream` to better indicate that it contains various stream combinators (not all of which are related to CBOR).

Reviewed By: quark-zju

Differential Revision: D22935892

fbshipit-source-id: 3f73aa707ab59c31717c1cf35995ad79946a15c9
2020-08-06 15:56:56 -07:00
Jun Wu
4b5833968a revlogindex: be Ctrl+C/SIGKILL safe
Summary:
This provides Ctrl+C/SIGKILL safety. It's needed because we no longer use the
Python transaction framework. If the write is incomplete, the revlog index
logical length will ensure new processes won't see incomplete data.

The length of revlog data is not tracked, as some "unused" in it does not
really matter. Reading the revlog should be still fine.

Reviewed By: sfilipco

Differential Revision: D22914423

fbshipit-source-id: f2f446cde79c7270cbd1ef165f8707368a0a2990
2020-08-06 12:31:57 -07:00
Jun Wu
6fd7a2e582 dag: use concrete error types
Summary:
This is more complex than previous libraries, mainly because `dag` defines APIs
(traits) used by other code, which might raise error type not interested
by `dag` itself. `BackendError::Other(anyhow::Error)` is currently used to
capture types that do not fit in `dag`'s predefined error types.

Reviewed By: sfilipco

Differential Revision: D22883865

fbshipit-source-id: 3699e14775f335620eec28faa9a05c3cc750e1d1
2020-08-06 12:31:57 -07:00
Jun Wu
8d0f48c4da dag: rename some anyhow::Result to dag::Result
Summary:
Prefix some `Result` with `dag::Result`. Since `dag::Result` is just
`anyhow::Result` for now, this does not change anything but makes
it more compatible with upcoming changes.

Reviewed By: sfilipco

Differential Revision: D22883864

fbshipit-source-id: 95a26897ed026f1bb8000b7caddeb461dcaad0e7
2020-08-06 12:31:57 -07:00
Jun Wu
ff9c979b07 revlogindex: use concrete error types
Summary:
All dependencies of revlogindex have migrated to concreted error types.
Let's migrate revlogindex itself. This allows compile-time type checks
and makes the error returned by revlogindex APIs more predictable.

Reviewed By: sfilipco

Differential Revision: D22857554

fbshipit-source-id: 7d32599508ad682c6e9c827d4599e6ed0769899c
2020-08-06 12:31:57 -07:00
Jun Wu
78c05bb5e6 cpython-ext: extract io::Error translation to a function
Summary: This will be used later.

Reviewed By: sfilipco

Differential Revision: D22883863

fbshipit-source-id: 4f7ed4eb51d403f96e9d1aa1792062b4c55e3398
2020-08-06 12:31:57 -07:00
Jun Wu
af375c51b0 radixbuf: use concrete error types
Summary: The `radixbuf` crate already has its own concrete error type. Use it.

Reviewed By: sfilipco

Differential Revision: D22855450

fbshipit-source-id: 307a46ddd79b28a18ee779867ee1e604b531828a
2020-08-06 12:31:57 -07:00
Jun Wu
ddad0cf115 util: use concrete error types
Summary:
`util` as a low-level library should use concrete error types so callsite can
type check their type conversions.

Reviewed By: sfilipco

Differential Revision: D22855448

fbshipit-source-id: 37b3fce36f1ae82a9604ef8ac0dc22c02280ceb2
2020-08-06 12:31:56 -07:00
Jun Wu
f8385f3b83 lz4-pyframe: use concrete error types
Summary:
Change the `lz4-pyframe` library to use a concrete error type instead of trait
object. This would allow callsites to type check the error type.

Reviewed By: sfilipco

Differential Revision: D22855449

fbshipit-source-id: 3497b3e0bfb814302fee2f7297b35de8b8a916ed
2020-08-06 12:31:56 -07:00
Xavier Deguillard
d467764ae4 cmake: do not compile configparser
Summary: This is unused, no need to add to the build time.

Reviewed By: fanzeyi

Differential Revision: D22967814

fbshipit-source-id: 91a5ed9f03128947af9cb69bca62ed75b75e7e66
2020-08-06 09:00:20 -07:00
Durham Goode
18aa75381c lfs: replace reqwest with our curl based http_client
Summary:
We were experiencing hangs during lfs http fetches. We've seen similar
issues before when using Hyper, which Reqwest is based off of. Let's switch to
the new Curl-based http_client instead.

Note, this does get rid of the ability to obey hg config http proxy settings.
That said, they shouldn't be used much, and the http proxy environment variables
are respected by libcurl.

Reviewed By: xavierd

Differential Revision: D22935348

fbshipit-source-id: 1c61c04bbb4043e3bde592251f12bf846ab3afd4
2020-08-05 01:40:37 -07:00
Durham Goode
1066929001 http_client: prevent 100 http error codes for lfs fetches
Summary:
A future diff does LFS fetches via http_client. Curl has some default
behavior of adding the "Expect: 100-continue" header which causes the server to
send a 100 status code response after the headers have been received but before
the the payload has been received. Since the http_client model only expects a
single response, this breaks the model and we're unable to read the second
response. Let's disable this behavior by manually setting the header to empty
string, which appears to be the official way to handle this.

Add it early so callers can overwrite it.

Reviewed By: quark-zju

Differential Revision: D22935349

fbshipit-source-id: 3009a5eb72f40584b846510f34f121e0e821a2bc
2020-08-05 01:40:36 -07:00
Xavier Deguillard
4f5455cfa5 silence some Rust warnings
Summary: This is causing noise when compiling, let's silence them.

Reviewed By: kulshrax

Differential Revision: D22925881

fbshipit-source-id: 10b48f1f05ff8931e23d07a9d7e9504339fceca0
2020-08-05 00:16:33 -07:00
Arun Kulshreshtha
0c5cecb42b http_client: implement Display for Method
Summary: title

Reviewed By: DurhamG

Differential Revision: D22936401

fbshipit-source-id: ed3a0f405d0fab288cd9d937ad390ef5395b72fb
2020-08-04 17:14:25 -07:00
Durham Goode
58f03baa85 repack: don't error if a pack is already deleted before repack
Summary:
One repack code path would return an error if the pack was already
deleted before the repack started. This is a fine situation, so let's just eat
the error and repack what can be repacked.

Reviewed By: xavierd

Differential Revision: D22873219

fbshipit-source-id: c716a5f0cd6106fd3464702753fb79df0bc7d13f
2020-08-04 17:11:54 -07:00
Stefan Filip
7392392a33 server: add commit/location_to_hash path
Summary:
Eden api endpoint for segmented changelog. It translates a path in the
graph to the hash corresponding to that commit that the path lands on.
It is expected that paths point to unique commits.

This change looks to go through the plumbing of getting the request from
the edenapi side through mononoke internals and to the segmented changelog
crate. The request used is an example. Follow up changes will look more at
what shape the request and reponse should have.

Reviewed By: kulshrax

Differential Revision: D22702016

fbshipit-source-id: 9615a0571f31a8819acd2b4dc548f49e36f44ab2
2020-08-04 11:22:39 -07:00
Xavier Deguillard
b0603e43cf revisionstore: only fetch LFS blob once
Summary:
During large prefetches, (say a clone), it is possible that 2 different
filenode actually refer to the same file content, which thus share the same LFS
blob. The code would wrongly prefetch this blob twice which would then fail due
to the `obj_set` only containing one instance of this object.

Instead of using a Vec for the objects to prefetch, we can simply use a
`HashSet` which will take care of de-duplicating the objects.

Reviewed By: DurhamG

Differential Revision: D22903606

fbshipit-source-id: 4983555d2b16639051acbbb591ebb752d55acc2d
2020-08-03 20:49:13 -07:00
Xavier Deguillard
1873fc3dbe revisionstore: properly prefetch all LFS blobs
Summary:
There was a small but easy to miss mistake when prefetch was changed to return
the keys that couldn't be prefetched. For LFS pointers, the code would wrongly
return that the blob was fetched, which is misleading as the LFS blob isn't
actually downloaded. For LFS pointers, we need to translate them to their LFS
blob content hashes.

Reviewed By: DurhamG

Differential Revision: D22903607

fbshipit-source-id: e86592cd986498d9f4a574585eb92da695de2e27
2020-08-03 20:49:12 -07:00
Durham Goode
b71124ad8c indexedlog: allow defaulting to writing history to indexedlog
Summary:
An earlier diff, D21772132 (713fbeec24), add an option to default hgcache data store
writes to indexedlog but it only did it for data, not history. Let's also do it
for history.

Reviewed By: quark-zju

Differential Revision: D22870952

fbshipit-source-id: 649361b2d946359b9fbdd038867e1058077bd101
2020-07-31 19:49:46 -07:00
Jun Wu
cc80592783 dynamicconfig: make in_timeshard accept a range
Summary: This makes it a little bit easier to use.

Reviewed By: sfilipco

Differential Revision: D22853717

fbshipit-source-id: aa3c1ed2a9a2d1020a48a4493a644093d8b07e67
2020-07-31 13:49:47 -07:00
Durham Goode
3e0133e902 lfs: add timeout to lfs fetching
Summary:
We're seeing users report lfs fetching hanging for 24+ hours. Stack
traces seem to show it hanging on the lfs fetch. Let's read bytes off the wire
in smaller chunks and add a timeout to each read (default timeout is 10s).

Reviewed By: xavierd

Differential Revision: D22853074

fbshipit-source-id: 3cd9152c472acb1f643ba8c65473268e67d59505
2020-07-31 09:30:26 -07:00
generatedunixname89002005287564
070b9abf48 Daily arc lint --take RUSTFMT
Reviewed By: zertosh

Differential Revision: D22862880

fbshipit-source-id: cc2a30bb5345ffae1a117bb6220d6c2f4d9f73ba
2020-07-31 04:28:59 -07:00
Jun Wu
235a9306e1 revlogindex: support delta-ed content
Summary:
Although new changelog revlogs do not use deltas since years ago, early
revisions in our production changelog still use mpatch delta format
because they are stream-cloned.

Teach revlogindex to support them.

Reviewed By: sfilipco

Differential Revision: D22657204

fbshipit-source-id: 7aa3b76a9a6b184294432962d36e6a862c4fe371
2020-07-30 20:32:38 -07:00
Jun Wu
64d4f5743f dag: delegate reachable_root to inner implementations
Summary: Otherwise the default implementation will be used.

Reviewed By: sfilipco

Differential Revision: D22657206

fbshipit-source-id: dea31149efe41cb3d9e30b33c138e437dce8011e
2020-07-30 20:32:37 -07:00
Jun Wu
a36f77673e revlogindex: implement reachable_roots fast path
Summary:
The default reachable_roots implementation is good enough for segmented
changelog, but not efficient for revlogindex use-case.

Reviewed By: sfilipco

Differential Revision: D22657193

fbshipit-source-id: a81bc255d42d46c50e61fe954f027f1160dacb6c
2020-07-30 20:32:37 -07:00
Jun Wu
5f3f7e49d6 dag: add reachable_roots API
Summary:
I thought it was just `roots & (::heads)`. It is actually more complex than
that.

Reviewed By: sfilipco

Differential Revision: D22657201

fbshipit-source-id: bd0b49fc4cdd2c516384cf70c1c5f79af4da1342
2020-07-30 20:32:37 -07:00
Jun Wu
fcc78319a0 revlogindex: use dedicated error type for missing commits
Summary:
This replaces RustError that might happen during `addcommits`, and allow us to
handle it without having a stacktrace.

Reviewed By: DurhamG

Differential Revision: D22539564

fbshipit-source-id: 356814b9baf0b31528dfc92d62b0dcf352bc1e24
2020-07-30 20:32:33 -07:00
Jun Wu
c68d389d95 revlogindex: update DAG hints
Summary:
Follow up of D22638454.

This makes revlogindex marks its compatible DAG so "all()" fast paths can be used properly.

Reviewed By: sfilipco

Differential Revision: D22638459

fbshipit-source-id: 074e95b9fccbc486b69a947fec5172662e7dd3b7
2020-07-30 20:32:32 -07:00
Jun Wu
a2b44103bd dag: add fast path for IdLazySet::contains
Summary:
No need to exhaust the entire IdLazySet if there are hints.
This is important to make `small & lazy` fast.

Reviewed By: sfilipco

Differential Revision: D22638462

fbshipit-source-id: 63a71986e6e254769c42eb6250c042ea6aa5808b
2020-07-30 20:32:32 -07:00
Jun Wu
e3059699ee dag: cross-DAG set operations should use FULL and ANCESTORS hint carefully
Summary:
When multiple DAGs (ex. a local DAG and a commit-cloud DAG) are involved,
certain fast paths become unsound. Namely, the fast paths of the FULL hint
should check DAG compatibility. For example:

  localrepodag.all() & remotedag.all()

should not simply return `localrepodag.all()` or `remotedag.all()`.

Fix it by checking DAG pointers.

A StaticSet might be created without using a DAG, add an optimization
to change `all & static` to `static & all`. So StaticSet without DAG
wouldn't require full DAG scans when intersecting with other sets.

Reviewed By: sfilipco

Differential Revision: D22638454

fbshipit-source-id: 72396417e9c1238d5411829da8f16f2c6d4c2f3a
2020-07-30 20:32:32 -07:00
Jun Wu
34de6956f6 dag: improve fmt::Debug on sets
Summary:
Improve `fmt::Debug` so it fits better in the Rust and Python eco-system:
- Support Rust formatter flags. For example `{:#5.3?}`. `5` defines limit of a
  large set to show, `3` defines hex commit hash length. `#` specifies the
  alternate form.
- Show commit hashes together with integer Ids for IdStaticSet.
- Use HG rev range syntax (`a:b`) to represent ranges for IdStaticSet.
- Limit spans to show for IdStaticSet, similar to StaticSet.
- Show only 8 chars of a long hex commit hash by default.
- Minor renames like `dag` -> `spans`, `difference` -> `diff`.

Python bindings uses `fmt::Debug` as `__repr__` and will be affected.

Reviewed By: sfilipco

Differential Revision: D22638455

fbshipit-source-id: 957784fec9c99c8fc5600b040d964ce5918e1bb4
2020-07-30 20:32:31 -07:00
Jun Wu
7c2dffb955 revlogindex: optimize set intersection with hints
Summary:
This makes intersection set stop early. It's useful to stop iteration on some
lazy sets. For example, the below `ancestors(tip) & span` or
`descendants(1) & span` sets can take seconds to calculate without this
optimization.

```
In [1]: cl.dag.ancestors([cl.tip()]) & cl.tonodes(bindings.dag.spans.unsaferange(len(cl)-10,len(cl)))
Out[1]: <and <lazy-id> <dag [...]>>

In [3]: %time len(cl.dag.ancestors([cl.tip()]) & cl.tonodes(bindings.dag.spans.unsaferange(len(cl)-10,len(cl))))
CPU times: user 364 µs, sys: 0 ns, total: 364 µs
Wall time: 362 µs

In [7]: %time len(cl.dag.descendants([repo[1].node()]) & cl.tonodes(bindings.dag.spans.unsaferange(0,100)))
CPU times: user 0 ns, sys: 574 µs, total: 574 µs
Wall time: 583 µs
```

Reviewed By: sfilipco

Differential Revision: D22638458

fbshipit-source-id: b9064ce2ff1aecc2d7d00025928dfcb3c0d78e0c
2020-07-30 20:32:31 -07:00
Jun Wu
d5d429a5c7 revlogindex: optimize with ANCESTORS hint
Summary:
Similar to the segmented changelog version using `ANCESTORS`. This makes
`heads(all())` calculates `heads_ancestors(all())` automatically and gets
the speed-up.

Reviewed By: sfilipco

Differential Revision: D22638464

fbshipit-source-id: 014412f1c226925e50387f18c1282b3cb96d434b
2020-07-30 20:32:31 -07:00
Jun Wu
f5fb9fb09d revlogindex: optimize heads_ancestors
Summary:
Optimize it to not covert revs to `Vec<u32>`, and have a fast path to
initialize `states` with `Unspecified`. This makes it about 2x faster and match
the C revlog `headrevs` performance when calculating `headsancestors(all())`:

```
In [2]: %timeit cl.index.clearcaches(); len(cl.index.headrevs())
10 loops, best of 3: 66.9 ms per loop

In [3]: %timeit len(cl.dageval(lambda: headsancestors(all())))
10 loops, best of 3: 64.9 ms per loop
```

Reviewed By: sfilipco

Differential Revision: D22638461

fbshipit-source-id: 965eb16e3a78ae02a65a8a44559f3a64c16f6884
2020-07-30 20:32:30 -07:00
Jun Wu
2d4bb1d7e3 revlogindex: fast path for parents
Summary:
Change `parents` from using the default implementation that returns `StaticSet`
of commit hashes, to a customized implementation that returns `IdStaticSet`.
This avoids unnecessary commit hash lookups, and makes `heads(all())` 30x
faster, matching `headsancestors(all())` (but is still 2x slower than the C
revlog index `headsrevs` implementation).

Reviewed By: sfilipco

Differential Revision: D22638453

fbshipit-source-id: 4fef78080b990046b91fee110c48e36301d83b4f
2020-07-30 20:32:30 -07:00
Jun Wu
a02c93864f dag: add ANCESTORS hint
Summary:
The hint indicates a set `X` is equivalent to `ancestors(X)`.

This allows us to make `heads` use `heads_ancestors` (which is faster in
segmented changelog) automatically without affecting correctness. It also
makes special queries like `ancestors(all())` super cheap because it'll just
return `all()` as-is.

Reviewed By: sfilipco

Differential Revision: D22638463

fbshipit-source-id: 44d9bbcbb0d7e2975a0c8322181c88daa1ba4e37
2020-07-30 20:32:30 -07:00
Jun Wu
3d9f195721 changelog: use Rust RevlogIndex for 'ancestor' revset function
Summary: This avoids depending on the C index if the Rust DAG is available.

Reviewed By: DurhamG

Differential Revision: D22519587

fbshipit-source-id: a89d91184feaeef6641d2b04353601297bf5d4d5
2020-07-30 20:00:41 -07:00
Liubov Dmitrieva
2c38313e9a remove dead code about secret tool
Summary:
Better Engineering: remove dead code about secret tool.

Secret tool is a FB specific tool (keychain like) and has been used to transfer OAuth token between
different devservers without user's involvement. We have migrated to certs on devservers, so it is not needed anymore.

Also, it is FB specific and doesn't make sense for open source either.

Reviewed By: mitrandir77

Differential Revision: D22827264

fbshipit-source-id: cd89168ad75ca041d2a0f18d63474dd1eaad483d
2020-07-30 06:10:18 -07:00
Xavier Deguillard
21cd242dcf revisionstore: add a fallback remote lfs store
Summary:
If the LFS server is down, we are going to retry fetching filenode from the
Mercurial server directly, who is expected to not return a pointer.

Previously, this was achieved by adding a hack into `get_missing`, but since
the function is no longer called during prefetch calls, we cannot rely on it
anymore. Instead, we can wrap the regular remote store and translate all the
StoreKey::Content onto their corresponding hgid keys.

Reviewed By: DurhamG

Differential Revision: D22565604

fbshipit-source-id: 2532a1fc3dfd9ba5600957ed5cf905255cb5b3fd
2020-07-28 10:51:38 -07:00
Xavier Deguillard
7be69d5e65 revisionstore: write the lfs blob alongside the pointer
Summary:
The ContentStore code has a strong assumption that all the data fetched ends up
in the hgcache. Unfortunately, this assumption breaks somewhat if an LFS
pointer is in the local store but the blob isn't alonside it. This can happen
for instance when ubundling a bundle that contains a pointer, the pointer will
be written to the local store, but the blob would be fetched in the shared
store.

We can break this assumption a bit in the LFS store code by writing the fetched
blob alongside the pointer, this allows the `get` operation to find both the
pointer and the blob in the same store.

Reviewed By: DurhamG

Differential Revision: D22714708

fbshipit-source-id: 01aedf04d692c787b7cddb0f7a76828ea37dcf29
2020-07-28 10:51:38 -07:00
Xavier Deguillard
f22575657c revisionstore: search in local lfs store on prefetch
Summary:
The pointer for a blob might very well be in the local store, so let's search
in it.

Reviewed By: DurhamG

Differential Revision: D22565608

fbshipit-source-id: 925dd5718fc19e11a1ccaa0887bf5c477e85b2e5
2020-07-28 10:51:38 -07:00
Xavier Deguillard
e9b3f79b70 revisionstore: return missing keys from prefetch
Summary: Similarly to the changes made for `get`, the same can be applied to prefetch.

Reviewed By: DurhamG

Differential Revision: D22565609

fbshipit-source-id: 0fbc1a0086fa44593a6aaffb746ed36b3261040c
2020-07-28 10:51:38 -07:00
Xavier Deguillard
5af4e00fb4 revisionstore: remove c_api
Summary: It was once used by EdenFS, but is now dead code, no need to keep it around.

Reviewed By: singhsrb

Differential Revision: D22784582

fbshipit-source-id: d01cf5a99a010530166cabb0de55a5ea3c51c9c7
2020-07-27 23:24:03 -07:00
Xavier Deguillard
3a97764d70 revisionstore: add a new StoreResult type
Summary:
When using LFS, it's possible that a pointer may be present in the local
LfsStore, but the blob would only be in the shared one. Such scenario can
happen after an upload, when the blob is moved to the shared store for
instance. In this case, during a `get` call, the local LFS store won't be able
to find the blob and thus would return Ok(None), the shared LFS store woud not
be able to find the pointer itself and would thus return Ok(None) too. If the
server is not aware of the file node itself, the `ContentStore::get` would also
return Ok(None), even though all the information is present locally.

The main reason why this is happening is due to the `get` call operating
primarily on file node based keys, and for content-based stores (like LFS),
this means that the translation layer needs to be present in the same store,
which in some case may not be the case. By allowing stores to return a
`StoreKey` when progress was made in finding the key we can effectively solve
the problem described above, the local store would translate the file node key
onto a content key, and the shared store would read the blob properly.

Reviewed By: DurhamG

Differential Revision: D22565607

fbshipit-source-id: 94dd74a462526778f7a7e232a97b21211f95239f
2020-07-24 10:45:40 -07:00
Xavier Deguillard
51aca36721 asyncrevisionstore: remove it
Summary:
I thought I had removed that code a while back, it turns out I didn't, so let's
do it.

Reviewed By: singhsrb

Differential Revision: D22583556

fbshipit-source-id: b92644195994e0a83bdbcd8019253ea217474486
2020-07-23 22:49:21 -07:00
Meyer Jacobs
586ada8de6 taggederror: introduce bail macro replacement which allows tagging
Summary: This change introduces a bail macro that allows tagging errors using the syntax `bail!(fault=Fault::Request, "my normal {}", bail_args)` or `bail!(Fault::Request, "my normal {}", bail_args)`.

Reviewed By: DurhamG

Differential Revision: D22646428

fbshipit-source-id: a6ec2940001b26db8ddc3a6d3620a1e17406c867
2020-07-22 15:37:14 -07:00
Jun Wu
7b7ae0bd09 hgcommits: implement strip_commits for testing
Summary:
About 64 tests depend on the revlog `strip` behavior. `strip` is not used in
production client-repos.  I tried to migrate them off `strip` but that seems
too much work for now. Instead let's just implement `strip` in the HgCommits
layer to be compatible to run the tests.

Reviewed By: DurhamG

Differential Revision: D22402195

fbshipit-source-id: f68d005e04690d8765d5268c698b6c96b981eb0a
2020-07-17 22:23:05 -07:00
Jun Wu
eb4c007145 changelog: use Rust RevlogIndex for partialmatch
Summary:
I dropped the special case of wdir handling. With the hope that we will handle
the virtual commits differently eventually (ex. drop special cases, insert real
commits to Rust DAG but do not flush them to disk, support multiple wdir
virtual commits, null is no longer an ancestor of every commit).

`test-listkeyspatterns.t` is changed because `0` no longer resolves to `null`.

Reviewed By: DurhamG

Differential Revision: D22368836

fbshipit-source-id: 14b9914506ef59bb69363b602d646ec89ce0d89a
2020-07-17 22:23:04 -07:00
Arun Kulshreshtha
6a00dd8e8a http_client: update client description
Reviewed By: quark-zju

Differential Revision: D22606295

fbshipit-source-id: e8fca2fd6b074d0991cde4a3cacb95bf0fe07877
2020-07-17 18:49:01 -07:00
Arun Kulshreshtha
cc6dd8ef03 edenapi: allow sending extra HTTP headers with each request
Summary: Allow the user to specify extra HTTP headers that should be sent with each EdenAPI request in the `edenapi.headers` config option. The field is expected to be a JSON object whose key-value pairs are used as header keys and values.

Reviewed By: quark-zju

Differential Revision: D22591870

fbshipit-source-id: ac1bb669270d667895554dcc5f7176d18736375c
2020-07-17 17:33:26 -07:00
Arun Kulshreshtha
6f2adbf2cc edenapi: include field names in malformed config errors
Summary: Include the name of bad config fields in the error message so the user can more easily fix the problem.

Reviewed By: quark-zju

Differential Revision: D22591871

fbshipit-source-id: e23e2c71e49e0458e7ea5c13e7feac3a990ead0c
2020-07-16 21:26:37 -07:00
Qinfan Wu
50a8016efc Update libra to latest revision
Summary: Update libra to latest revision.

Reviewed By: jsgf

Differential Revision: D22574392

fbshipit-source-id: e8b937d6957a159c3dc6f1809a042e74c6aa3729
2020-07-16 21:10:44 -07:00
Arun Kulshreshtha
0b7a612c0e edenapi: use CA cert bundle specified in hg config
Summary: Update `edenapi::Builder` to use the CA certificate bundle specified in the `[auth]` section of the user's config.

Reviewed By: quark-zju

Differential Revision: D22591034

fbshipit-source-id: 3a417adbf50ef7d2c538f4a032e54a038cbd282e
2020-07-16 19:48:36 -07:00
Arun Kulshreshtha
3613e4c840 auth: allow specifying a CA certificate bundle
Summary: Allow specifying a CA certificate bundle in the `auth` section of an `hgrc`. This is useful for testing with locally-built servers using self-signed certificates.

Reviewed By: quark-zju

Differential Revision: D22591045

fbshipit-source-id: 023fe006267b0b781a1af16a7505e188c008a8c0
2020-07-16 19:48:36 -07:00
Meyer Jacobs
e3b86cf77d debug: introduce binding layer for propagating error metadata to Python
Summary:
Implements based Rust-Python binding layer for error metadata propagation.

We introduce a new type, `TaggedExceptionData`, which carries CommonMetadata and the original (without metadata) error message for a Rust Anyhow error. This class is passed to RustError and can be accessed in Python (somewhat awkwardly) via indexing:
```
except error.RustError as e:
    fault = e.args[0].fault()
    typename = e.args[0].typename()
    message = e.args[0].message()
```
As far as I can tell, due to limitations in cpython-rs, this can't be made more ergonomic without introducing a Python shim around the Rust binding layer, which could adapt the cpython-rs classes to use whatever API we'd like.

Currently, anyhow errors that are not otherwise special-cased will be converted into RustError, with both the original error message and any attached metadata printed as shown below
```
  abort: intentional error for debugging with message 'intentional_error'
  error has type name taggederror::IntentionalError and fault None
```
We can of course re-raise the error if desired to maintain the previous behavior for handling a RustError.

If we'd like other, specialized Rust Python Exception types to carry metadata (such as `IndexedLogError`), we'll need to modify them to accept a `TaggedExceptionData` like `RustError`.

Renamed the "cause an error in pure rust command" function to `debugcauserusterror`, and instead used the name `debugthrowrustexception` for a command which causes an error in rust which is converted to a Python exception across the binding layer.

Introduced a simple integration test which exercises `debugthrowrustexception`.

Added a basic handler for RustError to scmutil.py

Reviewed By: DurhamG

Differential Revision: D22517796

fbshipit-source-id: 0409489243fe739a26958aad48f608890eb93aa0
2020-07-16 19:30:00 -07:00
Arun Kulshreshtha
bffb24216d revisionstore: move tokio runtime into EdenApiRemoteStore
Summary: Move the `tokio::Runtime` into `EdenApiRemoteStore` so that if initialization fails, we can propagate the error instead of panicking.

Reviewed By: xavierd

Differential Revision: D22564210

fbshipit-source-id: 9db1be99f2f77c6bb0f6e9dc445d624dc5990afe
2020-07-16 13:32:19 -07:00
Arun Kulshreshtha
6849666105 edenapi_types: add metadata field to DataEntry
Summary:
Add a metadata field to `read_res` containing a `revisionstore::Metadata` struct (which contains the object size and flags). The main purpose of this is to support LFS, which is indicated via a metadata flag.

Although this change affects the `DataEntry` struct which is serialized over the wire, version skew between the client and server should not break things since the field will automatically be populated with a default value if it is missing in the serialized response, and ignored if the client was built with an earlier version of the code without this field.

In practice, version skew isn't really a concern since this isn't used in production yet.

Reviewed By: quark-zju

Differential Revision: D22544195

fbshipit-source-id: 0af5c0565c17bdd61be5d346df008c92c5854e08
2020-07-16 13:32:19 -07:00
Arun Kulshreshtha
3327e15201 edenapi: percent-encode repo names
Summary: Instead of restricting the allowed characters in a repo name, allow any UTF-8 string. The string will be percent-encoded before being used in URLs.

Reviewed By: quark-zju

Differential Revision: D22559830

fbshipit-source-id: f9caa51d263e06d424531e0947766f4fd37b035f
2020-07-16 13:32:19 -07:00
Durham Goode
905ee42654 configs: add support for hostname-based dynamicconfigs
Summary: Makes the hostname available for dynamicconfig conditions.

Reviewed By: quark-zju

Differential Revision: D22537946

fbshipit-source-id: 630ee833bb3ec00253d718b3d03bbb8b3d49afca
2020-07-16 09:07:54 -07:00
Durham Goode
76df783c93 configs: implement user sharding
Summary: Adds support for sharding based on user name.

Reviewed By: quark-zju

Differential Revision: D22537540

fbshipit-source-id: 962f9582c8947335dc9d9d29c500d8c09df69878
2020-07-16 09:07:53 -07:00
Arun Kulshreshtha
c7bffff0ff edenapi: allow all ASCII characters in repo names
Summary: We have several repos whose names contain various non-alphanumeric/underscore/hyphen characters, so we need to be more permissive about accepting repo names.

Reviewed By: quark-zju

Differential Revision: D22554846

fbshipit-source-id: e7bb030e0b8fb6aa275c119ba0aa540405b29186
2020-07-15 15:12:49 -07:00
Arun Kulshreshtha
8cc3939f35 revisionstore: do not swallow errors from EdenAPI stores
Summary:
Previously, the EdenAPI stores would not report errors returned from the remote store. The intention behind this pattern in other stores is to prevent `KeyError`s from aborting the operation since the local store might still have the key.

However, in the case of the EdenAPI store, EdenAPI will simply omit missing keys in its response rather than returning an error. Instead, any error returned by the EdenAPI store indicates a more fundamental problem (e.g., unable to reach the server, connection reset, etc) which should cause an abort and return the error.

Reviewed By: quark-zju

Differential Revision: D22544031

fbshipit-source-id: e01e8d88b75e46dcebd2eef5203e3a0edde69fc7
2020-07-15 15:07:51 -07:00
Arun Kulshreshtha
e9a291438b edenapi: add limit option to read_res
Summary: When working with large CBOR responses, it is sometimes useful to limit processing to the first N entries to prevent the operation from taking a long time. This diff adds an option to the `read_res` tool to only look at the first N entries in a data or history response.

Reviewed By: quark-zju

Differential Revision: D22544451

fbshipit-source-id: 5e8e2c7212aa3b315a25bd4cf9273009a5e43f72
2020-07-15 13:19:16 -07:00
Arun Kulshreshtha
165f387df5 backingstore: do not require remotefilelog.reponame to be set
Summary: Some repos do not have `remotefilelog.reponame` set, so this shouldn't be a required config item.

Reviewed By: fanzeyi

Differential Revision: D22553141

fbshipit-source-id: a0fe9c289a1a32650572a4c123cda60af90e79ec
2020-07-15 12:14:00 -07:00
Meyer Jacobs
4ccbd119d7 debug: introduce error classification and metadata propagation
Summary:
Introduce new rust library, taggederror, which contains utilities for attaching metadata to errors. The library provides two main methods for attaching metadata to an error, the TaggedError wrapper type, and the AnyhowExt trait methods. Provides a struct, CommonMetadata, which contains all the metadata types introduced by taggederror (fault, transience, category, and typename), which can also be attached individually (and the same pattern can be used to attach other metadata).

Introduce a new native rust command, debugthrowrustexception, which causes the command to return an error, with some attached metadata.

Modify hg rust native command dispatch error handling to use debug formatter to print anyhow::Error errors. This will print out the source chain, contexts, and backtrace if available, which will cause the metadata we attach as a wrapper error or context to be printed.

Reviewed By: DurhamG

Differential Revision: D22420941

fbshipit-source-id: d38c5a10b686d86b69a2c0a19f5bcbf4ca24dff6
2020-07-15 10:03:10 -07:00
Durham Goode
28ddd1d1cc configs: add hg debugdynamicconfig --canary devvmXXX.prnY support
Summary:
Previously you could only canary locally on a devserver by setting an
environment variable. Let's add a --canary flag to debugdynamicconfig that
accepts a host.  Hg will ssh to that host and run the configerator cli to grab
the canaried config from that host.

Reviewed By: quark-zju

Differential Revision: D22535509

fbshipit-source-id: af1c21d8402c4e729769e50388d913bf52b66b89
2020-07-15 01:14:30 -07:00
Durham Goode
789d2c24fb cliparser: add support for Option<String> types
Summary:
Previously we had no way of specifying an optional string flag. This
adds support.

I considered making the implementation more generic, so it'd support
Option<i64> and potentially Option<bool> but it introduced some complexity and
didn't seem worth the effort for now.

Reviewed By: quark-zju

Differential Revision: D22535511

fbshipit-source-id: 04d7b5419ca7ae44a9aeff1a5cea2c3043d80042
2020-07-15 01:14:30 -07:00
Viet Hung Nguyen
78e3864869 xdiff: renamed third-party xdiff functions
Summary:
Follow up on this diff: D22432330 (b7817ffbd8)

Renamed xdiff functions to avoid linking issues when using both libgit2-sys and xdiff.

Reviewed By: farnz

Differential Revision: D22511368

fbshipit-source-id: e4be20e3112a8e8829298d5748657e9bdbde8588
2020-07-14 03:46:04 -07:00
Arun Kulshreshtha
9a536eb3b0 revisionstore: Add EdenApiHistoryStore
Summary: Add an EdenAPI-backed history store. Notably, thanks to the strongly-typed remote store design from the previous diff, it is not possible to construct an `EdenApiHistoryStore` for trees, even when the underlying remote store is behind a trait object. (This is because EdenAPI does not support fetching history for trees.)

Reviewed By: quark-zju

Differential Revision: D22492162

fbshipit-source-id: 23f1393919c4e8ac0918d2009a16f482d90df15c
2020-07-13 17:35:31 -07:00
Arun Kulshreshtha
670ed17ba6 revisionstore: add EdenApiFileStore and EdenApiTreeStore
Summary: Reimplement `EdenApiHgIdRemoteStore` as `EdenApiRemoteStore<T>`, where `T` is a marker type indicating whether this store fetches files or trees. This allows working with the stores in a more strongly-typed way, and avoid having to check what kind of store this is at runtime when fetching data.

Reviewed By: quark-zju

Differential Revision: D22492160

fbshipit-source-id: e17556093fa9b81d2301f281da36d75a03e33c5e
2020-07-13 17:35:31 -07:00
Arun Kulshreshtha
73765b649b revisionstore: move edenapi module into directory
Summary: Move `src/edenapi.rs` to `src/edenapi/mod.rs` in anticipation of adding more files to this module.

Reviewed By: quark-zju

Differential Revision: D22492161

fbshipit-source-id: f6252ea9a9e32d94029b8e6e76be5d9d1754f63d
2020-07-13 17:35:31 -07:00
Durham Goode
11972bf57e configs: switch to auditing the specific list of known problematic configs
Summary:
Previously we would audit all configs and report them if the
dynamicconfig did not match the rc-file config. Now that dynamicconfigs are
widely deployed, let's switch this around to auditing only configs we know have
had issues. This will let us start adding new configs via dynamicconfigs instead
of via the legacy staticfiles and chef, before we've finished migrating all the
legacy configs over.

Reviewed By: quark-zju

Differential Revision: D22401865

fbshipit-source-id: 5c41c674d39c8113b2a40da61e020e8a33c39312
2020-07-13 08:53:18 -07:00
Durham Goode
6774dfe154 packs: flush history packs every 10 million adds
Summary:
We're seeing cases were cloning can take 10's of GB of memory because
we pend all the history information in memory. Let's flush the history info
every 10 million adds to bound the memory usage.

10 million was chosen somewhat arbitrarily, but it results in pack files that
are 800MB, which corresponds roughly with 8GB of memory usage.

This requires updating repack to be aware that a single flush could produce
multiple packs. Note, since repack writes via this same path, it may result in
repack producing multiple pack files. In the degenerate case repack could
produce the same number (or more) of pack files than was inputted. If we set the
threshold high enough I think we'll be fine though. 800MB is probably
sufficient.

Reviewed By: xavierd

Differential Revision: D22438569

fbshipit-source-id: 425d5d3b7999b81e44d1dbe1f2a4ea453ab6ca4f
2020-07-13 08:10:14 -07:00
Arun Kulshreshtha
4408e43a22 edenapi_types: fix comments in json.rs
Reviewed By: quark-zju

Differential Revision: D22486837

fbshipit-source-id: 67d026df631b027d7b94e526fc4386c5e064b85e
2020-07-10 21:38:02 -07:00
Arun Kulshreshtha
6b67d820bd auth: remove use of unwrap
Reviewed By: quark-zju

Differential Revision: D22467292

fbshipit-source-id: d645d437a3dc80b1a7f29841067aa05b0e48df17
2020-07-09 19:05:55 -07:00
Arun Kulshreshtha
14a7fe636f cpython-ext: Add ExtractInnerRef trait
Summary: Per comments on D22429347, add a new `ExtractInnerRef` trait that is similar to `ExtractInner`, but returns a reference to the underlying value. A default implementation is provided for types whose inner value is `Clone + 'static`, so in practice most types will only need to implement `ExtractInnerRef`, whereas the callsite may choose whether it needs a reference or an owned value.

Reviewed By: quark-zju

Differential Revision: D22464158

fbshipit-source-id: 7b97329aedcddb0e51fd242b519e79eba2eed350
2020-07-09 19:05:55 -07:00
Arun Kulshreshtha
5cb7bdd3c0 edenapi: use EdenApiError as error type for StatsFuture
Summary: Ensure that all of the components of an EdenAPI response use the same error type.

Reviewed By: quark-zju

Differential Revision: D22443029

fbshipit-source-id: 3e00a8b83677beb5ef2d90630fe9b85760874186
2020-07-09 19:05:55 -07:00
Arun Kulshreshtha
cb16831e6d revisionstore: add add_entry method to HgIdMutableDeltaStore
Summary: Add an `add_entry` convenience method to `HgIdMutableDeltaStore`, similar to the one present in `HgIdMutableHistoryStore`.

Reviewed By: quark-zju

Differential Revision: D22443031

fbshipit-source-id: 84fdaae9fbd51e6f2df466b0441ec5f7ce6715f7
2020-07-09 19:05:55 -07:00