Commit Graph

740 Commits

Author SHA1 Message Date
Arun Kulshreshtha
4c1f11a751 types: fix mock node values
Summary: The mock values for the `Node` type are intended to have hashes that consist of a repeated digit (e.g., `1111111111111111111111111111111111111111`). However, since the bytes were specified using a single hex digit instead of two, the hashes were actually like `0101010101010101010101010101010101010101`. This diff fixes the values so they look as expected.

Reviewed By: quark-zju

Differential Revision: D14557546

fbshipit-source-id: 23651d70b9715d2fb77db162f689b87d9d43e5a2
2019-03-21 14:29:20 -07:00
Stefan Filip
e3b9873beb types: correctly import lazy_static in cargo
Summary: Fixes cargo test.

Reviewed By: quark-zju

Differential Revision: D14546435

fbshipit-source-id: daae0035871202fa3d221e11b0ea66199ded39d2
2019-03-20 19:56:14 -07:00
Stefan Filip
06704f2db2 radixbuf: add ignore marker to documentation blocks
Summary:
https://doc.rust-lang.org/rustdoc/documentation-tests.html#syntax-reference

Rust will treat an indentation of 4 or more spaces as a fenced code block and
attempt to run it as a docblock test

Reviewed By: singhsrb

Differential Revision: D14543987

fbshipit-source-id: 92f78e9e052befba0bd3eea80ac171f651f2fced
2019-03-20 19:56:14 -07:00
Stefan Filip
9c90eb2e9c radixbuf: cargo fmt
Summary: Formatting

Reviewed By: singhsrb

Differential Revision: D14543986

fbshipit-source-id: 5aae3a6166c315872102ab90d87d46d782682bc8
2019-03-20 19:56:14 -07:00
Stefan Filip
5e147ca3b7 commitcloudsubscriber: updates tests.rs
Summary:
The main issue is that cargo test fails preventing adding sandcastle
configuration that would run these tests on CI.

Reviewed By: singhsrb

Differential Revision: D14543988

fbshipit-source-id: c299148cce01316fad872b9cf8e15dea6633da48
2019-03-20 19:56:14 -07:00
Stefan Filip
ca6052e70c revisionstore: update rand package
Summary: This fixes the build in test mode.

Differential Revision: D14533840

fbshipit-source-id: baa40261f17cdc8881d99a52a7f5cbd1ff66307a
2019-03-20 19:56:14 -07:00
Xavier Deguillard
bee64d1535 asyncpacks: make the Metadata mandatory when adding to a asyncdatapack
Summary:
Similarly to the previous change let's make the asyncmutabledatapack force the
Metadata to be present.

Reviewed By: sfilipco

Differential Revision: D14443510

fbshipit-source-id: 26f851e8d38297dcc37410f0df6a69083531d516
2019-03-19 16:24:50 -07:00
Xavier Deguillard
1c1b1fadc7 revisionstore: make the Metadata mandatory when adding data to a datapack
Summary:
Now that mutablepacks can only create v1 packfile, we can force the Metadata to
not be optional. The main reason for doing this is to avoid issues where LFS
data is stored without its corresponding LFS flag. This can cause issue down
the line as LFS data will be intepreted as is, instead of being interpreted as
a pointer to the LFS blob.

Reviewed By: sfilipco

Differential Revision: D14443509

fbshipit-source-id: 9e7812017fc1356072278496406648f935024f92
2019-03-19 16:24:50 -07:00
Xavier Deguillard
10373e38e2 revisionstore: Force mutabledatapack to be created with v1
Summary:
The v0 doesn't support flags like whether the data is actually an LFS pointer. Let's simply
forbid creating them.

Reviewed By: quark-zju

Differential Revision: D14443512

fbshipit-source-id: 6ffa2e8fda2b2baba0aae53e749bc9248594a134
2019-03-19 16:24:50 -07:00
Xavier Deguillard
a3cee67af5 revisionstore: ignore more errors in repack_packs
Summary:
These last 2 errors are still considered fatal, but shouldn't be and are most
likely transient. Failing to open a packfile that was successfully opened
before can for instance happen when the file is removed by another process, or
if it somehow become corrupted. Failing the removal of the pack-file should no
longer be an issue, but if it fails, we can also ignore it with the reasoning
that the next repack will take care of it.

Reviewed By: sfilipco

Differential Revision: D14441288

fbshipit-source-id: 6c2758c2a88fd5d2d83b55defe3d263ee9f974a1
2019-03-19 16:19:14 -07:00
Arun Kulshreshtha
e697e7d994 types: add types for batch responses
Summary: In order to send batch responses from the API server for data fetching operations, we need to define the types sent over the wire from within `/scm/hg/lib` so that we can deserialize them from within Mercurial. For ease of use, these types implement `IntoIterator` to allow easily iterating over the content (performing type conversions where needed).

Reviewed By: quark-zju

Differential Revision: D14517259

fbshipit-source-id: 5ee867d8386e6b99cb5b4ed96338aeb7eb6a3e44
2019-03-19 14:28:48 -07:00
Arun Kulshreshtha
2418ee548c types: add mock values for Node and Key
Summary: When writing tests, it is often desirable to be able to quickly get a dummy value for a `Node` hash or `Key`. Trying to construct one on the spot can be overly verbose, so let's define some mock values that can be used by tests. This is similar to what Mononoke does (e.g., https://fburl.com/p9u55uye).

Reviewed By: quark-zju

Differential Revision: D14517258

fbshipit-source-id: e3d4cdd60010f44ca681d7a87e6124fe79f8a4c6
2019-03-19 14:28:48 -07:00
Arun Kulshreshtha
ef3f3dea44 types: rename LooseHistoryEntry and PackHistoryEntry
Summary: `LooseHistoryEntry` and `PackHistoryEntry` aren't the best names for these types, since the latter is what most users should use, whereas the former should only typically used for data transmission. As such, we should rename these to clarify the intent.

Differential Revision: D14512749

fbshipit-source-id: 5293df89766825077b2ba07224297b958bf46002
2019-03-18 19:50:19 -07:00
Xavier Deguillard
41d275ad36 revisionstore: ignore transient errors during repack
Summary:
Corrupted packfiles, or background removal of them could cause repack to fail,
let's simply ignore these transient errors and continue repacking.

Reviewed By: DurhamG

Differential Revision: D14373901

fbshipit-source-id: afe88e89a3bd0d010459975abecb2fef7f8dff6f
2019-03-11 18:15:45 -07:00
Stefan Filip
2eb3c24956 configparser: upgrade crate to rust edition 2018
Summary: As requested in D14380687.

Differential Revision: D14393014

fbshipit-source-id: 365c713b6f5a106cef0b945e63f224b7651d0e8f
2019-03-11 15:32:55 -07:00
Stefan Filip
3f33e9f3e9 configparser: fix XDG config loading
Summary:
The spec for both XDG and Mercurial say that when the XDG_CONFIG_HOME variable
is not set, we should default to $HOME/.config.

Windows and macOS also have a designated config folder outside of the home directory.
The `dirs` crate provides consistent access to this folder. I see no harm in looking at config
 folders across all operating systems.

Reviewed By: quark-zju

Differential Revision: D14380686

fbshipit-source-id: 5e5a9cd4694aaa49fbc526f4917dc4afdaeb9842
2019-03-11 15:32:55 -07:00
Stefan Filip
403e1c7ad2 configparser: rustfmt on hg.rs
Summary: Automatic formatting using rustfmt

Differential Revision: D14380687

fbshipit-source-id: 5f7832419b0941c00e2399c902454862580988a4
2019-03-11 15:32:55 -07:00
Stefan Filip
41e75fce3f manifest: fix infinite loop when cursor encounters error
Summary:
quark-zju noticed in code review that `Cursor` could get into an infinite loop when
it's results would be collected into a Vec<_>. That was the motive that I
needed to update `Cursor` to transition to `State::Done` when the cursor
encounters an error. Previously I felt that users of `Cursor` would only be
empowered by having the ability to retry the failure.

Reviewed By: quark-zju

Differential Revision: D14393590

fbshipit-source-id: b3e0974ac15d62f3f17790229121c0dec3a6149e
2019-03-11 15:27:52 -07:00
Jun Wu
6de9bec782 config: stop %include from scaning directories
Summary:
`listdir` makes it more expensive to detect config changes. We no longer need
it. Therefore drop the feature.

Reviewed By: markbt

Differential Revision: D13875655

fbshipit-source-id: 147adce45021c7b028aada5c40f498c2fd58c7f5
2019-03-08 16:57:06 -08:00
Stefan Filip
b7dee64bd2 manifest: fix tree entry serialization
Summary:
Follow up from D14178264.

Two changes:
 * tree manifest entries must end with a line feed
 * `t` is the byte that flags a directory

Reviewed By: DurhamG

Differential Revision: D14368316

fbshipit-source-id: b0b46c876649b8f25bf0ecdb1266527dbeb33796
2019-03-07 17:51:39 -08:00
Stefan Filip
660992a50a manifest: add tree::diff(Tree, Tree)
Summary:
`manifest::tree::diff()` returns an iterator over the differences between two
tree manifests.

I chose a function that takes two parameters over a method on Tree because it
felt more clear to write `left` and `right`. Also because I am not sure how
iterators would be abstracted on a trait.

Differential Revision: D14347656

fbshipit-source-id: 537574070cd18b08c77b3cd1cf4cff38d77fbf81
2019-03-07 17:46:44 -08:00
Stefan Filip
2deb0e6e42 manifest: add tree::Cursor and Tree::files()
Summary:
Cursor is a utility for iterating over a manifest tree. In this diff it is used
to implement Files. In the future it will be used to do a diff between two tree
manifests.

I am not sure how to describe an iterator return value in the Manifest trait so
I kept the function on the tree only for now. Looking forward to hearing your
suggestions.

Differential Revision: D14347655

fbshipit-source-id: ffd856443d8abe3ebd0557a096bf7a5ec46312d3
2019-03-07 17:46:44 -08:00
Xavier Deguillard
f868d77cd1 revisionstore: use remove_file from vfs.rs
Summary: The historypack wasn't using remove_file from vfs which was causing repack to fail.

Reviewed By: sfilipco

Differential Revision: D14373649

fbshipit-source-id: 2d87f24bda541bc011ed38533db1ac7bdddc81e3
2019-03-07 15:24:10 -08:00
Stefan Filip
c305e13566 manfiest: mark FileMetadata as Copy
Summary:
`Node` is marked as `Copy`. `FileMetadata` is not much more than `Node` so it
seems pretty clear that it should be marked `Copy`.

Reviewed By: DurhamG

Differential Revision: D14347657

fbshipit-source-id: 939abf88087bc8c6f942047a08d6a4a0d61e053f
2019-03-07 11:20:07 -08:00
Stefan Filip
5b370ffb72 manifest: move tree link to a separate file
Summary:
Cleaning up the `mod.rs` file so that it provides more signal.
`Link` is an internal implementation detail that other internal components may depend on so it is a great candidate to be moved to a dedicated file.

Differential Revision: D14347654

fbshipit-source-id: e5b5a42faf1e9f9c4a0591e5bd94182391ed511f
2019-03-07 11:20:07 -08:00
Stefan Filip
6d9dc154ca manifest: add flush function to manifests
Summary:
Save, finalize, flush, they mean about the same thing.

The first thing to note is that this implementation is not complete because
the parents are not correctly passed into the hashing function.

The second thing is that store failures make the code a little more complex
than it would have been otherwise.

(Note: this ignores all push blocking failures!)

Reviewed By: quark-zju

Differential Revision: D14292713

fbshipit-source-id: 807d7a385a62cb5f4948f1781d3146eaa6502ca9
2019-03-05 16:12:48 -08:00
Stefan Filip
25edcc014b manifest: inline store_entry_to_links
Summary:
This function is a bit on it's own with the removal of the pair conversion.
Since it is used in only one place it makes sense to inline it.

(Note: this ignores all push blocking failures!)

Reviewed By: quark-zju

Differential Revision: D14292712

fbshipit-source-id: abbf1dc70d61c0ad039f5bc5ed5277d0770e3899
2019-03-05 16:12:48 -08:00
Stefan Filip
c5cc253234 manifest: refactor tests to use store::Entry::from_elements
Summary:
Working on the save mechanism I realized that links_to_store_entry is not that
useful because we can avoid the failure states where we would try to serialize
an ephemeral node. I am removing that function and converting the code that was
using that function to using the Entry constructor directly.

(Note: this ignores all push blocking failures!)

Reviewed By: quark-zju

Differential Revision: D14292714

fbshipit-source-id: 54ef46670319c27d90fc78511a1eb6abf47d3acf
2019-03-05 16:12:48 -08:00
Stefan Filip
43fb573c23 types: add explicit conversions from owned paths types to unsized ref
Summary:
There are scenarios where an &PathComponentBuf or a &RepoPath will show up.
An example when using get from a HashMap. These are not the references that we
are looking for. We want &PathComponent and &RepoPath respectively. Adding
explicit conversions.

(Note: this ignores all push blocking failures!)

Reviewed By: quark-zju

Differential Revision: D14292711

fbshipit-source-id: 29f4de25c2ffebf7f009e4f2515e0ba8f0371ae0
2019-03-05 16:12:47 -08:00
Stefan Filip
765659d505 manifest: update tree manifest to take owneship of Store
Summary:
This is what Rust is telling us to do. The situation that triggers this update is
writing to the store. Particularly when the store is an in memory hashmap we
need to have a mutable borrow to the hashmap to insert into it. From a general
point of view this means that any sharing of the store between different
instances of a manifest will have to be handled by the struct that implements
the `Store` trait.

(Note: this ignores all push blocking failures!)

Reviewed By: quark-zju

Differential Revision: D14292716

fbshipit-source-id: 6e789527dbdf3cd3ffe967f4900251bf31f7d6b2
2019-03-05 16:12:47 -08:00
Stefan Filip
fcc560357a types: add RepoPathBuf::pop()
Summary:
The practical aspect of this method comes when iterating over a tree and having
to maintain the current path. When going deep we will be pushing path
components and when coming back we will be poping path components.

I am not sure if it makes sense to return the path component or not. However I
believe that we should return some sort of error when RepoPath is empty.

(Note: this ignores all push blocking failures!)

Reviewed By: quark-zju

Differential Revision: D14292715

fbshipit-source-id: 4ef1e10de7a60775340063b5baa317d3d626bc64
2019-03-05 16:12:47 -08:00
Stefan Filip
14f26aa355 manifest: add remove implementation for tree
Summary:
Removes a file from the manifest. Nothing special for it.

(Note: this ignores all push blocking failures!)

Reviewed By: quark-zju

Differential Revision: D14276645

fbshipit-source-id: 85e8ffd6cffee426c73eb627484dfa5a866a364b
2019-03-05 16:12:47 -08:00
Stefan Filip
1e9b7fafe9 manifest: refactor get_link out of manifest::get to allow code reuse
Summary:
It is going to be useful in tests to check how certain internal nodes change
so adding an api that allows fetching an internal node.

(Note: this ignores all push blocking failures!)

Differential Revision: D14276642

fbshipit-source-id: 9a3e488be6031f7b4727a8643f64970dcec8c400
2019-03-05 16:12:47 -08:00
Stefan Filip
e2b06628cd manifest: update Tree::get() to use RepoPath::parents()
Summary:
This removes the need for the local buffer for the parent.

(Note: this ignores all push blocking failures!)

Differential Revision: D14276648

fbshipit-source-id: a9378ea592d502ddf2dcdc35fe6ffa9ba213bc14
2019-03-05 16:12:47 -08:00
Stefan Filip
3c2c09f431 manifest: update tree::Manifest::insert
Summary:
Using the recently added path utilities so that we don't keep a secondary
parent buffer around.
Updating the file insert logic so that it is readable and intuitive.

(Note: this ignores all push blocking failures!)

Reviewed By: quark-zju

Differential Revision: D14276649

fbshipit-source-id: 8e7e835814f0039645601abbf1b701e8c1ed3697
2019-03-05 16:12:47 -08:00
Stefan Filip
dd2bfc6c04 types: add test for to_owned() on Path types
Summary:
I had an issue where I incorrectly ended up with a &&RepoPath. While debugging
I added this tests to validate my sanity. I think that keeping these tests is
useful for the future.

(Note: this ignores all push blocking failures!)

Differential Revision: D14276640

fbshipit-source-id: d7e1cedc80b3a0ecb97e5a0c80fc4eea110e943f
2019-03-05 16:12:47 -08:00
Stefan Filip
f71814f8e2 types: add dedicated iterator type for RepoPath::components()
Summary:
The current implementation has some gotchas that are related to how the `split`
method is implemented for `&str`. The new implementation is more clear for how
we construct path components

(Note: this ignores all push blocking failures!)

Differential Revision: D14276639

fbshipit-source-id: 1a22c177ba570915b7952eee78ed9191f7b72976
2019-03-05 16:12:47 -08:00
Stefan Filip
27c0d1c991 types: add RepoPath::last_component()
Summary:
Return the last component of the path. The empty path, `RepoPath::empty()`
does not have any components so `None` is returned in that case.

(Note: this ignores all push blocking failures!)

Differential Revision: D14276643

fbshipit-source-id: 9c4caf455891e77d03f22d81b39e3a34ae61ffcc
2019-03-05 16:12:47 -08:00
Stefan Filip
ed6e6245ad types: add RepoPath::parent()
Summary:
`RepoPath::parent` returns the parent of the path. The empty path,
`RepoPath::empty()` does not have a parent so `None` is returned in that case.

(Note: this ignores all push blocking failures!)

Differential Revision: D14276641

fbshipit-source-id: fc60da9750dc76e4f598dd483a63adc180135cb4
2019-03-05 16:12:47 -08:00
Stefan Filip
736cfe89bd types: add RepoPath::split_last_component()
Summary:
Tries to split the current `RepoPath` in a parent path and a component. If the
current path is empty then None is returned. If the current path contains only
one component then the pair that is returned is the empty repo path and a path
component that will match the contents `self`.

(Note: this ignores all push blocking failures!)

Differential Revision: D14276644

fbshipit-source-id: e85a22f65cc267e48f12af6bf6b40c7673b7eaaa
2019-03-05 16:12:47 -08:00
Stefan Filip
68b6897a78 types: Add RepoPath::parents()
Summary:
An iterator for the parent directories of a path.

(Note: this ignores all push blocking failures!)

Differential Revision: D14276646

fbshipit-source-id: 2b580ee8d762db5113110ec9b09ec3a093a1063a
2019-03-05 16:12:47 -08:00
Stefan Filip
ffffdd2a44 types: Add RepoPath::empty()
Summary:
This diff defines the empty `RepoPath`. This path is equivalent with the root
of the repository.
Also adding a method to check whether a `RepoPath` matches this special path.

(Note: this ignores all push blocking failures!)

Differential Revision: D14276647

fbshipit-source-id: 6e9ad5957ad39a711a1680bd084f448bb9d73f87
2019-03-05 16:12:47 -08:00
Arun Kulshreshtha
e6909b217b edenapi: pretty print keys in debug output
Summary: Now that `Key` has a `Display` implementation that produces useful output (namely the filenode and path as strings rather than bytes), we can use it to get better debug output.

Differential Revision: D14310544

fbshipit-source-id: ace5b76f07aa1216b5e9aae22dc7b6bd561e9560
2019-03-04 19:22:19 -08:00
Xavier Deguillard
5ede544f83 asyncpacks: remote unecessary Arc<Mutex> from the mutable packs
Summary:
Since all the methods of an aync mutable pack are taking the ownership of self,
there can only be one accessor to self, and therefore the Mutex is not
required. For the same reasons, the Arc can be removed.

This saves a bunch of atomic operation when operating on the mutable packs.

Reviewed By: kulshrax

Differential Revision: D14287252

fbshipit-source-id: 792bc41aa0dc372e2114dfca4895cca7083f3a56
2019-03-04 13:48:11 -08:00
Xavier Deguillard
073c33a933 asyncpacks: AsyncWrapper only accept Sync types
Summary:
Since most of the datastore/historystore types are already Sync, we don't
really need to wrap them in a Mutex. For the ones that aren't Sync, we can
easily wrap them in a Mutex and pass that to AsyncWrapper.

Reviewed By: kulshrax

Differential Revision: D14287253

fbshipit-source-id: 0f5c04c651592561403caa6e4627017ab1731d0a
2019-03-04 13:48:11 -08:00
Arun Kulshreshtha
2ab9df3b9d types: implement Display for Key
Summary: Implement the `Display` trait for `Key`. This is important for human-readable output for debugging since both filenode and path inside the `Key` are stored as byte arrays, which are just printed out as numbers by the derived `Debug` implementation.

Differential Revision: D14294970

fbshipit-source-id: a601dc9a0c23ac40894ec27135171928f5635507
2019-03-04 12:11:23 -08:00
Stefan Filip
82bbd0326f manifest: improve store imports in tree/mod.rs
Summary:
We currently alias many of the imports from the store submodule. It is nicer
if we just import the submodule name and prefix with the submodule.

Differential Revision: D14276638

fbshipit-source-id: 6661df7f9cb5d976b11153003f653a1f66301c9a
2019-03-03 12:35:50 -08:00
Arun Kulshreshtha
6181da2178 edenapi: add method to get file history
Summary: Add a `get_history()` method to the `EdenApi` trait that queries the API server's `getfilehistory` endpoint and writes the resulting history entries to a historypack in the user's cache.

Differential Revision: D14223269

fbshipit-source-id: bf69c767f5a89177c36e755250330dbbbc219e4f
2019-02-28 15:42:17 -08:00
Arun Kulshreshtha
1f6435f6c1 asyncpacks: add add_entry() function
Summary: Add an `add_entry()` method, similar to D14162781 which did the same for non-async packs.

Differential Revision: D14246980

fbshipit-source-id: 8c8f507f2e0133d80d826d04345df7b41d6013a3
2019-02-28 12:47:13 -08:00
Arun Kulshreshtha
a430c04f81 revisionstore: use AsRef<Path> in constructors
Summary: `AsRef<Path>` is more ergonomic than `&Path` since the former can accept `PathBuf`, `String`, etc.

Differential Revision: D14223167

fbshipit-source-id: 12d26adaa63855c339e04734c19d6697624f9c9e
2019-02-27 12:43:43 -08:00
Stefan Filip
d543e91b45 types: order imports in path.rs
Summary:
The order of imports we're trying to follow is: std, remote crates
(ie: crates.io), local crates (ie: fbcode crates), and crate local. A new line
in between each group can be used to prevent rustfmt from re-ordering them.

Reviewed By: singhsrb

Differential Revision: D14243163

fbshipit-source-id: fbbb07693af14b13ae6b3f4e788972d99193fd64
2019-02-27 10:00:40 -08:00
Stefan Filip
6825832b19 manifest: order imports
Summary:
The order of imports we're trying to follow is: std, remote crates
(ie: crates.io), local crates (ie: fbcode crates), and crate local. A new line
in between each group can be used to prevent rustfmt from re-ordering them.

Reviewed By: singhsrb

Differential Revision: D14243162

fbshipit-source-id: 6fc2cceb3d6834b602be20b8b8f74e0f61b227e1
2019-02-27 10:00:40 -08:00
Stefan Filip
b20f22f0a7 manifest: implement tree entry serialization/deserialization
Summary:
This diff focuses on addding deserialization. Because the most effective way
 of testing deserialization is doing round-trip conversions we also implement
 serialization.

 `manifest::tree::store::Entry` is the structure that is in charge of perfroming
 serialization and deserialization. We update the Store trait to interface with
 this new object.

Differential Revision: D14178264

fbshipit-source-id: bb12262c181a518ba4111d40c079d6836ec44301
2019-02-27 10:00:40 -08:00
Stefan Filip
5809294733 manifest: replace crate lazy-init with once_cell
Summary:
The `once_cell` crate is more flexible than `lazy-init`. It has more types,
a richer api, recent updates and more features.

Differential Revision: D14232727

fbshipit-source-id: 14aeb34a96e094069bb8dc3fb5efcf5b5707ce8c
2019-02-27 10:00:40 -08:00
Stefan Filip
29e6198309 manifest: make FileMetadata members public
Summary:
The purpose of `FileMetadata` is to be a data container. Using the
`FileMetadata` implies accessing the underlying values. That means that the
underlying values need to be `public`.

Differential Revision: D14178260

fbshipit-source-id: 8b8e5ead23d47962498f991152c2a5fd94ac3c74
2019-02-22 15:42:45 -08:00
Stefan Filip
59ce91e857 manifest: add Arbitrary implementation for FileType and FileMetadata
Summary: These implementations are going to used in quickcheck tests.

Differential Revision: D14178259

fbshipit-source-id: 0bded67deab3422b4aad53666c14cf195ea1b0d4
2019-02-22 15:42:45 -08:00
Stefan Filip
e496e01d8f types: Add as_byte_slice() to RepoPath and PathComponent
Summary:
`as_bytes` is useful for interacting with code that is written in reference
to a `slice` or when attempting to serialize them. It is fine to serialize
these paths using their underlying representation because we request that they
be normalized.

Reviewed By: quark-zju

Differential Revision: D14178263

fbshipit-source-id: 36529e2ae47580ae4014ae279df98b35eac5484a
2019-02-22 15:42:45 -08:00
Stefan Filip
5c04aead8f types: add Arbitrary implementations for RepoPathBuf & PathComponentBuf
Summary:
Implementing these traits allows easy use of `RepoPath` and friends in
quickcheck tests.
It is tricky to implement `PathComponentBuf`. I believe that most strings will
end up being a valid `PathComponent` so I think that it is reasonable to loop
until a valid string is found. To note generating Arbitrary Unicode `char`s is
implemented using a loop where random bytes are validated against the `char`
constructor.

Reviewed By: quark-zju

Differential Revision: D14178262

fbshipit-source-id: 759486a2053b851f8259d7cc03eee1cd69893f9f
2019-02-22 15:42:44 -08:00
Stefan Filip
49e7e99131 types: add public constant functions Node::{len, hex_len}
Summary:
These constant can be useful for parsing byte streams that we expect to contain
`Node`s. These constant should replace all potential hardcoding of values that
intend to represent the byte lengths for `Node`.

Differential Revision: D14178261

fbshipit-source-id: 34471151b5d253504e32a9e8b039608c1d4943fe
2019-02-22 15:42:44 -08:00
Xavier Deguillard
91b1a56c3a radixbuf: make it compile on windows
Summary:
Instead of using the rand crate imported into quickcheck, we can use the
rand crate directly.

Reviewed By: quark-zju

Differential Revision: D14174653

fbshipit-source-id: c848f139765b9e458d374790227399f0ad836af6
2019-02-22 10:22:30 -08:00
Xavier Deguillard
9f05c94729 watchman_client: fix windows build
Summary:
The watchman_client crate relied on unix-only crates for both
unix_socket_transport and command_line_transport. Let's not compile
these on windows.

Reviewed By: quark-zju

Differential Revision: D14174654

fbshipit-source-id: 67d26d1799e71a1bf20af1a57a687249f5dce227
2019-02-22 10:22:30 -08:00
Xavier Deguillard
ee8c0812fd revisionstore: fix test on windows
Summary: Changing the permission on the packfile failed due to the file being opened.

Reviewed By: quark-zju

Differential Revision: D14174652

fbshipit-source-id: 356ac4748fd69e660a6cb9e63367a87489755e5e
2019-02-22 10:22:30 -08:00
Xavier Deguillard
7c34139c06 revisionstore: fix compilation warnings
Summary:
Rust tells us that Rng::choose and Rng::shuffle should be replaced by
SliceRandom::choose and SliceRandom::shuffle, so let's do it.

Reviewed By: singhsrb

Differential Revision: D14178565

fbshipit-source-id: 586eb2891f1c2cab0a3435c1b4ae8f870e7a3c25
2019-02-21 18:39:21 -08:00
Arun Kulshreshtha
09d3ade23a types: use untagged serialization for Parents
Summary:
Tell serde to use an [untagged representation](https://serde.rs/enum-representations.html#untagged) of this enum. This means that `Parents::None` will map to a null value, `Parents::One` will map to an array representing a `Node`, and a `Parents::Two` will map to an array of 2 arrays representing `Node`.

Using CBOR serialization, this means that these variants are 1 byte, 21 bytes, and 43 bytes respectively.

Differential Revision: D14174309

fbshipit-source-id: 0217a23c4ee5409ab293525d7b6e5ae969b5504d
2019-02-21 16:34:25 -08:00
Arun Kulshreshtha
eb3b97d91a types: implement FromIterator and IntoIterator for Parents
Summary: Add implementations for `FromIterator` and `IntoIterator` for the `Parents` type to make it more ergonomic to use.

Reviewed By: quark-zju

Differential Revision: D14172511

fbshipit-source-id: 5ba848c1dfbb8cc23ed19c9dc816616f5ed7af5f
2019-02-21 14:39:04 -08:00
Arun Kulshreshtha
942d9d984a revisionstore: allow adding a PackHistoryEntry to a MutableHistoryPack
Summary: Add a convenience function to `MutableHistoryPack` to add an entry from a `PackHistoryEntry` struct.

Differential Revision: D14162781

fbshipit-source-id: a0e07f34b9231011a339ce63adcef8ab55a0555e
2019-02-21 14:39:04 -08:00
Arun Kulshreshtha
3344341ad9 types: split HistoryEntry into two types
Summary:
This changes the way we represent history entries for the Eden API by splitting them into two types and putting them into a new `historyentry` module.

- `PackHistoryEntry` is the same as the old `HistoryEntry`, containing the fields required to add this entry to a `MutableHistoryPack` (namely a `Key` and a `NodeInfo`).
- `LooseHistoryEntry` is a history entry containing the information that would normally be present in a line of the history text in the remotefilelog loose file format.

There are several reasons why it makes sense to have both of these types:

- The existing remotefilelog code in Mononoke uses a type very similar to `LooseHistoryEntry` internally, and as such having a similar type for API calls simplifies code on the server side.

- `PackHistoryEntry` contains redundant information (in particular, the file path may be duplicated up to 3 times). While it's structure is ideal for `revisionstore`'s in-memory data structures, for transmitting data, this redundancy is undesirable, especially since the client already has the file path (it is required to make the request in the first place).

- Conversions between these two representations include some subtle details that are tricky to get right. By putting the conversion in one canonical place, we can avoid having to duplicate this conversion logic in multiple places.

Differential Revision: D14162783

fbshipit-source-id: 63e0a060709916f21613442b75370f4d34a04f04
2019-02-21 14:39:04 -08:00
Arun Kulshreshtha
45a299b82d types: add Parents type representing a node's parents
Summary:
Add a type representing a node's parents. Mercurial nodes can have zero, one, or two parents; this is normally represented as an array of two node hashes, with any absent parents denoted with a null hash. This representation is not particularly Rustic because it allows for invalid combinations (such as a null p1 and non-null p2) to be represented. By using an enum, invalid values are unrepresentable, making code using this type simpler (due to not requiring validation logic).

This type will be used later in the stack to represent parents in history entries.

Reviewed By: quark-zju

Differential Revision: D14162782

fbshipit-source-id: dfff3ecc76dea114e0044839216d080b7f34a506
2019-02-21 14:39:04 -08:00
Stefan Filip
65ea22225b manifest: add Durable variant for Links
Summary:
`Durable` nodes are inner nodes that come from storage. Their contents can be shared between multiple instances of Tree. They are lazily evaluated. Their children list will be read from storage only when it is accessed.

The inner structure of a durable link poses an interesting question related to handling failures: what do we do when we have a failure when reading from storage?
We can cache the failure or we don't cache it. Caching it is mostly fine if we had an error reading from local storage or when deserializing. It is not the best option if our storage is remote and we hit a network blip. On the other hand we would not want to always retry when there is a failure on remote storage, we'd want to have a least an exponential backoff on retries. Long story short is that caching the failure is a reasonable place to start from.

The lazy-init crate is useful for modeling the lazy initialization model that we have for durable node links. See docs at https://docs.rs/lazy-init/0.3.0/lazy_init/

Reviewed By: quark-zju

Differential Revision: D14142928

fbshipit-source-id: 077f708b38e2ace772f30b3392445326ce17f47c
2019-02-21 12:01:28 -08:00
Stefan Filip
d1317c251c manifest: add store trait
Summary:
The store abstracts the data that we need from the store for TreeManifest. This is not a long term abstraction, mostly something that we can code against until we write the real parsing logic from the real data store.

Adding a TestStore implementation that we can use in tests and debug.

Reviewed By: quark-zju

Differential Revision: D14142930

fbshipit-source-id: 7f2d4f05a7b7e63758db9247cdbcd51541c88ec0
2019-02-21 12:01:28 -08:00
Stefan Filip
0f567c8a11 manifest: move tree implementation to module
Summary: Preparing to add the store abstraction for the tree manifest. This store implementations are going to be tied to the tree implementation and should be made private from the rest of the crate. To do this we move the tree to a module.

Reviewed By: quark-zju

Differential Revision: D14142929

fbshipit-source-id: 588d597e1248fc2e632c9efe03f08ba3d491e8cd
2019-02-21 12:01:28 -08:00
Stefan Filip
bab2c60c1d manifest: move Link::methods into Tree functions
Summary: manifest::tree::Link has several convenience functions. In the grand scheme of things this is something that we will want because they are common operations. The problems is that the node structure is currently changing rapidly and the additional layer of abstraction is hindering iteration. I don't know the exact shape that the nodes will have and trying different things out is easier when the Tree functions use the nodes directly.

Reviewed By: quark-zju

Differential Revision: D14142926

fbshipit-source-id: 5e4e8f4d3e1b19fd14dcc290a3dea4271b502d97
2019-02-21 12:01:28 -08:00
Arun Kulshreshtha
4dfba9862d Add conversions from Mononoke to Mercurial history entries
Summary: Add conversions between the `HgFileHistoryEntry` type produced by Mononoke and the `edenapi_types::HistoryEntry` type shared between Mercurial and Mononoke. The latter can be serialized and sent from the API server to the Mercurial client, and will be used for HTTP history fetching.

Reviewed By: quark-zju

Differential Revision: D14079338

fbshipit-source-id: 2123c88310f633f87d8c92405382d5046874dfee
2019-02-20 15:29:22 -08:00
Stefan Filip
04d098e28c Use RepoPath for path types in Manifest
Summary: `RepoPath` and `PathComponent` were recently added as path abstractions for source control internals. This diff makes use of these abstractions in the manifest crate.

Reviewed By: quark-zju

Differential Revision: D14142925

fbshipit-source-id: 1e0033f4f2aeb5c1d63899adbce198302086bfda
2019-02-20 10:33:19 -08:00
Stefan Filip
a83ec90f88 Fix creation for empty RepoPath
Summary:
While working on TreeManifest I found that the code does not allow creating an empty `RepoPath`. I did not realize that splitting an empty string always results in an empty string. This is an edge case and we need to handle it appropriately.

Currently adding a hack where the components iterator function will skip empty values. We are going to add more iterators in a future revision and we should revisit this function then.

Reviewed By: quark-zju

Differential Revision: D14142927

fbshipit-source-id: 1be43d61913138e3fa50e02976e81f9ca38a74a7
2019-02-20 10:33:19 -08:00
Xavier Deguillard
1d5283d1da revisionstore: refactor repack_datapacks/repack_historypacks
Summary:
With the Store trait, we can de-duplicate code between the datapack repack, and
the historypack repack.

Reviewed By: quark-zju

Differential Revision: D14091894

fbshipit-source-id: 5bf335414df2420b42ec45cce7097f3a97a49796
2019-02-20 09:40:56 -08:00
Stefan Filip
e7975b4e5e Add documentation for RepoPath and PathComponent
Summary: I am making this a new diff because the I am documenting failures which only make sense when validation was already added.

Differential Revision: D14097490

fbshipit-source-id: 7eb2c22ebbbc4365f8203d68bc29134176f326f6
2019-02-19 13:52:47 -08:00
Stefan Filip
fc1901e4a4 Add validation for paths
Summary:
When introducing `Component` we said that it will not have any slashes (`/`). This diff adds enforcement for that along with enforcing that `Component` is not empty.
`Path` and by extension `Component` also have a set of invalid bytes: `\0`, `\1` and `\n`. I got these checks from the mononoke codebase (mononoke-types/path.rs)

Reviewed By: DurhamG

Differential Revision: D14001106

fbshipit-source-id: 5acbdf66214e54f1034fd89d90e88c3e904d7f9b
2019-02-19 13:23:31 -08:00
Stefan Filip
65604c2a4e Add Component and ComponentBuf
Summary:
`Component` and `ComponentBuf` are specializations of `RelativePath` and `RelativePathBuf` that do not have any separators: `/`. The main operation that is done on paths is iterating over its components. The components are generally directory names and occasionally file names.

For the path: `foo/bar/baz.txt` we have 3 components: `foo`, `bar` and `baz.txt`.

A lot of algorithms used in source control management operate on directories so having an abstraction for individual components is going to increase readability in the long run. A clear example for where we may want to use `ComponentBuf` is in the treemanifest logic where all indexing is done using components. The index in those cases must be able to own component. Writing it in terms of `RelativePathBuf` would probably be less readable that writing it in terms of `String`.

This diff also adds the `RelativePath::components` which acts as an iterator over `RelativePath` producing `Component`.

The name `Component` is inspired from `std::path::Component`.

Reviewed By: DurhamG

Differential Revision: D14001108

fbshipit-source-id: 30916d2b78fa89537fc5b30420b3b7c12d1f82c7
2019-02-19 13:23:30 -08:00
Stefan Filip
bba4a5c197 Add Path structures specialized for source control
Summary:
`RelativePath` and `RelativePathBuf` are types for working with paths specialized for source control internals. They are akin to `str` and `String` in high level behavior. `RelativePath` is an unsized type wrapping a `str` so it can't be instantiated directly. `RelativePathBuf` represents the owned version of a `RelativePath` and wraps a `String`.

RelativePaths are going to be used in all contexts for SCM so flexible and efficient abstractions for dealing with common needs is important.

The inspiration for `RelativePath` and `RelativePathBuf` comes from the `std::path` crate however we know that the internal representation of a path is consistently an array of bytes where directories are delimited by `/` thus our types can have a simpler representation. It is because of the same reason that we can't use the abstractions in `std::path` for internal uses where we need to apply the same algorithm for blobs we get from the server across all systems.

This diff adds a new crate called `vfs`. The recommended use of the types in this crate are `use ...::vfs` followed by `vfs::RelativePath` and `vfs::RelativePathBuf`.

Alternatives for representing the path:
* We could use `String` and `&str` directly however these types are inexpressive and have few guarantees. Rust has a strong type system so we can leverage it to provide more safety.
* Mononoke use `PathElement(Vec<u8>)` and `MPath(Vec<PathElement>)`. One issue is that this type requires more allocations. Having read a buffer it is more difficult to provide views into it as &MRelativePath.

`RelativePath` is Dynamically Sized Type (DST). That means that `mem::transmute` is the practical way of constructing it in Rust right now:
* https://doc.rust-lang.org/nomicon/exotic-sizes.html
* https://github.com/rust-lang/rust/pull/39594#issuecomment-279471476

Basing these types on `String` and `&str` means that the paths that we deal with need to have UTF-8 encodings. In the grand scheme of things a lot of things are simplified when we can make this kind of assumption.

The code right now doesn't have much documentation. I'll add that if this direction makes sense to other people.

Reviewed By: DurhamG

Differential Revision: D14001109

fbshipit-source-id: 1ca399827b1284f32cd83219e7e090d8b2487cee
2019-02-19 13:23:30 -08:00
Xavier Deguillard
126de4655e revisionstore: add a Store trait
Summary:
A lot of code is duplicated between data stores, and history stores, and one
reason for it is the absence of common trait between these 2. By adding a new
Store trait it will make it easier to write generic code that works accross
data and history store.

Reviewed By: quark-zju

Differential Revision: D14091899

fbshipit-source-id: deef1d43a7d300cb3607c67554ad54f20c870e23
2019-02-19 12:18:27 -08:00
Xavier Deguillard
8e14fcb123 revisionstore: implement DataStore/HistoryStore for Deref types
Summary:
Instead of manually implementing DataStore/HistoryStore for Box, Rc, Arc, and
future smart pointers, we can simply implement the trait for all the types that
can be deref into a DataStore/HistoryStore.

Reviewed By: quark-zju

Differential Revision: D14078072

fbshipit-source-id: 47a80ab0179b84aa08836b6e7c5c3c5f9c1a08ff
2019-02-19 12:18:27 -08:00
Liubov Dmitrieva
0712d3f39e scm_daemon: remove direct fetching
Summary:
As we are moving to smooth switching to Mononoke and backwards, we can start to deprecate
Mercurial specific optimizations to simplify the code.

Reviewed By: markbt

Differential Revision: D14131594

fbshipit-source-id: fa927011890ecdf0874a3a74b4910412b3c84b70
2019-02-19 08:48:07 -08:00
Liubov Dmitrieva
bc43b991a2 fix compilation
Summary:
make local / relase are broken on trunk, fix it

in other modules, this code is only for tests. In toml file it is enabled for tests.

Reviewed By: mitrandir77, ikostia

Differential Revision: D14123392

fbshipit-source-id: eceaf17478b8b7e75ee2ba56df69b28ca8374c64
2019-02-18 04:53:57 -08:00
Arun Kulshreshtha
8864e6a0da types: move edenapi-types into types crate
Summary: Move the contents of `edenapi-types` into the `types` crate so all of Mercurial's Rust types are in one place.

Reviewed By: quark-zju

Differential Revision: D14114547

fbshipit-source-id: feb8f9c35f102d30bf00b230df81a86a3893a49b
2019-02-15 22:51:04 -08:00
Arun Kulshreshtha
9c6b914a22 types: move Key and NodeInfo out of revisionstore
Summary:
In order to move the types in `edenapi-types` (containing types shared between Mercurial and Mononoke) to the `types` crate, we need to move a few types from the  `revisionstore` crate into this crate first, because `revisionstore` depends on `types`, which would create a circular dependency since `edenapi-types` uses types from `revisionstore`.

In particular, this diff moves the `Key` and `NodeInfo` types into their own modules in the `types` crate.

Reviewed By: quark-zju

Differential Revision: D14114166

fbshipit-source-id: 8f9e78d610425faec9dc89ecc9e450651d24177a
2019-02-15 22:51:04 -08:00
Arun Kulshreshtha
61f9f25a66 edenapi-types: add crate for types shared between Mercurial and Mononoke
Summary:
For HTTP data fetching, it will be necessary to have the same Rust types in Mononoke and Mercurial, so that Mononoke can send down the serialized types and Mercurial can deserialize them. These types must live in the Mercurial codebase since Mercurial can't link to code outside of fbcode/scm/hg. As such, this diff adds a new crate to Mercurial that Mononoke can link to, containing these shared types.

Right now the only shared type is a `HistoryEntry`, designed to match the interface of `MutableDatapack::add`. This type will be used as part of the HTTP history fetching API.

In the longer term, it would probably make sense to use something like Thrift for defining the on-the-wire formats used between Mercurial and Mononoke (and eventually for RPC as well). However, given that using Thrift from Mercurial is currently nontrivial (since Mercurial is typically built with Cargo and needs to be compatible with open source tooling), defining the schema in this crate and using `serde` for serialization and HTTP/2 for transport should be sufficient for now.

Reviewed By: quark-zju

Differential Revision: D14079337

fbshipit-source-id: c7880919aeb3fd7e1cf70067a89a17341c1d973f
2019-02-15 15:17:12 -08:00
Kostia Balytskyi
75e854296a fix-code: make the test happy
Reviewed By: singhsrb

Differential Revision: D14094651

fbshipit-source-id: 722022ec09e36f8c17734e6da95ee867d742c196
2019-02-14 16:17:53 -08:00
Xavier Deguillard
b099f733a4 asyncpacks: make asyncunion*store more generic
Summary:
Those types were internally using DataPack/HistoryPack, limiting their use. We
can make them more generic by using the DataStore/HistoryStore traits. The only
drawback is having to implement the new method for each store type.

Ideally, we could have a trait StoreFromPath (or use the experimental TryFrom)
that all the datastore/historystore types would implement.

As a bonus change, I got rid of the *Builder type, these were required as the
new method was already implemented in the AsyncHistoryStore/AsyncDataStore. We
can simply rename the later and use a new method elsewhere.

Reviewed By: DurhamG

Differential Revision: D14060159

fbshipit-source-id: 31fa278f650ba979eecd3df4175cbac30ebb8180
2019-02-14 13:43:51 -08:00
Stefan Filip
1868861442 Add in memory representation for manifest tree
Summary:
Starting the implementation of tree manifest with the in memory nodes and implementing `get` and `insert`. The in memory nodes are called `Ephemeral` and the stored immutable nodes are going to be called `Durable`.

Using a `BTreeMap` for storing the children because we want to efficiently insert, fetch and remove path components. We also want iteration to be done in ordered fashion so BTreeMap is our collection in this case.

Removing elements from the tree is going to be implemented in a future update.

Reviewed By: DurhamG

Differential Revision: D14016273

fbshipit-source-id: d3bc22e5ddb21b689d07a7d74bd639b8c2b138ce
2019-02-14 13:32:05 -08:00
Stefan Filip
c1b8cd68d8 Add manifest crate
Summary:
The seed for the rust implementation of manifests.

We start with the most primitive API for manifests and maps a paths to a `Node`. At the basic level we need the same operations that a map implements so we start with `insert`, `get` and `remove`. We know that retrieving data for Manifests can fail so we encode that in our interface using `Fallible`.

I let for future iterations requiring iterator or returning manifest flags.

Reviewed By: DurhamG

Differential Revision: D14016274

fbshipit-source-id: 8f1f83610933b9e9a96f8c5ba2c6e50567c76e06
2019-02-14 13:32:05 -08:00
Stefan Filip
63c87a7500 Add test constructor for Node
Summary:
`Node` is not friendly with plain old unit tests because constructing them is a bit involved. This diff adds a constructor from u8 purely for test puposes.

I picked an u8 for input because it is the most convenient type. When we move past rust 1.31 it might make sense to use an u32 and use https://doc.rust-lang.org/std/primitive.u32.html#method.to_le_bytes

To note that property testing is best used in addition to plain old unit testing.

Reviewed By: DurhamG

Differential Revision: D14016272

fbshipit-source-id: 5b831ab0011ef2575f7e94d158ab4ddf30d1ac06
2019-02-14 13:32:05 -08:00
Xavier Deguillard
76316fbf9d revisionstore: verify repacked keys before deleting pack files
Summary:
During repack, the repacked files are deleted without any verification. Since
Adam saw some data loss, it's possible that somehow repack didn't fully repack
a packfile but it was deleted. Let's verify that the entire packfile was
repacked before deleting it.

Since repack is mostly a background operation, we don't have a way to notify
the user, but we can log the error to a scuba table to analyse further.

Reviewed By: DurhamG

Differential Revision: D14069766

fbshipit-source-id: 4358a87deeb9732eec1afdfb742e8d81db41cd87
2019-02-14 13:03:09 -08:00
Xavier Deguillard
e5a7da32da revisionstore: rename the packfile before removal on windows
Summary:
Removing files on Windows is hard. It can fail for many reasons, many of which
involves another process having the file opened in some way. One way to solve
this problem is that renaming the file isn't as restrictive as removing it.

Since hg repack will attempt removing any temporary files it will also try to
remove the packfiles that we failed to remove earlier.

Reviewed By: DurhamG

Differential Revision: D14030445

fbshipit-source-id: 1f3799e021c2e0451943a1d5bd4cd25ed608ffb6
2019-02-14 10:34:52 -08:00
Xavier Deguillard
8c40ed3a71 revisionstore: ignore AlreadyExists errors when persisting a mutable pack
Summary:
Packfiles are named based on their content, so having an on-disk file with the
same name means that they have the same content. If that happens, let's simply
continue without failing.

Reviewed By: DurhamG

Differential Revision: D14030446

fbshipit-source-id: f04c15507c89b2fca19c95a7b41d8e65c88da019
2019-02-14 10:34:52 -08:00
Mateusz Kwapich
96d0d0889e config: still load configs in legacy configurations (hgrc.d on linux)
Summary: This broke hg configs on Tupperware containers.

Reviewed By: DurhamG

Differential Revision: D14083110

fbshipit-source-id: e49f77235317046931c0e75c98c3e67a617dfd49
2019-02-14 09:31:57 -08:00
Arun Kulshreshtha
127ca1a990 edenapi: use rustls instead of openssl
Summary:
Switch from using OpenSSL (via `native-tls`) to [Rustls](https://github.com/ctz/rustls), a pure-Rust TLS implementation based on the `ring` crypto crate.

Unlike `native-tls`, Rustls supports ALPN, which means it can be used along with Hyper to perform HTTP/2 requests over TLS. (OpenSSL also supports ALPN, but older versions of Windows' `schannel` library do not, and as such `native-tls` doesn't support ALPN either regardless of platform.)

Rustls also builds on Windows without any special configuration, sidestepping the issues we've been having with OpenSSL in the Windows build.

Reviewed By: quark-zju

Differential Revision: D14070084

fbshipit-source-id: 25268c58a88177f4708370696d326b4c0bdc89a0
2019-02-13 16:07:00 -08:00
Xavier Deguillard
73aed5c3d2 revisionstore: do not attempt repacking one packfile
Summary:
Repacking one packfile will yield the same packfile, so we can save some IO by
not trying to repack.

Differential Revision: D14013789

fbshipit-source-id: 8069840cc7cb1837eb94cea97e50b3bbaa548873
2019-02-12 11:21:34 -08:00
Jun Wu
d515e4826f config: still load configs in legacy locations
Summary:
D13875656 made a config path change that breaks tests without using HGRCPATH,
or local build runs.

Reviewed By: DurhamG

Differential Revision: D14034919

fbshipit-source-id: 80de214f1769a8f40e79dc0ab1dbba4d55f506a7
2019-02-11 17:08:45 -08:00
Arun Kulshreshtha
cd9197c25d revisionstore: fix import ordering
Summary:
We've settled on the following grouping for imports:

- standard library
- 3rd party crates
- internal crates
- modules within same crate

This diff updates revisionstore accordingly.

Reviewed By: singhsrb

Differential Revision: D14030243

fbshipit-source-id: 74a7897342e39eb1d80202c8aae8c149bf08fc41
2019-02-11 15:47:36 -08:00
Arun Kulshreshtha
f7bbff1ceb edenapi: do not enforce HTTP/2 only
Summary: It turns out that `hyper-tls` does not support ALPN for negotiating HTTP/2 connections, and only supports HTTP/2 prior knowledge. (This is a limitation of the underlying TLS library, `native-tls`.) Unfortunately, while the Mononoke API server itself is fine with HTTP/2 prior knowledge for non-TLS connections, the Mononoke VIPs require TLS, and thus per the HTTP/2 spec require ALPN negotiation from an HTTP/1.1 initial connection. As a result, we need to revert back to using HTTP/1.1 for now in order to use TLS.

Reviewed By: singhsrb

Differential Revision: D14015335

fbshipit-source-id: b78197d4cfecf184479162c5b14ba54cbef66ee7
2019-02-11 09:54:22 -08:00
Jun Wu
9cdc2640d6 config: change system config entry point
Summary:
Change system config entry point to only `/etc/mercurial/system.rc` (unix) and
`\ProgramData\Facebook\Mercurial\system.rc` (Windows) so they won't overlap
with a vanilla Mercurial installation.

Another goal of this change is to make it easier to drop the directory
`%include` feature. So detecting config changes (for example, edenfs wants to
make sure ignore rules are up-to-date) can be made cheaper by just stating
files without `listdir`.

Reviewed By: markbt

Differential Revision: D13875656

fbshipit-source-id: 314c0bf87ff086dec5b88e232edca0133356484e
2019-02-08 19:31:11 -08:00
Xavier Deguillard
31622c9806 asyncpacks: add AsyncUnionHistoryStore
Summary: This will be used by scmmemcache to send history data to memcache

Reviewed By: DurhamG

Differential Revision: D13975346

fbshipit-source-id: f41eaf9a4968072dd07efbcd9d539e6293c3fa4f
2019-02-08 12:56:06 -08:00
Xavier Deguillard
a4311ec1df memcache: implement get_hist
Summary:
We can now fetch history data stored in memcache and write it to a history
pack.

Reviewed By: DurhamG

Differential Revision: D13975308

fbshipit-source-id: 2196328ad60a55d1e2b39d88d939f434e496837a
2019-02-08 12:56:06 -08:00
Xavier Deguillard
d9153f1565 memcache: proper serde serialization
Summary:
The initial get_data/set_data only sent the full-text to memcache, which is
just enough for non-LFS data. Let's use Serde to serialize/deserialize the data
that we send to memcache. This will make it simple to add checksuming, or more
metadata to it.

Reviewed By: DurhamG

Differential Revision: D13974714

fbshipit-source-id: 41a235e1d1e8128b14f00b668745f4f9a070a360
2019-02-08 12:56:06 -08:00
Xavier Deguillard
76418dd79c memcache: add set_data
Summary:
Similarly to the get_data, we can now read a datapack and send the proper
deltas to memcache. This change is lacking in the same way the get_data is.

Reviewed By: DurhamG

Differential Revision: D13886026

fbshipit-source-id: a00475e89b7e75dbbe9afa9f9d293a686f969a3f
2019-02-08 12:56:06 -08:00
Xavier Deguillard
9439d09d10 revisionstore: implement IterableStore for UnionStore
Summary:
The IterableStore trait allows iterating over all the keys of a DataStore.
Since this is applicable to a UnionStore, let's implement it there. We can now
use it in their async variants.

Regarding the async variants, the code effectively builds a Vec of Key, which
may use a lot of memory, a better alternative would be to use a Stream of Key.
This will be tackled later.

Reviewed By: DurhamG

Differential Revision: D13951905

fbshipit-source-id: 15944b18d7ffea08d191e5dc7e1b8e2b783f69d1
2019-02-08 12:56:06 -08:00
Xavier Deguillard
374495767e asyncpacks: add AsyncUnionDataStore
Summary: Simple async wrapper around a UnionDataStore.

Reviewed By: DurhamG

Differential Revision: D13951906

fbshipit-source-id: 086739d834297fcc1dabd246cfde4631b4767640
2019-02-08 12:56:06 -08:00
Stefan Filip
2d6e900d12 Update itertools version in lib/treestate
Summary: TP2 version for itertools was updated to 0.8.

Reviewed By: singhsrb

Differential Revision: D14008855

fbshipit-source-id: 081a43c5b02cd39c6a0a6b491bfa0767ddf0b7ed
2019-02-08 11:49:54 -08:00
Stefan Filip
162f93f205 Remove argparse from the lib cargo workspace
Summary: `lib/argparse` fails to build with cargo. Removing the crate from the workspace to unblock building with cargo.

Reviewed By: quark-zju

Differential Revision: D13969332

fbshipit-source-id: 0299f74e6aa81632ce64005d91fa2c30a32f5b96
2019-02-06 16:42:23 -08:00
Arun Kulshreshtha
bdecba1c92 edenapi: enforce HTTP/2 prior knowledge
Summary: Ensure that Hyper uses HTTP/2, since we'd like to support connection reuse and multiplexing.

Reviewed By: DurhamG

Differential Revision: D13925320

fbshipit-source-id: 0f39e66fe35a0dc95966d16772d1ab8988067c11
2019-02-05 21:22:48 -08:00
Arun Kulshreshtha
5f6998719e edenapi: add static builder() method to get a client Builder
Summary: In Rust it is typically more idiomatic to have a static method on a struct to produce a builder, since this means the builder doesn't need to be explicitly imported to construct a new instance of the struct.

Reviewed By: DurhamG

Differential Revision: D13925323

fbshipit-source-id: c06d5d42ba941dbbb2c619f9470e79fa23f35f68
2019-02-05 21:22:48 -08:00
Arun Kulshreshtha
70aff50986 edenapi: rename mononokeapi to edenapi
Summary: Rename Mononoke API to Eden API, per war room discussion.

Reviewed By: quark-zju

Differential Revision: D13908195

fbshipit-source-id: 94a2fe93f8a89d0c5e9b6a24939cc4760cfaade0
2019-02-05 21:22:48 -08:00
Xavier Deguillard
747cc15fbf revisionstore: remove Rc from UnionDataStore and UnionHistoryStore
Summary:
The Rc is required by the c_api, but there is no longer a reason for
UnionDataStore and UnionHistoryStore to use an Rc, so let's move the Rc into
c_api.

Reviewed By: DurhamG

Differential Revision: D13928332

fbshipit-source-id: a93b54e022d539dc4df9144a8c59e9ffbe3453e0
2019-02-04 09:30:23 -08:00
Xavier Deguillard
f11c7fbf26 revisionstore: remove Clone requirement from UnionStore
Summary:
By specifying the IntoIterator differently, we can avoid the clone requirement.
Since Clone isn't implemented on either DataPack or HistoryPack, this will
simplify the callers a bit

Reviewed By: DurhamG

Differential Revision: D13928274

fbshipit-source-id: f0261c50d73868689ebb3ae226f84d41c4c40925
2019-02-04 09:30:23 -08:00
Xavier Deguillard
82af74b019 revisionstore: add blanket HistoryStore implementation Rc, Arc and Box
Summary: This way, HistoryStore type constraint will work with these types.

Reviewed By: DurhamG

Differential Revision: D13928128

fbshipit-source-id: aaa9f2633166c137dca5fc2b1f44caab92b57a80
2019-02-04 09:30:23 -08:00
Xavier Deguillard
fb2b0f48d3 revisionstore: add blanket DataStore implementation for Rc, Arc and Box
Summary: This way, DataStore type constraint will work with these types.

Reviewed By: DurhamG

Differential Revision: D13928090

fbshipit-source-id: 1567556e3ffea2901acbc754b3bd67491e23056b
2019-02-04 09:30:23 -08:00
Xavier Deguillard
4c4e2a6909 revisionstore: remove RefCell from UnionStore
Summary: The UnionStore doesn't need internal mutability, so let's simplify it.

Reviewed By: DurhamG

Differential Revision: D13928058

fbshipit-source-id: f0ba085ff8401dcc99fc69c3eb6f5e20c071d650
2019-02-04 09:30:23 -08:00
Xavier Deguillard
b1203c00a5 asyncpacks: add AsyncMutableHistoryPack
Summary: This allows writing historypacks from an async context.

Reviewed By: DurhamG

Differential Revision: D13891932

fbshipit-source-id: b90ada657ee33d4736060eeaaf70a9d766b3aa31
2019-02-01 17:12:52 -08:00
Xavier Deguillard
edabec3c30 asyncpacks: add AsyncHistoryPack
Summary: This just reuses the AsyncHistoryStore methods.

Reviewed By: DurhamG

Differential Revision: D13891142

fbshipit-source-id: 9553e9824eebc5eacf6a82f9d0f212a62ec8955f
2019-02-01 17:12:52 -08:00
Xavier Deguillard
aead487e94 asyncpacks: add AsyncHistoryStore
Summary:
Similarly to AsyncDataStore, this is just a blocking wrapper around a
HistoryStore.

Reviewed By: DurhamG

Differential Revision: D13891140

fbshipit-source-id: 76acadfc1849770b47e2400ce8c70f7e32bba4df
2019-02-01 17:12:52 -08:00
Xavier Deguillard
259c19c598 asyncpacks: move the asynchronous wrapper to util.rs
Summary: This will be used to wrap an HistoryStore into a AsyncHistoryStore.

Reviewed By: DurhamG

Differential Revision: D13891139

fbshipit-source-id: 41a0ec740f05268259a654e769ff0909617102ff
2019-02-01 15:30:54 -08:00
Arun Kulshreshtha
d9439db691 mononokeapi: add metadata to datapack
Summary: Add metadata to each delta entry written to the datapack. Since the HTTP API never serves LFS files, and the only flag currently used simple indicates whether a file should use LFS, the flag field is intentionally set to `None`, leaving only the size in the metadata (which, since we're storing full file content, is the same as the content length).

Differential Revision: D13894292

fbshipit-source-id: 36db25adb0c46cd1c7fde841a69d3e6d48d08d06
2019-02-01 01:41:31 -08:00
Arun Kulshreshtha
f02ebcffb7 mononokeapi: support fetching multiple files concurrently
Summary: Give MononokeClient the ability to fetch multiple files concurrently. Right now this functionality is not exposed via the Python bindings, so as far as the Mercurial Python code is concerned, nothing has changed. The multi-get functionality will be used later in the stack.

Differential Revision: D13893575

fbshipit-source-id: c9e514fbeb41bbb37f52f6df3920eb01a66df293
2019-02-01 01:41:31 -08:00
Arun Kulshreshtha
a035c8783a mononokeapi: split out MononokeClientBuilder into separate module
Summary: As `MononokeClient` grows, we're going to add more inherent methods on the struct. To avoid cluttering the `client` module, split out all the builder-related things into a separate module.

Reviewed By: singhsrb

Differential Revision: D13892198

fbshipit-source-id: 42918d8a775d8328cfad8a6ac0365cb336893d8f
2019-02-01 01:41:31 -08:00
Arun Kulshreshtha
1b74af2ace mononokeapi: add ability to fetch a file and write it to a datapack
Summary: Add a new `get_file()` method to `MononokeClient` that fetches Mercurial file content from the API server and writes it to a datapack in the cache. This functionality is exposed via the new `hg debuggetfile` debug command, which takes a filenode and file path and fetches the corresponding file.

Differential Revision: D13889829

fbshipit-source-id: 2b68bf114ee72d641de7a1043cca1975e34cf4e6
2019-02-01 01:41:31 -08:00
Arun Kulshreshtha
5ae0d91378 url-ext: add url-ext crate
Summary:
Crate adding easy conversions between `http::Uri` and `url::Url`.

Rust has two main types for working with URLs: `http::Uri` and `url::Url`.  `http::Uri` comes from the `http` crate, which is supposed to be a set of common types to be used throughout the Rust HTTP ecosystem, to ensure mutual compatibility between different HTTP crates and web frameworks. This is the type that HTTP clients like Hyper expect when specifying URLs.

Unfortunately, `http::Uri` is a very simple type that does not expose any means of mutating or otherwise manipulating the URL. It can only parse URLs from strings, forcing the users to construct URLs via error-prone string concatenation.

In contrast, the `url::Url` comes from the `rust-url` crate from the Servo project. This type does support easily constructing and manipulating URLs, making it very useful for assembling a URL from components.

The only way to convert between the two types is to first convert back to a string, and then re-parse as the desired type. Several issues [have](https://github.com/hyperium/hyper/issues/1219) [been](https://github.com/hyperium/hyper/issues/1102) [raised](https://github.com/hyperium/hyper/issues/1219) about this upstream, but there has been no consensus or action as of yet. To get around the problem for now, this crate adds convenience methods to perform the conversions.

Reviewed By: DurhamG

Differential Revision: D13887403

fbshipit-source-id: ecfaf3ea9d884621493b0fe44a6b5658d10108b4
2019-01-30 18:30:49 -08:00
Jun Wu
9dc21f8d0b codemod: import from the edenscm package
Summary:
D13853115 adds `edenscm/` to `sys.path` and code still uses `import mercurial`.
That has nasty problems if both `import mercurial` and
`import edenscm.mercurial` are used, because Python would think `mercurial.foo`
and `edenscm.mercurial.foo` are different modules so code like
`try: ... except mercurial.error.Foo: ...`, or `isinstance(x, mercurial.foo.Bar)`
would fail to handle the `edenscm.mercurial` version. There are also some
module-level states (ex. `extensions._extensions`) that would cause trouble if
they have multiple versions in a single process.

Change imports to use the `edenscm` so ideally the `mercurial` is no longer
imported at all. Add checks in extensions.py to catch unexpected extensions
importing modules from the old (wrong) locations when running tests.

Reviewed By: phillco

Differential Revision: D13868981

fbshipit-source-id: f4e2513766957fd81d85407994f7521a08e4de48
2019-01-29 17:25:32 -08:00
Xavier Deguillard
fcd2fb9642 asyncpacks: fix compilation warnings
Summary: Some of the revisionstore imports were unused.

Reviewed By: kulshrax

Differential Revision: D13865074

fbshipit-source-id: 79c7c2ba869f2e1d72fa06aac70a4b027367c831
2019-01-29 14:10:31 -08:00
Arun Kulshreshtha
e80ea448d2 revisionstore: reexport Key at top level
Summary: title

Differential Revision: D13858151

fbshipit-source-id: 9f188c2a21382de65eb7febc45a46e10763771b3
2019-01-29 11:45:23 -08:00
Arun Kulshreshtha
872ecdaf30 revisionstore: derive Serialize and Deserialize for Key
Summary: Similar to previous diff in this stack, make this type serializable so we can send it as part of an HTTP request.

Reviewed By: singhsrb

Differential Revision: D13858440

fbshipit-source-id: 9173a3e76bcfa6a6600d30ada39d65475f95bc5e
2019-01-29 04:44:16 -08:00
Arun Kulshreshtha
a7a1abae63 types: derive Serialize and Deserialize for Node
Summary: Make this type serializable so it can be sent as part of an HTTP request. By using Serde, we can easily support a variety of serialization formats without code changes.

Reviewed By: singhsrb

Differential Revision: D13858443

fbshipit-source-id: b6c83f38eaadbb2a28be6d66faf6a3610ede970f
2019-01-29 04:44:15 -08:00
Arun Kulshreshtha
2540614d3b types: convert to Rust 2018
Summary: title

Reviewed By: singhsrb

Differential Revision: D13858439

fbshipit-source-id: d5b0a1f0870abab9948fccba19c51a72d8a09bfc
2019-01-29 04:44:15 -08:00
Durham Goode
725eb4da33 windows: fix the build
Summary:
The conditional if statement did not prevent the logic inside the
condition from being compiled, which in this case fails on windows. Instead of
using an if, let's just define two functions and conditionally compile the
functions.

Reviewed By: ikostia

Differential Revision: D13855560

fbshipit-source-id: ac417e6bd8fb272106fe8f3b9a8b7db57214ad88
2019-01-29 02:41:38 -08:00
Jun Wu
c12e300bb8 codemod: move Python packages to edenscm
Summary:
Move top-level Python packages `mercurial`, `hgext` and `hgdemandimport` to
a new top-level package `edenscm`. This allows the Python packages provided by
the upstream Mercurial to be installed side-by-side.

To maintain compatibility, `edenscm/` gets added to `sys.path` in
`mercurial/__init__.py`.

Reviewed By: phillco, ikostia

Differential Revision: D13853115

fbshipit-source-id: b296b0673dc54c61ef6a591ebc687057ff53b22e
2019-01-28 18:35:41 -08:00
Xavier Deguillard
79bdddbe91 asyncpacks: introduce AsyncDataPack
Summary:
As a last step towards getting rid of loosefiles, memcache will soon be changed
to produce packfiles. One of the missing piece to achieve is the ability to
read and write packfiles asynchronously, as memcache is purely async.

As a first step, we can wrap the packfile into a blocking context.

Reviewed By: DurhamG

Differential Revision: D13806738

fbshipit-source-id: 2211c2a984a453edbb1647830f7f5fb399a03023
2019-01-28 10:33:23 -08:00
Xavier Deguillard
1090da2436 asyncpacks: introduce AsyncMutableDataPack
Summary:
As a last step towards getting rid of loosefiles, memcache will soon be changed
to produce packfiles. One of the missing piece to achieve is the ability to
read and write packfiles asynchronously, as memcache is purely async.

As a first step, we can wrap the packfile into a blocking context.

Reviewed By: DurhamG

Differential Revision: D13804184

fbshipit-source-id: 01fcb57af1558feca662b1070969f553c479871a
2019-01-28 10:33:23 -08:00
Xavier Deguillard
5485ecc185 revisionstore: proper permissions for pack files
Summary:
The tempfile rust crates opens the file with RW permissions for the user only,
but once written out to disk, the permissions needs to be readable by everyone.
Unfortunately, rust doesn't have a portable way of doing this, so we have to
resort to using `if cfg!(unix)` conditions for doing this.

Reviewed By: DurhamG

Differential Revision: D13703406

fbshipit-source-id: 688bc679b5c1a7943ceab723c1f649d555b61a7a
2019-01-25 09:42:39 -08:00
Xavier Deguillard
da0999c2f8 revisionstore: move mutable packs close logic to a MutablePack trait
Summary:
This allows de-duplicating the logic for setting proper permissions on the
files. Most of the changes is code movement and rustfmt formatting.

Reviewed By: DurhamG

Differential Revision: D13703392

fbshipit-source-id: 28be85ef2d4b440202cf4885e50e62ac3c41f774
2019-01-25 09:42:39 -08:00
Arun Kulshreshtha
245c21b8ee mononokeapi: allow cert and key to come from separate files
Summary: Allow the credentials for TLS mutual authentication (namely, the client certificate and private key) to come from separate PEM files. At Facebook, these are usually stored in the same file, but Mercurial's standard TLS configuration options allow these to be configured separately. As such, in order to support the standard options (which will happen in a later diff), provide the ability to handle separate files, but for now just pass the same path for both from Python to Rust.

Reviewed By: markbt

Differential Revision: D13791525

fbshipit-source-id: 556d99d77a4273b9b0bd91cac8940da136088e45
2019-01-24 12:32:38 -08:00
Andrey Malevich
91ea434837 Revert D13575719: [tp2] Update zstd to 1.3.8 as 1.3.x
Differential Revision:
D13575719

Original commit changeset: eb7961078ad1

fbshipit-source-id: 844414e83f8a05df89a21dc1c2a6b9e60bad5dcc
2019-01-23 18:19:13 -08:00
Nick Terrell
64cde69334 Update zstd to 1.3.8 as 1.3.x
Summary: Update zstd in TP2 to zstd-1.3.8.

Reviewed By: pixelb

Differential Revision: D13575719

fbshipit-source-id: eb7961078ad161eb633b08b7e80e87f1c63ccca5
2019-01-23 11:16:22 -08:00
Arun Kulshreshtha
3ecdc75a90 mononokeapi: add builder for MononokeClient
Summary: Use a builder struct rather than a constructor function to configure and initialize new `MononokeClient` instances. Doing it this way is helpful because later in this stack, we'll need to pass a lot of additional configuration to `MononokeClient`; adding all of these items as parameters to the constructor quickly becomes unwieldily. Using a builder keeps the number of parameters in check.

Differential Revision: D13780408

fbshipit-source-id: bfc43ecbe474d5285ae87d4df9cce244a7ff391d
2019-01-23 10:37:17 -08:00
Arun Kulshreshtha
6e7d80393f mononokeapi: add MononokeApi trait
Summary:
Split up the functionality in `MononokeClient` by moving all of the Mononoke API methods to their own separate trait. This maintains a distinction between functionality that is part of the API vs methods for setting up and configuring the client.

Originally, I had tried to avoid using a trait here because of limitations on trait methods (for example, we can't use `impl Trait` for return types). In practice, I don't think this limitation will be an issue since the API exposed by the client needs to be synchronous (since it will be called by FFI bindings to Python), and as such, there shouldn't be any complex Future return types in the API. (The client will still use async code internally, but the external API will be synchronous.)

Differential Revision: D13780089

fbshipit-source-id: 17e80f549d6ac7c41c60b2b8389eb1760531883e
2019-01-23 10:37:17 -08:00
Arun Kulshreshtha
c7b9d822a4 revisionstore: use Vec<u8> instead of boxed slice for key names
Summary: Boxed slices are difficult to use in practice, so use `Vec<u8>` instead. (No need for `Bytes` here since there is no reference counting required.)

Reviewed By: DurhamG

Differential Revision: D13770055

fbshipit-source-id: 78f48ac32a4da9c105bf05eb44889c1f492721a8
2019-01-22 16:02:13 -08:00
Arun Kulshreshtha
a642954e27 revisionstore: use Bytes instead of Rc<Box<[u8]>> in loosefiles module
Summary: Use `Bytes` instead of `Rc<Box<[u8]>>` since the former is a nicer type to represent a reference counted heap allocated byte buffer. (Note that `Rc<Box<[u8]>>` should have originally been `Rc<[u8]>` -- the former introduces an unnecessary allocation and layer of indirection.)

Differential Revision: D13769306

fbshipit-source-id: 5f3e788426e28c7e9ccc478f993c717b23663f56
2019-01-22 14:03:17 -08:00
Arun Kulshreshtha
d3839ffb07 revisionstore: use Bytes instead of Box<[u8]> in Delta and DataEntry
Summary: Boxed bytes slices (e.g., `Box<[u8]>`, `Rc<[u8]>`) are not very ergonomic to use and are somewhat unusual in Rust code. Use the more common and easier to use `Bytes` type instead. Since this type supports shallow, referenced-counted copies, there shouldn't be any new O(n) copying behavior compared to `Rc<[u8]>`.

Reviewed By: markbt

Differential Revision: D13754730

fbshipit-source-id: d5fbc8e39c84c56d30174f4bb194ee21a14bf944
2019-01-22 14:03:17 -08:00
Arun Kulshreshtha
6a00abcfb0 lz4-pyframe: use failure::Fallible
Summary: Use `failure::Fallible<T>` in place of `Result<T, failure::Error>`.

Reviewed By: singhsrb

Differential Revision: D13754688

fbshipit-source-id: cfbe418f5213884816d4837d1077cd90a17359b6
2019-01-21 18:00:57 -08:00
Arun Kulshreshtha
7c93df4d3b l4-pyframe: migrate to rust 2018
Summary: Migrate crate to Rust 2018.

Reviewed By: singhsrb

Differential Revision: D13754665

fbshipit-source-id: d2ce3994874afa1149229d481084ea66b5e312f8
2019-01-21 18:00:57 -08:00
Arun Kulshreshtha
96fee34104 revisionstore: migrate to rust 2018
Summary: Migrate crate to Rust 2018 by running `cargo fix --edition --edition-idioms`, removing `extern crate` declarations, and fixing all new warnings.

Reviewed By: singhsrb

Differential Revision: D13754392

fbshipit-source-id: 3343a07e7d8b332e15475084a8a8ddff06f6d13b
2019-01-21 18:00:57 -08:00
Arun Kulshreshtha
aefe1ba8f8 revisionstore: regroup imports
Summary:
Previously, `use` statements were inconsistently and arbitrarily grouped. This diff groups them in the following order:

- 3rd party crates from crates.io
- local crates
- std library imports (collapsed into a single multiline `use` statement)
- modules within current crate

This new ordering ensures that upon migration to Rust 2018, all imports from within the current crate will be grouped together with the `crate::` prefix.

Reviewed By: singhsrb

Differential Revision: D13754393

fbshipit-source-id: e774c09e0547066afa5f797c1a9c2e5ec4190834
2019-01-21 18:00:57 -08:00
Arun Kulshreshtha
37a74966a2 revisionstore: rustfmt
Summary: Run the latest version of rustfmt over the code to ensure consistent style.

Reviewed By: singhsrb

Differential Revision: D13754394

fbshipit-source-id: 6cf5937bcb642530bdf41aaf83399366a9ba3c9a
2019-01-21 18:00:57 -08:00
Arun Kulshreshtha
bfe737d1fb revisionstore: fix dead code warnings
Summary: There were some warnings about unused private fields in various structs in this crate. Add `#[allow(dead_code)]` as needed to suppress these warnings.

Reviewed By: singhsrb

Differential Revision: D13754234

fbshipit-source-id: ca95a2afbfc67ddb66e7c7436c81cde0fa59f06c
2019-01-21 18:00:57 -08:00
Mark Thomas
a1a2eafd95 revisionstore: use Fallible
Summary:
Use the `Fallible` type alias provided by `failure` rather than defining our
own.

Differential Revision: D13732298

fbshipit-source-id: 2577bc4c34da5b7a88ae2703f9b898bc2a83b816
2019-01-21 03:37:19 -08:00
Arun Kulshreshtha
eb86dabbc1 mononokeapi: migrate to Rust 2018
Summary: Migrate this crate to Rust 2018 edition.

Reviewed By: phillco

Differential Revision: D13742720

fbshipit-source-id: 0a2f6a713cff43cf2814cf41df4ac910b9901e5c
2019-01-18 19:29:41 -08:00
Arun Kulshreshtha
c067536fae mononokeapi: use url::Url instead of http::Uri
Summary: The canonical URL type in Rust, `http::Uri`, does not support manipulating URLs easily. (e.g., concatenating path components, etc.) As such, switch to using the `Url` type from the `url` crate, which does support URL manipulation, and convert to `http::Uri` before passing the resulting URL to Hyper.

Reviewed By: phillco

Differential Revision: D13738139

fbshipit-source-id: c7de67f1596ebc1bdde89d3fe87086f49c32b5db
2019-01-18 15:47:17 -08:00
Xavier Deguillard
33688947c6 revisionstore: sort pack files in list_packs
Summary:
Directory listing is different in every OS, and due to the current repack
implementation, this directly affect the order in which the packfiles are added
to the new one. Since the resulting packfile name depends on the hash of its
content, the name was influenced by the directory order.

By sorting the files in list_packs, the packfile name will be independent of
the directory listing and thus be the same for all the OSes.

Reviewed By: singhsrb

Differential Revision: D13700935

fbshipit-source-id: 01e055a0c1bcf7fb2dc4faf614dfb20cd4499017
2019-01-16 15:18:24 -08:00
Xavier Deguillard
87cf0f533b revisionstore: Add a basic rust incremental repack.
Summary: For now, combine all files smaller than 100MB that accumulate to less than 4GB.

Reviewed By: DurhamG

Differential Revision: D13603760

fbshipit-source-id: 3fa74f1ced3d3ccd463af8f187ef5e0254e1820b
2019-01-16 09:47:09 -08:00
Xavier Deguillard
2525a6e9ee revisionstore: Use PackWriter to write to {data,history}packs.
Summary: Use the newly introduced PackWriter to write the {data,history}packs.

Reviewed By: markbt

Differential Revision: D13603759

fbshipit-source-id: 528a6af7c4ac3321aeec0559805de12114224cfd
2019-01-16 09:47:09 -08:00
Xavier Deguillard
e6a60b68f3 revisionstore: Add an efficient pack writer.
Summary:
The packfiles are currently being written via an unbuffered file. This is
inefficient as every write to the file results results in a write(2) syscall.
By buffering these writes we can reduce the number of syscalls and thus
increase the throughput of pack writing operations.

Reviewed By: markbt

Differential Revision: D13603758

fbshipit-source-id: 649186a852d427a1473695b1d32cc9cd87a74a75
2019-01-16 09:47:09 -08:00
Mark Thomas
c6c99b4777 configparser: update pest to 2.1.0
Summary:
Update pest to 2.1.0.

This version has a new behaviour for parser error messages: the line feed at
the end of the line is shown in the error output.

Reviewed By: wez

Differential Revision: D13671099

fbshipit-source-id: b8d1142a44a56a0b21b3b72cf027f3f8a30f421e
2019-01-16 03:52:09 -08:00
Arun Kulshreshtha
28e20c5997 Reexport public types from public submodules
Summary:
The revisionstore crate currently consists of several public submodules,
each exposing several public types. The APIs exposed by each of the modules
require using types from the other modules. As such, users of this crate are
forced to have complex nested imports to use any of its functionality.

This diff helps ease this problem by reexporting the public types exposed from
each of the public submodules at the top level, thereby allowing crate users to
`use` all of the required types without needing nested imports.

Reviewed By: singhsrb

Differential Revision: D13686913

fbshipit-source-id: 9fb3cce8783787aa5f3f974c7168afada5952712
2019-01-15 21:20:03 -08:00
Xavier Deguillard
e6135fa88e revisionstore: Use get_missing instead of get_delta in repack.
Summary:
The later tries to read from the disk, while the former is purely in memory and
thus more efficient.

Reviewed By: DurhamG, markbt

Differential Revision: D13603757

fbshipit-source-id: 5fd120ba4065d6a65cb2982db9ab81db3ea26524
2019-01-15 17:02:38 -08:00
Mark Thomas
3b9eb801e1 types: use Fallible
Summary:
Use the `Fallible` type alias provided by `failure` rather than defining our
own.

Differential Revision: D13657313

fbshipit-source-id: ae249bc15037cc2be019ce7ce8a440c153aa31cc
2019-01-15 03:50:47 -08:00
Mark Thomas
3570402d79 watchman_client: use Fallible
Summary:
Use the `Fallible` type alias provided by `failure` rather than defining our
own.

Differential Revision: D13657312

fbshipit-source-id: 55134ee93f1f3aaaeefe5644a4a1f2285603bc1c
2019-01-15 03:50:47 -08:00
Mark Thomas
7f1258f091 commitcloudsubscriber: use Fallible
Summary:
Use the `Fallible` type alias provided by `failure` rather than defining our
own.

Differential Revision: D13657314

fbshipit-source-id: f1a379089972f7f0066c49ddedf606d36b7ac260
2019-01-15 03:50:47 -08:00
Mark Thomas
d3709fde5b mononokeapi: use Fallible
Summary:
Use the `Fallible` type alias provided by `failure` rather than defining our
own.

Differential Revision: D13657310

fbshipit-source-id: cae73fc239a6ad30bb6ef56a664d1ef5a2a19b5f
2019-01-15 03:50:47 -08:00
Xavier Deguillard
f170cceea2 revisionstore: Repackable::delete now takes the ownership of self.
Summary:
On some platforms, removing a file can fail if it's still mapped or opened. In
mercurial, this can happen during repack as the datapacks are removed while
still being mapped.

Reviewed By: DurhamG

Differential Revision: D13615938

fbshipit-source-id: fdc1ff9370e2767e52ee1828552f4598105f784f
2019-01-14 21:14:13 -08:00
Xavier Deguillard
da3dd2319f revisionstore: remove repacked pack files
Summary:
After repacking the data/history packs, we need to cleanup the
repacked files. This was an omission from D13363853.

Reviewed By: markbt

Differential Revision: D13577592

fbshipit-source-id: 36e7d5b8e86affe47cdd10d33a769969f02b8a62
2019-01-11 16:54:15 -08:00
Xavier Deguillard
ce16778656 remotefilelog: set proper file permissions on closed mutable packs.
Summary:
The python version of the mutable packs set the permission to read-only after
writing them, while the rust version keeps them writeable. Let's make the rust
one more consistent.

Reviewed By: markbt

Differential Revision: D13573572

fbshipit-source-id: 61256994562aa09058a88a7935c16dfd7ddf9d18
2019-01-11 16:54:15 -08:00
Mark Thomas
98417b1ffb configparser: fix warning about unused Result
Summary:
Use of `write!` requires checking for errors, however in this case, there is no
need to use `write!`, as we just want the error as a string.

Reviewed By: ikostia

Differential Revision: D13596497

fbshipit-source-id: 5892025344936936188cf3a8ca227e71eff57d55
2019-01-08 06:19:55 -08:00
Jun Wu
f6158659f8 configparser: use hardcoded system config path on Windows
Summary:
When I was debugging an eden importer issue with Puneet, we saw errors caused
by important extensions (ex. remotefilelog, lz4revlog) not being loaded.  It
turned out that configpaser was checking the "exe dir" to decide where to
load "system configs". For example, If we run:

  C:\open\fbsource\fbcode\scm\hg\build\pythonMSVC2015\python.exe eden_import_helper.py

The "exe dir" is "C:\open\fbsource\fbcode\scm\hg\build", and system config is
not there.

Instead of copying "mercurial.ini" to every possible "exe dir", this diff just
switches to a hard-coded system config path. It's now consistent with what we
do on POSIX systems.

The logic to copy "mercurial.ini" to "C:\open\fbsource\fbcode\scm\hg" or
"C:\tools\hg" become unnecessary and are removed.

Reviewed By: singhsrb

Differential Revision: D13542939

fbshipit-source-id: 5fb50d8e42d36ec6da28af29de89966628fe5549
2018-12-22 01:53:03 -08:00
Saurabh Singh
b193e23dd2 test-check-fix-code: unbreak test by fixing copyrights
Summary:
`test-check-fix-code.t` was failing due to copyright header missing
from certain files. This commit fixes the files by running

```
contrib/fix-code.py FILE
```

as suggested in the failure message.

Reviewed By: DurhamG

Differential Revision: D13538506

fbshipit-source-id: d8063c9a0e665377a9976abeccb68fbef6781950
2018-12-21 10:03:26 -08:00
Jun Wu
22e9000fc9 lz4-pyframe: add compresshc
Summary:
Unfortunately required symbols are not exposed by lz4-sys. So we just declare
them ourselves.

Make sure it compresses better:

  In [1]: c=open('/bin/bash').read();
  In [2]: from mercurial.rust import lz4
  In [3]: len(lz4.compress(c))
  Out[3]: 762906
  In [4]: len(lz4.compresshc(c))
  Out[4]: 626970

While it's much slower for larger data (and compresshc is slower than pylz4):

  Benchmarking (easy to compress data, 20MB)...
            pylz4.compress: 10328.03 MB/s
       rustlz4.compress_py:  9373.84 MB/s
          pylz4.compressHC:  1666.80 MB/s
     rustlz4.compresshc_py:  8298.57 MB/s
          pylz4.decompress:  3953.03 MB/s
     rustlz4.decompress_py:  3935.57 MB/s
  Benchmarking (hard to compress data, 0.2MB)...
            pylz4.compress:  4357.88 MB/s
       rustlz4.compress_py:  4193.34 MB/s
          pylz4.compressHC:  3740.40 MB/s
     rustlz4.compresshc_py:  2730.71 MB/s
          pylz4.decompress:  5600.94 MB/s
     rustlz4.decompress_py:  5362.96 MB/s
  Benchmarking (hard to compress data, 20MB)...
            pylz4.compress:  5156.72 MB/s
       rustlz4.compress_py:  5447.00 MB/s
          pylz4.compressHC:    33.70 MB/s
     rustlz4.compresshc_py:    22.25 MB/s
          pylz4.decompress:  2375.42 MB/s
     rustlz4.decompress_py:  5755.46 MB/s

Note python-lz4 was using an ancient version of lz4. So there could be differences.

Reviewed By: DurhamG

Differential Revision: D13528200

fbshipit-source-id: 6be1c1dd71f57d40dcffcc8d212d40a853583254
2018-12-20 17:54:22 -08:00
Jun Wu
4f24bffdde cpython-ext: move pybuf to cpython-ext
Summary:
The `pybuf` provides a way to read `bytes`, `bytearray`, some `buffer` types in
a zero-copy way. The main benefit is to use same code to support different
input types. It's copied to a couple of places. Let's move it to `cpython-ext`.

Reviewed By: DurhamG

Differential Revision: D13516206

fbshipit-source-id: f58881c4bfe651a6fdb84cf317a74c3c8d7a4961
2018-12-20 17:54:22 -08:00
Jun Wu
f23c6bc7e3 cpython-ext: add a way to pre-allocate PyBytes
Summary: Make it possible to write content directly into a PyBytes buffer.

Reviewed By: DurhamG

Differential Revision: D13528202

fbshipit-source-id: 8c0a4ed030439a8dc40cdfbd72b1f6734a8b2036
2018-12-20 17:54:22 -08:00
Jun Wu
6e88ac4794 lz4-pyframe: provide decompress_into API
Summary:
This allows decompressing into a pre-allocated buffer. After some experiments,
it seems `bytearray` will just break too many things, ex:

- bytearray is not hashable
- bytearray[index] returns an int
- a = bytearray('x'); b = a; b += '3' # will mutate 'a'
- ''.join([bytearray('')]) will raise TypeError

Therefore we have to use zero-copy `bytes` instead, which is less elegent. But
this API change is a step forward.

Reviewed By: DurhamG

Differential Revision: D13528201

fbshipit-source-id: 1cfaf5d55efdc0d6c0df85df9960fe9682028b08
2018-12-20 17:54:22 -08:00
Jun Wu
7831e2a4ce cpython-ext: add ways to zero-copy Vec<u8> into a Python object
Summary:
I need to convert `Vec<u8>` to a Python object in a zero-copy way for rustlz4
performacne.

Assuming Python and Rust use the same memory allocator, it's possible to transfer
the control of a malloc-ed pointer from Rust to Python. Use this to implement
zero-copy. PyByteArrayObject is chosen because its struct contains such a pointer.
PyBytes cannot be used as it embeds the bytes, without using a pointer.

Sadly there are no CPython APIs to do this job. So we have to write to the raw
structures. That means the code will crash if python is replaced by
python-debug (due to Python object header change). However, that seems less an
issue given the performance wins. If python-debug does become a problem, we can
try vendoring libpython directly.

I didn't implement a feature-rich `PyByteArray` Rust object. It's not easy to
do so outside the cpython crate. Most helper macros to declare types cannot be
reused, because they refer to `::python`, which is not available in the current
crate.

Reviewed By: DurhamG

Differential Revision: D13516209

fbshipit-source-id: 9aa089b309beb71d4d21f6c63fcb97dbc798b5f8
2018-12-20 17:54:22 -08:00
Jun Wu
35c85018cd lz4-pyframe: add a benchmark
Summary:
This gives some sense about how fast it is.

Background: I was trying to get rid of python-lz4, by exposing this to Python.
However, I noticed it's 10x slower than python-lz4. Therefore I added some
benchmark here to test if it's the wrapper or the Rust lz4 code.

It does not seem to be this crate:

```
  # Pure Rust
  compress (100M)                77.170 ms
  decompress (~100M)             67.043 ms

  # python-lz4
  In [1]: import lz4, os
  In [2]: b=os.urandom(100000000);
  In [3]: %timeit lz4.compress(b)
  10 loops, best of 3: 87.4 ms per loop
```

Reviewed By: DurhamG

Differential Revision: D13516205

fbshipit-source-id: f55f94bbecc3b49667ed12174f7000b1aa29e7c4
2018-12-20 17:54:21 -08:00
Jun Wu
b3893b3d3c indexedlog: add methods on Log to do prefix lookups
Summary:
This exposes the underlying lookup functions from `Index`.

Alternatively we can allow access to `Index` and provide an `iter_started_from`
method on `Log` which takes a raw offset. I have been trying to avoid exposing
raw offsets in public interfaces, as they would change after `flush()` and cause
problems.

Reviewed By: markbt

Differential Revision: D13498303

fbshipit-source-id: 8b00a2a36a9383e3edb6fd7495a005bc985fd461
2018-12-20 15:50:55 -08:00
Jun Wu
3237b77e4c indexedlog: add APIs to lookup by prefix
Summary:
This is the missing API before `indexedlog::Index` can fit in the
`changelog.partialmatch` case. It's actually more flexible as it can provide
some example commit hashes while the existing revlog.c or radixbuf
implementation just error out saying "ambiguous prefix".

It can be also "abused" for the semantics of sorted "sub-keys". By replace
"key" with "key + subkey" when inserting to the index. Looking up using "key"
would return a lazy result list (`PrefixIter`) sorted by "subkey". Note:
the radix tree is NOT efficient (both in time and space) when there are common
prefixes. So this use-case needs to be careful.

Reviewed By: markbt

Differential Revision: D13498301

fbshipit-source-id: 637856ebd761734d68b20c15866424b1d4518ad6
2018-12-20 15:50:55 -08:00
Jun Wu
562b7a1704 indexedlog: add a function to convert base16 to base256
Summary: This will be used in prefix lookups.

Reviewed By: markbt

Differential Revision: D13498300

fbshipit-source-id: 3db7a21d6f35a18699d9dc3a0eca71a5410e0e61
2018-12-20 15:50:55 -08:00
Jun Wu
443a8f33b3 indexedlog: move binary indexedlog_dump out
Summary:
It makes testing duplicated - now `cargo test` would try running tests on 2 entry points:
lib.rs and indexedlog_dump.rs.  Move it to a separate crate to solve the issue.

Reviewed By: markbt

Differential Revision: D13498266

fbshipit-source-id: 8abf07c1272dfa825ec7701fd8ea9e0d1310ec5f
2018-12-18 08:17:21 -08:00
Jun Wu
61b1a5f475 indexedlog: fix rustc warnings
Summary: `write!` result needs to be used.

Reviewed By: markbt

Differential Revision: D13471967

fbshipit-source-id: d48752bcac05dd33b112679d7faf990eb8ddd651
2018-12-17 12:10:52 -08:00
Xavier Deguillard
79164e920c revisionstore: replace rand::chacha with rand_chacha
Summary: The former is deprecated and thus compiling revisionstore shows many warnings.

Reviewed By: markbt

Differential Revision: D13379278

fbshipit-source-id: d4b4662a1ad00997de4c46274deaf22f48487328
2018-12-17 12:07:22 -08:00
Mark Thomas
ca135cd33f cpython-failure: Integrate cpython PyResult with the failure crate
Summary:
Adds a new crate `cpython-result`, which provides a `ResultExt` trait, which
extends the failure `Result` type to allow coversion to `PyResult` by
converting the error to an appropriate Python Exception.

Reviewed By: quark-zju

Differential Revision: D12980782

fbshipit-source-id: 44a63d31f9ecf2f77efa3b37c68f9a99eaf6d6fa
2018-12-14 06:43:40 -08:00
Mark Thomas
cf4b52c19c mutationstore: add mutationstore
Summary:
The mutationstore is a new store for recording records of commit mutations for
commits that are not in the local repository.

It uses an indexedlog to store the data.  Each mutation entry corresponds to
the information the mutation that led to the creation of a particular commit,
which is recorded as the successor in the entry.

Entries can come from three possible places:

* `Commit` metadata for a commit not available locally
* `Obsmarkers` for repos that have been migrated from evolution tracking
* `Synthetic` for entries created synthetically, e.g. by a pullcreatemarkers
  implementation.

The other commits referred to in an entry must predate the successor commit.
For entries that originated from commits, this is ensured, as the successor
commit hash includes the other commit hashes.  For other entry types, it is
an error to refer to later commits, and any entry that causes a cycle will
be ignored.

Reviewed By: quark-zju

Differential Revision: D12980773

fbshipit-source-id: 040d3f7369a113e710ed8c9f61fabec6c5ec9258
2018-12-14 06:43:40 -08:00
Mark Thomas
1346ff92c4 types: implement Debug for Node
Summary:
The derived debug for Node prints out each byte as a decimal number.  Instead,
make the Debug output for nodes look like `Node("hexstring")`.

Reviewed By: DurhamG

Differential Revision: D12980775

fbshipit-source-id: 042cbf6eade8403759684969e1f69f7f4e335582
2018-12-14 06:43:40 -08:00
Mark Thomas
88ab626e9a types: Add Nodes::random_distinct to randomly generate sets of nodes
Summary:
Add a utility function for tests to generate a vector of random nodes.  This
will be used in future tests.

Reviewed By: DurhamG

Differential Revision: D12980784

fbshipit-source-id: 73fc8643503e11a46a845671df94c912a5e49d23
2018-12-14 06:43:40 -08:00
Mark Thomas
d0c03f6aaf types: Add WriteNodeExt and ReadNodeExt
Summary:
Add traits that extend `std::io::Read` and `std::io::Write` to implement new
`read_node` and `write_node` methods, allowing simple reading and writing of
binary nodes from and to streams.

Reviewed By: DurhamG

Differential Revision: D12980778

fbshipit-source-id: fc6751cd43a1693a5a5a3ac93aea74aec5fda4fe
2018-12-14 06:43:40 -08:00
Xavier Deguillard
5307fd8867 revisionstore: implement basic repack in rust
Summary:
The future of mercurial is rust, and one of the missing piece is repacking of data/history packs. For now, let's implement a very basic packing strategy that just pulls all the packs into one, with one small optimization that puts all the delta chains close together in the output file.

At first, it's expected that this code will be driven by the existing python code, but more and more will be done in rust as time goes.

Reviewed By: DurhamG

Differential Revision: D13363853

fbshipit-source-id: ad1ac2039e1732f7141d99abf7f01804a9bde097
2018-12-12 12:44:03 -08:00
Jun Wu
421c7b3f45 indexedlog: add a tool to dump indexedlog content
Summary: The tool can dump indexedlog content. Useful for manually investigating issues.

Reviewed By: DurhamG

Differential Revision: D13051387

fbshipit-source-id: 8687a1aa9dfb54776e80f184208c49da2492c34d
2018-12-06 14:57:52 -08:00
Jun Wu
54dc931140 indexedlog: use inlined leaf entries to further reduce index size
Summary:
Add a new entry type - INLINE_LEAF, which embeds the EXT_KEY and LINK entries
to save space.

The index size for referred keys is significantly reduced with little overhead:

  index insertion (owned key)     3.732 ms
  index insertion (referred key)  3.604 ms
  index flush                    11.868 ms
  index lookup (memory)           1.159 ms
  index lookup (disk, no verify)  2.175 ms
  index lookup (disk, verified)   4.303 ms
  index size (5M owned keys)     216626039
  index size (5M referred keys)   96616431
    11.87s user 2.96s system 98% cpu 15.107 total

The breakdown of the "5M referred keys" size is:

  type          count     bytes
  radixes       1729472   33835772
  inline_leafs  5000000   62780651

There are no other kinds of entries stored.

Previously, the index size of referred keys is:

  index size (5M referred keys)  136245815 bytes

So it's 136MB -> 96MB, 40% decrease.

Reviewed By: DurhamG

Differential Revision: D13036801

fbshipit-source-id: 27e68e4b6c332c1dc419abc6aba69271952e4b3d
2018-12-06 14:57:52 -08:00
Jun Wu
a4958163ee indexedlog: optimize size of radix entries (BC)
Summary:
Replace the 20-byte "jump table" with 3-byte "flag + bitmap". This saves space
for indexes less than 4GB. There are some reserved bits in the "flag" so if we
run into space issues when indexes are larger than 4GB, we can try adding
6-byte integer, or VLQ back without breaking backwards-compatibility.

It seems to hurt flush performance a bit, because we have to scan the child
array twice. However, lookup (the most important performance) does not change
much. And the index is more compact.

After:

  index flush                    19.644 ms
  index lookup (disk, no verify)  2.220 ms
  index lookup (disk, verified)   4.067 ms
  index size (5M owned keys)     216626039 bytes
  index size (5M referred keys)  136245815 bytes

Before:

  index flush                    16.764 ms
  index lookup (disk, no verify)  2.205 ms
  index lookup (disk, verified)   4.030 ms
  index size (5M owned keys)     240838647 bytes
  index size (5M referred keys)  160458423 bytes

For the "referred key" case, it's 160->136MB, 17% decrease.

A detailed break down of components of index is:

After:

  type       count     bytes (using owned keys)
  radixes    1729472   33835772
  links      5000000   27886336
  leafs      5000000   44629384
  keys       5000000  110000000

  type       count     bytes (using referred keys)
  radixes    1729472   33835772
  links      5000000   27886336
  leafs      5000000   44629384
  ext_keys   5000000   29894315

Before:

  type       count     bytes (using owned keys)
  radixes    1729472   58048380
  links      5000000   27886336
  leafs      5000000   44903923
  keys       5000000  110000000

  type       count     bytes (using referred keys)
  radixes    1729472   58048380
  links      5000000   27886336
  leafs      5000000   44629384
  ext_keys   5000000   29894315

Leaf nodes are taking too much space. It seems the next big optimization might
be inlining ext_keys into leafs.

Reviewed By: DurhamG, markbt

Differential Revision: D13028196

fbshipit-source-id: 6043b16fd67a497eb52d20a17e153fcba5cb3e81
2018-12-06 14:57:52 -08:00
Jun Wu
d8117b3b04 indexedlog: increase key count for size test
Summary:
Since the size test only runs once, we can use a larger number of keys. This is
closer to some production use-cases.

`cargo bench size` shows:

  index size (5M owned keys)     240838647
  index size (5M referred keys)  160458423

It currently uses 32 bytes per key for 5M referred keys.

Reviewed By: markbt

Differential Revision: D13027880

fbshipit-source-id: 726f5fb2da056e77ab93d82fda9f1afa500d0a8d
2018-12-06 14:57:52 -08:00
Jun Wu
55b6331aa4 indexedlog: add more benchmarks
Summary:
Add benchmarks about index sizes, and a benchmark of insertion using key
references.

An example `cargo bench` result running on my devserver looks like:

  index insertion (owned key)     3.551 ms
  index insertion (referred key)  3.713 ms
  index flush                    20.648 ms
  index lookup (memory)           1.087 ms
  index lookup (disk, no verify)  2.041 ms
  index lookup (disk, verified)   4.347 ms
  index size (owned key)            886010
  index size (referred key)         534298

Reviewed By: markbt

Differential Revision: D13027879

fbshipit-source-id: 70644c504026ffee2122d857d5035f5b7eea4f42
2018-12-06 14:57:52 -08:00
Jun Wu
d7129256d4 indexedlog: switch checksum table to little endian (BC)
Summary:
For checksum values like xxhash, there is no benefit using big endian. Switch
to little endian so it's slightly slightly faster on the major platforms we
care about.

This is a breaking change. However, the format is not used in production yet.
So there is no migration code.

Reviewed By: markbt

Differential Revision: D13015465

fbshipit-source-id: ca83d19b3328370d089b03a33e848e64b728ef2a
2018-12-06 14:57:52 -08:00
Jun Wu
75b4f92c44 indexedlog: support different checksum functions for Log entries (BC)
Summary:
Previously, the format of an Log entry is hard-coded - length, xxhash, and
content. The xxhash always takes 8 bytes.

For small (ex. 40-byte) entries, xxhash32 is actually faster and takes less
disk space.

Introduce the "entry flags" concept so we can store some metadata about what
checksum function to use. The concept could be potentially used to support
other new format changes at per entry level in the future.

As we're here, also support data without checksums. That can be useful for
content with its own checksum, like a blob store with its own SHA1 integrity
check.

Performance-wise, log insertion is slower (but the majority insertaion overhead
would be on the index part), iteration is a little bit faster, perhaps because
the log can use less data.

Before:

  log insertion                  15.874 ms
  log iteration (memory)          6.778 ms
  log iteration (disk)            6.830 ms

After:

  log insertion                  18.114 ms
  log iteration (memory)          6.403 ms
  log iteration (disk)            6.307 ms

Reviewed By: DurhamG, markbt

Differential Revision: D13051386

fbshipit-source-id: 629c251633ecf85058ee7c3ce7a9f576dfac7bdf
2018-12-06 14:57:52 -08:00
Jun Wu
049cd99f05 indexedlog: use non-VLQ encoding for xxhash (BC)
Summary:
Xxhash result won't usually have leading zeros. So VLQ encoding is not an
efficient choice. Use non-VLQ encoding instead.

Performance wise, this is noticably faster than before:

  log insertion                  14.161 ms
  log insertion with index      102.724 ms
  log flush                      11.336 ms
  log iteration (memory)          6.351 ms
  log iteration (disk)            7.922 ms
    10.18s user 3.66s system 97% cpu 14.218 total
  log insertion                  13.377 ms
  log insertion with index       97.422 ms
  log flush                      11.792 ms
  log iteration (memory)          6.890 ms
  log iteration (disk)            7.139 ms
    10.20s user 3.56s system 97% cpu 14.117 total
  log insertion                  14.573 ms
  log insertion with index       94.216 ms
  log flush                      18.993 ms
  log iteration (memory)          7.867 ms
  log iteration (disk)            7.567 ms
    9.85s user 3.73s system 96% cpu 14.073 total
  log insertion                  15.526 ms
  log insertion with index       98.868 ms
  log flush                      19.600 ms
  log iteration (memory)          7.533 ms
  log iteration (disk)            7.150 ms
    10.13s user 4.02s system 96% cpu 14.647 total
  log insertion                  14.629 ms
  log insertion with index      100.449 ms
  log flush                      20.997 ms
  log iteration (memory)          7.299 ms
  log iteration (disk)            7.518 ms
    10.14s user 3.65s system 96% cpu 14.274 total

This is a format-breaking change. Fortunately we haven't really use the old
format in production yet.

Reviewed By: DurhamG, markbt

Differential Revision: D13015463

fbshipit-source-id: 6e7e4f7a845ea8dbf0904b3902740b65cc7467d5
2018-12-06 14:57:52 -08:00
Jun Wu
42c3ef6eb6 indexedlog: add benchmark for "log"
Summary:
Some simple benchmark for "log". The initial result running from my devserver
looks like:

  log insertion                  33.146 ms
  log insertion with index      106.449 ms
  log flush                       9.623 ms
  log iteration (memory)         10.644 ms
  log iteration (disk)           11.517 ms
    13.75s user 3.61s system 97% cpu 17.778 total
  log insertion                  27.906 ms
  log insertion with index      107.683 ms
  log flush                      19.204 ms
  log iteration (memory)         10.239 ms
  log iteration (disk)           11.118 ms
    12.89s user 3.55s system 97% cpu 16.924 total
  log insertion                  31.645 ms
  log insertion with index      109.403 ms
  log flush                       9.416 ms
  log iteration (memory)         10.226 ms
  log iteration (disk)           10.757 ms
    13.07s user 3.02s system 97% cpu 16.423 total
  log insertion                  31.848 ms
  log insertion with index      109.332 ms
  log flush                      18.345 ms
  log iteration (memory)         10.709 ms
  log iteration (disk)           11.346 ms
    13.12s user 3.70s system 97% cpu 17.276 total
  log insertion                  29.665 ms
  log insertion with index      106.041 ms
  log flush                      16.159 ms
  log iteration (memory)         10.367 ms
  log iteration (disk)           11.110 ms
    12.99s user 3.27s system 97% cpu 16.717 total

Reviewed By: markbt

Differential Revision: D13015464

fbshipit-source-id: 035fee6c8b6d0bea4cfe194eed3d58ba4b5ebcb8
2018-12-06 14:57:52 -08:00