Commit Graph

61 Commits

Author SHA1 Message Date
Arun Kulshreshtha
95fe3c583a manifest: add matcher support to bfs diff
Summary: Add matcher support for BFS diff, bringing feature parity with the non-BFS implementation.

Reviewed By: xavierd

Differential Revision: D17276969

fbshipit-source-id: f63194508e10950a691df0202a68211e2876f742
2019-09-12 10:24:46 -07:00
Arun Kulshreshtha
76e31ae145 manifest: add initial bfs diff implementation
Summary: This diff provides an implementation of the diff operation for trees which processes directories in BFS order (i.e., layer by layer). This allows the iterator to perform a bulk prefetch of the changed nodes in each layer at the start of each layer of the traversal. This should hopefully provide a more efficient fetch pattern than the existing implementation, which requires a full prefetch of both trees upfront for reasonable performance.

Reviewed By: xavierd

Differential Revision: D17276971

fbshipit-source-id: 284f1d458f43cb76befe27e85f53a641f29d7550
2019-09-12 10:24:46 -07:00
Arun Kulshreshtha
dbf37904d4 manifest: add prefetch method to store interface
Summary:
Add a `prefetch` method to the `TreeStore` trait. This will be used by code using the store to signal that certain keys will be accessed soon. The default implementation is a no-op, but in the case of stores where prefetching makes sense (such as stores backed by remote servers), the default implementation can be overridden to include the appropriate prefetching logic.

For now, this change is a no-op, but later in this stack it will be used to signal to the underlying Python data store to perform the appropriate tree fetches via the Eden API. This will be used to support a more efficient pattern of bulk tree fetches during the diff operation.

Reviewed By: sfilipco

Differential Revision: D17276970

fbshipit-source-id: 22a5d847e5be5dbf1b0a74b47587a98d840b8cdc
2019-09-12 10:24:45 -07:00
Stefan Filip
edfeb9f529 treemanifest: remove delta storing of manifest entries when autocreate
Summary:
It is somewhat difficult to fetch the raw entry on the p1 side in the Rust
Manifests. These entries are used to write deltas to revlogs or to datapacks.

Reviewed By: xavierd

Differential Revision: D17143551

fbshipit-source-id: 6624116324664354d199d5f6ac55712c8ed29b9d
2019-09-03 15:08:34 -07:00
Thomas Orozco
817fb8839c rust-crates-io: add gotham_derive
Summary: This updates rust-crates-io to add gotham_derive.

Reviewed By: StanislavGlebik

Differential Revision: D17163951

fbshipit-source-id: 27c6d728fee9b837ae504b51362feaa71e69d66f
2019-09-03 14:57:52 -07:00
Stefan Filip
490a99230c manifest: preserve p1 and p2 order in finalize
Summary:
I had assumed that we store p1 and p2 in the same order that they are used in
Node computation. That is incorrect. In general p1 and p2 are assumed to have
an ordering that matters and it's Node computation that is specific.

Reviewed By: quark-zju

Differential Revision: D17125743

fbshipit-source-id: 3a2673d9c243e2d2103aba0cb4fd8f536386efa7
2019-08-30 10:50:06 -07:00
Stefan Filip
abf02ee9ca manifest: update finalize to work on durable trees
Summary:
In some cases the finalize algorithm is used to persist data that is received
in a bundle. The process is that it constructs a store from the bundle and
goes to construct a tree with the root node received. It then goes through
finalize to generate the entries that need to be written to local storage.

Reviewed By: quark-zju

Differential Revision: D17125149

fbshipit-source-id: de5a1e922a6aebe48e238d8473177a8d3f7a9ef5
2019-08-30 10:50:06 -07:00
Stefan Filip
f8fa8d9367 manifest: add list directory functionality
Summary:
`listdir` returns the contents a directory in the manifest. The format
is pretty simple, containing only the simple names of the files and or
directories. I don't know if this is something that eden can use because
it seems to simple. In other words, we have something but we may want
to iterate on it before we market it broadly.

Reviewed By: quark-zju

Differential Revision: D17098082

fbshipit-source-id: d6aa42c96781cf1f8b2e916fa10bb275593bdc65
2019-08-30 10:50:05 -07:00
Stefan Filip
ffb563f1bb manifest: add subdir_diff compatibility function for gettreepacks
Summary:
The C++ manifest implements walksubdirtrees which is used to compute the packs
that a "client" wants for a prefetch. In terms of interface the function is very
annoying and couples with storage and tree representations without being part
of any of them.

We reproduce that functionality as a means to replace the C++ implementation.
The long term goal is to do lazy fetches using an iteration style that plays
nicer with batching downloads.

This change also includes fastmanifest updates because they are required to
enable the walksubdirtrees functionality in our tests.

Reviewed By: quark-zju

Differential Revision: D17086669

fbshipit-source-id: 6c1f9fbf975814f0a2071f8d1c8e022e5ad58e29
2019-08-30 10:50:05 -07:00
Stefan Filip
8a402cc939 manifest: update insert validation
Summary:
The insert code would be unclear in what kind of issue it ran into when
inserting files. Sometimes the file we want to insert is a directory and
other times it want to traverse a directory. This change makes those
situations clear along with some other corner case behaviors.

Reviewed By: quark-zju

Differential Revision: D16775354

fbshipit-source-id: 50ab6bc52b70cc5cef013d11050eb3cdf5b160a5
2019-08-26 10:48:08 -07:00
Stefan Filip
d4614d568d manifest: update manifest.remove to return None for dirs
Summary:
Updating the manifest implementation for remove with the intended API.
When I originally implemented remove I wasn't confident what was the
best way to implement remove. As I've gained more experience, I feel
confident that doing two iteration over the tree is a good approach
for this method. The first iteration should validate that the file
exists then the second iteration will actually traverse down updating
the nodes to mutable ephemerals.

Reviewed By: quark-zju

Differential Revision: D16775353

fbshipit-source-id: 8ebee9ca347efcb694a6d27c1eeae2c149643766
2019-08-26 10:48:08 -07:00
Stefan Filip
2e3860a3d9 manifest: fix get_link to return None for children of files
Summary:
`get_link` started as test function that broaden in scope but did not have
it's behavior updated as it started to be used more broadly.
No reason to error out when we request a path that has parent files in the
manifest.

Reviewed By: quark-zju

Differential Revision: D16775356

fbshipit-source-id: a320926100378f16d723ca204746906e79c7752e
2019-08-26 10:48:07 -07:00
Stefan Filip
5bc614c767 bindings: add finalize implementation for tree manifests
Summary: Matching more of the existing API.

Reviewed By: quark-zju

Differential Revision: D16607233

fbshipit-source-id: 7a71f22089067ecfccbfcb2ad072fbf21e360439
2019-08-06 14:24:31 -07:00
Stefan Filip
2cb92f487b manifest: update manifest::get to return optional enum
Summary:
This allows us to say that the queried path was a directory and simplifies
the API a little bit.

The Python/C++ Manifest API provide inspection of directories through several
functions. We want the Rust manifest to provide the same functionality.
This approach looks to have more consistency as the API evolves.

`FsNode` is a structure that is forward looking. We will want to add an API
that lists a directory and an interator of FsNode fits that well.
I also felt the need for it over the development of the manifest code.

Reviewed By: quark-zju

Differential Revision: D16609253

fbshipit-source-id: f826d7b21e3001f4bef43a35b9d1a9bc5a59eda9
2019-08-06 10:43:01 -07:00
Stefan Filip
7bbf5c30db manifest: add fmt::Debug implementation for Tree
Summary:
Prints out a tree that can make it easy to determine the data
that is in memory.

Reviewed By: quark-zju

Differential Revision: D16571839

fbshipit-source-id: dee8a0c6564853d49a72fa29bd53c6b09a7f3ddf
2019-08-01 10:35:52 -07:00
Stefan Filip
a585faf2db manifest: update get to return FileMetadata copy
Summary:
There is no good reason to return a reference for FileMetadata. It is a relatively
small object that implements the `Copy` trait so returning a copy is
approapriate.

Reviewed By: quark-zju

Differential Revision: D16571838

fbshipit-source-id: 0c315c9f405e425832d39da5c67809dd15b4ab5e
2019-08-01 10:35:51 -07:00
Stefan Filip
427ad4416e bindings: add diff implementation for rust tree manifest
Summary:
I wasn't sure how to test this. I implemented `diff` then got it to work
with `hg show` by adding the missing methods.

Reviewed By: quark-zju

Differential Revision: D16497354

fbshipit-source-id: 727979ad8ce4a4615e85ea96c3fe6413aa20b267
2019-08-01 10:35:51 -07:00
Stefan Filip
5a301d20c4 manifest: add matcher integration for diff algorithm
Summary:
Diffs can be filtered by a matcher to narrow down the number of files returned.
Adding this capability to the rust tree manifest implementation.

Reviewed By: kulshrax

Differential Revision: D16497355

fbshipit-source-id: fee07112b5bcff63c7c4115f28dade79f41fe6bc
2019-07-31 10:06:48 -07:00
Stefan Filip
7c511ba610 manifest: add matcher filtering to the files iterator
Summary:
Most operations do not work on iterate on all the files of the repository.
Most operations filter the data set in some way. In many cases this filtering
is to a set of files and in some cases using patterns or subdirectories.

The Python code uses `match.py` to represent this filtering. The parallel
data structure in the rust code is `pathmatcher::Matcher`.

This diff adds `files` integration with `pathmatcher::Matcher`.

Reviewed By: quark-zju

Differential Revision: D16352527

fbshipit-source-id: 8b61ac7399f581773bf61ff648634cbc6e1a27b6
2019-07-22 13:03:02 -07:00
Stefan Filip
e05693d74b manifest: move file description code from lib.rs to file.rs
Summary:
Title. Small clean up. I think that this makes sense because the code in
file.rs is self contained. It provides a distraction free environment for
adding methods to the structures in there.

Reviewed By: quark-zju

Differential Revision: D16352531

fbshipit-source-id: c23e943198e0a4b50aa00c75e67b13bc4c3ee976
2019-07-22 13:03:01 -07:00
Stefan Filip
cb57f5d7ac bindings: add manifest python classes
Summary: Bindings so that the rust manifest code can be used in Python.

Reviewed By: quark-zju

Differential Revision: D16352532

fbshipit-source-id: 34d4522f5e084f531f31bcd21770950f15f2fe13
2019-07-22 13:03:00 -07:00
David Tolnay
f7011a3993 rust: Head start on some upcoming warnings
Summary:
This diff sets two Rust lints to warn in fbcode:

```
[rust]
  warn_lints = bare_trait_objects, ellipsis_inclusive_range_patterns
```

and fixes occurrences of those warnings within common/rust, hg, and mononoke.

Both of these lints are set to warn by default starting with rustc 1.37. Enabling them early avoids writing even more new code that needs to be fixed when we pull in 1.37 in six weeks.

Upstream tracking issue: https://github.com/rust-lang/rust/issues/54910

Reviewed By: Imxset21

Differential Revision: D16200291

fbshipit-source-id: aca11a7a944e9fa95f94e226b52f6f053b97ec74
2019-07-12 00:55:53 -07:00
Arun Kulshreshtha
4e824ba5d7 manifest: bump version of once_cell to 0.2.0
Summary: A recent update to the crates in tp2 bumped the `once_cell` crate to version 0.2.0. This broke the build because the `Cargo.toml` for the `manifest` crate specified version 0.1.8. Apparently just changing the crate version to 0.2.0 fixes the build, so we weren't affected by whatever breaking changes were made to the crate.

Reviewed By: DurhamG

Differential Revision: D15492142

fbshipit-source-id: 552b0a751ab7c2aab5f0fbcb1124de4ea427790c
2019-05-23 22:38:37 -07:00
Stefan Filip
8a99978193 manifest: improve deserialization error reporting
Summary:
The old code did not provide enough information to start debugging
a problem in the serialization code.

(Note: this ignores all push blocking failures!)

Reviewed By: quark-zju

Differential Revision: D15380388

fbshipit-source-id: 9da51cfe4d735de3961e840bd9cd2a1595131cd9
2019-05-17 10:24:16 -07:00
Stefan Filip
e0aaaf7cbb manifest: update flush to hash content only
Summary:
Flush encapsulates the closed abstraction approach.
Finalize addresses the implementations that need to handle
history.

(Note: this ignores all push blocking failures!)

Reviewed By: quark-zju

Differential Revision: D15290173

fbshipit-source-id: 2cfd7a3f7ca0de1c2566ce3ae680d5b1451e4f91
2019-05-17 10:24:16 -07:00
Stefan Filip
3044a0d869 manifest: add finalize method
Summary:
The plan is to have finalize do the complicated history managing stuff and
have flush be a light weight content only hashing approach to storage.
Flush will be used in tests in the short term.
Finalize would be used in Python to write to Revlogs and handle linknodes.

(Note: this ignores all push blocking failures!)

Reviewed By: quark-zju

Differential Revision: D15290171

fbshipit-source-id: 54ed5c791254507fd9f2e874315284b70bc6b597
2019-05-17 10:24:15 -07:00
Stefan Filip
bfaba6f7c3 manifest: move Store behind Arc
Summary:
We want the Manifest share their durable nodes.
The path that I want to go is enable cloning for individual manifests.
What to do with the storage then? Share it in an `Arc`.

(Note: this ignores all push blocking failures!)

Reviewed By: quark-zju

Differential Revision: D15290170

fbshipit-source-id: 621d5d70700c4372c3b40a246fc45d9446b13ac5
2019-05-17 10:24:15 -07:00
Stefan Filip
80c26ce702 manifest: rename manifest::tree::Store to TreeStore
Summary:
This enables more clear exporting so that it can be implemented
externally.

(Note: this ignores all push blocking failures!)

Reviewed By: quark-zju, xavierd

Differential Revision: D15290172

fbshipit-source-id: 25541c88abab228a0083e5a27e74db31e6819230
2019-05-17 10:24:15 -07:00
Stefan Filip
4e9787b879 manifest: handle missing line feed in tree manifest entries
Summary:
A tree manifest entry must always end with a line feed. It is somewhat
redundant but that's how the serialization is defined. Sometimes that last
line feed is missing in our data. I don't know why.

Reviewed By: quark-zju

Differential Revision: D15110860

fbshipit-source-id: c4ac5075e22a8b8851f6b246d22af8ab68f42a74
2019-05-08 10:07:28 -07:00
Stefan Filip
a8bc9fc3a7 manifest: use dynamic dispatch for tree manifest store
Summary:
This is a quality of life improvement for working with the storage layer.
We probably don't gain a whole lot by statically linking the store and it is
useful to have some flexibility in the storage layer.

Differential Revision: D15110859

fbshipit-source-id: 6102acafa21dd1dbaeed0f8fc3147538a8c301d1
2019-05-08 10:07:27 -07:00
Stefan Filip
d6ad49db5b manifest: migrate to types::testutil node
Summary: migration

Differential Revision: D14660306

fbshipit-source-id: 71df6814d93f8b9f814aedaa4ceb558a8b69cdf6
2019-04-08 16:21:08 -07:00
Stefan Filip
1e2816f7c6 types: add testutil module to help writing tests
Summary:
Building test objects can be tedious using various of our bottom bytes.
This diff addresses that issue by adding helper functions in a new module
in the types crate.

Handling this case could be improved in rust.

Differential Revision: D14660307

fbshipit-source-id: a866c1f3ede60ba1b87eb17d35817b8a8d7674a4
2019-04-08 16:21:07 -07:00
Stefan Filip
02851845a9 manifest: Fix skip_subtree on Leaf
Summary:
This diff fixes the behavior of `skip_subtree` when called on a Leaf. The bug is
that the path is not correctly handled in this case. The name of the file
continues to stay in the path resulting in incorrect path names for all
subsequent calls to `path()`.
The high level perspective  is that `skip_subtree` is a no-op in a Leaf node.
To fix, clarify the behavior and improve readability of the code we ad a new
state that handles poping elements from the path.

Durham noticed this bug when reviewing D14347655.

Reviewed By: quark-zju

Differential Revision: D14654557

fbshipit-source-id: 625278366e492a3048dddc44f9234a06d6928b7e
2019-04-01 11:51:16 -07:00
Stefan Filip
41e75fce3f manifest: fix infinite loop when cursor encounters error
Summary:
quark-zju noticed in code review that `Cursor` could get into an infinite loop when
it's results would be collected into a Vec<_>. That was the motive that I
needed to update `Cursor` to transition to `State::Done` when the cursor
encounters an error. Previously I felt that users of `Cursor` would only be
empowered by having the ability to retry the failure.

Reviewed By: quark-zju

Differential Revision: D14393590

fbshipit-source-id: b3e0974ac15d62f3f17790229121c0dec3a6149e
2019-03-11 15:27:52 -07:00
Stefan Filip
b7dee64bd2 manifest: fix tree entry serialization
Summary:
Follow up from D14178264.

Two changes:
 * tree manifest entries must end with a line feed
 * `t` is the byte that flags a directory

Reviewed By: DurhamG

Differential Revision: D14368316

fbshipit-source-id: b0b46c876649b8f25bf0ecdb1266527dbeb33796
2019-03-07 17:51:39 -08:00
Stefan Filip
660992a50a manifest: add tree::diff(Tree, Tree)
Summary:
`manifest::tree::diff()` returns an iterator over the differences between two
tree manifests.

I chose a function that takes two parameters over a method on Tree because it
felt more clear to write `left` and `right`. Also because I am not sure how
iterators would be abstracted on a trait.

Differential Revision: D14347656

fbshipit-source-id: 537574070cd18b08c77b3cd1cf4cff38d77fbf81
2019-03-07 17:46:44 -08:00
Stefan Filip
2deb0e6e42 manifest: add tree::Cursor and Tree::files()
Summary:
Cursor is a utility for iterating over a manifest tree. In this diff it is used
to implement Files. In the future it will be used to do a diff between two tree
manifests.

I am not sure how to describe an iterator return value in the Manifest trait so
I kept the function on the tree only for now. Looking forward to hearing your
suggestions.

Differential Revision: D14347655

fbshipit-source-id: ffd856443d8abe3ebd0557a096bf7a5ec46312d3
2019-03-07 17:46:44 -08:00
Stefan Filip
c305e13566 manfiest: mark FileMetadata as Copy
Summary:
`Node` is marked as `Copy`. `FileMetadata` is not much more than `Node` so it
seems pretty clear that it should be marked `Copy`.

Reviewed By: DurhamG

Differential Revision: D14347657

fbshipit-source-id: 939abf88087bc8c6f942047a08d6a4a0d61e053f
2019-03-07 11:20:07 -08:00
Stefan Filip
5b370ffb72 manifest: move tree link to a separate file
Summary:
Cleaning up the `mod.rs` file so that it provides more signal.
`Link` is an internal implementation detail that other internal components may depend on so it is a great candidate to be moved to a dedicated file.

Differential Revision: D14347654

fbshipit-source-id: e5b5a42faf1e9f9c4a0591e5bd94182391ed511f
2019-03-07 11:20:07 -08:00
Stefan Filip
6d9dc154ca manifest: add flush function to manifests
Summary:
Save, finalize, flush, they mean about the same thing.

The first thing to note is that this implementation is not complete because
the parents are not correctly passed into the hashing function.

The second thing is that store failures make the code a little more complex
than it would have been otherwise.

(Note: this ignores all push blocking failures!)

Reviewed By: quark-zju

Differential Revision: D14292713

fbshipit-source-id: 807d7a385a62cb5f4948f1781d3146eaa6502ca9
2019-03-05 16:12:48 -08:00
Stefan Filip
25edcc014b manifest: inline store_entry_to_links
Summary:
This function is a bit on it's own with the removal of the pair conversion.
Since it is used in only one place it makes sense to inline it.

(Note: this ignores all push blocking failures!)

Reviewed By: quark-zju

Differential Revision: D14292712

fbshipit-source-id: abbf1dc70d61c0ad039f5bc5ed5277d0770e3899
2019-03-05 16:12:48 -08:00
Stefan Filip
c5cc253234 manifest: refactor tests to use store::Entry::from_elements
Summary:
Working on the save mechanism I realized that links_to_store_entry is not that
useful because we can avoid the failure states where we would try to serialize
an ephemeral node. I am removing that function and converting the code that was
using that function to using the Entry constructor directly.

(Note: this ignores all push blocking failures!)

Reviewed By: quark-zju

Differential Revision: D14292714

fbshipit-source-id: 54ef46670319c27d90fc78511a1eb6abf47d3acf
2019-03-05 16:12:48 -08:00
Stefan Filip
765659d505 manifest: update tree manifest to take owneship of Store
Summary:
This is what Rust is telling us to do. The situation that triggers this update is
writing to the store. Particularly when the store is an in memory hashmap we
need to have a mutable borrow to the hashmap to insert into it. From a general
point of view this means that any sharing of the store between different
instances of a manifest will have to be handled by the struct that implements
the `Store` trait.

(Note: this ignores all push blocking failures!)

Reviewed By: quark-zju

Differential Revision: D14292716

fbshipit-source-id: 6e789527dbdf3cd3ffe967f4900251bf31f7d6b2
2019-03-05 16:12:47 -08:00
Stefan Filip
14f26aa355 manifest: add remove implementation for tree
Summary:
Removes a file from the manifest. Nothing special for it.

(Note: this ignores all push blocking failures!)

Reviewed By: quark-zju

Differential Revision: D14276645

fbshipit-source-id: 85e8ffd6cffee426c73eb627484dfa5a866a364b
2019-03-05 16:12:47 -08:00
Stefan Filip
1e9b7fafe9 manifest: refactor get_link out of manifest::get to allow code reuse
Summary:
It is going to be useful in tests to check how certain internal nodes change
so adding an api that allows fetching an internal node.

(Note: this ignores all push blocking failures!)

Differential Revision: D14276642

fbshipit-source-id: 9a3e488be6031f7b4727a8643f64970dcec8c400
2019-03-05 16:12:47 -08:00
Stefan Filip
e2b06628cd manifest: update Tree::get() to use RepoPath::parents()
Summary:
This removes the need for the local buffer for the parent.

(Note: this ignores all push blocking failures!)

Differential Revision: D14276648

fbshipit-source-id: a9378ea592d502ddf2dcdc35fe6ffa9ba213bc14
2019-03-05 16:12:47 -08:00
Stefan Filip
3c2c09f431 manifest: update tree::Manifest::insert
Summary:
Using the recently added path utilities so that we don't keep a secondary
parent buffer around.
Updating the file insert logic so that it is readable and intuitive.

(Note: this ignores all push blocking failures!)

Reviewed By: quark-zju

Differential Revision: D14276649

fbshipit-source-id: 8e7e835814f0039645601abbf1b701e8c1ed3697
2019-03-05 16:12:47 -08:00
Stefan Filip
82bbd0326f manifest: improve store imports in tree/mod.rs
Summary:
We currently alias many of the imports from the store submodule. It is nicer
if we just import the submodule name and prefix with the submodule.

Differential Revision: D14276638

fbshipit-source-id: 6661df7f9cb5d976b11153003f653a1f66301c9a
2019-03-03 12:35:50 -08:00
Stefan Filip
6825832b19 manifest: order imports
Summary:
The order of imports we're trying to follow is: std, remote crates
(ie: crates.io), local crates (ie: fbcode crates), and crate local. A new line
in between each group can be used to prevent rustfmt from re-ordering them.

Reviewed By: singhsrb

Differential Revision: D14243162

fbshipit-source-id: 6fc2cceb3d6834b602be20b8b8f74e0f61b227e1
2019-02-27 10:00:40 -08:00
Stefan Filip
b20f22f0a7 manifest: implement tree entry serialization/deserialization
Summary:
This diff focuses on addding deserialization. Because the most effective way
 of testing deserialization is doing round-trip conversions we also implement
 serialization.

 `manifest::tree::store::Entry` is the structure that is in charge of perfroming
 serialization and deserialization. We update the Store trait to interface with
 this new object.

Differential Revision: D14178264

fbshipit-source-id: bb12262c181a518ba4111d40c079d6836ec44301
2019-02-27 10:00:40 -08:00