Summary: These structs are generally useful when writing code that traverses manifests. As such, let's move them out of the BFS module into the top level of the tree module so that other code can use them.
Reviewed By: xavierd
Differential Revision: D17352891
fbshipit-source-id: b390ec84a29604dc6eef31a95dba976a5224f5e9
Summary: For symmetry with the BFS diff implementation, move the DFS diff implementation into its own module. This will help unclutter mod.rs.
Reviewed By: xavierd
Differential Revision: D17352892
fbshipit-source-id: 61709cd3e430c8676c529fbbbb76a9775c05053d
Summary: Add support for calling the new BFS diff implementation from Python. This diff adds the appropriate glue code to the bindings crate and adds a config option (`treemanifest.bfsdiff`) to enable the new functionality.
Reviewed By: xavierd
Differential Revision: D17334739
fbshipit-source-id: 24aac21910e74a42d625c93bed7fa3aa08e167c0
Summary: This diff provides an implementation of the diff operation for trees which processes directories in BFS order (i.e., layer by layer). This allows the iterator to perform a bulk prefetch of the changed nodes in each layer at the start of each layer of the traversal. This should hopefully provide a more efficient fetch pattern than the existing implementation, which requires a full prefetch of both trees upfront for reasonable performance.
Reviewed By: xavierd
Differential Revision: D17276971
fbshipit-source-id: 284f1d458f43cb76befe27e85f53a641f29d7550
Summary:
Add a `prefetch` method to the `TreeStore` trait. This will be used by code using the store to signal that certain keys will be accessed soon. The default implementation is a no-op, but in the case of stores where prefetching makes sense (such as stores backed by remote servers), the default implementation can be overridden to include the appropriate prefetching logic.
For now, this change is a no-op, but later in this stack it will be used to signal to the underlying Python data store to perform the appropriate tree fetches via the Eden API. This will be used to support a more efficient pattern of bulk tree fetches during the diff operation.
Reviewed By: sfilipco
Differential Revision: D17276970
fbshipit-source-id: 22a5d847e5be5dbf1b0a74b47587a98d840b8cdc
Summary:
It is somewhat difficult to fetch the raw entry on the p1 side in the Rust
Manifests. These entries are used to write deltas to revlogs or to datapacks.
Reviewed By: xavierd
Differential Revision: D17143551
fbshipit-source-id: 6624116324664354d199d5f6ac55712c8ed29b9d
Summary:
I had assumed that we store p1 and p2 in the same order that they are used in
Node computation. That is incorrect. In general p1 and p2 are assumed to have
an ordering that matters and it's Node computation that is specific.
Reviewed By: quark-zju
Differential Revision: D17125743
fbshipit-source-id: 3a2673d9c243e2d2103aba0cb4fd8f536386efa7
Summary:
In some cases the finalize algorithm is used to persist data that is received
in a bundle. The process is that it constructs a store from the bundle and
goes to construct a tree with the root node received. It then goes through
finalize to generate the entries that need to be written to local storage.
Reviewed By: quark-zju
Differential Revision: D17125149
fbshipit-source-id: de5a1e922a6aebe48e238d8473177a8d3f7a9ef5
Summary:
`listdir` returns the contents a directory in the manifest. The format
is pretty simple, containing only the simple names of the files and or
directories. I don't know if this is something that eden can use because
it seems to simple. In other words, we have something but we may want
to iterate on it before we market it broadly.
Reviewed By: quark-zju
Differential Revision: D17098082
fbshipit-source-id: d6aa42c96781cf1f8b2e916fa10bb275593bdc65
Summary:
The C++ manifest implements walksubdirtrees which is used to compute the packs
that a "client" wants for a prefetch. In terms of interface the function is very
annoying and couples with storage and tree representations without being part
of any of them.
We reproduce that functionality as a means to replace the C++ implementation.
The long term goal is to do lazy fetches using an iteration style that plays
nicer with batching downloads.
This change also includes fastmanifest updates because they are required to
enable the walksubdirtrees functionality in our tests.
Reviewed By: quark-zju
Differential Revision: D17086669
fbshipit-source-id: 6c1f9fbf975814f0a2071f8d1c8e022e5ad58e29
Summary:
The insert code would be unclear in what kind of issue it ran into when
inserting files. Sometimes the file we want to insert is a directory and
other times it want to traverse a directory. This change makes those
situations clear along with some other corner case behaviors.
Reviewed By: quark-zju
Differential Revision: D16775354
fbshipit-source-id: 50ab6bc52b70cc5cef013d11050eb3cdf5b160a5
Summary:
Updating the manifest implementation for remove with the intended API.
When I originally implemented remove I wasn't confident what was the
best way to implement remove. As I've gained more experience, I feel
confident that doing two iteration over the tree is a good approach
for this method. The first iteration should validate that the file
exists then the second iteration will actually traverse down updating
the nodes to mutable ephemerals.
Reviewed By: quark-zju
Differential Revision: D16775353
fbshipit-source-id: 8ebee9ca347efcb694a6d27c1eeae2c149643766
Summary:
`get_link` started as test function that broaden in scope but did not have
it's behavior updated as it started to be used more broadly.
No reason to error out when we request a path that has parent files in the
manifest.
Reviewed By: quark-zju
Differential Revision: D16775356
fbshipit-source-id: a320926100378f16d723ca204746906e79c7752e
Summary: Matching more of the existing API.
Reviewed By: quark-zju
Differential Revision: D16607233
fbshipit-source-id: 7a71f22089067ecfccbfcb2ad072fbf21e360439
Summary:
This allows us to say that the queried path was a directory and simplifies
the API a little bit.
The Python/C++ Manifest API provide inspection of directories through several
functions. We want the Rust manifest to provide the same functionality.
This approach looks to have more consistency as the API evolves.
`FsNode` is a structure that is forward looking. We will want to add an API
that lists a directory and an interator of FsNode fits that well.
I also felt the need for it over the development of the manifest code.
Reviewed By: quark-zju
Differential Revision: D16609253
fbshipit-source-id: f826d7b21e3001f4bef43a35b9d1a9bc5a59eda9
Summary:
Prints out a tree that can make it easy to determine the data
that is in memory.
Reviewed By: quark-zju
Differential Revision: D16571839
fbshipit-source-id: dee8a0c6564853d49a72fa29bd53c6b09a7f3ddf
Summary:
There is no good reason to return a reference for FileMetadata. It is a relatively
small object that implements the `Copy` trait so returning a copy is
approapriate.
Reviewed By: quark-zju
Differential Revision: D16571838
fbshipit-source-id: 0c315c9f405e425832d39da5c67809dd15b4ab5e
Summary:
I wasn't sure how to test this. I implemented `diff` then got it to work
with `hg show` by adding the missing methods.
Reviewed By: quark-zju
Differential Revision: D16497354
fbshipit-source-id: 727979ad8ce4a4615e85ea96c3fe6413aa20b267
Summary:
Diffs can be filtered by a matcher to narrow down the number of files returned.
Adding this capability to the rust tree manifest implementation.
Reviewed By: kulshrax
Differential Revision: D16497355
fbshipit-source-id: fee07112b5bcff63c7c4115f28dade79f41fe6bc
Summary:
Most operations do not work on iterate on all the files of the repository.
Most operations filter the data set in some way. In many cases this filtering
is to a set of files and in some cases using patterns or subdirectories.
The Python code uses `match.py` to represent this filtering. The parallel
data structure in the rust code is `pathmatcher::Matcher`.
This diff adds `files` integration with `pathmatcher::Matcher`.
Reviewed By: quark-zju
Differential Revision: D16352527
fbshipit-source-id: 8b61ac7399f581773bf61ff648634cbc6e1a27b6
Summary:
Title. Small clean up. I think that this makes sense because the code in
file.rs is self contained. It provides a distraction free environment for
adding methods to the structures in there.
Reviewed By: quark-zju
Differential Revision: D16352531
fbshipit-source-id: c23e943198e0a4b50aa00c75e67b13bc4c3ee976
Summary: Bindings so that the rust manifest code can be used in Python.
Reviewed By: quark-zju
Differential Revision: D16352532
fbshipit-source-id: 34d4522f5e084f531f31bcd21770950f15f2fe13
Summary:
This diff sets two Rust lints to warn in fbcode:
```
[rust]
warn_lints = bare_trait_objects, ellipsis_inclusive_range_patterns
```
and fixes occurrences of those warnings within common/rust, hg, and mononoke.
Both of these lints are set to warn by default starting with rustc 1.37. Enabling them early avoids writing even more new code that needs to be fixed when we pull in 1.37 in six weeks.
Upstream tracking issue: https://github.com/rust-lang/rust/issues/54910
Reviewed By: Imxset21
Differential Revision: D16200291
fbshipit-source-id: aca11a7a944e9fa95f94e226b52f6f053b97ec74
Summary: A recent update to the crates in tp2 bumped the `once_cell` crate to version 0.2.0. This broke the build because the `Cargo.toml` for the `manifest` crate specified version 0.1.8. Apparently just changing the crate version to 0.2.0 fixes the build, so we weren't affected by whatever breaking changes were made to the crate.
Reviewed By: DurhamG
Differential Revision: D15492142
fbshipit-source-id: 552b0a751ab7c2aab5f0fbcb1124de4ea427790c
Summary:
The old code did not provide enough information to start debugging
a problem in the serialization code.
(Note: this ignores all push blocking failures!)
Reviewed By: quark-zju
Differential Revision: D15380388
fbshipit-source-id: 9da51cfe4d735de3961e840bd9cd2a1595131cd9
Summary:
Flush encapsulates the closed abstraction approach.
Finalize addresses the implementations that need to handle
history.
(Note: this ignores all push blocking failures!)
Reviewed By: quark-zju
Differential Revision: D15290173
fbshipit-source-id: 2cfd7a3f7ca0de1c2566ce3ae680d5b1451e4f91
Summary:
The plan is to have finalize do the complicated history managing stuff and
have flush be a light weight content only hashing approach to storage.
Flush will be used in tests in the short term.
Finalize would be used in Python to write to Revlogs and handle linknodes.
(Note: this ignores all push blocking failures!)
Reviewed By: quark-zju
Differential Revision: D15290171
fbshipit-source-id: 54ed5c791254507fd9f2e874315284b70bc6b597
Summary:
We want the Manifest share their durable nodes.
The path that I want to go is enable cloning for individual manifests.
What to do with the storage then? Share it in an `Arc`.
(Note: this ignores all push blocking failures!)
Reviewed By: quark-zju
Differential Revision: D15290170
fbshipit-source-id: 621d5d70700c4372c3b40a246fc45d9446b13ac5
Summary:
This enables more clear exporting so that it can be implemented
externally.
(Note: this ignores all push blocking failures!)
Reviewed By: quark-zju, xavierd
Differential Revision: D15290172
fbshipit-source-id: 25541c88abab228a0083e5a27e74db31e6819230
Summary:
A tree manifest entry must always end with a line feed. It is somewhat
redundant but that's how the serialization is defined. Sometimes that last
line feed is missing in our data. I don't know why.
Reviewed By: quark-zju
Differential Revision: D15110860
fbshipit-source-id: c4ac5075e22a8b8851f6b246d22af8ab68f42a74
Summary:
This is a quality of life improvement for working with the storage layer.
We probably don't gain a whole lot by statically linking the store and it is
useful to have some flexibility in the storage layer.
Differential Revision: D15110859
fbshipit-source-id: 6102acafa21dd1dbaeed0f8fc3147538a8c301d1
Summary:
Building test objects can be tedious using various of our bottom bytes.
This diff addresses that issue by adding helper functions in a new module
in the types crate.
Handling this case could be improved in rust.
Differential Revision: D14660307
fbshipit-source-id: a866c1f3ede60ba1b87eb17d35817b8a8d7674a4
Summary:
This diff fixes the behavior of `skip_subtree` when called on a Leaf. The bug is
that the path is not correctly handled in this case. The name of the file
continues to stay in the path resulting in incorrect path names for all
subsequent calls to `path()`.
The high level perspective is that `skip_subtree` is a no-op in a Leaf node.
To fix, clarify the behavior and improve readability of the code we ad a new
state that handles poping elements from the path.
Durham noticed this bug when reviewing D14347655.
Reviewed By: quark-zju
Differential Revision: D14654557
fbshipit-source-id: 625278366e492a3048dddc44f9234a06d6928b7e
Summary:
quark-zju noticed in code review that `Cursor` could get into an infinite loop when
it's results would be collected into a Vec<_>. That was the motive that I
needed to update `Cursor` to transition to `State::Done` when the cursor
encounters an error. Previously I felt that users of `Cursor` would only be
empowered by having the ability to retry the failure.
Reviewed By: quark-zju
Differential Revision: D14393590
fbshipit-source-id: b3e0974ac15d62f3f17790229121c0dec3a6149e
Summary:
Follow up from D14178264.
Two changes:
* tree manifest entries must end with a line feed
* `t` is the byte that flags a directory
Reviewed By: DurhamG
Differential Revision: D14368316
fbshipit-source-id: b0b46c876649b8f25bf0ecdb1266527dbeb33796
Summary:
`manifest::tree::diff()` returns an iterator over the differences between two
tree manifests.
I chose a function that takes two parameters over a method on Tree because it
felt more clear to write `left` and `right`. Also because I am not sure how
iterators would be abstracted on a trait.
Differential Revision: D14347656
fbshipit-source-id: 537574070cd18b08c77b3cd1cf4cff38d77fbf81
Summary:
Cursor is a utility for iterating over a manifest tree. In this diff it is used
to implement Files. In the future it will be used to do a diff between two tree
manifests.
I am not sure how to describe an iterator return value in the Manifest trait so
I kept the function on the tree only for now. Looking forward to hearing your
suggestions.
Differential Revision: D14347655
fbshipit-source-id: ffd856443d8abe3ebd0557a096bf7a5ec46312d3
Summary:
`Node` is marked as `Copy`. `FileMetadata` is not much more than `Node` so it
seems pretty clear that it should be marked `Copy`.
Reviewed By: DurhamG
Differential Revision: D14347657
fbshipit-source-id: 939abf88087bc8c6f942047a08d6a4a0d61e053f
Summary:
Cleaning up the `mod.rs` file so that it provides more signal.
`Link` is an internal implementation detail that other internal components may depend on so it is a great candidate to be moved to a dedicated file.
Differential Revision: D14347654
fbshipit-source-id: e5b5a42faf1e9f9c4a0591e5bd94182391ed511f
Summary:
Save, finalize, flush, they mean about the same thing.
The first thing to note is that this implementation is not complete because
the parents are not correctly passed into the hashing function.
The second thing is that store failures make the code a little more complex
than it would have been otherwise.
(Note: this ignores all push blocking failures!)
Reviewed By: quark-zju
Differential Revision: D14292713
fbshipit-source-id: 807d7a385a62cb5f4948f1781d3146eaa6502ca9
Summary:
This function is a bit on it's own with the removal of the pair conversion.
Since it is used in only one place it makes sense to inline it.
(Note: this ignores all push blocking failures!)
Reviewed By: quark-zju
Differential Revision: D14292712
fbshipit-source-id: abbf1dc70d61c0ad039f5bc5ed5277d0770e3899
Summary:
Working on the save mechanism I realized that links_to_store_entry is not that
useful because we can avoid the failure states where we would try to serialize
an ephemeral node. I am removing that function and converting the code that was
using that function to using the Entry constructor directly.
(Note: this ignores all push blocking failures!)
Reviewed By: quark-zju
Differential Revision: D14292714
fbshipit-source-id: 54ef46670319c27d90fc78511a1eb6abf47d3acf
Summary:
This is what Rust is telling us to do. The situation that triggers this update is
writing to the store. Particularly when the store is an in memory hashmap we
need to have a mutable borrow to the hashmap to insert into it. From a general
point of view this means that any sharing of the store between different
instances of a manifest will have to be handled by the struct that implements
the `Store` trait.
(Note: this ignores all push blocking failures!)
Reviewed By: quark-zju
Differential Revision: D14292716
fbshipit-source-id: 6e789527dbdf3cd3ffe967f4900251bf31f7d6b2
Summary:
Removes a file from the manifest. Nothing special for it.
(Note: this ignores all push blocking failures!)
Reviewed By: quark-zju
Differential Revision: D14276645
fbshipit-source-id: 85e8ffd6cffee426c73eb627484dfa5a866a364b
Summary:
It is going to be useful in tests to check how certain internal nodes change
so adding an api that allows fetching an internal node.
(Note: this ignores all push blocking failures!)
Differential Revision: D14276642
fbshipit-source-id: 9a3e488be6031f7b4727a8643f64970dcec8c400
Summary:
This removes the need for the local buffer for the parent.
(Note: this ignores all push blocking failures!)
Differential Revision: D14276648
fbshipit-source-id: a9378ea592d502ddf2dcdc35fe6ffa9ba213bc14
Summary:
Using the recently added path utilities so that we don't keep a secondary
parent buffer around.
Updating the file insert logic so that it is readable and intuitive.
(Note: this ignores all push blocking failures!)
Reviewed By: quark-zju
Differential Revision: D14276649
fbshipit-source-id: 8e7e835814f0039645601abbf1b701e8c1ed3697