Commit Graph

143 Commits

Author SHA1 Message Date
Arun Kulshreshtha
9c6b914a22 types: move Key and NodeInfo out of revisionstore
Summary:
In order to move the types in `edenapi-types` (containing types shared between Mercurial and Mononoke) to the `types` crate, we need to move a few types from the  `revisionstore` crate into this crate first, because `revisionstore` depends on `types`, which would create a circular dependency since `edenapi-types` uses types from `revisionstore`.

In particular, this diff moves the `Key` and `NodeInfo` types into their own modules in the `types` crate.

Reviewed By: quark-zju

Differential Revision: D14114166

fbshipit-source-id: 8f9e78d610425faec9dc89ecc9e450651d24177a
2019-02-15 22:51:04 -08:00
Xavier Deguillard
76316fbf9d revisionstore: verify repacked keys before deleting pack files
Summary:
During repack, the repacked files are deleted without any verification. Since
Adam saw some data loss, it's possible that somehow repack didn't fully repack
a packfile but it was deleted. Let's verify that the entire packfile was
repacked before deleting it.

Since repack is mostly a background operation, we don't have a way to notify
the user, but we can log the error to a scuba table to analyse further.

Reviewed By: DurhamG

Differential Revision: D14069766

fbshipit-source-id: 4358a87deeb9732eec1afdfb742e8d81db41cd87
2019-02-14 13:03:09 -08:00
Xavier Deguillard
e5a7da32da revisionstore: rename the packfile before removal on windows
Summary:
Removing files on Windows is hard. It can fail for many reasons, many of which
involves another process having the file opened in some way. One way to solve
this problem is that renaming the file isn't as restrictive as removing it.

Since hg repack will attempt removing any temporary files it will also try to
remove the packfiles that we failed to remove earlier.

Reviewed By: DurhamG

Differential Revision: D14030445

fbshipit-source-id: 1f3799e021c2e0451943a1d5bd4cd25ed608ffb6
2019-02-14 10:34:52 -08:00
Xavier Deguillard
8c40ed3a71 revisionstore: ignore AlreadyExists errors when persisting a mutable pack
Summary:
Packfiles are named based on their content, so having an on-disk file with the
same name means that they have the same content. If that happens, let's simply
continue without failing.

Reviewed By: DurhamG

Differential Revision: D14030446

fbshipit-source-id: f04c15507c89b2fca19c95a7b41d8e65c88da019
2019-02-14 10:34:52 -08:00
Xavier Deguillard
73aed5c3d2 revisionstore: do not attempt repacking one packfile
Summary:
Repacking one packfile will yield the same packfile, so we can save some IO by
not trying to repack.

Differential Revision: D14013789

fbshipit-source-id: 8069840cc7cb1837eb94cea97e50b3bbaa548873
2019-02-12 11:21:34 -08:00
Arun Kulshreshtha
cd9197c25d revisionstore: fix import ordering
Summary:
We've settled on the following grouping for imports:

- standard library
- 3rd party crates
- internal crates
- modules within same crate

This diff updates revisionstore accordingly.

Reviewed By: singhsrb

Differential Revision: D14030243

fbshipit-source-id: 74a7897342e39eb1d80202c8aae8c149bf08fc41
2019-02-11 15:47:36 -08:00
Xavier Deguillard
a4311ec1df memcache: implement get_hist
Summary:
We can now fetch history data stored in memcache and write it to a history
pack.

Reviewed By: DurhamG

Differential Revision: D13975308

fbshipit-source-id: 2196328ad60a55d1e2b39d88d939f434e496837a
2019-02-08 12:56:06 -08:00
Xavier Deguillard
d9153f1565 memcache: proper serde serialization
Summary:
The initial get_data/set_data only sent the full-text to memcache, which is
just enough for non-LFS data. Let's use Serde to serialize/deserialize the data
that we send to memcache. This will make it simple to add checksuming, or more
metadata to it.

Reviewed By: DurhamG

Differential Revision: D13974714

fbshipit-source-id: 41a235e1d1e8128b14f00b668745f4f9a070a360
2019-02-08 12:56:06 -08:00
Xavier Deguillard
9439d09d10 revisionstore: implement IterableStore for UnionStore
Summary:
The IterableStore trait allows iterating over all the keys of a DataStore.
Since this is applicable to a UnionStore, let's implement it there. We can now
use it in their async variants.

Regarding the async variants, the code effectively builds a Vec of Key, which
may use a lot of memory, a better alternative would be to use a Stream of Key.
This will be tackled later.

Reviewed By: DurhamG

Differential Revision: D13951905

fbshipit-source-id: 15944b18d7ffea08d191e5dc7e1b8e2b783f69d1
2019-02-08 12:56:06 -08:00
Xavier Deguillard
747cc15fbf revisionstore: remove Rc from UnionDataStore and UnionHistoryStore
Summary:
The Rc is required by the c_api, but there is no longer a reason for
UnionDataStore and UnionHistoryStore to use an Rc, so let's move the Rc into
c_api.

Reviewed By: DurhamG

Differential Revision: D13928332

fbshipit-source-id: a93b54e022d539dc4df9144a8c59e9ffbe3453e0
2019-02-04 09:30:23 -08:00
Xavier Deguillard
f11c7fbf26 revisionstore: remove Clone requirement from UnionStore
Summary:
By specifying the IntoIterator differently, we can avoid the clone requirement.
Since Clone isn't implemented on either DataPack or HistoryPack, this will
simplify the callers a bit

Reviewed By: DurhamG

Differential Revision: D13928274

fbshipit-source-id: f0261c50d73868689ebb3ae226f84d41c4c40925
2019-02-04 09:30:23 -08:00
Xavier Deguillard
82af74b019 revisionstore: add blanket HistoryStore implementation Rc, Arc and Box
Summary: This way, HistoryStore type constraint will work with these types.

Reviewed By: DurhamG

Differential Revision: D13928128

fbshipit-source-id: aaa9f2633166c137dca5fc2b1f44caab92b57a80
2019-02-04 09:30:23 -08:00
Xavier Deguillard
fb2b0f48d3 revisionstore: add blanket DataStore implementation for Rc, Arc and Box
Summary: This way, DataStore type constraint will work with these types.

Reviewed By: DurhamG

Differential Revision: D13928090

fbshipit-source-id: 1567556e3ffea2901acbc754b3bd67491e23056b
2019-02-04 09:30:23 -08:00
Xavier Deguillard
4c4e2a6909 revisionstore: remove RefCell from UnionStore
Summary: The UnionStore doesn't need internal mutability, so let's simplify it.

Reviewed By: DurhamG

Differential Revision: D13928058

fbshipit-source-id: f0ba085ff8401dcc99fc69c3eb6f5e20c071d650
2019-02-04 09:30:23 -08:00
Arun Kulshreshtha
e80ea448d2 revisionstore: reexport Key at top level
Summary: title

Differential Revision: D13858151

fbshipit-source-id: 9f188c2a21382de65eb7febc45a46e10763771b3
2019-01-29 11:45:23 -08:00
Arun Kulshreshtha
872ecdaf30 revisionstore: derive Serialize and Deserialize for Key
Summary: Similar to previous diff in this stack, make this type serializable so we can send it as part of an HTTP request.

Reviewed By: singhsrb

Differential Revision: D13858440

fbshipit-source-id: 9173a3e76bcfa6a6600d30ada39d65475f95bc5e
2019-01-29 04:44:16 -08:00
Durham Goode
725eb4da33 windows: fix the build
Summary:
The conditional if statement did not prevent the logic inside the
condition from being compiled, which in this case fails on windows. Instead of
using an if, let's just define two functions and conditionally compile the
functions.

Reviewed By: ikostia

Differential Revision: D13855560

fbshipit-source-id: ac417e6bd8fb272106fe8f3b9a8b7db57214ad88
2019-01-29 02:41:38 -08:00
Xavier Deguillard
5485ecc185 revisionstore: proper permissions for pack files
Summary:
The tempfile rust crates opens the file with RW permissions for the user only,
but once written out to disk, the permissions needs to be readable by everyone.
Unfortunately, rust doesn't have a portable way of doing this, so we have to
resort to using `if cfg!(unix)` conditions for doing this.

Reviewed By: DurhamG

Differential Revision: D13703406

fbshipit-source-id: 688bc679b5c1a7943ceab723c1f649d555b61a7a
2019-01-25 09:42:39 -08:00
Xavier Deguillard
da0999c2f8 revisionstore: move mutable packs close logic to a MutablePack trait
Summary:
This allows de-duplicating the logic for setting proper permissions on the
files. Most of the changes is code movement and rustfmt formatting.

Reviewed By: DurhamG

Differential Revision: D13703392

fbshipit-source-id: 28be85ef2d4b440202cf4885e50e62ac3c41f774
2019-01-25 09:42:39 -08:00
Arun Kulshreshtha
c7b9d822a4 revisionstore: use Vec<u8> instead of boxed slice for key names
Summary: Boxed slices are difficult to use in practice, so use `Vec<u8>` instead. (No need for `Bytes` here since there is no reference counting required.)

Reviewed By: DurhamG

Differential Revision: D13770055

fbshipit-source-id: 78f48ac32a4da9c105bf05eb44889c1f492721a8
2019-01-22 16:02:13 -08:00
Arun Kulshreshtha
a642954e27 revisionstore: use Bytes instead of Rc<Box<[u8]>> in loosefiles module
Summary: Use `Bytes` instead of `Rc<Box<[u8]>>` since the former is a nicer type to represent a reference counted heap allocated byte buffer. (Note that `Rc<Box<[u8]>>` should have originally been `Rc<[u8]>` -- the former introduces an unnecessary allocation and layer of indirection.)

Differential Revision: D13769306

fbshipit-source-id: 5f3e788426e28c7e9ccc478f993c717b23663f56
2019-01-22 14:03:17 -08:00
Arun Kulshreshtha
d3839ffb07 revisionstore: use Bytes instead of Box<[u8]> in Delta and DataEntry
Summary: Boxed bytes slices (e.g., `Box<[u8]>`, `Rc<[u8]>`) are not very ergonomic to use and are somewhat unusual in Rust code. Use the more common and easier to use `Bytes` type instead. Since this type supports shallow, referenced-counted copies, there shouldn't be any new O(n) copying behavior compared to `Rc<[u8]>`.

Reviewed By: markbt

Differential Revision: D13754730

fbshipit-source-id: d5fbc8e39c84c56d30174f4bb194ee21a14bf944
2019-01-22 14:03:17 -08:00
Arun Kulshreshtha
96fee34104 revisionstore: migrate to rust 2018
Summary: Migrate crate to Rust 2018 by running `cargo fix --edition --edition-idioms`, removing `extern crate` declarations, and fixing all new warnings.

Reviewed By: singhsrb

Differential Revision: D13754392

fbshipit-source-id: 3343a07e7d8b332e15475084a8a8ddff06f6d13b
2019-01-21 18:00:57 -08:00
Arun Kulshreshtha
aefe1ba8f8 revisionstore: regroup imports
Summary:
Previously, `use` statements were inconsistently and arbitrarily grouped. This diff groups them in the following order:

- 3rd party crates from crates.io
- local crates
- std library imports (collapsed into a single multiline `use` statement)
- modules within current crate

This new ordering ensures that upon migration to Rust 2018, all imports from within the current crate will be grouped together with the `crate::` prefix.

Reviewed By: singhsrb

Differential Revision: D13754393

fbshipit-source-id: e774c09e0547066afa5f797c1a9c2e5ec4190834
2019-01-21 18:00:57 -08:00
Arun Kulshreshtha
37a74966a2 revisionstore: rustfmt
Summary: Run the latest version of rustfmt over the code to ensure consistent style.

Reviewed By: singhsrb

Differential Revision: D13754394

fbshipit-source-id: 6cf5937bcb642530bdf41aaf83399366a9ba3c9a
2019-01-21 18:00:57 -08:00
Arun Kulshreshtha
bfe737d1fb revisionstore: fix dead code warnings
Summary: There were some warnings about unused private fields in various structs in this crate. Add `#[allow(dead_code)]` as needed to suppress these warnings.

Reviewed By: singhsrb

Differential Revision: D13754234

fbshipit-source-id: ca95a2afbfc67ddb66e7c7436c81cde0fa59f06c
2019-01-21 18:00:57 -08:00
Mark Thomas
a1a2eafd95 revisionstore: use Fallible
Summary:
Use the `Fallible` type alias provided by `failure` rather than defining our
own.

Differential Revision: D13732298

fbshipit-source-id: 2577bc4c34da5b7a88ae2703f9b898bc2a83b816
2019-01-21 03:37:19 -08:00
Xavier Deguillard
33688947c6 revisionstore: sort pack files in list_packs
Summary:
Directory listing is different in every OS, and due to the current repack
implementation, this directly affect the order in which the packfiles are added
to the new one. Since the resulting packfile name depends on the hash of its
content, the name was influenced by the directory order.

By sorting the files in list_packs, the packfile name will be independent of
the directory listing and thus be the same for all the OSes.

Reviewed By: singhsrb

Differential Revision: D13700935

fbshipit-source-id: 01e055a0c1bcf7fb2dc4faf614dfb20cd4499017
2019-01-16 15:18:24 -08:00
Xavier Deguillard
87cf0f533b revisionstore: Add a basic rust incremental repack.
Summary: For now, combine all files smaller than 100MB that accumulate to less than 4GB.

Reviewed By: DurhamG

Differential Revision: D13603760

fbshipit-source-id: 3fa74f1ced3d3ccd463af8f187ef5e0254e1820b
2019-01-16 09:47:09 -08:00
Xavier Deguillard
2525a6e9ee revisionstore: Use PackWriter to write to {data,history}packs.
Summary: Use the newly introduced PackWriter to write the {data,history}packs.

Reviewed By: markbt

Differential Revision: D13603759

fbshipit-source-id: 528a6af7c4ac3321aeec0559805de12114224cfd
2019-01-16 09:47:09 -08:00
Xavier Deguillard
e6a60b68f3 revisionstore: Add an efficient pack writer.
Summary:
The packfiles are currently being written via an unbuffered file. This is
inefficient as every write to the file results results in a write(2) syscall.
By buffering these writes we can reduce the number of syscalls and thus
increase the throughput of pack writing operations.

Reviewed By: markbt

Differential Revision: D13603758

fbshipit-source-id: 649186a852d427a1473695b1d32cc9cd87a74a75
2019-01-16 09:47:09 -08:00
Arun Kulshreshtha
28e20c5997 Reexport public types from public submodules
Summary:
The revisionstore crate currently consists of several public submodules,
each exposing several public types. The APIs exposed by each of the modules
require using types from the other modules. As such, users of this crate are
forced to have complex nested imports to use any of its functionality.

This diff helps ease this problem by reexporting the public types exposed from
each of the public submodules at the top level, thereby allowing crate users to
`use` all of the required types without needing nested imports.

Reviewed By: singhsrb

Differential Revision: D13686913

fbshipit-source-id: 9fb3cce8783787aa5f3f974c7168afada5952712
2019-01-15 21:20:03 -08:00
Xavier Deguillard
e6135fa88e revisionstore: Use get_missing instead of get_delta in repack.
Summary:
The later tries to read from the disk, while the former is purely in memory and
thus more efficient.

Reviewed By: DurhamG, markbt

Differential Revision: D13603757

fbshipit-source-id: 5fd120ba4065d6a65cb2982db9ab81db3ea26524
2019-01-15 17:02:38 -08:00
Xavier Deguillard
f170cceea2 revisionstore: Repackable::delete now takes the ownership of self.
Summary:
On some platforms, removing a file can fail if it's still mapped or opened. In
mercurial, this can happen during repack as the datapacks are removed while
still being mapped.

Reviewed By: DurhamG

Differential Revision: D13615938

fbshipit-source-id: fdc1ff9370e2767e52ee1828552f4598105f784f
2019-01-14 21:14:13 -08:00
Xavier Deguillard
da3dd2319f revisionstore: remove repacked pack files
Summary:
After repacking the data/history packs, we need to cleanup the
repacked files. This was an omission from D13363853.

Reviewed By: markbt

Differential Revision: D13577592

fbshipit-source-id: 36e7d5b8e86affe47cdd10d33a769969f02b8a62
2019-01-11 16:54:15 -08:00
Xavier Deguillard
ce16778656 remotefilelog: set proper file permissions on closed mutable packs.
Summary:
The python version of the mutable packs set the permission to read-only after
writing them, while the rust version keeps them writeable. Let's make the rust
one more consistent.

Reviewed By: markbt

Differential Revision: D13573572

fbshipit-source-id: 61256994562aa09058a88a7935c16dfd7ddf9d18
2019-01-11 16:54:15 -08:00
Xavier Deguillard
79164e920c revisionstore: replace rand::chacha with rand_chacha
Summary: The former is deprecated and thus compiling revisionstore shows many warnings.

Reviewed By: markbt

Differential Revision: D13379278

fbshipit-source-id: d4b4662a1ad00997de4c46274deaf22f48487328
2018-12-17 12:07:22 -08:00
Xavier Deguillard
5307fd8867 revisionstore: implement basic repack in rust
Summary:
The future of mercurial is rust, and one of the missing piece is repacking of data/history packs. For now, let's implement a very basic packing strategy that just pulls all the packs into one, with one small optimization that puts all the delta chains close together in the output file.

At first, it's expected that this code will be driven by the existing python code, but more and more will be done in rust as time goes.

Reviewed By: DurhamG

Differential Revision: D13363853

fbshipit-source-id: ad1ac2039e1732f7141d99abf7f01804a9bde097
2018-12-12 12:44:03 -08:00
Jun Wu
61f0a3da45 tests: add a test-check test that runs fix-code.py
Summary:
Add "--dry-run" for fix-code.py and use it in test-check.
This avoids license header and version = "*" issues.

Reviewed By: ikostia

Differential Revision: D10213070

fbshipit-source-id: 9fdd49ead3dfcecf292d5f42c028f20e5dde65d3
2018-11-15 18:54:06 -08:00
Jun Wu
616306543b codemod: use explicit versions in Cargo.toml
Summary:
This is done by running `fix-code.py`. Note that those strings are
semvers so they do not pin down the exact version. An API-compatiable upgrade
is still possible.

Reviewed By: ikostia

Differential Revision: D10213073

fbshipit-source-id: 82f90766fb7e02cdeb6615ae3cb7212d928ed48d
2018-11-15 18:54:06 -08:00
Wez Furlong
caad413499 load blobs using hg's rust config and datapack code
Summary:
This diff implements getBlob on top of the mercurial rust
datapack code.  It adds a C++ binding on top of the rust code to
make it easier to use and hooks it up in the hg backing store.

Need to figure this out for our opensource and windows builds:

* Need to teach them how to build and link the rust code
* need to add a windows version of the methods that accept paths;
  this is just a matter of adding a WCHAR version of the functions.

Reviewed By: strager

Differential Revision: D10433450

fbshipit-source-id: 45ce34fb9c383ea6018a0ca858581e0fe11ef3b5
2018-10-31 17:58:17 -07:00
Mark Thomas
8c076978ff revisionstore: handle truncated packfiles better
Summary:
If the rust pack stores are used to access truncated pack files, currently they
panic.  Instead, return a proper error showing what's wrong.

Reviewed By: quark-zju

Differential Revision: D10868299

fbshipit-source-id: 57fe5ec1ee4ee2a7bb10d2d5c5ca7082dc34125d
2018-10-27 08:58:24 -07:00
Jun Wu
3adc813687 codemod: add copyright headers
Summary: This is just the result of running `./contrib/fix-code.py $(hg files .)`

Reviewed By: ikostia

Differential Revision: D10213075

fbshipit-source-id: 88577c9b9588a5b44fcf1fe6f0082815dfeb363a
2018-10-26 15:09:12 -07:00
Durham Goode
79a60403f7 histpack: sort history entries before writing them
Summary:
The histpack format requires that entries in each file section be
written in topological order, so that future readers can compute ancestors by
just linearly scanning. Let's make the rust mutable history pack support this.

Technically the rust historypack reader does not require this for now, but the python
one does, so we need to enforce it.

Reviewed By: kulshrax

Differential Revision: D10441286

fbshipit-source-id: dfdb57182909270b760bd79a100873aa3903a2a5
2018-10-23 17:16:01 -07:00
Durham Goode
3f06e4734e histpack: fix exponential time bug in rust history pack
Summary:
During an ancestor traversal, we were adding items to the queue if they
hadn't be processed yet. In a highly merge-y history this could result in adding
an exponential number of items to the queue since we aren't preventing items
from being added until they are actually consumed.

The fix is to just add the items to the seen set as we add them to the queue.

Reviewed By: quark-zju

Differential Revision: D10434655

fbshipit-source-id: 430b51adb2d24a99d8c780031f3dbf22c56b9347
2018-10-17 15:00:21 -07:00
Jun Wu
7752e9e81f rustlib: move Node to a separate "types" crate
Summary:
The `Node` type will be used in multiple places. Let's move it to a standalone
crate so new libraries depending on it won't need to pull in all of
revisionstore's dependencies.

Note: I'd also like the `types` create to only define clean types. Given the
fact NULL_ID is not a great design in Mercurial (`Option<Node>` is a better
choice in Rust), it probably does not belong to the formal Rust `Node` type.
This diff is merely about moving things with minimal changes. NULL_ID will
be decoupled from `Node` in a follow-up.

Reviewed By: markbt

Differential Revision: D10132047

fbshipit-source-id: 5d05c5e0ac06a2d58556c4db11775503f9495626
2018-10-03 18:19:27 -07:00
Harvey Hunt
c507d4e818 Implement Display trait for revisionstore Node
Summary:
Copy functions from Mononoke to implement the Display trait
for a Node.

Reviewed By: quark-zju

Differential Revision: D9768566

fbshipit-source-id: 6961026a9e4cdaf4a0f2592dc9284abebadb0aa3
2018-09-20 05:05:08 -07:00
Durham Goode
a0d0a75c44 revisionstore: add unit tests for ancestor logic
Summary:
This was meant to be in a prior diff but was forgotten. This also
exposes an issue where we aren't producing ancestors in topological order.

Reviewed By: quark-zju

Differential Revision: D9380009

fbshipit-source-id: 6a49f0f31c3e107353f9192ca15cda0b1b9c3693
2018-08-17 12:49:57 -07:00
Durham Goode
6dfc0351f4 revisionstore: don't allow loading non-v1 historypacks
Summary:
v0 history packs require more complicated and slow logic for looking up
a node.  Instead of complicating our rust implementation, let's just not support
v0.

Reviewed By: quark-zju

Differential Revision: D9373395

fbshipit-source-id: 6d28a3684966b55a617619e3cae765b2944919a0
2018-08-17 09:39:36 -07:00
Durham Goode
58b15fd23c revisionstore: make get_ancestors return an error if it can't find the key
Summary:
When calling get_ancestors with 'partial' enabled, we want to return a
key error if the first key can't be found, but not if later keys can't be found.

Reviewed By: singhsrb

Differential Revision: D9367477

fbshipit-source-id: 0e9ad7ea82f83db7326392accab96bd31318f28e
2018-08-17 09:39:36 -07:00