Commit Graph

51 Commits

Author SHA1 Message Date
Katherine McKinley
5c1e5825b0 Convert NodeHash to ChangesetId in BlobChangeset
Summary: Change BlobChangeset and callers to use ChangesetId instead of NodeId

Reviewed By: lukaspiatkowski

Differential Revision: D6835450

fbshipit-source-id: 7b20359837632aef4803e40965380c38f54c9d0a
2018-01-31 13:36:45 -08:00
Xiaotian Wu
2c4d93ceb8 modify entryid related code
Summary: modify entryid code

Reviewed By: lukaspiatkowski

Differential Revision: D6722202

fbshipit-source-id: 1df45071709f4a425374a87a29553830071b5d2d
2018-01-16 07:57:25 -08:00
Xiaotian Wu
92c0528b1f modify manifestid related code
Summary: nodehash -> manifestid

Reviewed By: lukaspiatkowski

Differential Revision: D6719378

fbshipit-source-id: 1ec59b33270e389da8e74b3864c37a86c9d89f81
2018-01-16 07:57:25 -08:00
Stanislau Hlebik
5c43ec42c7 mononoke: add --inmemory-logs-capacity option
Summary:
Adds an option that sets the number of filelogs and revlogs that will be loaded
in memory. That let's us use blobimporting in memory constrained
enviroments.

Reviewed By: jsgf

Differential Revision: D6532734

fbshipit-source-id: b748478ec80e75f56a8e07ae1532b0d69c4a5e16
2017-12-13 06:58:24 -08:00
Simon Farnsworth
4d6f3fa51b Remove generics from Linknode
Summary:
Like the other  BlobState components, Linknode was too generic -
reduce down to a practical set for live implementations.

Error handling is not great here or in Bookmarks, but I'm going to await the
decision on moving to Failure before I improve it.

Reviewed By: jsgf

Differential Revision: D6459012

fbshipit-source-id: 00314309f62ba070b5908a28f5174a31b6dd0d84
2017-12-11 07:05:53 -08:00
Simon Farnsworth
28caf3f938 Remove Error, GetBlob and PutBlob from BlobStore
Summary:
Remove the last associated types from BlobStore - this means that
BlobStore now has an associated trait object type.

Reviewed By: jsgf

Differential Revision: D6425414

fbshipit-source-id: 7186dab9b56593dd1d70be732d4ad56d1e7b3c63
2017-12-11 07:05:53 -08:00
Jeremy Fitzhardinge
ac31713c84 rust: failure cleanup pass
Summary:
Don't use failure's bail!() and ensure!() macros.

Instead, failure_ext provides:
- bail_err!(err) - Converts its single parameter to the expected error and returns; ie `return Err(From::from(err));`
- bail_msg!(fmt, ...) - takes format string parameters and returns a `failure::err_msg()` error
- ensure_err!(), ensure_msg!() - corresponding changes

Also:
- remove all stray references to error-chain
- remove direct references to failure_derive (it's reexported via failure and failure_ext)
- replace uses of `Err(foo)?;` with `bail_err!()` (since `bail_err` unconditionally returns, but `Err(x)?` does not in principle, which can affect type inference)

Reviewed By: kulshrax

Differential Revision: D6507717

fbshipit-source-id: 635fb6f8c96d185b195dff171ea9c8db9e83af10
2017-12-07 14:10:17 -08:00
Stanislau Hlebik
2bf9290758 mononoke: add skip parameter to blobimport
Summary:
Make it possible to skip a number of commits.
Also change type from usize to u64, to make sure it works the same on 32-bit
platforms (although that shouldn't matter much).

Reviewed By: jsgf

Differential Revision: D6395743

fbshipit-source-id: 88a12583de2b23d4f55115d696c5398f6814c2da
2017-12-06 01:21:11 -08:00
Jeremy Fitzhardinge
dc5e78c1c1 rust: mass convert scm/mononoke/... to use failure
Summary:
Convert scm/mononoke to use failure, and update common/rust crates it depends on as well.

What it looks like is a lot of deleted code...

General strategy:
- common/rust/failure_ext adds some things that are in git failure that aren't yet in crates.io (`bail!` and `ensure!`, `Result<T, Error>`)
- everything returns `Result<T, failure::Error>`
- crates with real error get an error type, with a derived Fail implementation
  - replicate error-chain by defining an `enum ErrorKind` where the fields match the declared errors in the error! macro
- crates with dummy error-chain (no local errors) lose it
- `.chain_err()` -> `.context()` or `.with_context()`

So far the only place I've needed to extract an error is in a unit test.
Having a single unified error type has simplified a lot of things, and removed a lot of error type parameters, error conversion, etc, etc.

Reviewed By: sid0

Differential Revision: D6446584

fbshipit-source-id: 744640ca2997d4a85513c4519017f2e2e78a73f5
2017-12-05 18:11:13 -08:00
Simon Farnsworth
16da012250 Remove ValueIn/ValueOut from the BlobStore generic arguments.
Summary:
BlobStore is entirely generic, and puts no limits on its
implementations. Remove ValueIn and ValueOut type parameters, and insist that
all blobs are Bytes (as per production setups)

Reviewed By: StanislavGlebik

Differential Revision: D6425413

fbshipit-source-id: 455e526d8baebd0d0f1906941648acca89be4881
2017-12-04 10:22:09 -08:00
Simon Farnsworth
c1ee28dbc7 Remove Key from the BlobStore generic arguments.
Summary:
BlobStore is entirely generic, and puts no limits on its
implementations. Remove the "Key" type parameter, and insist that all keys are
String (as per production setups)

Reviewed By: StanislavGlebik

Differential Revision: D6425412

fbshipit-source-id: 1f1229bf8e001bf780964e883c6beb071e9ef1d8
2017-12-04 10:22:09 -08:00
Stanislau Hlebik
074fdbdc83 mononoke: make OUTPUT parameter optional in blobimport
Summary:
In some cases output path is not necessary at all - for example, if we put
blobs into the remote storage and we don't care about heads.

Let's make OUTPUT parameter optional for these cases.

Reviewed By: jsgf

Differential Revision: D6397168

fbshipit-source-id: 06ee3b2bba038ff5076040a01c9d73c2b6e2b5fc
2017-12-03 08:37:01 -08:00
Simon Farnsworth
16615b3749 Make Heads a parameterless trait
Summary:
As part of removing excess genericism, make Heads a trait with no
associated types or type parameters.

Reviewed By: StanislavGlebik

Differential Revision: D6352727

fbshipit-source-id: df9ef87e0e0abe43c30e7318da38d7f930c37c6e
2017-11-23 07:05:35 -08:00
Siddharth Agarwal
a6c5093cc8 blobimport: write out linknodes
Summary:
This makes it quite easy to write out linknodes.

Also regenerate linknodes for our test fixtures -- the next commit will bring
them in.

Reviewed By: jsgf

Differential Revision: D6214033

fbshipit-source-id: 3b930fe9eda45a1b7bc6f0b3f81dd8af102061fc
2017-11-13 22:01:55 -08:00
Jeremy Fitzhardinge
66a5fa4362 mononoke: mothball hgqlserve
Summary:
It's an interesting prototype, but awkward to keep running and we
only need it for reference.

Reviewed By: StanislavGlebik

Differential Revision: D6306296

fbshipit-source-id: 10b5bf3631debcb9de258d4d68089ff709dc1329
2017-11-13 17:45:49 -08:00
Stanislau Hlebik
83f2eb90a5 mononoke: remove retries
Summary:
Putting retries on this layer is not very good, because it requires every
client to add RetryingBlobstore.

Reviewed By: kulshrax

Differential Revision: D6298254

fbshipit-source-id: dbdce7fe141f9e1511322e74a1258d3819a68eb5
2017-11-13 03:21:31 -08:00
Stanislau Hlebik
ab834dd388 blobimport: add more stats
Reviewed By: lukaspiatkowski

Differential Revision: D6207104

fbshipit-source-id: c124a39618a95bbb42cff22ce9120e517c26489f
2017-11-07 10:43:50 -08:00
Siddharth Agarwal
acda30f2fb replace AsRef<Path> with Into<PathBuf> in a few spots
Summary:
We need ownership of the buffer in all of these cases, and
`AsRef<Path>` could potentially create unnecessary copies.

Reviewed By: jsgf

Differential Revision: D6214034

fbshipit-source-id: 806a87bfe3b125febaaaaf26c8b8dcac407de145
2017-11-02 13:10:56 -07:00
Stanislau Hlebik
e64ef44fda mononoke: add an option to limit blob size in blobimport
Summary:
This option can be used non-production ready blobstores that can't yet handle
big blobs.

Reviewed By: farnz

Differential Revision: D6189922

fbshipit-source-id: fa4df5b49c6d1126d3b3114e9ebe376931947917
2017-11-01 13:01:58 -07:00
Stanislau Hlebik
8cc33676aa mononoke: add --commits-limit option to blobimport
Summary:
It's quite useful option for testing and I had to reimplement this option a
couple of time. It's time to land it.

Reviewed By: farnz

Differential Revision: D6172230

fbshipit-source-id: ec1b7c0453a3a612a173aec87978a4917568cd7b
2017-11-01 13:01:57 -07:00
Lukas Piatkowski
9712530c0e blobstore: add RetryingBlobstore that retries failed put/get operations with delay
Reviewed By: StanislavGlebik

Differential Revision: D6203017

fbshipit-source-id: 277fa267e86d2cb5eede241bf80dd8d1c90a3b96
2017-10-31 20:53:08 -07:00
Siddharth Agarwal
c8d6e7f954 use RepoPath instead of MPath in a few places
Summary:
`RepoPath` represents any absolute path -- root, directory or file. There's a
lot of code that manually switches between directory and file entries --
abstract all of that away.

Reviewed By: farnz

Differential Revision: D6201383

fbshipit-source-id: 0047023a67a5484ddbdd00bb57bca3bfb7d4dd3f
2017-10-31 14:26:39 -07:00
Siddharth Agarwal
3269b802a4 factor out code to put manifest and file blobs
Summary:
copy_changeset was getting a bit too long, and linknode stuff would
have made it even longer.

Reviewed By: StanislavGlebik

Differential Revision: D6097840

fbshipit-source-id: 00800cf9516adf69f2ca19244d3e14268f148ae4
2017-10-19 09:39:17 -07:00
Siddharth Agarwal
92eb9c6e6a blobimport: use (effectively) named params for conversion
Summary:
Going to add more params here, and this is becoming quite hard to
read.

Reviewed By: StanislavGlebik

Differential Revision: D6096419

fbshipit-source-id: 50f0b99bb6b1804fc01f6a99fc0297c1695dbaa5
2017-10-19 09:39:17 -07:00
Siddharth Agarwal
a1c6c494fd blobimport: nicer manifest method names
Summary: Namespacing really helps here!

Reviewed By: StanislavGlebik

Differential Revision: D6095128

fbshipit-source-id: f7e4b6d011c2d55c3976a6ba9a20f17384c62b92
2017-10-19 09:39:17 -07:00
Siddharth Agarwal
675c1e0b56 blobimport: factor out conversion code into separate files
Summary:
I'm adding linknode support to this store, and with that and without
this refactoring the code becomes quite unmanageable.

I've recorded this as copies to preserve blame info for the bits that will
remain untouched. This seems to work pretty well.

Reviewed By: StanislavGlebik

Differential Revision: D6094812

fbshipit-source-id: f7a7a1d3546d4ef2dbfa33a0a8e97d47b44f51a5
2017-10-19 09:39:17 -07:00
Siddharth Agarwal
7205d847f5 blobimport: factor out error definitions into another module
Reviewed By: StanislavGlebik

Differential Revision: D6094813

fbshipit-source-id: fa58b9c69c82c506435d7c7d3b259c8d9cb46545
2017-10-19 09:39:17 -07:00
Siddharth Agarwal
2fbdf6d207 move blobimport code into a subdirectory
Summary: Will factor this out into several files in upcoming patches.

Reviewed By: StanislavGlebik

Differential Revision: D6094811

fbshipit-source-id: cd354888882aff2552e61dea788aeb5426e08f4d
2017-10-19 09:39:17 -07:00
Stanislau Hlebik
553343bc0d mononoke: filter the same manifest entries in blobimporting
Summary:
There is no need to insert the same entries twice. Let's filter them.
Note that while it's possible to have the same manifest entries (for example,
file or dirs with the same content), all changeset entries should be unique,
because each changeset in the repo is unique and is processed exactly once.

Reviewed By: farnz

Differential Revision: D6076667

fbshipit-source-id: 64bdf25a21884eb2faf43f32590f7cbb8f8dd300
2017-10-18 01:36:22 -07:00
Stanislau Hlebik
13823f0f7e mononoke: add separate io thread to blobimport
Summary:
Let's move all IO to the separate thread. This helps quite a lot when used with
slow blostore, because parser threads are not blocked on IO -
importing upstream mercurial repo went from 20 mins to 9 mins.

Reviewed By: lukaspiatkowski

Differential Revision: D6050992

fbshipit-source-id: c3877b123bad993d819495247135544a141eab10
2017-10-18 01:36:22 -07:00
Arun Kulshreshtha
be57d5440b Change default Manifold bucket to mononoke_prod in blobimport
Summary: Change the default bucket for blobimport to be mononoke_prod, a higher capacity bucket than the previous mononoke bucket. Also make it possible to specify the bucket via the CLI rather than hardcoding it.

Reviewed By: jsgf

Differential Revision: D6073745

fbshipit-source-id: 11dcf0c8bbef0b7c3f5971cf0676cf6325f276a6
2017-10-17 00:36:28 -07:00
Stanislau Hlebik
024650d13d mononoke: fix blobimporting warnings
Reviewed By: kulshrax

Differential Revision: D5995891

fbshipit-source-id: d8e396da8689cc1d07856a21e0f023a83e222877
2017-10-12 04:07:45 -07:00
Lukas Piatkowski
0a76eb6f45 blobimport: add optional thrift service and stats counters
Reviewed By: farnz

Differential Revision: D6030720

fbshipit-source-id: 26971f13061e9a3a1e65d339fc7bdc444b5165fd
2017-10-11 10:58:08 -07:00
Lukas Piatkowski
a3579ee80d blobimport: add more descriptive error messages on manifold failure
Reviewed By: kulshrax

Differential Revision: D6022078

fbshipit-source-id: 4504524cb49dbd9b013ac44229107be5024dfcad
2017-10-11 05:50:50 -07:00
Lukas Piatkowski
565afd9615 blobimport: use drain from slog-glog-fmt for logging
Summary: the glog drain does not swallow f.e. backtrace of error_chain errors, so it is a bit easier to debug the tool

Reviewed By: farnz

Differential Revision: D6021671

fbshipit-source-id: 32bfe01bfd77d85c37a2a446cb3e5d000763c689
2017-10-11 05:50:46 -07:00
Siddharth Agarwal
db28a15305 add deny(warnings)/allow(deprecated) to a few crates
Summary:
Realized that we were missing a few crates from the Tokio cleanup because those crates
didn't have `#![deny(warnings)]`.

This also caused a bunch of files to be rustfmted, which is fine.

Reviewed By: kulshrax

Differential Revision: D6024628

fbshipit-source-id: 55032d20f3676c92ef124d861e1edcd34126ab55
2017-10-10 15:23:25 -07:00
Stanislau Hlebik
c474b57a9e blobimport: add an option to postpone compaction
Summary: Compaction can slow down blobimporting a lot. Let's add an option to postpone it till the end

Reviewed By: farnz

Differential Revision: D5882003

fbshipit-source-id: 0611a8e94b3d7331bdacf909d820526f547414a0
2017-09-22 08:33:25 -07:00
Siddharth Agarwal
de4b1f6c93 use BlobHash in RawNodeBlob
Summary: Also ensure that `blobimport` doesn't use its own copy.

Reviewed By: jsgf

Differential Revision: D5847604

fbshipit-source-id: 5390848cd5fab8abd967ef9701720491d703c0f1
2017-09-18 00:35:52 -07:00
Jeremy Fitzhardinge
2f72747e85 mononoke: blobimport: more use of impl Future
Summary:
This seemed to run up against a compiler issue
(https://users.rust-lang.org/t/impl-future-lost-fact-that-a-closure-is-fnonce/12870/16)
that can be worked around by adding an apparently unnecessary `.boxed()`.

Reviewed By: sid0

Differential Revision: D5843292

fbshipit-source-id: 0a82760bf6afbf7ba5f04541ca57bedcc935d411
2017-09-15 15:58:33 -07:00
Jeremy Fitzhardinge
8f701222aa mononoke: blobimport: first use of impl Future
Summary: Use `impl Future` rather than a boxed future.

Reviewed By: sid0

Differential Revision: D5829773

fbshipit-source-id: 40c4339e96f7194544f416534952b78a23d93fa6
2017-09-15 15:24:05 -07:00
Jeremy Fitzhardinge
d584bc31b3 mononoke: blobimport: rustfmt
Reviewed By: sid0

Differential Revision: D5831880

fbshipit-source-id: 9d9d7b18214241336e97219c48bf281dd08b5ab9
2017-09-15 15:24:05 -07:00
Arun Kulshreshtha
c9d8af3f1c Add ManifoldBlob support to blobimport
Summary: Add the `--blobstore manifold` option to blobimport to make it write blobs to Manifold.

Reviewed By: jsgf

Differential Revision: D5758930

fbshipit-source-id: a14a3c155b5d8d7b171ed7a4e53f8569539cb2e9
2017-09-05 13:38:47 -07:00
Jeremy Fitzhardinge
0745e97bfe mononoke: don't use : in blob key names
Summary:
`:` is a reserved character for Windows paths, so Mercurial rejects
them from being committed. Use `-` instead, so that we can commit file blob
repo test fixtures.

Reviewed By: kulshrax

Differential Revision: D5731525

fbshipit-source-id: 8d14fc03f1b135cbc4d42aeaf2f3a0ae6d13f956
2017-09-04 15:20:06 -07:00
Siddharth Agarwal
c1a30e25c9 make repo error types require std::error::Error
Summary: This gets us `Display` support as well.

Reviewed By: lukaspiatkowski

Differential Revision: D5734383

fbshipit-source-id: 1485cf80bb310cdd282b4546bed56c60082be8ec
2017-08-31 13:51:24 -07:00
Siddharth Agarwal
8202a344b4 rustfmt mercurial-types and blobimport
Summary: Just a few minor changes that make our lives easier overall.

Reviewed By: lukaspiatkowski

Differential Revision: D5737854

fbshipit-source-id: da951d7872433bffa8fc64d15cd0e917f77144b5
2017-08-31 13:51:24 -07:00
Stanislau Hlebik
dbd845ba0e mononoke blobimport: split work between threads by linkrev
Summary:
We want to avoid putting the same entries twice in the blobstore. And even more - we want to avoid generating list of these entries at all in the first place.

The first approach was to add a `Mutex<HashSet>` that worker threads will use to filter out entries that were already imported. Turned out that this Mutex kills almost all the speedup from concurrency.
But since we have linkrevs then for each entry we know in which commit this entry was created [1]. That means that all of the entries are already nicely split between the threads. So no synchronization is needed.

It gives a good speedup - from ~7min to 2min of importing of hg upstream treemanifest repo using file blobstore.

Note: there is still a lock contention - tree revlogs and file revlogs maps are protected by mutex. We can optimize it later if needed.

[1] There is a well-known linkrev issue in mercurial. It shouldn't affect our case at all.

Reviewed By: jsgf

Differential Revision: D5650074

fbshipit-source-id: c4f9e2763127ffe4402417dd3963f1f450d7b325
2017-08-23 05:13:50 -07:00
Stanislau Hlebik
747e4a2ed9 mononoke: rustfmt on blobimport
Reviewed By: sid0

Differential Revision: D5649665

fbshipit-source-id: 553ef550e7465ab5f3bb129cf4d0c282128fa24d
2017-08-23 05:13:50 -07:00
Stanislau Hlebik
b72949e3d1 mononoke: blob importing of tree manifest repo
Summary: Main part is `get_stream_of_manifest_entries` that creates a stream of all tree manifest entries by recursively going through all of them.

Reviewed By: jsgf

Differential Revision: D5622490

fbshipit-source-id: 4a8b2707df0300a37931c465bafb1ed54d6d4d25
2017-08-18 09:51:17 -07:00
Stanislau Hlebik
c33e4afddf mononoke: same storage keys for manifest entries as for the file entries
Summary:
A preparation step before blob importing of tree manifest repos to blobrepo.

`get_parents()` method of BlobEntry reads parents from the blobstore. It works fine for file entries because file entries can stores its parents in the blobstore. With tree manifests BlobEntry can contain also tree manifest entries, and that means that tree manifest entries parents should also be stored somewhere in the blobstore.

I suggest to use the same logic for the tree manifest entries as for the file entries. File and manifest entries have two blobstore entries - one stores hash of the content and parents, another stores the actual content.

To do this I moved `RawNodeBlob` and `get_node()` to the separate module and made fields public.

Reviewed By: jsgf

Differential Revision: D5622342

fbshipit-source-id: c9f0c446107d4697b042544ff8b37a159064f061
2017-08-15 10:37:34 -07:00
Stanislau Hlebik
91d587e052 mononoke: change Path implementation
Summary:
Instead of storing `Vec<u8>`, let's store `Vec<PathComponent>`, where PathComponent is Vec<u8> without b'\'.
To make sure len() is still `O(1)` let's store it too.

Reviewed By: sid0

Differential Revision: D5573721

fbshipit-source-id: 91967809284d79bf0fcdcabcae9fd787a37c318b
2017-08-10 05:24:42 -07:00