Commit Graph

55 Commits

Author SHA1 Message Date
Johan Schuijt-Li
50b7d93014 rustfmt before editing
Summary: Code format.

Reviewed By: StanislavGlebik

Differential Revision: D14131431

fbshipit-source-id: 1b6f89382bbbe8b5bb033e3ecf7620d161d587e8
2019-02-21 06:27:53 -08:00
Lukas Piatkowski
515a2909eb monononoke hashes: remove usages of borrows of hashes which are Copy
Summary: The Copy trait means that something is so cheap to copy that you don't even need to explicitly do `.clone()` on it. As it doesn't make much sense to pass &i64 it also doesn't make much sense to pass &<Something that is Copy>, so I have removed all the occurences of passing one of ouf hashes that are Copy.

Reviewed By: fanzeyi

Differential Revision: D13974622

fbshipit-source-id: 89efc1c1e29269cc2e77dcb124964265c344f519
2019-02-06 15:11:35 -08:00
Stanislau Hlebik
c608944cfd mononoke: remove unused code
Summary:
Removed:

cmd-line cmd tool for filenodes and bookmarks. These should be a part of
mononoke_admin script

Outdates docs folder

Commitsim crate, because it's replaced by real pushrebase

unused hooks_old crate

storage crate which wasn't used

Reviewed By: aslpavel

Differential Revision: D13301035

fbshipit-source-id: 3ae398752218915dc4eb85c11be84e48168677cc
2018-12-05 05:58:07 -08:00
Anastasiya Zhyrkevich
bb31c81995 RevIdx flags ExtStored parsing
Summary:
Mononoke blobimports filenodes from Mercurial Host Machines to its blobstore.
It parses revlogs to retrieve flags used for parsing.
Revlogs Contains the flags for Marking Ext Stored FIles. This flag means that the file is not stored at the  Mercurial Host machine, but somewhere remotely.
This diff provides the api to get the extstored flag from revlog.

Reviewed By: farnz

Differential Revision: D13081950

fbshipit-source-id: ba5bc04ad3659d4880960995d1cc46594d89e220
2018-11-22 10:33:58 -08:00
Mark Thomas
e0a21d4deb add storerequirements support
Summary:
Add support for the storerequirements feature of Mercurial repositories, which
requires the reader to additionally check the store/requires file for store
requirements.

Reviewed By: StanislavGlebik

Differential Revision: D9850335

fbshipit-source-id: 557ea0f90f3d138d1df56edd94ee23760b9fd849
2018-09-17 04:52:40 -07:00
Stanislau Hlebik
3a9da5606a mononoke: add new requirement
Summary:
Hg has added it, it doesn't impact Mononoke.
Without it Mononoke can't read config repo

Reviewed By: farnz

Differential Revision: D8595497

fbshipit-source-id: a5d9f1cc9b00dde805c744a7f876dbdd86bd4a55
2018-06-22 14:51:40 -07:00
Rain ⁣
3665c48fc7 remove the old blobimport tool
Summary:
The old blobimport tool will not be able to import commits with the new Thrift serialization they'll be switching to.

`blobrepo::utils::RawNodeBlob` is also used by the admin tool, and it will go away once we start using Thrift serialization.

Reviewed By: farnz

Differential Revision: D8372455

fbshipit-source-id: d02a37e33e1ccd4dd1f695e38dbb40851dd51cd6
2018-06-12 15:40:10 -07:00
Lukas Piatkowski
4f44c3f130 mercurial_types: remove D* types and move mercurial types around
Summary:
Now it is as it should be: mercurial_types have the types, mercurial has revlog related structures
burnbridge

Reviewed By: farnz

Differential Revision: D8319906

fbshipit-source-id: 256e73cdd1b1a304c957b812b227abfc142fd725
2018-06-07 13:19:16 -07:00
Lukas Piatkowski
10857e6a37 CODEMOD: rename mercurial::BlobNode to HgBlobNode
Reviewed By: sid0

Differential Revision: D7620628

fbshipit-source-id: 8d616c3b9cd3342f71155f11bcef4feb760ddf0e
2018-04-16 03:40:25 -07:00
Lukas Piatkowski
2e0522f884 CODEMOD: rename mercurial::EntryId to HgEntryId
Reviewed By: sid0

Differential Revision: D7620299

fbshipit-source-id: 9bc1505473567528d1f174ae0f0f1312af187814
2018-04-16 03:40:25 -07:00
Lukas Piatkowski
03255529fa CODEMOD: rename mercurial::NodeHash to HgNodeHash
Reviewed By: sid0

Differential Revision: D7619973

fbshipit-source-id: 229fea891788c33eb1f45446ba2333e945ca5553
2018-04-16 03:40:25 -07:00
Lukas Piatkowski
d0e66cc5f7 mercurial: distinguish between NodeHash used in mercurial Revlogs and in Mononoke Blobstore
Summary:
This codemod tries not to change the existing behavior of system, only introduce new types specific to Mercurial Revlogs.
It introduces a lot of copypasta intentionally and it will be cleaned in following diffs.

Reviewed By: farnz

Differential Revision: D7367191

fbshipit-source-id: 0a915f427dff431065e903b5f6fbd3cba6bc22a7
2018-03-22 12:24:35 -07:00
Siddharth Agarwal
d462fc8fa9 revlogrepo: use Option<MPath> for trees
Summary: Represent the root tree as None.

Reviewed By: farnz

Differential Revision: D7354168

fbshipit-source-id: 5d71a3bd43c19e86ecf7d53a3f721547acabe080
2018-03-21 15:59:31 -07:00
Lukas Piatkowski
a4cd4005df mercurial: refactor RevlogRepo to expose less methods publicly
Summary:
RevlogRepo exposes a ton of methods that are almost equvalent to taking Revlog directly and ignoring the RevloRepo abstraction above it.
This diff cleans this up a bit, there are still some methods that the "old" blobimport uses, but the "new" one shouldn't need to do that.

Reviewed By: StanislavGlebik

Differential Revision: D7289445

fbshipit-source-id: ac7130fe41c4e4484d6986fe5b19d5adc751369a
2018-03-20 11:53:09 -07:00
Siddharth Agarwal
76027dfac0 verify that MPathElement instances are sane
Summary:
While writing Thrift deserialization code I realized there was nothing
that actually checked that MPathElement instances don't have embedded nulls or
slashes.

Reviewed By: farnz

Differential Revision: D7296838

fbshipit-source-id: 6a23d559da11e5e935e23d7b9a13f58894efaf62
2018-03-16 10:15:05 -07:00
Siddharth Agarwal
b338897dc4 prefix ChangesetId, ManifestId and BlobHash with Hg
Summary:
Mononoke will introduce its own ChangesetId, ManifestId and BlobHash, and it
would be good to rename these before that lands.

Reviewed By: farnz

Differential Revision: D7293334

fbshipit-source-id: 7d9d5ddf1f1f45ad45f04194e4811b0f6decb3b0
2018-03-15 17:45:29 -07:00
Dino Wernli
fa2b14cd8b Remove the generic types from Blob and BlobNode in favor of Bytes
Summary: Replace the generic types if `Blob` and `BlobNode` with `Bytes`.

Reviewed By: lukaspiatkowski

Differential Revision: D7115361

fbshipit-source-id: 924d347377569c6d1b3b4aed14d584510598da7b
2018-03-02 09:45:04 -08:00
Katherine McKinley
5c1e5825b0 Convert NodeHash to ChangesetId in BlobChangeset
Summary: Change BlobChangeset and callers to use ChangesetId instead of NodeId

Reviewed By: lukaspiatkowski

Differential Revision: D6835450

fbshipit-source-id: 7b20359837632aef4803e40965380c38f54c9d0a
2018-01-31 13:36:45 -08:00
Julian Priestley
67e908efe0 Use ChangesetId rather than NodeHash in bookmarks
Summary: Update the bookmarks module to use ChangesetId to represent bookmarks, rather than NodeHash.

Reviewed By: lukaspiatkowski

Differential Revision: D6774650

fbshipit-source-id: 1742e4e78798ad68a7f17ebd345eef14a7de2cec
2018-01-22 10:23:36 -08:00
Xiaotian Wu
2c4d93ceb8 modify entryid related code
Summary: modify entryid code

Reviewed By: lukaspiatkowski

Differential Revision: D6722202

fbshipit-source-id: 1df45071709f4a425374a87a29553830071b5d2d
2018-01-16 07:57:25 -08:00
Xiaotian Wu
ed94c6702c modify changesetid related code
Summary: modify the parameter type

Reviewed By: lukaspiatkowski

Differential Revision: D6695443

fbshipit-source-id: fafcdc83773cb86c08cdcf3a8d80c1c9a498eca5
2018-01-16 07:57:25 -08:00
Simon Farnsworth
ebafde00b0 Remove Repo trait completely
Summary:
We're never going to serve RevlogRepo in production, and we're down to
a single BlobRepo type that will have different backing stores. Remove the
unused trait, and use BlobRepo everywhere bar blobimport and repo_config
(because we previously hardcoded revlog here - we want to change to a BlobRepo
once blobimport is full-fidelity).

Reviewed By: jsgf

Differential Revision: D6596164

fbshipit-source-id: ba6e76e78c495720792cbe77ae6037f7802ec126
2018-01-15 06:37:27 -08:00
Stanislau Hlebik
5c43ec42c7 mononoke: add --inmemory-logs-capacity option
Summary:
Adds an option that sets the number of filelogs and revlogs that will be loaded
in memory. That let's us use blobimporting in memory constrained
enviroments.

Reviewed By: jsgf

Differential Revision: D6532734

fbshipit-source-id: b748478ec80e75f56a8e07ae1532b0d69c4a5e16
2017-12-13 06:58:24 -08:00
Simon Farnsworth
6ed57cea63 Learn about the treedirstate requires
Summary:
We don't read the dirstate, but treedirstate has a new repo requires
as it changes dirstate format. Learn to ignore it.

Reviewed By: jsgf

Differential Revision: D6509774

fbshipit-source-id: 46faedcece308e2ebc34d87a62d2391a68eeea38
2017-12-07 10:37:17 -08:00
Jeremy Fitzhardinge
dc5e78c1c1 rust: mass convert scm/mononoke/... to use failure
Summary:
Convert scm/mononoke to use failure, and update common/rust crates it depends on as well.

What it looks like is a lot of deleted code...

General strategy:
- common/rust/failure_ext adds some things that are in git failure that aren't yet in crates.io (`bail!` and `ensure!`, `Result<T, Error>`)
- everything returns `Result<T, failure::Error>`
- crates with real error get an error type, with a derived Fail implementation
  - replicate error-chain by defining an `enum ErrorKind` where the fields match the declared errors in the error! macro
- crates with dummy error-chain (no local errors) lose it
- `.chain_err()` -> `.context()` or `.with_context()`

So far the only place I've needed to extract an error is in a unit test.
Having a single unified error type has simplified a lot of things, and removed a lot of error type parameters, error conversion, etc, etc.

Reviewed By: sid0

Differential Revision: D6446584

fbshipit-source-id: 744640ca2997d4a85513c4519017f2e2e78a73f5
2017-12-05 18:11:13 -08:00
Stanislau Hlebik
cda6ccb287 mononoke: add hgsql requirement
Summary:
It doesn't do anything specific, but at least now it doesn't
fail if you try to open hgsql repo

Reviewed By: jsgf

Differential Revision: D6405323

fbshipit-source-id: d844f723ffe4cb8dcd2d2d71351d43524db51201
2017-12-03 08:37:01 -08:00
Stanislau Hlebik
3130f826c3 mononoke: add simple_fsencode
Summary:
There is a complex logic of path encoding in mercurial. Previously only one
case was implemented, when both 'fncache' and 'store' requirements are present.
This commit adds implementation for the case when 'store' requirement is
present, but 'fncache' is not.

Reviewed By: jsgf

Differential Revision: D6405322

fbshipit-source-id: 3b4a0c5b0fd22f43593ffff54dfe748589294012
2017-12-03 08:37:01 -08:00
Stanislau Hlebik
c98c151de2 mononoke: rename fsencode to fncache_fsencode
Summary:
There are a few different path encodings in mercurial. Next diff will add
another one, so in this let's rename fsencode to fncache_fsencode

Reviewed By: jsgf

Differential Revision: D6405324

fbshipit-source-id: 3e67d972b02ca41f29fe24250fb227dd384ea0da
2017-12-03 08:37:01 -08:00
Simon Farnsworth
36de85eb2e Stop repo returning Bookmarks objects directly
Summary:
Returning the bookmarks object gets in the way of degenericising
bookmarks. It's also not in line with other methods on the Repo trait - the
repo handles querying the underlying storage, not the user.

Switch to providing pass-through interfaces for bookmarks.

Reviewed By: StanislavGlebik

Differential Revision: D6408644

fbshipit-source-id: 2808850a070b7bcc478cd40d824bdc8d3acb8b0f
2017-11-27 04:21:05 -08:00
Siddharth Agarwal
a6c5093cc8 blobimport: write out linknodes
Summary:
This makes it quite easy to write out linknodes.

Also regenerate linknodes for our test fixtures -- the next commit will bring
them in.

Reviewed By: jsgf

Differential Revision: D6214033

fbshipit-source-id: 3b930fe9eda45a1b7bc6f0b3f81dd8af102061fc
2017-11-13 22:01:55 -08:00
Siddharth Agarwal
c8d6e7f954 use RepoPath instead of MPath in a few places
Summary:
`RepoPath` represents any absolute path -- root, directory or file. There's a
lot of code that manually switches between directory and file entries --
abstract all of that away.

Reviewed By: farnz

Differential Revision: D6201383

fbshipit-source-id: 0047023a67a5484ddbdd00bb57bca3bfb7d4dd3f
2017-10-31 14:26:39 -07:00
Sam Fu
1e5225b5c7 remove deprecated Tokio APIs from fbcode/scm/mononoke/mercurial
Reviewed By: jsgf

Differential Revision: D6151579

fbshipit-source-id: 6cb9f3a96916d46664bc8d52a7064419324934a6
2017-10-31 11:22:05 -07:00
Stanislau Hlebik
97fedb6daa mononoke: clear revlogs when we have too many of them
Summary:
Blobimporting of huge repos can result in huge memory usage and may result in OOM errors.
Let's use stupid but quite effective method in practice - let's clear revlogs when we have too many of them.

We can't use lru_cache crate (http://contain-rs.github.io/lru-cache/lru_cache/) because it doesn't have immutable methods, so we won't be able to use ReadWriteLock optimization we've added before.

Reviewed By: lukaspiatkowski

Differential Revision: D5953892

fbshipit-source-id: 9d78b0065ee9901d35567972cf0014c3c00c3c77
2017-10-18 01:36:21 -07:00
Siddharth Agarwal
86f4489e7c move bookmarks::Version into a separate crate
Summary:
This Version type is going to form the basis for other key-value
stores like linknodes, so it needs to be moved into a separate crate.

I've chosen `storage_types` as the name because it seems to be the most obvious
candidate.

Reviewed By: jsgf

Differential Revision: D6015772

fbshipit-source-id: 52de7866d68fdec2a4908626679a6f08c5f73402
2017-10-10 07:29:30 -07:00
Siddharth Agarwal
7351b25695 rustfmt a few files
Summary:
Will be making functional changes to these files separately and don't
want the two to interfere.

Reviewed By: jsgf

Differential Revision: D6015773

fbshipit-source-id: 26529ce4075ac47e5f0e80177319e1beb90c2076
2017-10-10 07:29:30 -07:00
Stanislau Hlebik
d904a5c13f mononoke: handle null manifest pointer specially
Summary:
Previously blob importing failed if commit has null manifest pointer. This diff
fixes it by returning empty blob.

Reviewed By: lukaspiatkowski

Differential Revision: D5953225

fbshipit-source-id: 196e4dc3aaf2820ddeee366f20966e598aee97cb
2017-10-03 09:05:52 -07:00
Stanislau Hlebik
1ab5dcaacb mononoke: use RwLock in RevlogRepo
Summary: Bottleneck while blob importing of a huge repo is in the lock contention. Replacing it with RwLock speeds up fbsource import from 20 hours down to 6 hours.

Reviewed By: farnz

Differential Revision: D5891097

fbshipit-source-id: bbac2e113896958d6f2da270837c9787e701b5cb
2017-09-25 05:43:38 -07:00
Siddharth Agarwal
a79b8267ae rename Path to MPath
Summary:
`Path` has the potential to be confused with `std::path::Path`.
`MPath` is nice, concise, and clearly different from `Path`.

Reviewed By: jsgf

Differential Revision: D5895665

fbshipit-source-id: dc5ed5c3866b227d753c6d904d3c6d213c882cd7
2017-09-22 17:27:03 -07:00
Siddharth Agarwal
1165fb00e0 rustfmt a few crates
Summary: Going to make a few changes here.

Reviewed By: jsgf

Differential Revision: D5895642

fbshipit-source-id: 79483e15087d4c552b6bc9801ad3fe0aaba071d6
2017-09-22 17:27:03 -07:00
Stanislau Hlebik
c6cfba9cc5 mononoke: encode datapath too
Summary: Encoding only index path is not enough. It works fine for now because we don't use hash encoding. Next diff adds hash encoding, so we need to encode datapath too.

Reviewed By: jsgf

Differential Revision: D5719574

fbshipit-source-id: 4f2a4a75baad73313e80ffb81031166d4bab3e29
2017-09-17 12:15:07 -07:00
Stanislau Hlebik
6372225acf mononoke: add separate fsencode function instead of fsencode_dir and fsencode_file
Summary:
Let's use separate function fsencode instead of two methods fsencode_dir and fsencode_file.

There are a few reasons for that:
1) This is similar to upsteram mercurial code - it also uses separate function, not a method of the class.
2) Path is supposed to represent a file in the mercurial vfs. Previously we joined this file with "00manifest.i" - it creates a file that doesn't exist in mercurial vfs. This point is debatable though, so I'm fine with making it a method of the class. But probably it doesn't matter that much.
2) We never actually need to encode directory - even in tree manifest case we use fsencode to find location of`00manifest.*` files. That means, that we don't really need to have separate fsencode_dir function, so I was wrong when I added them in the first place.
3) Special hash encoding is used to encode paths that are longer than 120 chars (will be added in the next diff). `00manifest.i` and `00manifest.d` are used in the hash digest, and that means that one fsencode_dir() method is not enough - we'd need to add separate methods to fsencode idx file and separate method to fsencode data file.

Reviewed By: jsgf

Differential Revision: D5719576

fbshipit-source-id: ca6b38dd7d0c6c0c5a345d8fcbe1b0d6fa10a062
2017-09-17 12:15:07 -07:00
Siddharth Agarwal
e5b075b8ec rustfmt mercurial and mercurial-types
Summary: Going to be making significant changes to these files soon.

Reviewed By: kulshrax

Differential Revision: D5796735

fbshipit-source-id: 879fbca3fc936a538c95e50a3333fc2c312de15b
2017-09-12 11:50:20 -07:00
Jeremy Fitzhardinge
9501d33699 rust: asyncmemo: allow fill function to be recursive
Summary:
Pass a reference to the cache to the fill function. This allows the
function to be recursive based on memoized values.

This also required quite a bit of restructuring to make sure that locks
and ownership are handled properly during recursive calls. Specifically,
a new `Slot` state - `Polling` - is used to indicate when a thread/task
is currently calling `.poll()` on the future. This contains a list of
futures Tasks which are interested in the state of the slot which can be
notified when it changes state.

Also removed unused Entry API code.

Reviewed By: sid0

Differential Revision: D5652704

fbshipit-source-id: 29cd3fe37d4eb9316235872b7e2e228bf10a016f
2017-08-29 12:36:18 -07:00
Stanislau Hlebik
48776b34a3 mononoke: move data/ and meta/ prefixes to fsencode_* functions
Summary:
Core mercurial takes "data/" and "meta/" prefixes into account when does
fsencode.
It doesn't make a difference now, but it will make a difference when we'll add
hashencode to the fsencode() function.

Reviewed By: jsgf

Differential Revision: D5670748

fbshipit-source-id: 661974c25e00979eedffb30b432518135f0dc631
2017-08-23 05:13:50 -07:00
Stanislau Hlebik
dbd845ba0e mononoke blobimport: split work between threads by linkrev
Summary:
We want to avoid putting the same entries twice in the blobstore. And even more - we want to avoid generating list of these entries at all in the first place.

The first approach was to add a `Mutex<HashSet>` that worker threads will use to filter out entries that were already imported. Turned out that this Mutex kills almost all the speedup from concurrency.
But since we have linkrevs then for each entry we know in which commit this entry was created [1]. That means that all of the entries are already nicely split between the threads. So no synchronization is needed.

It gives a good speedup - from ~7min to 2min of importing of hg upstream treemanifest repo using file blobstore.

Note: there is still a lock contention - tree revlogs and file revlogs maps are protected by mutex. We can optimize it later if needed.

[1] There is a well-known linkrev issue in mercurial. It shouldn't affect our case at all.

Reviewed By: jsgf

Differential Revision: D5650074

fbshipit-source-id: c4f9e2763127ffe4402417dd3963f1f450d7b325
2017-08-23 05:13:50 -07:00
Stanislau Hlebik
a28f3ab216 mononoke: add method to RevlogRepo to get the content of tree manifest entry
Summary: We'll need to implement blob importing from revlogrepo to blobrepo

Reviewed By: jsgf

Differential Revision: D5622462

fbshipit-source-id: c57f016711bcec7c0bd432d40881588ebdce6f7f
2017-08-21 03:11:40 -07:00
Stanislau Hlebik
8b03d3909e revlogrepo: handle tree manifests
Summary: Repos with flat manifest and with tree manifest use different path. Let's open tree manifest file if it's present and flat manifest otherwise

Reviewed By: sid0

Differential Revision: D5583566

fbshipit-source-id: e35eef4b1f8067c2a91ebfc62718fec100f19e2e
2017-08-18 04:40:24 -07:00
Stanislau Hlebik
90b418c719 mononoke: add get_tree_revlog to RevlogRepo
Summary:
Add function similar to get_file_revlog to get tree manifest revlog.
Will be used in the next diffs.

Reviewed By: jsgf

Differential Revision: D5563895

fbshipit-source-id: 0ff84c458eb071763cbdc6d98bcf92e9b8ccc1b8
2017-08-18 04:40:23 -07:00
Stanislau Hlebik
5849f4687d mononoke: better fsencode in mercurial_types::Path
Summary:
Previously `fsencode()` worked incorrectly if Path was a directory. We didn't notice it before because we've never used Path to store directories, but we will use it for TreeManifest.

I considered two options when implementing it.
1) Put some kind of flag `isDir` inside Path struct. But that would create complications with `join()`  method. For example, you can't join anything to the file - what should we do in this case? panic? return result?
2) and another `fsencode_dir()` method. Clients need to know what kind of Path they have. I choose this option because it requires less changes and brings less complications compared to the option 1

Reviewed By: sid0

Differential Revision: D5574847

fbshipit-source-id: c4c476a7fc3b884de847c431a56ff5f313c1389f
2017-08-18 04:40:23 -07:00
Stanislau Hlebik
a21bb2cac4 mononoke: move changelog and manifest from under lock
Summary:
Both changelog and manifest are revlogs and they are both protected by their
own mutexes. There is no need to protect by RevlogRepo mutex.

It requires making `Revlog::get_heads()` method accept immutable `&self`. This
is fine since all `Revlog` methods are protected by mutex.

Reviewed By: farnz

Differential Revision: D5618949

fbshipit-source-id: 0511d547360e9785cb6e2cefadf8c10626a433c4
2017-08-15 10:37:34 -07:00