Summary:
There are a few separate steps during blobimport. One of them is inserting the
blobs and another is inserting the changesets. That worked fine if RevlogRepo was
static. However if RevlogRepo has changed between first and second step we
could've inserted blobs for N commits, but insert (N + X) changesets into
changesets table. That means that a few commits would have no blobs at all.
This diff fixes it by fixing the changesets that we want to blobimport
beforehand.
Reviewed By: farnz
Differential Revision: D7615133
fbshipit-source-id: 1a66907e34a65588b101199c8f59abda53f7bc20
Summary: As with filenodes, we also want to write changesets in a real store.
Reviewed By: lukaspiatkowski
Differential Revision: D7615101
fbshipit-source-id: 269deb8fc3c1f58afb82f453a68ea4d8a3f1f63d
Summary:
Use asyncmemo to cache Changesets.
Unfortunately currently we are using separate asyncmemo cache, so we have to
specify the size for the caches separately. Later we'll have a single cache for
everything, and the number of config knobs will go down.
Reviewed By: lukaspiatkowski
Differential Revision: D7685376
fbshipit-source-id: efe8a3a95fcc72fab4f4af93564e706cd1540c2f
Summary:
Let's use it! Pass config option that set's the cache max memory usage (don't
put a limit on the number of entries, it's useless in that case).
Currently we'll set a separate size for each of the caches that we use
(blobstore, changesets, filenodes, etc). Later we'll have just one single option that
sets the cache size for all of them.
Reviewed By: lukaspiatkowski
Differential Revision: D7671814
fbshipit-source-id: f9571078e6faaa80ea4c31c76a9eebcc24d8a68a
Summary:
We know that the hashes for non-root-tree-manifests and filenodes
should always be consistent. Verify that.
Reviewed By: farnz
Differential Revision: D7704087
fbshipit-source-id: 7f6207878c5cd372b272aa6970506dd63b5a3c7c
Summary:
As the comment explains, sometimes the hashes don't match the
contents. Accept such pushes.
Reviewed By: farnz
Differential Revision: D7699930
fbshipit-source-id: 376f01b6cf03f6cad84c2c878d192d55f8d81812
Summary:
We're going to keep this around for now as part of double-writing.
All the hashes here are definitely Mercurial hashes, so use them that way.
Reviewed By: lukaspiatkowski
Differential Revision: D7683890
fbshipit-source-id: 270091cd11f3cec7ef4cf565de5ef913fcf7adea
Summary:
file::File works entirely in the Mercurial domain, so these
conversions are good.
Reviewed By: StanislavGlebik
Differential Revision: D7665973
fbshipit-source-id: 8a192c5d1886492ad21593693b080c8e5ddf8f7e
Summary:
This is because these Mercurial entries are (at least currently) going
to be stored as they come in, and this data structure is entirely in the
Mercurial domain.
Reviewed By: lukaspiatkowski
Differential Revision: D7664972
fbshipit-source-id: 9de5475eed0d7ab7085c29fd0282f205043cfe5a
Summary:
I was trying to debug something with the new blobimport, and this was
getting in the way.
Reviewed By: StanislavGlebik
Differential Revision: D7664660
fbshipit-source-id: 2ec4ee79fbe13584f35e7dcd9e8df2b8bdf181c0
Summary:
The comment doesn't quite look right, and `HgBlob` will always have
content available now.
Reviewed By: StanislavGlebik
Differential Revision: D7663200
fbshipit-source-id: 614b8bc97ece99aaefdc8fa6eaf36fe66779be13
Summary:
The list of arguments is becoming too long, and I need to add even
more here.
Reviewed By: StanislavGlebik, farnz
Differential Revision: D7652096
fbshipit-source-id: 62a4631e163e95cf5c950a949e72facab629ea54
Summary:
Currently, any sort of `Bytes` can be stored in the blobstore. That
caused me to make several mistakes while writing the code to store bonsai
changesets, because I'd just end up storing the wrong set of `Bytes`.
Introduce stronger typing so that only types that explicitly implement
`BlobStorable` can be stored in the blobstore.
Currently, these sorts of blobs can be stored in the blob store:
* `ChangesetBlob` and `ContentBlob` in `mononoke-types` (these are Thrift-serialized structures)
* The envelope `RawNodeBlob` and `RawCSBlob` types in `blobrepo`, once converted to `EnvelopeBlob` instances
* `HgBlob`, which contains revlog data (manifests or files) exactly as serialized by Mercurial
Reviewed By: StanislavGlebik
Differential Revision: D7627290
fbshipit-source-id: d1bcbde8881e365dec99618556e7b054985bccf7
Summary:
Pass mysql tier name to the BlobRepo, so that we can use it to connect to mysql
based storages like mysql changeset storage, filenodes storage etc.
Note that currently Filenodes storage only connects to master region. This will
be fixed in the later diffs
Reviewed By: lukaspiatkowski
Differential Revision: D7585191
fbshipit-source-id: 168082abfeb7ccba549c7a49e6269cc01c490c14
Summary:
Now that `BlobNode` no longer returns `None`:
* don't expose the `BlobNode` API outside the crate because it turns out to not be very useful (it should probably go away eventually?)
* make the `File` API not return `Option` types
* Add a new `file_contents` that returns a brand-new `FileContents` (this is the first time we're tying together Mercurial and Mononoke data structures!)
Also remove a `Symlink` API that isn't really correct honestly.
Reviewed By: StanislavGlebik
Differential Revision: D7624729
fbshipit-source-id: 38443093b8bfea91384c959f3425cf355fac9f65
Summary: Having an implicit `From` parsing makes it hard to track the exceptional places where RevlogChangest can be directly translated to BlobChangeset. Make it explicit for better tracking of this behavior
Reviewed By: StanislavGlebik
Differential Revision: D7637247
fbshipit-source-id: 781341315102ea6b2265c33bb09a89aae3d0c329
Summary:
I'm going to put in some stronger typing around what can be stored in
the blob store. Centralizing the management here makes that much easier.
Reviewed By: StanislavGlebik
Differential Revision: D7619519
fbshipit-source-id: a428679018f0a1571e54bc01bb5483ba9fdf1cb5
Summary: The Hg prefix is unique now so let's not use verbose mercurial::
Reviewed By: sid0
Differential Revision: D7620112
fbshipit-source-id: 0aece310ed817445fef4c94b32f78fda3a3b1c49
Summary: mercurial_types::DBlobNode should be replaced by types from mononoke_types or mercurial in most cases. This rename should help with tracking this
Reviewed By: sid0
Differential Revision: D7619793
fbshipit-source-id: 261fd92acae825dc4bc8011c3716c5585eb0413c
Summary: mercurial_types::DParent should be replaced by types from mononoke_types or mercurial in most cases. This rename should help with tracking this
Reviewed By: sid0
Differential Revision: D7619686
fbshipit-source-id: 5ad105113779387f1408c806860483e06ed5fb3d
Summary: mercurial_types::DFileNodeId should be replaced by types from mononoke_types in most cases. This rename should help with tracking this
Reviewed By: sid0
Differential Revision: D7619290
fbshipit-source-id: aa6a8e55ae3810c4531028c3b3db2e5730fe7846
Summary: mercurial_types::DChangesetId should be replaced by types from mononoke_types in most cases and by mercurial::HgChangesetId in others. This rename should help with tracking this
Reviewed By: sid0
Differential Revision: D7618897
fbshipit-source-id: 78904f57376606be99b56662164e0c110e632c64
Summary: Migrates some uses of `.map_err()` and `.context(format!())` usage in Mononoke to `.with_context()`
Reviewed By: lukaspiatkowski
Differential Revision: D7607935
fbshipit-source-id: 551538c78a1755f7aa0716532ab437a1baf6dd89
Summary:
This is a cleanup of NodeHash API. There were few unused methods and few ways to convert between mercurial and mercurial_types hashes. With this diff it is very easy to identify the places where this converstion happens.
A followup of this diff will be to use this new API to easily replace the NodeHash convertions in places where it requires remapping.
Reviewed By: sid0
Differential Revision: D7592876
fbshipit-source-id: 6875aa6df1a3708ce54ca5724f6eb960d179192b
Summary:
Let's fill cs table even if we import only part of the repo. This let's us
import new changesets incrementally.
That can be dangerous since we don't check if parent commits are present.
However this blobimport is a temporary measure until we get a full-fidelity
blobimport that uses a commit API.
Reviewed By: jsgf
Differential Revision: D7485495
fbshipit-source-id: 63ba91bad4eb1c1662db73293c76a506f48a4753
Summary: They are replaced by filenodes
Reviewed By: farnz
Differential Revision: D7443320
fbshipit-source-id: 13c7d07bc00dcbaa991663c8da8a07fcb0de1332
Summary:
This will probably go away soon, but for now I want to be able to
disambiguate the new Thrift-encoded blobs in Mononoke from these.
Reviewed By: StanislavGlebik
Differential Revision: D7565808
fbshipit-source-id: d61f3096fa368b934a923dee54a0ea1e3469ae0d
Summary:
Since `FileType` now exists, the `Type` enum can use it instead of
defining its own stuff.
Reviewed By: farnz
Differential Revision: D7526046
fbshipit-source-id: 3b8eb5502bee9bc410ced811dc019c1ce757633f
Summary:
Previously we were able to create just sqlite filenodes, now let's make it
possible to create a mysql filenodes.
Reviewed By: jsgf
Differential Revision: D7485098
fbshipit-source-id: b9156e51d41a570f9e6aaf9eaa9e476222257bca
Summary:
We'll need filenodes in blobimport when we'll add filenodes to the BlobRepo.
The implementation is not great - it creates a separate thread for the
filenodes (see "filenodeinserts"). New filenodes are sent via UnboundedSender
from the parsing cpupool
However it doesn't worth the effort to clean up the code that we are
going to deprecate in a couple of weeks.
Reviewed By: farnz
Differential Revision: D7429440
fbshipit-source-id: 4a9220915bd27f5c1c2028ec604afd700bb8a509
Summary:
The diff adds an extension of streams to the `failures_ext` crate, allowing streams with error type `failure::Error` or `failure::Fail` to store a context.
As a proof-of-concept, the resulting `context()` function is applied to a stream in use in mononoke.
Reviewed By: lukaspiatkowski
Differential Revision: D7336012
fbshipit-source-id: 822c9dcd5b6c0a60470e8fd98fecd569928be7d1
Summary:
There's no point passing it by reference since callers don't need to
retain it, and the async implementation needs to move it into another context.
Reviewed By: farnz
Differential Revision: D7350001
fbshipit-source-id: 5947557a84621afae801dc20e3994496244e3a10
Summary:
This codemod tries not to change the existing behavior of system, only introduce new types specific to Mercurial Revlogs.
It introduces a lot of copypasta intentionally and it will be cleaned in following diffs.
Reviewed By: farnz
Differential Revision: D7367191
fbshipit-source-id: 0a915f427dff431065e903b5f6fbd3cba6bc22a7
Summary:
We're going to get rid of empty MPaths very soon, so stop using them
here.
Reviewed By: StanislavGlebik, farnz
Differential Revision: D7358023
fbshipit-source-id: 2d5fff40eae03cc63ef5514faee1a8505b3b2bd0
Summary:
Lots of places specifically want a non-empty `MPath`, and earlier those cases
weren't type-checked. Now `Option<MPath>` stands for an empty `MPath`. In the
next diff `MPath::empty()` will go away and the only way to represent an
`MPath` that doesn't exist will be with `Option<MPath>`.
Reviewed By: farnz
Differential Revision: D7350970
fbshipit-source-id: 1612aec67134e7a0ebad15dbaa93b5ea972f8ddf
Summary:
The `Option<&MPathElement>` type is more general -- it's easy to
convert from `&Option<MPathElement>` to it, but the other way around can
require a clone.
Reviewed By: farnz
Differential Revision: D7339161
fbshipit-source-id: 0c8ab57a19bc330245c612e3e0e3651e368ab8cb
Summary:
The new_blobimport as opposed to the old one do two things differently:
1. It uses a better structured API of RevlogRepo. The old one reads the Revlogs directly and does not verify if the data it has read is correct or it does not let us fix it into canonical form easily (once we have a canonical form different from Revlog's).
2. It uses BlobRepo's Commit API instead of writing directly to storage. This ensures consistency in our code and let's us leverage the validation that is incorporated in Commit API.
Reviewed By: farnz
Differential Revision: D7041976
fbshipit-source-id: fe592524533955f364f1b037109b3b5b5bab6b02
Summary:
RevlogRepo exposes a ton of methods that are almost equvalent to taking Revlog directly and ignoring the RevloRepo abstraction above it.
This diff cleans this up a bit, there are still some methods that the "old" blobimport uses, but the "new" one shouldn't need to do that.
Reviewed By: StanislavGlebik
Differential Revision: D7289445
fbshipit-source-id: ac7130fe41c4e4484d6986fe5b19d5adc751369a
Summary:
Mononoke will introduce its own ChangesetId, ManifestId and BlobHash, and it
would be good to rename these before that lands.
Reviewed By: farnz
Differential Revision: D7293334
fbshipit-source-id: 7d9d5ddf1f1f45ad45f04194e4811b0f6decb3b0
Summary: Replace the generic types if `Blob` and `BlobNode` with `Bytes`.
Reviewed By: lukaspiatkowski
Differential Revision: D7115361
fbshipit-source-id: 924d347377569c6d1b3b4aed14d584510598da7b
Summary: Update to include num_cpu and blake2. Also update bincode to 1.0.0
Reviewed By: StanislavGlebik
Differential Revision: D7098292
fbshipit-source-id: 67793a6f458d50fc049781f34abaf313c8ff7a79
Summary:
Provide an API to ask BlobRepo to create changesets for you from
pieces that you either have to hand, or have created via upload_entry().
Parallelism is maintained in as far as possible - if you commit N changesets,
they should all upload blobs in parallel, but the final completion future
depends on the parents, so that completion order can be maintained.
The ultimate goal of this API is to ensure that only valid commits are added to the `BlobRepo` - this means that, once the future returned by `create_changeset` resolves, you have a repo with commits and blobs in place. Until then, all the pieces can be uploaded, but are not guaranteed to be accessible to clients.
Still TODO is teaching this to use the complete changesets infra so that we
simply know which changesets are fully uploaded.
Reviewed By: StanislavGlebik
Differential Revision: D6743004
fbshipit-source-id: 813329058d85c022d75388890181b48b78d2acf3
Summary:
Iff all the inserts finished successfully, then it's safe to mark changesets as complete.
This diff fills up changesets store after blobimport successfully finishes.
For simplicity if --commit-limit or --skip is set then we skip filling up the changeset store.
Reviewed By: sid0
Differential Revision: D7043831
fbshipit-source-id: 8ae864b45222d52281c885a49c2dca44ba577137
Summary:
As we discussed before, let's add get_name() method that returns MPathElement,
and remove get_path() and get_mpath().
Except for renaming, diff also make repoconfig work with tree manifest, and
fixes linknodes creation in blobimport - previously basename was used instead
of the whole path.
Reviewed By: jsgf
Differential Revision: D6857097
fbshipit-source-id: c09f3ff40d38643bd44aee8b4488277d658cf4f6
Summary: Change BlobChangeset and callers to use ChangesetId instead of NodeId
Reviewed By: lukaspiatkowski
Differential Revision: D6835450
fbshipit-source-id: 7b20359837632aef4803e40965380c38f54c9d0a