Summary:
Bookmarks point to Bonsai changesets. So previously we were fetching bonsai
changeset for a bookmark then converting it to hg changeset in `get_bookmark`
method, then converting it back to bonsai in `pushrebase.rs`.
This diff adds method `get_bonsai_bookmark()` that removes these useless
conversions.
Reviewed By: farnz
Differential Revision: D10427433
fbshipit-source-id: 1b15911fc5d77483b5a135a8d4484fccff23c774
Summary:
getfiles implementation for lfs
The implementation is the following:
- get file size from file envelope (retrieve from manifold by HgNodeId)
- if file size > threshold from lfs config
- fetch file to memory, get sha256 of the file, will be fixed later, as this approach consumes a lot of memory, but we don't have any mapping from sha256 - blake2 [T35239107](https://our.intern.facebook.com/intern/tasks/?t=35239107)
- generate lfs metadata file according to [LfsPlan](https://www.mercurial-scm.org/wiki/LfsPlan)
- set metakeyflag (REVID_STORED_EXT) in the file header
- if file size < threshold, process usual way
Reviewed By: StanislavGlebik
Differential Revision: D10335988
fbshipit-source-id: 6a1ba671bae46159bcc16613f99a0e21cf3b5e3a
Summary: Reverting the myrouter based filenodes for now as they cause some problems
Reviewed By: jsgf
Differential Revision: D10405364
fbshipit-source-id: 07da917455ae5af9ef81a24d99f516171101c8a7
Summary:
According to [Mercurial Lfs Plan](https://www.mercurial-scm.org/wiki/LfsPlan), on push, for files which size is above the threshold (lfs.threshold config) hg client is sending LFS metadata instead of actual files contents. The main part of LFS metadata is SHA-256 of the file content (oid).
The format requires the following mandatory fields: version, oid, size.
When lfs metadata is sent instead of a real file content then lfs_ext_stored flag is in the request's revflags.
If this flag is set, We are ignoring sha-1 hash verification inconsistency.
Later check that the content is actually loaded to the blobstore and create filenode envelope from it, load the envelope to the blobstore.
Filenode envelope requires the following info:
- size - retrieved on fetching the actual data from blobstore.
- copy_from - retrieved from the file, sent by hg client.
Mononoke still does the same checks for LFS push as for non-lfs push (i.e. checks that all the necessary manifests/filelogs were uploaded by a client)
Reviewed By: StanislavGlebik
Differential Revision: D10255314
fbshipit-source-id: efc8dac4c9f6d6f9eb3275d21b7b0cbfd354a736
Summary:
Mercurial stores executable bit as part of the manifest, so if changeset only changes that attribute of a file Hg reuses file hash. But mononoke has been creating additional file node. So this change tries to handle this special case. Note this kind of reuse only happens if file has only one parent [P60183653](P60183653)
Some of our fixtures repo were effected, hence this hashes were replaced with updated ones
```
396c60c14337b31ffd0b6aa58a026224713dc07d => a5ab070634ab9cbdfc92404b3ec648f7e29547bc
339ec3d2a986d55c5ac4670cca68cf36b8dc0b82 => c10443fa4198c6abad76dc6c69c1417b2e821508
b47ca72355a0af2c749d45a5689fd5bcce9898c7 => 6d0c1c30df4acb4e64cb4c4868d4c974097da055
```
Reviewed By: farnz
Differential Revision: D10357440
fbshipit-source-id: cdd56130925635577345b08d8ed0ae6e229a82a7
Summary: Make get_manifest_by_nodeid accept HgManifestId and correct all calls to get_manifest_by_nodeid.
Reviewed By: StanislavGlebik
Differential Revision: D10298425
fbshipit-source-id: 932e2a896657575c8998e5151ae34a96c164e2b2
Summary:
PUT request upload to mononoke API
hg client sends a PUT request to store a file into blobstore during push supporting LFS
Upload file by alias is divied into 2 parts:
- Put alias : blobstore key
- Put blobstore_key: contents
Keep in mind, that file content is thrift encoded
host_address for batch request is from command line flags -H for host, -p for port
Reviewed By: StanislavGlebik
Differential Revision: D10026683
fbshipit-source-id: 6c2726c7fee2fb171582bdcf7ce86b22b0130660
Summary:
JSON blobs let other users of Mononoke learn what they need to know
about commits. When we get a commit, log a JSON blob to Scribe that other users can pick up to learn what they want to know.
Because Scribe does not guarantee ordering, and can sometimes lose messages, each message includes enough data to allow a tailer that wants to know about all commits to follow backwards and detect lost messages (and thus fix them up locally). It's expected that tailers will either sample this data, or have their own state that they can use to detect missing commits.
Reviewed By: StanislavGlebik
Differential Revision: D9995985
fbshipit-source-id: 527b6b8e1ea7f5268ce4ce4490738e085eeeac72
Summary:
We want to be able to do post-commit logging for all changesets. Set
up the data structures I intend to use for now, and arrange to discard all
logging.
A later diff will add logging to a ScribeClient.
Reviewed By: StanislavGlebik
Differential Revision: D9995984
fbshipit-source-id: 796b390f6b83ace576f73a217ac564c4251d7ec5
Summary:
This diff adds a real implementation for CachingChangesetFetcher. Now it
fetches the data for the cache from the blobstore.
The rest is explained in the comments.
Reviewed By: farnz
Differential Revision: D9908320
fbshipit-source-id: 5427f3ed312cb7753434161423cb27b48744347f
Summary:
Initial implementation of ChangesetsFetcher that will use cache smarter.
At the moment it doesn't do anything special, but in the next diffs it will pre
warm cache in case it has a lot of cache misses (that's why it has to have a
reference to the cachelib CachePool).
Reviewed By: farnz
Differential Revision: D9908319
fbshipit-source-id: 6377a947696bae6b060de5a441722c28309b341c
Summary:
High-level goal: we want to make certain big getbundle requests faster. To do
that we'd store blobs of commits that are close to each other in the blobstore
and fetch them only if we had too many cache misses. All this logic will be
hidden in ChangesetFetcher trait implementation. ChangesetFetcher will be
created per request (hence the factory).
Reviewed By: farnz
Differential Revision: D9869659
fbshipit-source-id: 9e3ace3188b3c13f83ef1bd61b668d4f22103f74
Summary:
WIP
Mononoke API download for lfs
support get request
curl http://127.0.0.1:8000/{repo_name}/lfs/download/{sha256}
Reviewed By: StanislavGlebik
Differential Revision: D9850413
fbshipit-source-id: 4d756679716893b2b9c8ee877433cd443df52285
Summary:
Let's check that new case conflicts are not added by a commit.
That diff also fixes function check_case_conflict_in_manifest - it needs to
take into account that if one of the conflicting files was removed then there
is no case conflict.
There should be a way to disable this check because we sometimes need to allow
broken commits. For example, during blobimport
Reviewed By: aslpavel
Differential Revision: D9789809
fbshipit-source-id: ca09ee2d3e5340876a8dbf57d13e5135344d1d36
Summary: Use `ChangesetId` in `DifferenceOfUnionsOfAncestorsNodeStream` instead of `HgNodeHash`. This avoids several bonsai lookups of parent nodes.
Reviewed By: StanislavGlebik
Differential Revision: D9631341
fbshipit-source-id: 1d1be7857bf4e84f9bf5ded70c28ede9fd3a2663
Summary:
Additional 2-step reference for blob:
For each file add an additional blob with:
key = aliases.sha256.sha256(raw_file_contents)
value = blob_key
Pay attention, that sha256 hash is taken `from raw_file_content`, not from a blob content.
Additional blob is sent together with the file content blob.
Reviewed By: lukaspiatkowski, StanislavGlebik
Differential Revision: D9775509
fbshipit-source-id: 4cc997ca5903d0a991fa0310363d6af929f8bbe7
Summary:
In `fetch_file_contents()` `blobstore_bytes.into()` converted the bytes to
`Blob<Id>`. This code calls `MononokeId::from_data()` which calls blake2
hashing. Turns out it causes big problems for large many large files that
getfiles can return.
Since this hash is not used at all, let's avoid generating it.
Reviewed By: jsgf
Differential Revision: D9786549
fbshipit-source-id: 65de6f82c1671ed64bdd74b3a2a3b239f27c9f17
Summary:
Profiling showed that since we are inserting objects into blobstore
sequentially it takes a lot of time for long stacks of commit. Let's do it in
parallel.
Note that we are still inserting sequentially into changesets table
Reviewed By: farnz
Differential Revision: D9683037
fbshipit-source-id: 8f9496b97eaf265d9991b94243f0f14133f463da
Summary:
Use .chain_err() where appropriate to give context to errors coming up from
below. This requires the outer errors to be proper Fail-implementing errors (or
failure::Error), so leave the string wrappers as Context.
Reviewed By: lukaspiatkowski
Differential Revision: D9439058
fbshipit-source-id: 58e08e6b046268332079905cb456ab3e43f5bfcd
Summary:
This diff fills missing parts of push-rebase implementation
- `find_closest_root` - find closest root to specified bookmark
- `find_changed_files` - find file affected by changesets between provided `ancestor` and `descendant`
- `intersect_changed_files` - rejects push rebase if any conflicts have been found
- `create_rebased_changes` - support for merges
- `do_pushrebase` - returns updated bookmark value
Reviewed By: StanislavGlebik
Differential Revision: D9458416
fbshipit-source-id: c0cb53773eba6e966f1a5928c43ebdec761a78d3
Summary: This commits change `HgBlob` from an enum into a struct that only contains one Bytes field, completely removes `HgBlobHash` and changes the methods of `HgBlob` from returning `Option`s into directly returning results.
Reviewed By: farnz
Differential Revision: D9317851
fbshipit-source-id: 48030a621874d628602b1c5d3327e635d721facf
Summary: Since this data is specific to TimedStream and not TimedFuture I split the Stats struct into FutureStats and StreamStats
Reviewed By: StanislavGlebik
Differential Revision: D9355421
fbshipit-source-id: cc2055706574756e2e53f3ccc57abfc50c3a02ba
Summary:
Revsets must use ChangesetId, not HgNodeHash. I'm going to use
`RangeNodeStream` in pushrebase so I thought it was a good time to change it
Reviewed By: farnz
Differential Revision: D9338827
fbshipit-source-id: 50bbe8f73dba3526d70d3f816ddd93507db99be5
Summary: Store where it was copied from
Reviewed By: farnz
Differential Revision: D9132560
fbshipit-source-id: a7a73e1f3de08340f5add5fffa32dd0373eb27fa
Summary:
The issues were fixed, and also thrift manifold works better during bulk
blobimport.
Reviewed By: farnz
Differential Revision: D9132384
fbshipit-source-id: ab4a04eeff86bb4968b80af00c404fad710db183
Summary: - Make `Bookmakrs` work with `ChangsetId` instead of `HgChangesetId`
Reviewed By: StanislavGlebik, farnz
Differential Revision: D9297139
fbshipit-source-id: e18661793d144669354e509271044410caa3502a
Summary: As a bonus this diff also contains unifying the linknode family of methods (they all now accept arguments via reference) and better tracing for get_files request
Reviewed By: StanislavGlebik
Differential Revision: D9031283
fbshipit-source-id: 4526a8446984904bce7d4dcef240088c7f2ffaa3
Summary:
Asyncmemo has two issues for our use:
1. Separate memory pool from cachelib caches.
2. Future fusion means that a `get` that should succeed will fail because there
was an earlier get still in progress.
The second is good for memoization, where the worst case from a failed get is
extra CPU work, but not so good for caching. Replace uses of Asyncmemo for
caches with a cachelib based cache
Reviewed By: StanislavGlebik
Differential Revision: D9013679
fbshipit-source-id: b85d4eec7294e0c8ee08faa671d26901b35cf1fc
Summary:
Alas, the diff is huge. One thing is changing Changesets to use ChangesetId.
This is actually quite straightforward. But in order to do this we need to
adapt our test fixtures to also use bonsai changesets. Modifying existing test
fixtures to work with bonsai changesets is very tricky. Besides, existing test
fixtures is a big pile of tech debt anyway, so I used this chance to get rid of
them.
Now test fixtures use `generate_new_fixtures` binary to generate an actual Rust
code that creates a BlobRepo. This Rust code creates a bonsai changeset, that
is converted to hg changeset later.
In many cases it results in the same hg hashes as in old test fixtures.
However, there are a couple of cases where the hashes are different:
1) In the case of merge we are generating different hashes because of different
changed file list (lukaspiatkowski, aslpavel, is it expected?). this is the case for test
fixtures like merge_even, merge_uneven and so on.
2) Old test fixtures used flat manifest hashes while new test fixtures are tree
manifest only.
Reviewed By: jsgf
Differential Revision: D9132296
fbshipit-source-id: 5c4effd8d56dfc0bca13c924683c19665e7bed31
Summary: To make them more explicit
Reviewed By: matthewdippel
Differential Revision: D9132294
fbshipit-source-id: a365d7b58ba095d11fb0570e5ab6994a158873b3
Summary:
This is a split from D8893504. It just enables functionality to create bonsai changesets.
The split was done so that I can land the biggest chunk of work.
Reviewed By: farnz
Differential Revision: D9081430
fbshipit-source-id: 7437c7789998f5691afe83d5b16a8f2c5faac8b4
Summary:
It was using manifest hash as parent hash. One more reason to fix all the types
in Mononoke
Reviewed By: aslpavel
Differential Revision: D9123653
fbshipit-source-id: 0841f7ac64e50e9234d80040b7f286930af53420