Summary:
add backsyncing to rewrite file paths:
After setting the variables for large repo (D23294833 (d6895d837d)), we try to import the git commits into large repo and rewrite the file paths.
Following this, repo import tool should back-sync the commits into small_repo.
next step: derive all the data types for both small and large repos. Currently, we only derive it for the large repo.
==============
remove backup file:
The backup file was a last-minute addition when trying to import a repo for the first time.
Removed it, because we shouldn't write to external files. Future plan is to include
better process recoverability across the whole tool and not just rewrite file paths functionality.
Reviewed By: StanislavGlebik
Differential Revision: D23452571
fbshipit-source-id: bda39694fa34788218be795319dbbfd014ba85ff
Summary:
That's one of the sev followups. Before redacting a file content let's check if
it exists in "main-bookmark" (which is be default master), and refuse to redact
if it actually exists.
If this check passes (i.e. the content we are about to redact is not reachable
from master) that doesn't mean that we are 100% safe. E.g. this comment can be
in ancestor of master, or in any other repo or it can be added in the next
commit.
This check is a best-effort check to prevent shooting ourselves in the foot.
Reviewed By: aslpavel
Differential Revision: D23476278
fbshipit-source-id: 5a4cd10964a65b8503ba9a6391f17319f0ce37d8
Summary:
If we exceed a rate limit, we probably don't want to just drop 100% of traffic.
This would create a sawtooth pattern where we allow a bunch of traffic, update
our counters, drop a bunch of traffic, update our counters again, allow a bunch
of traffic, etc.
To fix this, let's make limits probabilistic. This lets us say "beyond X GB/s,
drop Y% of traffic", which is closer to a sane rate limit.
It might also make sense to eventually change this to use ratelim. Initially,
we didn't do this because we needed our rate limiting decisions to be local to
a single host (because different hosts served different traffic), but now that
we spread the load for popular blobs across the whole tier, we should be able
to just delegate to ratelim.
For now, however, let's finish this bit of a functionality so we can turn it
on.
The corresponding Configerator change is here: D23472683
Reviewed By: aslpavel
Differential Revision: D23472945
fbshipit-source-id: f7d985fded3cdbbcea3bc8cef405224ff5426a25
Summary:
Mononoke throws an error if we request the nullid. In the long term we
want to get rid of the concept of the nullid entirely, so let's just add some
Python level blocks to prevent us from attempting to fetch it. This way we can
start to limit how much Rust has to know about these concepts.
Reviewed By: sfilipco
Differential Revision: D23332359
fbshipit-source-id: 8a67703ba1197ead00d4984411f7ae0325612605
Summary:
Without the `--noproxy localhost` flag curl will obey the `https_proxy` env
variable but will not respect the `no_proxy` env variable or `curlrc`.
This means that tests running in a shell with `https_proxy` will likely fail.
The failures may vary in aspect based on what logic is running at the time.
Reviewed By: kulshrax
Differential Revision: D23360744
fbshipit-source-id: 0383a141e848bd2257438697e699f727d79dd5d2
Summary:
If a service is configured with no permitted paths, ensure we deny any writes
that might affect any path. This is not hugely useful, and probably means a
configuration error, but it's the safe choice.
In a similar vein, if a service is permitted to modify any path, there's not
much point in checking all the commits, so skip the path checks to save some
time.
Reviewed By: StanislavGlebik
Differential Revision: D23316392
fbshipit-source-id: 3d9bf034ce496540ddc4468b7128657e446059c6
Summary:
This tests creating, moving and deleting bookmarks using the source control
service, making sure that hooks and service write restrictions are applied
appropriately.
Reviewed By: StanislavGlebik
Differential Revision: D23287999
fbshipit-source-id: bd7e66ec3668400a617f496611e4f24f33f8083e
Summary:
The error message for fast-forward failure is wrong. The correct way to allow
non-fast-forward moves is with the NON_FAST_FORWARD pushvar.
Reviewed By: StanislavGlebik
Differential Revision: D23243542
fbshipit-source-id: 554cdee078cd712f17441bd10bd7968b0674bbfe
Summary:
When bookmarks are moved or created, work out what additional changesets
should have the hooks run on them. This may apply to plain pushes,
force pushrebases, or bookmark only pushrebases.
At first, this will run in logging-only mode where we will count how many
changesets would have hooks run on them (up to a tunable limit). We can
enable running of hooks with a tunable killswitch later on.
Reviewed By: StanislavGlebik
Differential Revision: D23194240
fbshipit-source-id: 8031fdc1634168308c7fe2ad3c22ae4389a04711
Summary:
Move the running of hooks from in `repo_client` to in `bookmarks_movement`.
For pushrebase and plain push we still only run hooks on the new commits the client has sent.
Bookmark-only pushrebases, or moves where some commits were already known, do not run
the hooks on the omitted changesets. That will be addressed next.
The push-redirector currently runs hooks in the large repo. Since hook running has now been moved
to later on, they will automatically be run on the large repo, and instead the push-redirector runs them on
the small repo, to ensure they are run on both.
There's some additional complication with translating hook rejections in the push-redirector. Since a
bookmark-only push can result in hook rejections for commits that are not translated, we fall back to
using the large-repo commit hash in those scenarios.
Reviewed By: StanislavGlebik
Differential Revision: D23077551
fbshipit-source-id: 07f66a96eaca4df08fc534e335e6d9f6b028730d
Summary:
If the imported commit has manifest id with all zeros (empty commit). Blobimport job can't find it in blobstore and returns error D23266254.
Add an early return when the manifest_id is NULL_HASH.
Reviewed By: StanislavGlebik
Differential Revision: D23266254
fbshipit-source-id: b8a3c47edfdfdc9d8cc8ea032fb96e27a04ef911
Summary:
This largely moves connection accepting from old style bytes, futures and tokio
to updated versions, while keeping some parts at old bytes/futures in order to
remain compatible with the rest of the Mononoke codebase.
Division lies on `Stdio` which maintains old channels, stream and futures,
while the socket handling, connection acception and wire encoding is updated.
With the updated futures, we now wait for the forwarding stream to have
succeeded before considering a connection fully handled.
Other notable changes:
- futures_ext now a mini codec Decoder instead of relying on NetstringDecoder,
which has been updated to use bytes 0.5
- hgcli has been modified to use updated NetstringDecoder
- netstring now requires the updated bytes 0.5 crate
- the part in connection_acceptor was handling repo/security logic is now part of repo_handler (as it should have been), connection_acceptor now only handles networking and framing
- tests now verify that the shutdown handler is triggered
Reviewed By: krallin
Differential Revision: D22526867
fbshipit-source-id: 34e43af4a0c8b84de0000f2093d7fffd3fb0e20d
Summary: Some repos are push-redirected repos: pushes go to another repos, but then synced into this repo. Because of this, when we import a repo into a smaller repo that push-redirects to a large repo, we need to make sure we don't break the large repo with the imported code, since merges, pushes, imports etc. are redirected to the large repo. For now, in order to avoid breaking the large repo, we added a simple check that returns error, if the small repo push-redirects to the large one.
Reviewed By: ikostia
Differential Revision: D23158826
fbshipit-source-id: f722790441d641f67293e78c5d1ea5d1102bbb9b
Summary: When running manual scrub for a large repo with one empty store, we are doing one peek per key. For keys that have existed for some time this is unnecessary as we know the key should exist and slows down the scrub.
Reviewed By: farnz
Differential Revision: D23054582
fbshipit-source-id: d2222350157ca37aa31b7792214af4446129c692
Summary: Refactor control of movement of non-scratch bookmarks through pushrebase.
Reviewed By: krallin
Differential Revision: D22920694
fbshipit-source-id: 347777045b4995b69973118781511686cf34bdba
Summary:
Refactor control of movement of non-scratch bookmarks through force-pushrebase
or bookmark-only pushrebase. These are equivalent to ordinary pushes, and so
can use the same code path for moving the bookmarks.
This has the side-effect of enabling some patterns that were previously not
possible, like populating git mappings with a force-pushrebase.
Reviewed By: ikostia
Differential Revision: D22844828
fbshipit-source-id: 4ef71fa4cef69cc2f1d124837631e8304644ca06
Summary: Refactor control of movement of non-scratch bookmarks through plain pushes.
Reviewed By: krallin
Differential Revision: D22844829
fbshipit-source-id: 2f1a89e1d0f69880f74b7bc135144bfb305a918e
Summary:
Refactor control of movement of scratch bookmarks to a new `bookmark_movement` crate
that will contain all bookmark movement controls.
Reviewed By: krallin
Differential Revision: D22844830
fbshipit-source-id: 56d25ad45a9328eaa079c13466b4b802f033d1dd
Summary:
The primary change is in `eden/scm/lib/edenapi/types`:
* Split `DataEntry` into `FileEntry` and `TreeEntry`.
* Split `DataError` into `FileError` and `TreeError`. Remove `Redacted` error variant from `TreeError` and `MaybeHybridManifest` error variant from `FileError`.
* Split `DataRequest`, `DataResponse` into appropriate File and Tree types.
* Refactor `data.rs` into `file.rs` and `tree.rs`.
* Lift `InvalidHgId` error, used by both File and Tree, into `lib.rs`.
* Bugfix: change `MaybeHybridManifest` to be returned only for hash mismatches with empty paths, to match documented behavior.
Most of the remaining changes are straightforward fallout of this split. Notable changes include:
* `eden/scm/lib/edenapi/tools/read_res`: I've split the "data" commands into "file" and "tree", but I've left the identical arguments sharing the same argument structs. These can be refactored later if / when they diverge.
* `eden/scm/lib/types/src/hgid.rs`: Moved `compute_hgid` from `eden/scm/lib/edenapi/types/src/data.rs` to as a new `from_content` constructor on the `HgId` struct.
* `eden/scm/lib/revisionstore/src/datastore.rs`: Split `add_entry` method on `HgIdMutableDeltaStore` trait into `add_file` and `add_tree` methods.
* `eden/scm/lib/revisionstore/src/edenapi`
* `mod.rs`: Split `prefetch` method on `EdenApiStoreKind` into `prefetch_files` and `prefetch_trees`, which are given a default implementation that fails with `unimplemented!`.
* `data.rs`: Replace blanket trait implementations for `EdenApiDataStore<T>` with specific implementations for `EdenApiDataStore<File>` and `EdenApiDataStore<Tree>` which call the appropriate fetch and add functions.
* `data.rs` `test_get_*`: Replace dummy hashes with real hashes. These tests were only passing due to the hash mismatches (incorrectly) being considered `MaybeHybridManifest` errors, and allowed to pass.
Reviewed By: kulshrax
Differential Revision: D22958373
fbshipit-source-id: 788baaad4d9be20686d527f819a7342678740bc3
Summary:
After creating the merge commit (D23028163 (f267bec3f7)) from the imported commit head and the destination bookmark's head, we need to push the commit onto that bookmark. This diff adds the push functionality to repo_import tool.
Note: GlobalrevPushrebaseHook is a hook to assign globalrevs to commits to keep the order of the commits
Reviewed By: StanislavGlebik
Differential Revision: D23072966
fbshipit-source-id: ff815467ed0f96de86da3de9a628fd45743eb167
Summary:
Once we have revealed the commits to the user (D22864223 (578207d0dc), D22762800 (f1ef619284)), we need to merge the imported branch into the destination branch (specified by dest-bookmark). To do this, we extract the latest commit of the destination branch, then compare the two commits, if we have merge conflicts. If we have merge conflicts, we inform the user, so they can resolve it. Otherwise, we create a new bonsai having the two commits as parents.
Next step: pushrebase the merge commit
Minor refactor: moved app setup to a separate file for better readability.
Reviewed By: StanislavGlebik
Differential Revision: D23028163
fbshipit-source-id: 7f3e2a67dc089e6bbacbe71b5e4ef5f6eed2a9e1
Summary:
Matches the `getcommitdata` SSH endpoint.
This is going to be used to remove the requirement that client repostories
need to have all commits locally.
Reviewed By: krallin
Differential Revision: D22979458
fbshipit-source-id: 75d7265daf4e51d3b32d76aeac12207f553f8f61
Summary:
Added more logs when running the binary to be able to track the progress more easily.
Saved bonsai hashes into a file. In case we fail at deriving data types, we can still try to derive them manually with the saves hashes and avoid running the whole tool again.
Reviewed By: StanislavGlebik
Differential Revision: D22943309
fbshipit-source-id: e03a74207d76823f6a2a3d92a1e31929a39f39a5
Summary:
Argument names should be `snake_case`. Long options should be `--kebab-case`.
Retain the old long options as aliases for compatibility.
Reviewed By: HarveyHunt
Differential Revision: D22945600
fbshipit-source-id: a290b3dc4d9908eb61b2f597f101b4abaf3a1c13
Summary:
Add `--log-interval` to log every N commits, so that it can be seen to be
making progress in the logs.
The default is set to 500, which logs about once every 10 seconds on my devserver.
Reviewed By: HarveyHunt
Differential Revision: D22945599
fbshipit-source-id: 7fc09b907793ea637289c9018958013d979d6809
Summary: Update all the remaining steps in the walker to use the new early checks, so as to prune unnecessary edges earlier in the walk.
Reviewed By: farnz
Differential Revision: D22847412
fbshipit-source-id: 78c499a1870f97df7b641ee828fb8ec58303ebef
Summary:
This tool can be used in tandem with pre_merge_delete tool to merge a one large
repository into another in a controlled manner - the size of the working copy
will be increased gradually.
Reviewed By: ikostia
Differential Revision: D22894575
fbshipit-source-id: 0055d3e080c05f870cfd0026174365813b0eb253
Summary:
In several places in `library.sh` we had `--mononoke-config-path
mononoke-config`. This ensured that we could not run such commands from
non-`$TESTTMP` directorires. Let's fix that.
Reviewed By: StanislavGlebik
Differential Revision: D22901668
fbshipit-source-id: 657bce27ce6aee8a88efb550adc2ee5169d103fa
Summary:
This adds a CLI for the functionality, added in the previous diff. In addition, this adds an integration test, which tests this deletion functionality.
The output of this tool is meant to be stored in the file. It simulates a simple DAG, and it should be fairly easy to automatically parse the "to-merge" commits out of this output. In theory, it could have been enough to just print the "to-merge" commits alone, but it felt like sometimes it may be convenient to quickly examine the delete commits.
Reviewed By: StanislavGlebik
Differential Revision: D22866930
fbshipit-source-id: 572b754225218d2889a3859bcb07900089b34e1c
Summary:
Related diff: D22816538 (3abc4312af)
In repo_import tool once we move a bookmark to reveal commits to users, we want to check if hg_sync has received the commits. To do this, we extract the largest log id from bookmarks_update_log to compare it with the mutable_counter value related to hg_sync. If the counter value is larger or equal to the log id, we can move the bookmark to the next batch of commits. Otherwise, we sleep, retry fetching the mutable_counter value and compare the two again.
mutable_counters is an sql table that can track bookmarks log update instances with a counter.
This diff adds the functionality to extract the mutable_counters value for hg_sync.
======================
SQL query fix:
In the previous diff (D22816538 (3abc4312af)) we didn't cover the case where we might not get an ID which should return None. This diff fixes this error.
Reviewed By: StanislavGlebik
Differential Revision: D22864223
fbshipit-source-id: f3690263b4eebfe151e50b01a13b0193009e3bfa
Summary:
This is expected to fix flakyness in test-walker-corpus.t
The problem was that if a FileContent node was reached via an Fsnode it did not have a path associated. This is a race condition that I've not managed to reproduce locally, but I think is highly likely to be the reason for flaky failure on CI
Reviewed By: ikostia
Differential Revision: D22866956
fbshipit-source-id: ef10d92a8a93f57c3bf94b3ba16a954bf255e907
Summary:
There have been lots of issues with user experience related to authentication
and its help messages.
Just one of it:
certs are configured to be used for authentication and they are invalid but the `hg cloud auth`
command will provide help message about the certs but then ask to copy and
paste a token from the code about interactive token obtaining.
Another thing, is certs are configired to use, it was not hard to
set up a token for Scm Daemon that can be still on tokens even if cloud
sync uses certs.
Now it is possible with `hg auth -t <token>` command
Now it should be more cleaner and all the messages should be cleaner as well.
Also certs related help message has been improved.
Also all tests were cleaned up from the authentication except for the main
test. This is to simplify the tests.
Reviewed By: mitrandir77
Differential Revision: D22866731
fbshipit-source-id: 61dd4bffa6fcba39107be743fb155be0970c4266
Summary:
Pull Request resolved: https://github.com/facebookexperimental/eden/pull/40
Those tools are being used in some integration tests, make them public so that the tests might pass
Reviewed By: ikostia
Differential Revision: D22844813
fbshipit-source-id: 7b7f379c31a5b630c6ed48215e2791319e1c48d9
Summary:
Pull Request resolved: https://github.com/facebookexperimental/eden/pull/41
As of D22098359 (7f1588131b) the default locale used by integration tests is en_US.UTF-8, but as the comment in code mentiones:
```
The en_US.UTF-8 locale doesn't behave the same on all systems and trying to run
commands like "sed" or "tr" on non-utf8 data will result in "Illegal byte
sequence" error.
That is why we are forcing the "C" locale.
```
Additionally I've changed the test-walker-throttle.t test to use "/bin/date" directly. Previously it was using "/usr/bin/date", but the "/bin/date" is a more standard path as it works on MacOS.
Reviewed By: krallin
Differential Revision: D22865007
fbshipit-source-id: afd1346e1753df84bcfc4cf88651813c06933f79
Summary: It fails now, unknown reason, will work on it later
Reviewed By: mitrandir77, ikostia
Differential Revision: D22865324
fbshipit-source-id: c0513bfa2ce9f6baffebff472053e8a5d889c9ba
Summary:
This is the (almost) final diff to introduce WarmBookmarksCache in repo_client.
A lot of this code is to pass through the config value, but a few things I'd
like to point out:
1) Warm bookmark cache is enabled from config, but it can be killswitched using
a tunable.
2) WarmBookmarksCache in scs derives all derived data, but for repo_client I
decided to derive just hg changeset. The main motivation is to not change the
current behaviour, and to make mononoke server more resilient to failures in
other derived data types.
3) Note that WarmBookmarksCache doesn't obsolete SessionBookmarksCache that was
introduced earlier, but rather it complements it. If WarmBookmarksCache is
enabled, then SessionBookmarksCache reads the bookmarks from it and not from
db.
4) There's one exception in point #3 - if we just did a push then we read
bookmarks from db rather than from bookmarks cache (see
update_publishing_bookmarks_after_push() method). This is done intentionally -
after push is finished we want to return the latest updated bookmarks to the
client (because the client has just moved a bookmark after all!).
I'd argue that the current code is a bit sketchy already - it doesn't read from
master but from replica, which means we could still see outdated bookmarks.
Reviewed By: krallin
Differential Revision: D22820879
fbshipit-source-id: 64a0aa0311edf17ad4cb548993d1d841aa320958
Summary:
The debug print abuses the `linkmapper`. The Rust commit add logic does not
use `linkmapper`. So let's remove the debug message to be consistent with
the Rust logic.
Reviewed By: DurhamG
Differential Revision: D22657189
fbshipit-source-id: 2e92087dbb5bfce2f00711dcd62881aba64b0279
Summary: When a TLS connection fails due to a missing client certificate, the `curl` command may fail with either code 35 or 56 depending on the TLS version used. With TLS v1.3, the error is explicitly reported as a missing client certificate, whereas in TLS v1.2, it is reported as a generic handshake failure. This is because TLS v1.3 defines an explicit [`certificate_required`](https://tools.ietf.org/html/rfc8446#section-4.4.2.4) alert, which is [not present](https://github.com/openssl/openssl/issues/6804) in earlier TLS versions.
Reviewed By: krallin
Differential Revision: D22834527
fbshipit-source-id: a15d6a169d35ece6ed5a54b37b8ca9bbc506b3da
Summary:
Pull Request resolved: https://github.com/facebookexperimental/eden/pull/38
The tool is used in some integration tests, make it public so that the tests might pass
Reviewed By: ikostia
Differential Revision: D22815283
fbshipit-source-id: 76da92afb8f26f61ea4f3fb949044620a57cf5ed
Summary:
Pull Request resolved: https://github.com/facebookexperimental/eden/pull/37
mononoke_hg_sync_job is used in integration tests, make it public
Reviewed By: krallin
Differential Revision: D22795881
fbshipit-source-id: 7a32c8e8adf723a49922dbb9e7723ab01c011e60
Summary:
Pull Request resolved: https://github.com/facebookexperimental/eden/pull/36
This command is used in some integration tests, make it public.
Reviewed By: krallin
Differential Revision: D22792846
fbshipit-source-id: 39ac89b1a674ea63dc924cafa07107dbf8e5a098
Summary:
Once we start moving the bookmark across the imported commits (D22598159 (c5e880c239)), we need to check dependent systems to avoid overloading them when parsing the commits. In this diff we added the functionality to check Phabricator. We use an external service (jf graphql - find discussion here: https://fburl.com/nr1f19gs) to fetch commits from Phabricator. Each commit id starts with "r", followed by a call sign (e.g FBS for fbsource) and the commit hash (https://fburl.com/qa/9pf0vtkk). If we try to fetch an invalid commit id (e.g not having a call sign), we should receive an error. Otherwise, we should receive a JSON.
An imported commit should have the following query result: https://fburl.com/graphiql/20txxvsn - nodes has one result with the imported field true.
If the commit hasn't been recognised by Phabricator yet, the nodes array will be empty.
If the commit has been recognised, but not yet parsed, the imported field will be false.
If we haven't parsed the batch, we will try to check Phabricator again after sleeping for a couple of seconds.
If it has parsed the batch of commits, we move the bookmark to the next batch.
Reviewed By: krallin
Differential Revision: D22762800
fbshipit-source-id: 5c02262923524793f364743e3e1b3f46c921db8d