Summary: This will help move more Python logic to Rust.
Reviewed By: sfilipco
Differential Revision: D21854224
fbshipit-source-id: b03cbacedc11d77e8c56262437a8d10bd9a89e59
Summary: This is discovered by using it in Python world.
Reviewed By: sfilipco
Differential Revision: D22323186
fbshipit-source-id: 295811e0950b94ad2ad73ad242228b6a3f9765d0
Summary: Adding a same commit multiple times is a no-op.
Reviewed By: sfilipco
Differential Revision: D22323190
fbshipit-source-id: 61a06335581a9cad32dc7e929b841ec69b551a9c
Summary: This adds some test coverage for the revlog DagAlgorithm implementation.
Reviewed By: sfilipco
Differential Revision: D22249157
fbshipit-source-id: a1d347b4d90d0e7f8fb229c317cc75c2b8e16242
Summary:
This makes RevlogIndex compatible with the generic DAG testing API from the dag
crate.
Reviewed By: sfilipco
Differential Revision: D22249156
fbshipit-source-id: 54a3c458e85804968964174eab674e494a6fa8a2
Summary: Some DAG implementations does not support it.
Reviewed By: sfilipco
Differential Revision: D22249158
fbshipit-source-id: ebcdf164677ee647ef44aa1ee3cfd318bac658b0
Summary:
Different implementation might return different orders. They should be
considered correct.
Reviewed By: sfilipco
Differential Revision: D22249159
fbshipit-source-id: 36e4cadf814366f7ee2ed8a778948ff810760550
Summary: This makes it possible to run tests for other DAGs, like the revlog.
Reviewed By: sfilipco
Differential Revision: D22249155
fbshipit-source-id: 205579eeaccd42a21297d965973957168bb8726e
Summary:
For revlog, calculating `only` can have some fast paths that do not scan the
entire changelog.
Reviewed By: sfilipco
Differential Revision: D21944136
fbshipit-source-id: 58391636350f8f19643d59c46d663f55861d6de3
Summary:
This will be used to maintain narrow-heads phase calculation and sunsetting the
revlog-specific changelog.index2.
Reviewed By: sfilipco
Differential Revision: D21944131
fbshipit-source-id: a8bbd1fd24546f4891ffa677476bff750c3faf5f
Summary: The values of `pending_nodes_index` should start from 0 instead of 1.
Reviewed By: sfilipco
Differential Revision: D21944133
fbshipit-source-id: a2a332868f16b398037289c81bf8076d1400c0a7
Summary:
This drops the `file` parameter from the `raw_data` API, making
RevlogIndex easier to use.
Reviewed By: sfilipco
Differential Revision: D21854228
fbshipit-source-id: 259726524d1cc6a1f9d00783e22f9502c7decdeb
Summary:
The reverse `to_id_set` exists.
It turns out that the Python land wants this in many places.
Reviewed By: sfilipco
Differential Revision: D22240175
fbshipit-source-id: b6a3a3a3869dc0c521a21b1d86394421b816632b
Summary:
This provides a way for implementations to optimize the operation.
For segmented changelog, the default implementation is good enough.
For revlog, `only` can have a fast path that does not iterate through the
entire changelog.
A related API `only_both` is added. For revlog it has multiple use-cases,
including narrow-heads phase calculation and revlog.findcommonmissing used by
discovery.
Reviewed By: markbt
Differential Revision: D21944132
fbshipit-source-id: d11660dae85ea6158977eb00d1ceaceddf1d8234
Summary:
Sometimes you want to fetch a file. Using curl and the LFS server works, but
this really should be part of Mononoke admin.
Reviewed By: ikostia
Differential Revision: D22397472
fbshipit-source-id: 17decf4aa2017a2c1be52605a254692f293d1bcd
Summary:
This got broken when we moved to Tokio 0.2. Let's fix it and add a test to make
sure it does not regress.
Reviewed By: ikostia
Differential Revision: D22396261
fbshipit-source-id: a8359aee33b4d6d840581f57f91af6c03125fd6a
Summary: Use `thiserror` to provide a more ergonomic API for `DataEntry` hash checking. The `.data()` method now simply returns a `Result` rather than a tuple with an ad-hoc enum.
Reviewed By: quark-zju
Differential Revision: D22376164
fbshipit-source-id: fc39cb212ec1ee5830292db4aa5eca18f2c16a2b
Summary:
The streaming pager expects bytes as inputs, but we were sending
strings to the progress buffer. This fixes it.
Reviewed By: quark-zju
Differential Revision: D22394395
fbshipit-source-id: 4acbfc08ca624ca3c794e6e369df669e370e5b42
Summary:
The test relies on Python revlog implementation details which
do not exist in the Rust revlog implementation.
Reviewed By: DurhamG
Differential Revision: D22240183
fbshipit-source-id: b245b35e561c3364618a0e199244df030cc47942
Summary:
The original test is unmaintainable. Rewrite it to test key features.
I dropped detailed tests about merge conflict / content handling.
In the future we probalby will have a clean Rust implementation of "applying
diff between X and Y to Z" which can replace various unmaintainable patch
application logic in Python. We can test that Rust library extensively and
commands will just use the clean library (ex. revert, backout)
Reviewed By: sfilipco
Differential Revision: D22240184
fbshipit-source-id: 4d6c65fe02ccc92e64c62a48f702187678973086
Summary:
`debugstrip` is an operation that depends on multiple legacy components (revlog
strip, truncate-based transaction). They are incompatible with modern configs
(no truncation, heads-based visibility, metalog-based transaction).
Avoid using it in the test.
Reviewed By: DurhamG
Differential Revision: D22240187
fbshipit-source-id: ec215d75fb766957a3d6f58e491ef815f5bedbdc
Summary:
Change by `fix-revnum.py`. Part of the tests using `hg debugstrip`,
which I'm trying to avoid.
Reviewed By: DurhamG
Differential Revision: D22240181
fbshipit-source-id: a569b712fe4b985378e5c61c000deecccefbc488
Summary:
Since tests do not use it, and we have long disabled it in production.
Let's just disable the command unconditionally.
Reviewed By: DurhamG
Differential Revision: D22368834
fbshipit-source-id: 7ebc5b07c4044b6809defc06437cda7256cb2ebf
Summary:
`hg rollback` was long disabled in production setup. It has weird behavior and
is likely incompatible with modern transaction frameworks. Remove its usage in
tests.
Reviewed By: DurhamG
Differential Revision: D22240180
fbshipit-source-id: 453684ebbc77132e09b1b717b6ad1e106dcad214
Summary:
End-users have been using visibleheads + narrowheads for a while, and hgsql
does not require any filtering, and most tests are migrated to modern configs
(visibility + narrow heads). Now it's time to consider removing the repoview
layer.
This removes complexities around `changelog.filteredrevs` and various different
`repoview` objects with caching problems (ex. I have seen that `repo` and `unfi`
have inconsistent phasecache causing they calculate phases differently and it's
quite hard to reason about confidently).
This will also make it easier to migrate to segmented changelog.
Reviewed By: DurhamG
Differential Revision: D22201084
fbshipit-source-id: 3661c26dd72a64b5005d86e164af4da5a6895649
Summary:
This diff adds two new bits of functionality to `LiveCommitSyncConfig`:
- getting all possible versions of `CommitSyncConfig` for a given repo
- getting `CommitSyncConfig` for a repo by version name
These bits are meant to be used in:
- `commit_validator` and `bookmarks_validator`, which would
need to run validation against a specific config version
- `mononoke_admin`, which would need to be able to query all versions,
display the version used to sync two commits and so on
Reviewed By: StanislavGlebik
Differential Revision: D22235381
fbshipit-source-id: 42326fe853b588849bce0185b456a5365f3d8dff
Summary:
For various reasons (ex. wrong configs like investigating test repos) the
initialization can fail. Ignore them.
Reviewed By: DurhamG
Differential Revision: D22368942
fbshipit-source-id: ae01dcc499f63f373b0f7bec00554ea8074ae7cf
Summary:
This updates the virtually_sharded_blobstore to deduplicate puts only if the
data being put is actually the data we have put in the past. This is done by
keeping track of the hash of things we've put in the presence cache.
This has 2 benefits:
- This is safer. We only dedupe puts we 100% know succeeded (because this
particular instance was the one to attempt the put).
- This is creates less surprises, notably it lets us overwrite data in the
backing store (if we are writing something different).
Reviewed By: StanislavGlebik
Differential Revision: D22392809
fbshipit-source-id: d76a49baa9a5749b0fb4865ee1fc1aa5016791bc
Summary:
Running those on my devserver, I noticed they can be a bit flaky. They're are
racy on the purpose, but let's relax them a bit.
We have a lot of margin here — our blobstore is rate limited at once request
every 10ms, and we need to do 100 requests (the goal is to show that they don't
all wait), so 100ms is fine to prove that they're not rate limited when sharing
the same data.
Reviewed By: StanislavGlebik
Differential Revision: D22392810
fbshipit-source-id: 2e3c9cdf19b0e4ab979dfc000fbfa8da864c4fd6
Summary:
When we look up how a commit was synced, we frequently need to know which version of `CommitSyncConfig` was used to sync it. Specifically, this is useful for admin tooling and commit validator, which I am planning to migrate to use versioned `CommitSyncConfig` in the near future.
Later I will also include this information into `RewrittenAs` variant of `CommitSyncOutcome`, so that we expose it to real users. I did not do it in this diff to keep is small and easy to review. And because the other part is not ready :P
Reviewed By: StanislavGlebik
Differential Revision: D22255785
fbshipit-source-id: 4312e9b75e2c5f92ba018ff9ed9149efd3e7b7bc
Summary: When I've implemented this method I didn't test it for preserving the order of the input changesets and I've noticed my mistake when I was testing the scmquery part.
Reviewed By: StanislavGlebik
Differential Revision: D22374981
fbshipit-source-id: 4529f01370798377b27e4b6a706fc192a1ea928e
Summary:
Add the `scsc list-bookmarks` command, which lists bookmarks in a repository.
If a commit id is also provided, `list-bookmark` will be limited to bookmarks that
point to that commit of one of its descendants.
Reviewed By: mitrandir77
Differential Revision: D22361240
fbshipit-source-id: 17067ba47f9285b8137a567a70a87fadcaabec80
Summary:
There is inevitably interaction between caching, deduplication and rate
limiting:
- You don't want the rate limiting to be above caching (in the blobstore stack,
that is), because you shouldn't rate limits cache hits (this is where we are
today).
- You don't want the rate limiting to below deduplication, because then you get
priority inversion where a low-priority rate-limited request might hold the
semaphore while a higher-priority, non rate limited request wants to do the
same fetch (we could have moved rate limiting here prior to introducing
deduplication, but I didn't do it earlier because I wanted to eventually
introduce deduplication).
So, now that we have caching and deduplication in the same blobstore, let's
also incorporate rate limiting there!.
Note that this also brings a potential motivation for moving Memcache into this
blobstore, in case we don't want rate limiting to apply to requests before they
go to the _actual_ blobstore (I did not do this in this diff).
The design here when accessing the blobstore is as follows:
- Get the semaphore
- Check if the data is in cache, if so release the semaphore and return the
data.
- Otherwise, check if we are rater limited.
Then, if we are rate limited:
- Release the semaphore
- Wait for our turn
- Acquire the semaphore again
- Check the cache again (someone might have put the data we want while we were
waiting).
- If the data is there, then return our rate limit token.
- If the data isn't there, then proceed to query the blobstore.
If we aren't rate limited, then we just proceed to query the blobstore.
There are a couple subtle aspects of this:
- If we have a "late" cache hit (i.e. after we waited for rate limiting), then
we'll have waited but we won't need to query the blobstore.
- This is important when a large number of requests from the same key
arrive at the same time and get rate limited. If we don't do this second
cache check or if we don't return the token, then we'll consume a rate
limiting token for each request (instead of 1 for the first request).
- If a piece of data isn't cacheable, we should treat it like a cache hit with
regard to semaphores (i.e. release early), but like a miss with regard to
rate limits (i.e. wait).
Both of those are addressed captured in the code by returning the `Ticket` on a
cache hit. We can then choose to either return the ticket on a cache hit, or wait
for it on a cache miss.
(all of this logic is captured in unit tests, we can remove any of the blocks
there in `Shards::acquire` and a test will fail)
Reviewed By: farnz
Differential Revision: D22374606
fbshipit-source-id: c3a48805d3cdfed2a885bec8c47c173ee7ebfe2d
Summary:
Sometimes we take a token then realize we don't want it. In this case, giving it back is convenient.
This adds this!
Reviewed By: farnz
Differential Revision: D22374607
fbshipit-source-id: ccf47e6c75c37d154704645c9e826f514d6f49f6
Summary:
This is a mirror image of a diff, which made backsyncer use `LiveCommitSyncConfig`: we want to use configerator-based live configs, when we run in the continuous tailing mode.
As no-op iteration time used to be 10s and that's a bit wasteful for tests, this diff changes it to be configurable.
Finally, because of instantiating various additional `CommitSyncerArgs` structs, this diff globs out some of the `using repo` logs (which aren't very useful as test signals anyway, IMO).
Reviewed By: StanislavGlebik
Differential Revision: D22209205
fbshipit-source-id: fa46802418a431781593c41ee36f468dee9eefba
Summary: Tidy up some comments in this file.
Reviewed By: ikostia
Differential Revision: D22376165
fbshipit-source-id: ce4760776048aa8e72809b4f828d0ea426fcf878
Summary: This diff actually start to use the option
Reviewed By: krallin
Differential Revision: D22373943
fbshipit-source-id: fe23da9c3daa1f9f91a5ee5e368b33e0091aa9c1
Summary:
Previously if a blame request was rejected (e.g. because a file was too large)
then we returned BlameError::Error.
This doesn't look correct, because there's BlameError::Rejected. This diff
makes it so that fetch_blame function returns BlameError::Rejected
Reviewed By: aslpavel
Differential Revision: D22373948
fbshipit-source-id: 4859809dc315b8fd66f94016c6bd5156cffd7cc2
Summary:
In the next diffs we'll need to read override_blame_filesize_limit from derived
data config, and this config is stored in BlobRepo.
this diff makes a small refactoring to pass BlobRepo to fetch_full_file_content
Reviewed By: krallin
Differential Revision: D22373946
fbshipit-source-id: b209abce82c0279d41173b5b25f6761659a92f3d
Summary: This will make adding blame file size limit override the next diffs easier
Reviewed By: krallin
Differential Revision: D22373945
fbshipit-source-id: 4857e43c5d80596340878753ea90bf31d7bb3367
Summary:
We're always yielding zero or one child during traversal, bounded traversal is
unnecessary here
Differential Revision: D22242148
fbshipit-source-id: b4c8a1279ef7bd15e9d0b3b2063683f45e30a97a
Summary:
Let's use new option in CLI. Unfortunately we can't easily accept commit ids in
named params so it has to be a postional one.
Differential Revision: D22234412
fbshipit-source-id: a9c27422fa65ae1c42cb1c243c7694507a957437
Summary:
If anything were to go wrong, we'd be happy to know which puts we ignored. So,
let's log them.
Reviewed By: farnz
Differential Revision: D22356714
fbshipit-source-id: 5687bf0fc426421c5f28b99a9004d87c97106695