Commit Graph

6082 Commits

Author SHA1 Message Date
Jun Wu
ea67a2168e hgcommits: new crate for hybrid commit data + dag structure
Summary: This will help move more Python logic to Rust.

Reviewed By: sfilipco

Differential Revision: D21854224

fbshipit-source-id: b03cbacedc11d77e8c56262437a8d10bd9a89e59
2020-07-06 15:50:59 -07:00
Jun Wu
a0c5b1b3a5 revlogindex: is_ancestor(x, x) should return true
Summary: This is discovered by using it in Python world.

Reviewed By: sfilipco

Differential Revision: D22323186

fbshipit-source-id: 295811e0950b94ad2ad73ad242228b6a3f9765d0
2020-07-06 15:50:59 -07:00
Jun Wu
52668752d8 revlogindex: de-duplicate insertions
Summary: Adding a same commit multiple times is a no-op.

Reviewed By: sfilipco

Differential Revision: D22323190

fbshipit-source-id: 61a06335581a9cad32dc7e929b841ec69b551a9c
2020-07-06 15:50:59 -07:00
Jun Wu
b86f3bd6e2 revlogindex: use tests from the dag crate
Summary: This adds some test coverage for the revlog DagAlgorithm implementation.

Reviewed By: sfilipco

Differential Revision: D22249157

fbshipit-source-id: a1d347b4d90d0e7f8fb229c317cc75c2b8e16242
2020-07-06 15:50:59 -07:00
Jun Wu
fcbe821dd1 revlogindex: impl DagAddHeads for RevlogIndex
Summary:
This makes RevlogIndex compatible with the generic DAG testing API from the dag
crate.

Reviewed By: sfilipco

Differential Revision: D22249156

fbshipit-source-id: 54a3c458e85804968964174eab674e494a6fa8a2
2020-07-06 15:50:59 -07:00
Jun Wu
cf1bc37007 dag: avoid using > 2 parents in generic DAG tests
Summary: Some DAG implementations does not support it.

Reviewed By: sfilipco

Differential Revision: D22249158

fbshipit-source-id: ebcdf164677ee647ef44aa1ee3cfd318bac658b0
2020-07-06 15:50:59 -07:00
Jun Wu
9a17be7ce0 dag: do not test the order of vertexes in generic tests
Summary:
Different implementation might return different orders. They should be
considered correct.

Reviewed By: sfilipco

Differential Revision: D22249159

fbshipit-source-id: 36e4cadf814366f7ee2ed8a778948ff810760550
2020-07-06 15:50:58 -07:00
Jun Wu
f24dc621cb dag: make part of the tests generic
Summary: This makes it possible to run tests for other DAGs, like the revlog.

Reviewed By: sfilipco

Differential Revision: D22249155

fbshipit-source-id: 205579eeaccd42a21297d965973957168bb8726e
2020-07-06 15:50:58 -07:00
Jun Wu
5d9baa2f07 revlogindex: implement fast path for only
Summary:
For revlog, calculating `only` can have some fast paths that do not scan the
entire changelog.

Reviewed By: sfilipco

Differential Revision: D21944136

fbshipit-source-id: 58391636350f8f19643d59c46d663f55861d6de3
2020-07-06 15:50:58 -07:00
Jun Wu
7d75f6046f revlogindex: implement fast path for only_both
Summary:
This will be used to maintain narrow-heads phase calculation and sunsetting the
revlog-specific changelog.index2.

Reviewed By: sfilipco

Differential Revision: D21944131

fbshipit-source-id: a8bbd1fd24546f4891ffa677476bff750c3faf5f
2020-07-06 15:50:58 -07:00
Jun Wu
40cd0f8f06 revlogindex: fix an offset-by-one error
Summary: The values of `pending_nodes_index` should start from 0 instead of 1.

Reviewed By: sfilipco

Differential Revision: D21944133

fbshipit-source-id: a2a332868f16b398037289c81bf8076d1400c0a7
2020-07-06 15:50:58 -07:00
Jun Wu
bd0a35f2a0 revlogindex: do not raise errors on ambiguous prefix
Summary: This matches the interface of segmented changelog.

Reviewed By: sfilipco

Differential Revision: D21944134

fbshipit-source-id: 75f68b2838de4abe95f13cb3c62dc68af132fff7
2020-07-06 15:50:58 -07:00
Jun Wu
4670200c21 revlogindex: maintain revlog.d file handler transparently
Summary:
This drops the `file` parameter from the `raw_data` API, making
RevlogIndex easier to use.

Reviewed By: sfilipco

Differential Revision: D21854228

fbshipit-source-id: 259726524d1cc6a1f9d00783e22f9502c7decdeb
2020-07-06 15:50:58 -07:00
Jun Wu
137fa3cd34 revlogindex: implement writing to revlog data
Summary: Extend RevlogIndex to support writing to revlog data.

Reviewed By: sfilipco

Differential Revision: D21854227

fbshipit-source-id: 11b6bf3b706b316f23c33ab07144530c9db92d58
2020-07-06 15:50:58 -07:00
Jun Wu
f005d92f07 revlogindex: implement reading from revlog data
Summary: Extend RevlogIndex to support reading from revlog data.

Reviewed By: sfilipco

Differential Revision: D21854229

fbshipit-source-id: 4cbc08762fd236a97370d5d55c59a222f935b262
2020-07-06 15:50:58 -07:00
Jun Wu
2bc4dd01ca dag: add a trait to convert IdSet to Set
Summary:
The reverse `to_id_set` exists.
It turns out that the Python land wants this in many places.

Reviewed By: sfilipco

Differential Revision: D22240175

fbshipit-source-id: b6a3a3a3869dc0c521a21b1d86394421b816632b
2020-07-06 15:50:58 -07:00
Jun Wu
07b3d60f80 dag: add "only(x, y)" to DagAlgorithm
Summary:
This provides a way for implementations to optimize the operation.

For segmented changelog, the default implementation is good enough.

For revlog, `only` can have a fast path that does not iterate through the
entire changelog.

A related API `only_both` is added. For revlog it has multiple use-cases,
including narrow-heads phase calculation and revlog.findcommonmissing used by
discovery.

Reviewed By: markbt

Differential Revision: D21944132

fbshipit-source-id: d11660dae85ea6158977eb00d1ceaceddf1d8234
2020-07-06 15:50:57 -07:00
Thomas Orozco
7e8c9174be mononoke/admin: add a filestore fetch subcommand
Summary:
Sometimes you want to fetch a file. Using curl and the LFS server works, but
this really should be part of Mononoke admin.

Reviewed By: ikostia

Differential Revision: D22397472

fbshipit-source-id: 17decf4aa2017a2c1be52605a254692f293d1bcd
2020-07-06 14:56:08 -07:00
Thomas Orozco
46def15c4f mononoke/admin: fix filestore store subcommand
Summary:
This got broken when we moved to Tokio 0.2. Let's fix it and add a test to make
sure it does not regress.

Reviewed By: ikostia

Differential Revision: D22396261

fbshipit-source-id: a8359aee33b4d6d840581f57f91af6c03125fd6a
2020-07-06 14:56:08 -07:00
Arun Kulshreshtha
7ae5344bb2 edenapi_types: improve DataEntry hash check API
Summary: Use `thiserror` to provide a more ergonomic API for `DataEntry` hash checking. The `.data()` method now simply returns a `Result` rather than a tuple with an ad-hoc enum.

Reviewed By: quark-zju

Differential Revision: D22376164

fbshipit-source-id: fc39cb212ec1ee5830292db4aa5eca18f2c16a2b
2020-07-06 14:47:48 -07:00
Durham Goode
f02c7e85f0 py3: fix streampager progress bar
Summary:
The streaming pager expects bytes as inputs, but we were sending
strings to the progress buffer. This fixes it.

Reviewed By: quark-zju

Differential Revision: D22394395

fbshipit-source-id: 4acbfc08ca624ca3c794e6e369df669e370e5b42
2020-07-06 14:07:59 -07:00
Durham Goode
dc09a79a6b py3: fix checkout with conflicts in eden
Summary: It now works

Reviewed By: quark-zju

Differential Revision: D22394103

fbshipit-source-id: e525192079bef66d36f731c754c4b73f5fc7cb11
2020-07-06 14:07:59 -07:00
Jun Wu
253584bf2e test-generaldelta: disable modern features
Summary:
The test relies on Python revlog implementation details which
do not exist in the Rust revlog implementation.

Reviewed By: DurhamG

Differential Revision: D22240183

fbshipit-source-id: b245b35e561c3364618a0e199244df030cc47942
2020-07-06 14:04:28 -07:00
Jun Wu
1a5de54ab0 test-bookmarks-strip: do not use debugstrip
Reviewed By: DurhamG

Differential Revision: D22240182

fbshipit-source-id: b0bf34e84f46a0593b6390c6c97a47110f8d94d2
2020-07-06 14:04:28 -07:00
Jun Wu
ddfb3c5dba test-backout: rewrite the test in a basic form
Summary:
The original test is unmaintainable. Rewrite it to test key features.

I dropped detailed tests about merge conflict / content handling.
In the future we probalby will have a clean Rust implementation of "applying
diff between X and Y to Z" which can replace various unmaintainable patch
application logic in Python. We can test that Rust library extensively and
commands will just use the clean library (ex. revert, backout)

Reviewed By: sfilipco

Differential Revision: D22240184

fbshipit-source-id: 4d6c65fe02ccc92e64c62a48f702187678973086
2020-07-06 14:04:28 -07:00
Jun Wu
24ab8f5b4b test-amend-split: do not use hg debugstrip
Summary:
`debugstrip` is an operation that depends on multiple legacy components (revlog
strip, truncate-based transaction).  They are incompatible with modern configs
(no truncation, heads-based visibility, metalog-based transaction).
Avoid using it in the test.

Reviewed By: DurhamG

Differential Revision: D22240187

fbshipit-source-id: ec215d75fb766957a3d6f58e491ef815f5bedbdc
2020-07-06 14:04:28 -07:00
Jun Wu
4da3addbee tests: migrate off rev numbers for more tests
Summary:
Change by `fix-revnum.py`. Part of the tests using `hg debugstrip`,
which I'm trying to avoid.

Reviewed By: DurhamG

Differential Revision: D22240181

fbshipit-source-id: a569b712fe4b985378e5c61c000deecccefbc488
2020-07-06 14:04:28 -07:00
Jun Wu
ac60550de6 commands: disable rollback unconditionally
Summary:
Since tests do not use it, and we have long disabled it in production.
Let's just disable the command unconditionally.

Reviewed By: DurhamG

Differential Revision: D22368834

fbshipit-source-id: 7ebc5b07c4044b6809defc06437cda7256cb2ebf
2020-07-06 14:04:27 -07:00
Jun Wu
24363ad52f tests: remove usage of hg rollback
Summary:
`hg rollback` was long disabled in production setup. It has weird behavior and
is likely incompatible with modern transaction frameworks. Remove its usage in
tests.

Reviewed By: DurhamG

Differential Revision: D22240180

fbshipit-source-id: 453684ebbc77132e09b1b717b6ad1e106dcad214
2020-07-06 14:04:27 -07:00
Jun Wu
dabce28285 repoview: further remove repoview references
Summary:
Since repoview is removed, those concetps are useless. Therefore remove them.

This includes:
- repo.unfiltered(), repo.filtered(), repo.filtername
- changelog.filteredrevs
- error.FilteredIndexError, error.FilteredLookupError,
  error.FilteredRepoLookupError
- repo.unfilteredpropertycache, repo.filteredpropertycache,
  repo.unfilteredmethod
- index.headsrevsfiltered

Reviewed By: DurhamG

Differential Revision: D22367600

fbshipit-source-id: d133b8aaa136176b4c9f7f4b0c52ee60ac888531
2020-07-06 14:04:27 -07:00
Jun Wu
021fa7eba5 repoview: remove repoview layer
Summary:
End-users have been using visibleheads + narrowheads for a while, and hgsql
does not require any filtering, and most tests are migrated to modern configs
(visibility + narrow heads). Now it's time to consider removing the repoview
layer.

This removes complexities around `changelog.filteredrevs` and various different
`repoview` objects with caching problems (ex. I have seen that `repo` and `unfi`
have inconsistent phasecache causing they calculate phases differently and it's
quite hard to reason about confidently).

This will also make it easier to migrate to segmented changelog.

Reviewed By: DurhamG

Differential Revision: D22201084

fbshipit-source-id: 3661c26dd72a64b5005d86e164af4da5a6895649
2020-07-06 14:04:27 -07:00
Kostia Balytskyi
6d5b3ac1f2 live_commit_sync_config: add versions accessors
Summary:
This diff adds two new bits of functionality to `LiveCommitSyncConfig`:
- getting all possible versions of `CommitSyncConfig` for a given repo
- getting `CommitSyncConfig` for a repo by version name

These bits are meant to be used in:
- `commit_validator` and `bookmarks_validator`, which would
  need to run validation against a specific config version
- `mononoke_admin`, which would need to be able to query all versions,
  display the version used to sync two commits and so on

Reviewed By: StanislavGlebik

Differential Revision: D22235381

fbshipit-source-id: 42326fe853b588849bce0185b456a5365f3d8dff
2020-07-06 14:00:36 -07:00
Jun Wu
ef913a9914 debugshell: ignore error setting up commitcloud service
Summary:
For various reasons (ex. wrong configs like investigating test repos) the
initialization can fail. Ignore them.

Reviewed By: DurhamG

Differential Revision: D22368942

fbshipit-source-id: ae01dcc499f63f373b0f7bec00554ea8074ae7cf
2020-07-06 12:54:52 -07:00
Thomas Orozco
ce0af2d591 mononoke/virtually_sharded_blobstore: deduplicate puts based on data being put
Summary:
This updates the virtually_sharded_blobstore to deduplicate puts only if the
data being put is actually the data we have put in the past. This is done by
keeping track of the hash of things we've put in the presence cache.

This has 2 benefits:

- This is safer. We only dedupe puts we 100% know succeeded (because this
  particular instance was the one to attempt the put).
- This is creates less surprises, notably it lets us overwrite data in the
  backing store (if we are writing something different).

Reviewed By: StanislavGlebik

Differential Revision: D22392809

fbshipit-source-id: d76a49baa9a5749b0fb4865ee1fc1aa5016791bc
2020-07-06 12:10:46 -07:00
Thomas Orozco
19b31ead9d mononoke/virtually_sharded_blobstore: make race tests a little more forgiving
Summary:
Running those on my devserver, I noticed they can be a bit flaky. They're are
racy on the purpose, but let's relax them a bit.

We have a lot of margin here — our blobstore is rate limited at once request
every 10ms, and we need to do 100 requests (the goal is to show that they don't
all wait), so 100ms is fine to prove that they're not rate limited when sharing
the same data.

Reviewed By: StanislavGlebik

Differential Revision: D22392810

fbshipit-source-id: 2e3c9cdf19b0e4ab979dfc000fbfa8da864c4fd6
2020-07-06 12:10:46 -07:00
Kostia Balytskyi
f223ca6e6e synced commit mapping: expose version in get query
Summary:
When we look up how a commit was synced, we frequently need to know which version of `CommitSyncConfig` was used to sync it. Specifically, this is useful for admin tooling and commit validator, which I am planning to migrate to use versioned `CommitSyncConfig` in the near future.

Later I will also include this information into `RewrittenAs` variant of `CommitSyncOutcome`, so that we expose it to real users. I did not do it in this diff to keep is small and easy to review. And because the other part is not ready :P

Reviewed By: StanislavGlebik

Differential Revision: D22255785

fbshipit-source-id: 4312e9b75e2c5f92ba018ff9ed9149efd3e7b7bc
2020-07-06 11:23:31 -07:00
Mateusz Kwapich
7b3aa42459 fix the problem with ordering in into_response
Summary: When I've implemented this method I didn't test it for preserving the order of the input changesets and I've noticed my mistake when I was testing the scmquery part.

Reviewed By: StanislavGlebik

Differential Revision: D22374981

fbshipit-source-id: 4529f01370798377b27e4b6a706fc192a1ea928e
2020-07-06 08:32:03 -07:00
Mark Thomas
bcaaba1e9c add list-bookmarks command
Summary:
Add the `scsc list-bookmarks` command, which lists bookmarks in a repository.

If a commit id is also provided, `list-bookmark` will be limited to bookmarks that
point to that commit of one of its descendants.

Reviewed By: mitrandir77

Differential Revision: D22361240

fbshipit-source-id: 17067ba47f9285b8137a567a70a87fadcaabec80
2020-07-06 07:01:24 -07:00
Thomas Orozco
07907b2b26 mononoke/virtually_sharded_blobstore: merge in the context_concurrency_blobstore
Summary:
There is inevitably interaction between caching, deduplication and rate
limiting:

- You don't want the rate limiting to be above caching (in the blobstore stack,
  that is), because you shouldn't rate limits cache hits (this is where we are
  today).
- You don't want the rate limiting to below deduplication, because then you get
  priority inversion where a low-priority rate-limited request might hold the
  semaphore while a higher-priority, non rate limited request wants to do the
  same fetch (we could have moved rate limiting here prior to introducing
  deduplication, but I didn't do it earlier because I wanted to eventually
  introduce deduplication).

So, now that we have caching and deduplication in the same blobstore, let's
also incorporate rate limiting there!.

Note that this also brings a potential motivation for moving Memcache into this
blobstore, in case we don't want rate limiting to apply to requests before they
go to the _actual_ blobstore (I did not do this in this diff).

The design here when accessing the blobstore is as follows:

- Get the semaphore
- Check if the data is in cache, if so release the semaphore and return the
  data.
- Otherwise, check if we are rater limited.

Then, if we are rate limited:

- Release the semaphore
- Wait for our turn
- Acquire the semaphore again
- Check the cache again (someone might have put the data we want while we were
  waiting).
    - If the data is there, then return our rate limit token.
    - If the data isn't there, then proceed to query the blobstore.

If we aren't rate limited, then we just proceed to query the blobstore.

There are a couple subtle aspects of this:

- If we have a "late" cache hit (i.e. after we waited for rate limiting), then
  we'll have waited but we won't need to query the blobstore.
    - This is important when a large number of requests from the same key
      arrive at the same time and get rate limited. If we don't do this second
      cache check or if we don't return the token, then we'll consume a rate
      limiting token for each request (instead of 1 for the first request).
- If a piece of data isn't cacheable, we should treat it like a cache hit with
  regard to semaphores (i.e. release early), but like a miss with regard to
  rate limits (i.e. wait).

Both of those are addressed captured in the code by returning the `Ticket` on a
cache hit. We can then choose to either return the ticket on a cache hit, or wait
for it on a cache miss.

(all of this logic is captured in unit tests, we can remove any of the blocks
there in `Shards::acquire` and a test will fail)

Reviewed By: farnz

Differential Revision: D22374606

fbshipit-source-id: c3a48805d3cdfed2a885bec8c47c173ee7ebfe2d
2020-07-06 04:38:31 -07:00
Thomas Orozco
6153dab328 mononoke/async_limiter: add support for cancelling access
Summary:
Sometimes we take a token then realize we don't want it. In this case, giving it back is convenient.

This adds this!

Reviewed By: farnz

Differential Revision: D22374607

fbshipit-source-id: ccf47e6c75c37d154704645c9e826f514d6f49f6
2020-07-06 04:38:31 -07:00
Kostia Balytskyi
b7cf1dcbdb x-repo sync job: use LiveCommitSyncConfig
Summary:
This is a mirror image of a diff, which made backsyncer use `LiveCommitSyncConfig`: we want to use configerator-based live configs, when we run in the continuous tailing mode.

As no-op iteration time used to be 10s and that's a bit wasteful for tests, this diff changes it to be configurable.

Finally, because of instantiating various additional `CommitSyncerArgs` structs, this diff globs out some of the `using repo` logs (which aren't very useful as test signals anyway, IMO).

Reviewed By: StanislavGlebik

Differential Revision: D22209205

fbshipit-source-id: fa46802418a431781593c41ee36f468dee9eefba
2020-07-03 13:36:18 -07:00
Arun Kulshreshtha
1b5283aa5a edenapi_types: improve comments
Summary: Tidy up some comments in this file.

Reviewed By: ikostia

Differential Revision: D22376165

fbshipit-source-id: ce4760776048aa8e72809b4f828d0ea426fcf878
2020-07-03 12:29:19 -07:00
Arun Kulshreshtha
dc98c085ad edenapi_types: split redaction tombstone string across multiple lines
Summary: Make this line less long.

Reviewed By: ikostia

Differential Revision: D22372492

fbshipit-source-id: cfc1ab6a296aa2056a908bf786e4f498f3a688b4
2020-07-03 12:15:01 -07:00
Stanislau Hlebik
2cfc23770c mononoke: use override_blame_filesize_limit option
Summary: This diff actually start to use the option

Reviewed By: krallin

Differential Revision: D22373943

fbshipit-source-id: fe23da9c3daa1f9f91a5ee5e368b33e0091aa9c1
2020-07-03 09:58:46 -07:00
Stanislau Hlebik
06f2a420d1 mononoke: correctly return BlameError::Rejected
Summary:
Previously if a blame request was rejected (e.g. because a file was too large)
then we returned BlameError::Error.

This doesn't look correct, because there's BlameError::Rejected. This diff
makes it so that fetch_blame function returns BlameError::Rejected

Reviewed By: aslpavel

Differential Revision: D22373948

fbshipit-source-id: 4859809dc315b8fd66f94016c6bd5156cffd7cc2
2020-07-03 09:58:46 -07:00
Stanislau Hlebik
2a732f2626 mononoke: pass BlobRepo in fetch_full_file_content
Summary:
In the next diffs we'll need to read override_blame_filesize_limit from derived
data config, and this config is stored in BlobRepo.

this diff makes a small refactoring to pass BlobRepo to fetch_full_file_content

Reviewed By: krallin

Differential Revision: D22373946

fbshipit-source-id: b209abce82c0279d41173b5b25f6761659a92f3d
2020-07-03 09:58:46 -07:00
Stanislau Hlebik
b703f11685 mononoke: asyncify fetch_full_file_content
Summary: This will make adding blame file size limit override the next diffs easier

Reviewed By: krallin

Differential Revision: D22373945

fbshipit-source-id: 4857e43c5d80596340878753ea90bf31d7bb3367
2020-07-03 09:58:46 -07:00
Mateusz Kwapich
f21b459c99 remove dependency on bounded_traversal
Summary:
We're always yielding zero or one child during traversal, bounded traversal is
unnecessary here

Differential Revision: D22242148

fbshipit-source-id: b4c8a1279ef7bd15e9d0b3b2063683f45e30a97a
2020-07-03 08:02:25 -07:00
Mateusz Kwapich
7ff7c931a8 add option for limiting the log to descendants of single node
Summary:
Let's use new option in CLI. Unfortunately we can't easily accept commit ids in
named params so it has to be a postional one.

Differential Revision: D22234412

fbshipit-source-id: a9c27422fa65ae1c42cb1c243c7694507a957437
2020-07-03 08:02:25 -07:00
Thomas Orozco
de731a89fc mononoke/virtually_sharded_blobstore: log deduplicated puts
Summary:
If anything were to go wrong, we'd be happy to know which puts we ignored. So,
let's log them.

Reviewed By: farnz

Differential Revision: D22356714

fbshipit-source-id: 5687bf0fc426421c5f28b99a9004d87c97106695
2020-07-03 05:53:11 -07:00