Summary:
Make blobstore_factory PutBehaviour aware by layering all except the final multiplex as BlobstorePutOps
This makes it so all the components that go into a multiplex are BlobstorePutOps, which is a prerequisite for making the multiplex logging include the Overwrite status.
Reviewed By: StanislavGlebik
Differential Revision: D24109289
fbshipit-source-id: 23f4cedbaebadae194e41cfbff9ef46b651e3fd4
Summary:
log whether puts overwrite keys by implementing BlobstoreWithPutBehaviour for Logblob.
It logs a count of type of overwrite per put so we can sum them up.
Reviewed By: StanislavGlebik
Differential Revision: D24079272
fbshipit-source-id: 81944d92a56b0d3349ef390eb83f9e5bf4ee3d39
Summary:
If Phabricator takes more than 10 seconds to respond, a `socket.timeout`
exception may be thrown. Treat this like other networking errors, and simply
report the diff phabricator status as `Error`. Previously this exception was
unhandled, causing the entire command to abort.
Reviewed By: singhsrb
Differential Revision: D24272274
fbshipit-source-id: f646d111a91f901e09d9f94a1e0102d6dd4d0952
Summary:
This change introduces two new metadata types, Category and Transience, and a mechanism for Category to provide a default Fault and Transience, which can be overriden by the user.
Also introduces a mechanism for attempting to log exceptions which occur during exception logging, falling back to the previous behavior of just swallowing the exception on failure.
Reviewed By: DurhamG
Differential Revision: D22677565
fbshipit-source-id: 1cf75ca1e2a65964a0ede1f072439378a46bd391
Summary:
It only has benchmark code that led to the use of mincode. Now hgcommits is the
main crate for commit storage. `commitstore` without `hg` in its name was
initially planned to support other kinds of commits including git and bonsai.
However we don't have immediate goal for that at present. So let's just remove
the commitstore directory.
Reviewed By: singhsrb
Differential Revision: D24263618
fbshipit-source-id: 84b4861ae490817377e69d8c2006c63331e3db1f
Summary:
On the hg servers we're seeing crashes due to signals during syscalls.
Rolling back to prior to the signal changes seems to have fixed it, though we
haven't bisected enough to be sure this is the cause.
The ui.threaded option is already used to opt-out of running on a background
thread, let's also use it to opt-out of the custom signal registering in hopes
that it fixes the server issue, until they can be deprecated.
Reviewed By: quark-zju
Differential Revision: D24254804
fbshipit-source-id: 50e4fc8c7e3a88b5baa49394f6f1edffc946187d
Summary:
Since D23247941, .clang-format is no longer behind a symlink, which means that
we can get clang-format niceness on Windows without having our own
.clang-format file.
Thus, let's just remove eden/scm/.clang-format
Reviewed By: ahornby
Differential Revision: D24255987
fbshipit-source-id: 8bbb966949cf0d1c0ad76053f699dd524700183e
Summary: Need to add new quickcheck tests, verify that remove `Serialize` from `TreeEntry` is okay.
Reviewed By: kulshrax
Differential Revision: D23457777
fbshipit-source-id: aa94ed7aa81b41924eba4a8bd1bdc2c737365b77
Summary:
Change
abort: repository repo: timed out waiting for lock held by <lockinfo ...>
to:
abort: repository repo: timed out waiting for lock held by process '842210' on host 'hostname'
Reviewed By: singhsrb
Differential Revision: D24214462
fbshipit-source-id: 65056ebb9764651b2f0126061fafdfdefaa4e9c6
Summary: The rev numbers are almost gone, remove them from the test.
Reviewed By: sfilipco
Differential Revision: D24255156
fbshipit-source-id: 5cbc4a71c2d3f773c5b474d1edce84ceceb50bf9
Summary: Added sqlblob to the common blobstore tests to make sure it doesn't diverge from expectations for other stores.
Reviewed By: StanislavGlebik
Differential Revision: D24079254
fbshipit-source-id: 49ef1c372171a1594ba298c66d3473ef682d24cc
Summary:
Add CountedBlobstoreOps so that blobstore layers that need BlobstoreOps can still use counting
This unblocks adding sqlblob to blobstore-test in next diff in stack
Reviewed By: farnz
Differential Revision: D24079256
fbshipit-source-id: 6a6505aff8c8405353a1f10d79f6e6e08911228a
Summary: Add BlobstorePutOps so that blobstore layers that need BlobstorePutOps can still use PrefixBlob as a wrapper.
Reviewed By: farnz
Differential Revision: D24109298
fbshipit-source-id: 710571e6c30fa8a432d463eedfab5fcc0389baa3
Summary:
Add predicate based PutBehaviour logic to manifoldblob.
This will prevent overwrites of keys when in IfAbsent mode, and will generate useful logging in OverwriteAndLog and IsAbsent mode.
This change factors our part of the put logic to put_check_conflict, so that it can use re-used from each of the PutBehaviour cases.
Reviewed By: StanislavGlebik
Differential Revision: D24021170
fbshipit-source-id: d2e71afadada3d5e661634449108e6c9f8dc5907
Summary:
We don't have any Preserved entries anymore - now all preserved entries will be
rewritten with "noop" mapping.
This diff removes it completely
Reviewed By: mitrandir77, ikostia
Differential Revision: D24173538
fbshipit-source-id: f2d6238633cea8dc3c06f2e607b2abd76edfca6b
Summary: This state is going to be removed soon, so no need for tests anymore
Reviewed By: mitrandir77
Differential Revision: D24221363
fbshipit-source-id: 19dce04549ccbfe59255463a73e56c70f1c8bc4d
Summary: Just as with the previous diff, let's remove it from cross_repo_sync_test_utisl
Reviewed By: mitrandir77
Differential Revision: D24220618
fbshipit-source-id: 95c5ddc955f720101f6576b34e9787435b6deb4c
Summary:
This will allow Eden to control what data is in Mercurial's memory and
what is on disk. This will let it persist the hg_import_helper process longer,
and prevent slow startup times when needing to restart it.
Reviewed By: xavierd
Differential Revision: D24231131
fbshipit-source-id: a4f743740b44957e8d2dd93f07e9831eadfee7ab
Summary:
I observed an integration test fail because Mercurial aborted with an
error about not handling EINTR, but it had no traceback. Always run
Mercurial with --traceback to try to debug these.
Reviewed By: xavierd
Differential Revision: D24204308
fbshipit-source-id: 44960bc645e5f97f61761e511b372328430fcda7
Summary:
This was added in D16078908 (1bc6ecf8fe). It turns out fsync followed by a hardlink
on btrfs results in very slow hardlink performance (100-600ms). Since fewer and
fewer of our files use this atomic write code path and since this affects almost
every hg write command, let's roll this back.
This will increase the chance of data loss during a hard reboot, but commit
cloud is good enough to allow recovering from that in almost every situation.
Reviewed By: sfilipco
Differential Revision: D24230056
fbshipit-source-id: aae1a5612eda5f62bb5ec1442b1312ed45c42786
Summary:
With invalidatelinkrev, linkrev stored in revlog shouldn't be used.
This makes some tests pass with segmented changelog.
Reviewed By: singhsrb
Differential Revision: D24201944
fbshipit-source-id: 6473c30266c79aa97a955b1a6c867411cc67de2b
Summary:
The old code assumes `0..len(repo)` are valid revs, which is no longer true
with segmented changelog.
Reviewed By: singhsrb
Differential Revision: D24201948
fbshipit-source-id: b882a215701c57ccdf4af340c889586b040772da
Summary:
Copy segmented changelog, metalog and mutation store for local clones.
This mainly affects tests.
Reviewed By: singhsrb
Differential Revision: D24201941
fbshipit-source-id: c60da9e2bf982a6f66004415e45178749157745e
Summary: This is needed to move our hg servers to python 3.
Reviewed By: quark-zju
Differential Revision: D24204056
fbshipit-source-id: cbaf97893f8f77b535952ac290766f0fd5e14f0c
Summary:
From an OSS perspective, I think that the log tables have a place. However for
daily use perspective, next to scuba they don't add much except retention and
instead feel more heavy weight to manage. This change probably simplifies
things and makes the Segmented Changelog component easier to maintain.
Reviewed By: krallin
Differential Revision: D24213548
fbshipit-source-id: 48a4ea57e3f3911c3bf82b0cc51f118d72119e19
Summary:
This diff is a first step in a preparation for removing Preserved state from
CommitSyncOutcome. It does two things:
1) Remove unsafe_preserve_commit method and instead rewrite commit with noop
config version - this is exactly how we are going to do it in production
2) While there also select correct mover for validation - previously we were
always picking the current mover
I've also added a bunch of print statements, since it was tricky to debug what
was going on in the test.
Reviewed By: ikostia
Differential Revision: D24216168
fbshipit-source-id: 86e81aea61e638d93bdb33e7c9fd713f7b5e6c3b
Summary:
All the new entries in our mapping tables should have their version set. Let's
enforce it in the code
Reviewed By: ikostia
Differential Revision: D24217688
fbshipit-source-id: 95f01d8929a9c3a19b84434c91db6d08a6e5f863
Summary:
To give a bit of context - we are getting rid of Preserved state in
CommitSyncOutcome completely since it can be fully replaced with with
EquivalentWorkingCopyAncestor/RewrittenAs states.
ikostia@ did a similar change in D24142837 (2035a34a0e) to consider Preserved only if
mapping is None. However this diff diff it only for mapping.get() method. Let's do
the same for mapping.get_equivalent_working_copy().
Reviewed By: ikostia
Differential Revision: D24216455
fbshipit-source-id: f1f8d46263de54cb2e11d33b6c17f371b79e80f9
Summary: This was probably copied from backsyncer some time ago.
Reviewed By: StanislavGlebik
Differential Revision: D24198742
fbshipit-source-id: 3d8fad0ddc94185acd28ede7163b43424935830d
Summary:
This diff adds tests for `sync_merge` version-determination logic:
- when both parents were rewritten with the same version and its identical to the current one
- when both parents were rewritten with the same version and its different to the current one
- when both parents are Preserved
- when one parent is Preserved
Reviewed By: StanislavGlebik
Differential Revision: D24104680
fbshipit-source-id: 075eb40e6f76d4f3271fdf243a5728322698ff46
Summary: Do not show revnum or "changeset" after commit (with --debug or --vrebose).
Reviewed By: singhsrb
Differential Revision: D24201942
fbshipit-source-id: 2f0d15711df67070e50d4bf30f0b1b4401d85524
Summary:
With segmented changelog, linkrev can exceed i32 range and cause "pack" (aka.
bundle) to fail. Workaround it by packing nullrev instead.
~30 tests now pass with segmented changelog with this change.
Reviewed By: singhsrb
Differential Revision: D24201940
fbshipit-source-id: 5f27c185837cd3c1fbe9b65d21ef3cd641eec2e5
Summary:
debugbuilddag assumes revs created start from 0, 1, 2..., which is
no longer true with segmented changelog.
Change it to resolve revs using the local `nodeids` array so it's
compatible with segmented changelog.
Reviewed By: singhsrb
Differential Revision: D24191790
fbshipit-source-id: ca7d1cccbba664128c227d66071b166c799cdf49
Summary:
A previous change updated all tests to remove the use of rev numbers. The
update to the list of results missed this tests case.
Reviewed By: quark-zju
Differential Revision: D24208236
fbshipit-source-id: 289136f31e66eb74b106d7ea9401419fc369d59f
Summary:
Print a message for each EdenAPI method call to stderr if the user has `edenapi.debug` set.
These messages are already logged to `tracing`, but also printing them out when `edenapi.debug` is set makes the debug output more useful, since it provides context for the download stats. This is especially useful when reading through EdenFS logs.
Reviewed By: quark-zju
Differential Revision: D24204381
fbshipit-source-id: 37b47eed8b89438cdf510443e917a5c8660eb43b
Summary: Use a `HashMap` to store user-specified additional HTTP headers. This allows headers to be set in multiple places (whereas previously, setting new headers would replace all previously set headers).
Reviewed By: quark-zju
Differential Revision: D24200833
fbshipit-source-id: 93147cf334a849c4d2fc4f29849018a4c7565143
Summary: The panics can happen when the input sets are out of range.
Reviewed By: kulshrax
Differential Revision: D24191789
fbshipit-source-id: efbcbd7f6f69bd262aa979afa4f44acf9681d11e
Summary: Change fold and metaedit to not show revnum in editor message.
Reviewed By: kulshrax
Differential Revision: D24191787
fbshipit-source-id: 140ec58c8eb00c067c6e40e1a18187f7801246e9
Summary: looks like we got land time conflict
Reviewed By: krallin
Differential Revision: D24196362
fbshipit-source-id: 27da83a2f86cc7fe5f59fe583d4b719f69df0248
Summary:
We've run into an high cpu usage issue on commit_file_diffs request.
Looks like the problem is in the fact that ChangesetPathContext does fsnode
traversal for each path, which is very expensive if we have a lot of paths
Note - this is similar problem to D20766465 (2eebab89c5)
Reviewed By: mitrandir77
Differential Revision: D24194056
fbshipit-source-id: e808ff7c63990260c1eb2f70a8bba11c2add395c
Summary:
Mononoke command for running the SegmentedChangelogSeeder for an existing
repository. The result is going to be a new IdMap version in the metadata
store and a new IdDag stored in the the blobstore resulting in a brand new
SegmentedChangelog bundle.
Reviewed By: krallin
Differential Revision: D24096963
fbshipit-source-id: 1eaf78392d66542d9674a99ad0a741f24bc2cb1b
Summary:
The SegmentedChangelogSeeder has the role of constructing a new IdMap for a
given repository. That would happen when a repository is onboarded or when
algorithm improvements are made.
This change comes with small refactoring. We had the Dag which did a bit of
everything. Now the on_demand_update and the seeder functionalities are in
their separate files. The tests from `dag.rs` were moved to the `tests.rs` and
updated to use the seeder and on_demand_update structures.
`SegmentedChangelogSeeder::run` is the main logic added in this diff.
Reviewed By: quark-zju
Differential Revision: D24096965
fbshipit-source-id: 0f655e8c226ca0051f3e925342e92b1e7979aab2
Summary:
The IdDagStore provides the ability to save and later load prebuilt instances
of the IdDag.
This is going to be used in the clone API where we send one of these blobs to
the client. It is also going to be used by servers starting up.
Right now the serialization is naive, relying on serde::Serialize. The key
schema would provide the means for evolving the serialization format in cases
where we would require breaking changes.
Reviewed By: quark-zju
Differential Revision: D24096967
fbshipit-source-id: 2c883e5e82c05bec03c429c3c2a2d545170a8c05
Summary:
This IdMapVersionStore determines which is the latest IdMapVersion that commit
"tailing" processes should use when building new Dag bundles. The "seed"
process will update the versions of the IdMap. The plan for the "seed" process
is to write a new IdMap version to Sql then update the store with a new entry.
New "tailer" processes will then start to use the newly built IdMapVersion.
The tailing processes that will build fresh IdDags for general consumption.
These IdDags will be used by the clone operation. These dags will also be used
by servers instances spinning up.
DagBundles specify (id dag version, id map vession). This pair specified a
prebuilt Segmented Changelog that is ready to be loaded.
Reviewed By: quark-zju
Differential Revision: D24096968
fbshipit-source-id: 413f49ed185a770a73afd17dfbc952901ab53b42
Summary:
This allows for more flexibility in structuring the code that wants to read all
the public changesets.
The usecase I have in mind is the SegmentedChangelog Seeder. The logic is
defined in the segmented_changelog crate. Constructing the Seeder is more
straight forward if it doesn't have to take direct dependency on SqlPhases and
SqlChangesets.
Reviewed By: quark-zju
Differential Revision: D24096966
fbshipit-source-id: dffa909cd27d6c05d745fd0fe0609114a50f1892
Summary:
Some sort of serialization for the Dag is useful for saving the IdDag produced
by offline jobs load that when a mononoke server starts.
Reviewed By: quark-zju
Differential Revision: D24096964
fbshipit-source-id: 5fac40f9c10a5815fbf5dc5e2d9855cd7ec88973
Summary:
Adding a simple from implementation to the error struct allows us to avoid
instantiating the varint explicitly.
Reviewed By: krallin
Differential Revision: D24161695
fbshipit-source-id: cb6e4c1e2cb21bd17ddff0df89a53d3f0604f562
Summary: Let's use more references when we can
Reviewed By: krallin
Differential Revision: D24161694
fbshipit-source-id: 5cf7edf843fa8dcf0d24ca114c3d520263d92a3b
Summary:
This change enables the filler job to work on all repos available instead of a
single one. We are still going to be able to dedicate the job to a certain repo
(by crafting a config with a single repo enabled) but we can put the entire
long tail for low-traffic repos under a single job.
This requires D24110335 to land in configerator to work.
Reviewed By: krallin
Differential Revision: D24136239
fbshipit-source-id: 4b77d1667c37cc55f11c3087b02a09dbae29db0f
Summary: Allow bookmark to be optional - again, will be used in the next diffs
Reviewed By: ahornby
Differential Revision: D24163608
fbshipit-source-id: e037731117181d0b1bbe4eb273301245142b507d
Summary: This functionality will be used in the next diffs.
Reviewed By: ahornby
Differential Revision: D24163517
fbshipit-source-id: 36e5c9646e21913f0e0d79d77dd11862f5aa5331
Summary:
This diff fixes how syncing of merge commits decides on the `CommitSyncConfigVersion` to use. Old and incorrect behavior just always uses current version from `LiveCommitSyncConfig`. The desired behavior is to reuse the version with which parent commits are synced, and manually sync commits when version changes are needed.
For merges it is more interesting, as merges have multiple parents. The overarching idea is to force all of the parents to have the same version and bail a merge if this is not the case. However, that is an ideal, and we are not there yet, because:
- there are `NotSyncCandidate` parents, which can (and should at the moment) be safely excluded from the list of parents of the synced commit.
- there are `Preserved` parents (which will turn into the ones synced with a `noop` version)
- there are `RewrittenAs` and `EquivalentWorkingCopy` parents, which don't have an associated version.
So until the problems above are solved:
- absent `RewrittenAs`/`EquivalentWorkingCopy` versions are replaced with the current version
- `Preserved` merge parents cause merge sync to fail.
Reviewed By: StanislavGlebik
Differential Revision: D24033905
fbshipit-source-id: c1c98b3e7097513af980b5a9f00cc62d248fc03b
Summary:
Our higher-level goal is to get rid of `CommitSyncOutcome::Preserved` altogether. This diff is a step in that direction. Specifically, this diff removes the creation of "accidental" Preserved commits: the ones where the hashes are identical, although a `Mover` of some version have been applied. There are a few sides to this fix:
- `get_commit_sync_outcome` now returns `Preserved` only when the source and target hashes are identical, plus stored version is `None` (previously it would only look at hashes).
- `sync_commit_no_parents` now records the `Mover` version it used to rewrite the commit (previously it did not, which would sometimes create `Preserved` roots)
- there are now just two ways to sync commits as `Preserved`:
- `unsafe_preserve_commit` (when the caller explicitly asks for it). The idea is to only remove it once we remove the callers of this methods, of course.
- `sync_commit_single_parent` when the parent is also `Preserved`. Note that automatically upgrading from `Preserved` parent to a rewritten changeset is incorrect for now: `Preserved` does not have an associated version by definition, so we would have to use a current version, which may corrupt the repo. Once we get rid of `Preserved`, this case will naturally go away.
- as we now have `update_mapping_with_version` and `update_mapping` (which consumes current version), we need to add explicit `update_mapping_no_version` for preserved commits we are still creating (again, recording a current version is a mistake here, same reason as above)
NB: I've added/changed a bunch of `println`s in tests, leaving them here, as they are genuinely useful IMO and not harmful.
Reviewed By: StanislavGlebik
Differential Revision: D24142837
fbshipit-source-id: 2153d3c5cc406b3410eadbdfca370f79d01471f9
Summary:
There's a bug in Thrift-py3 streaming support, where interrupting
iterating over an async stream leaves Thrift objects in a broken
state. Futures get dropped (and warnings are printed to the console)
but the destructors hang. Don't even try to garbage collect the heap
in that case.
Reviewed By: genevievehelsel
Differential Revision: D24032229
fbshipit-source-id: 5f16667fe6cfd27de1b39cc2974028729e18b214
Summary: Thrift-py3 supports streaming, so give the new client access to APIs from streamingeden.thrift.
Reviewed By: wez
Differential Revision: D24032144
fbshipit-source-id: 44f350b5cfa943154084b8d64f6c696e315e6b88
Summary: Eden mounts are identified by paths, not by strings, so fix the Thrift signature.
Reviewed By: wez
Differential Revision: D23774513
fbshipit-source-id: c0fb82c48eee5ce4e8fbffef5623f9016ef76e40
Summary:
We're seeing an issue on the hg servers where the filecache assertion
that if a value is in obj.__dict__ it's also in obj._filecache is broken. This
occurred about 10% of the time in sandcastle jobs. The diff that caused this
went in in April (D21148446 (73c5cb89de)), so it's unclear why it's only cropping up now.
This is caused by the following steps:
1. repo._bookmarks is accessed while _bookmarks is in the _filecache but not in
the __dict__
2. This causes construction of _bookmarks, before it can set it to __dict__
3. Construction of _bookmarks calls repo.invalidate(clearfilecache=True), which
deletes _bookmarks from _filecache.
4. _bookmarks construction completes, and gets set to __dict__ (but now it's
missing from _filecache, so the invariant will fail next time someone checks).
5. Someone accesses _bookmarks later, and the assertion fires.
The fix is to just not clear the filecache during bookmark construction. The
main purpose of this invalidate was to let the changelog be reloaded, and I
think that will still happen since, if there are any new commits in the
changelog, the file size and time will change, triggering a reload next time the
_filecache entry is checked.
Reviewed By: quark-zju
Differential Revision: D24182914
fbshipit-source-id: fb49137e28d9224c6617d9c84faaf2f9de363aaf
Summary:
Adds a subcommand to `eden debug prefetch_profile` to fetch all the files
for a list of profiles or all the active profiles for a checkout.
These profiles (files) can contain lists of file names, or globs of files, eden
will be able to resolve them since this uses the existing prefetch code.
I opted to put this command under `eden debug prefetch_profile` instead
of `eden prefetch` since the command may change to call into eden with out
the list of files (letting eden resolve the active profiles and names). Then
it will no longer resemble prefetch and so long term it may be better homed
here.
Reviewed By: genevievehelsel
Differential Revision: D23771483
fbshipit-source-id: 12af81d40cc495efd381e3c3a2df645d72053ccd
Summary:
Before I make further changes to the Journal, improve the comments and
refactor a few small things.
Reviewed By: kmancini
Differential Revision: D24089530
fbshipit-source-id: de9da2c1e6b1c87b6587781cfa55ae7cc4085eeb
Summary:
Scanning through the functionality provided in ServiceFrameworkLight,
it looks like none of it really applies to the edenfs daemon, so break
the dependency entirely. Removing this complexity would have prevented
the regression where we stopped recording Thrift call statistics.
This should also improve our build times a bit, and maybe resource
consumption.
Reviewed By: genevievehelsel
Differential Revision: D24094784
fbshipit-source-id: fefd1a648c7ecba8484296527ff8100269c176b6
Summary:
Implement BlobstorePutOps for S3Blob. This uses is_present to check the various put behaviours
While implementing this I noticed get_sharded_key could be updated to take a reference, so I did that as well.
Differential Revision: D24079253
fbshipit-source-id: 16e194076dbdb4da8a7a9b779e0bd5fb60f550a6
Summary: Now that fileblob and memblob support put behaviour logic, update the overwrite test to check the overwrite result.
Differential Revision: D24021167
fbshipit-source-id: d9578630205cf5d79999a459cc29481968d5717d
Summary: Update memblob to be PutBehaviour aware by changing implementation from Blobstore to BlobstoreOps
Differential Revision: D24021166
fbshipit-source-id: 04dd25c5535769ea507120c1886592b808a7bbc6
Summary: Update Memblob::new callsites to ::default() in preparation for adding arguments to ::new() to specify the put behaviour desired
Differential Revision: D24021173
fbshipit-source-id: 07bf4e6c576ba85c9fa0374d5aac57a533132448
Summary: Add put behaviour handling to fileblob so that it can prevent overwrites if requested.
Differential Revision: D23933228
fbshipit-source-id: 8e74ac96b232be841174f6ad2bd2fccf92aaa90d
Summary:
Add put behaviour to BlobstoreOptions in preparation for passing in the put behaviour through blobstore_factory.
Later in the stack a command line option is added to set this non-None so that we can turn on overwrite logging for particular jobs.
Reviewed By: StanislavGlebik
Differential Revision: D24021169
fbshipit-source-id: 5692e2d3912ebde07b0d7bcce54b79df188a9f16
Summary: Add a reliable, lightweight TraceBus class for publishing events to a background thread. Subscribers can be registered for observing events or computing telemetry about them.
Reviewed By: wez
Differential Revision: D23404525
fbshipit-source-id: 3539466421b0821ffb918ea862168d3cccd19b15
Summary: To be more Windows compatable, we should move away from using `sh -c`. We don't use shell=True because that is susceptible to shell injection. As long as we don't close stdin until we're done, using Popen raw should be fine.
Reviewed By: xavierd
Differential Revision: D24151425
fbshipit-source-id: c0bcc883af948491862e8ce0cee56bcbe98e04f1
Summary: Add a new "user" column to the EdenAPI server's Scuba data.
Reviewed By: krallin
Differential Revision: D24153765
fbshipit-source-id: 95a3177d9283e5b0e3f47e7f42a1de5142049b99
Summary: Add a method to get the client's username from a client identity. This is helpful for logging, etc.
Reviewed By: krallin
Differential Revision: D24153766
fbshipit-source-id: 2ecf85e5de72918aeb292ce88539a991da4de900
Summary: Allow source control service clients to set pushvars.
Reviewed By: krallin
Differential Revision: D24136870
fbshipit-source-id: 34f9176ec66ca319b363c91015dae9b59a55a244
Summary:
Add the possibilty of setting `pushvars` when deleting bookmarks. This makes
it consistent with the other write operations.
Reviewed By: krallin
Differential Revision: D24136869
fbshipit-source-id: f98b74c6c731e50552184000ad697d04748711fd
Summary:
Previously all mutable_counters reads were going to leader. This might be
useful for some cases, but in the case of blobimport warmer this opens too
many connections to sql leader.
Let's read from replica instead
Reviewed By: krallin
Differential Revision: D24160315
fbshipit-source-id: 0cecde3c54a00bbea215a1e0fa63d4a7c3dc9eaa
Summary:
This seems to have broken as a result of a land race between D23999804 (6421dca639) and
D23455274 (bdff69b747). Let's fix it.
Reviewed By: ikostia
Differential Revision: D24158809
fbshipit-source-id: 1d733e2c93eb8a0803395d409fbb15e2e2146bdd
Summary: Adds version of `bounded_traversal_stream` where unfold returns a stream over children instead of an iterator. This function also applies back pressure on children iteration when we have too many unscheduled items.
Reviewed By: krallin
Differential Revision: D23931035
fbshipit-source-id: 2e2806653782d4e646dcdf4b2d4e624fd6543da8
Summary:
Our stdin/stdout bytes/str manipulations caused input() to print
warnings about buffered not being supported. The only reason we do those
manipulations to handle the case in tests where the prompt answer doesn't come
from stdin, so let's just handle that case via readline instead of prompt.
This is what upstream Mercurial does.
Reviewed By: quark-zju
Differential Revision: D24122909
fbshipit-source-id: ab9d989a66d39990b688c65a1fae80bd48b0f42e
Summary: Add `--debug` flag to `read_res cat` command for debug printing entire entry rather than just the data blob.
Reviewed By: kulshrax
Differential Revision: D23999804
fbshipit-source-id: 6955854edab2643cffbe5fae484a398716b48055
Summary:
Initial support for a backend using edeanpi.
Note this is just the first step. Most code paths are not updated to the
streaming API to get commit data, so they will error out with commit not found
errors.
Confirm that commit data can be fetched via edenapi:
$ RUST_LOG=debug lhg dbsh --config experimental.lazy-commit-data=1
In [1]: master= repo['master'].node()
In [3]: cl.inner.getcommitrawtext(master) is None
Out[3]: True
In [4]: s=cl.inner.streamcommitrawtext(repo.nodes('master~10::master'))
In [5]: it=iter(s)
...
[2020-09-25T02:09:16.793Z DEBUG hgcommits::hybrid] >> resolve_remote input=[e6c4e203b66f1416e08dc597a2d63b91e93b1466, bfb610989e9dd701e785b4a3a5998e76d9709cab, 68bbfc79602a153895b761089e9479dd8fa33351, 5366fe39ad538463abae6c648eb5150bbb79d4c7, 5ea45d8ab0f8203837ca1736f36ded4a492571b4, 722da0a32eae12de5e85078beea2ae4b7aafe4a4, 4dbe3eab10d13b30697e1762eb7b9ff3ad0cf630, 430ae91aab8028b6572ccef89f8396dafec622c4, 5abd96c5420f0d512c63e768f8cea83f1c6691c9, c84ab3412cebfade730e95a1bc5ebc9b1dd0747b, 790ed2d40e4a0b08fb22fe9b4246fec0165f8a87]
[2020-09-25T02:09:16.793Z DEBUG hgcommits::hybrid] << resolve_remote input=[e6c4e203b66f1416e08dc597a2d63b91e93b1466, bfb610989e9dd701e785b4a3a5998e76d9709cab, 68bbfc79602a153895b761089e9479dd8fa33351, 5366fe39ad538463abae6c648eb5150bbb79d4c7, 5ea45d8ab0f8203837ca1736f36ded4a492571b4, 722da0a32eae12de5e85078beea2ae4b7aafe4a4, 4dbe3eab10d13b30697e1762eb7b9ff3ad0cf630, 430ae91aab8028b6572ccef89f8396dafec622c4, 5abd96c5420f0d512c63e768f8cea83f1c6691c9, c84ab3412cebfade730e95a1bc5ebc9b1dd0747b, 790ed2d40e4a0b08fb22fe9b4246fec0165f8a87]
...
[2020-09-25T02:09:16.958Z DEBUG zstore::zstore] >> Zstore::contains id=3422a85c3703dd0bf0030d5d4c1bb65775adff90
[2020-09-25T02:09:16.958Z DEBUG zstore::zstore] << Zstore::contains id=3422a85c3703dd0bf0030d5d4c1bb65775adff90
[2020-09-25T02:09:16.958Z DEBUG zstore::zstore] >> Zstore::insert data_len=1010 id=3422a85c3703dd0bf0030d5d4c1bb65775adff90
[2020-09-25T02:09:16.958Z DEBUG zstore::zstore] << Zstore::insert data_len=1010 id=3422a85c3703dd0bf0030d5d4c1bb65775adff90
...
[2020-09-25T02:09:16.959Z INFO zstore::zstore] >> Zstore::flush
[2020-09-25T02:09:16.959Z DEBUG indexedlog::log] >> Log::sync dirty_bytes=7048
[2020-09-25T02:09:16.959Z DEBUG indexedlog::log] << Log::sync dirty_bytes=7048
[2020-09-25T02:09:16.959Z INFO zstore::zstore] << Zstore::flush
In [6]: list(it)
Out[6]: [...]
The logs about `hgcommits::hybrid ... resolve_remote` shows the remote fetching
is working. The logs about `Zstore::insert` and `Zstore::flush` shows the
commit data were written to disk.
Reviewed By: sfilipco
Differential Revision: D23924148
fbshipit-source-id: a3d77999e29395ce5c603fe66412936947456534
Summary:
Support constructing the "hybrid" commits backend, which is similar to
"doublewrite" but read commit text from edenapi via the `streamcommitrawtext`
method.
Reviewed By: sfilipco
Differential Revision: D23924149
fbshipit-source-id: cb15ee4be7953af7798d460557ba2ae2d4f24a52
Summary:
The hybrid backend is similar to the doublewrite backend, except that it does
not use revlog to read commit data, but uses EdenAPI instead.
Note:
- The non-stream API will not fetch commit data from EdenAPI.
- The commit hashes are not lazy yet.
Reviewed By: sfilipco
Differential Revision: D23924147
fbshipit-source-id: eb2cf8d3a7e1704b4efb13ad3ad86f8b6a1b31d0
Summary:
This can be used like:
In [1]: s=cl.inner.streamcommitrawtext(repo.nodes('.%%master')) # repo.nodes returns a generator, becomes stream
In [2]: s
Out[2]: <stream at 0x7f5eec742df0>
In [3]: list(s)
Out[3]: [{'vertex': ..., 'raw_text': ...}, ...]
In [4]: s.typename()
Out[4]: 'cpython_ext::convert::Serde<hgcommits::ParentlessHgCommit>'
Reviewed By: sfilipco
Differential Revision: D23911870
fbshipit-source-id: f54959a551d446ed5b8086a2235fe74e47b29e70
Summary:
This makes it convertible to `PyObject` via `cpython_ext::convert::Serde`
without additional code or dependencies.
Reviewed By: sfilipco
Differential Revision: D23966993
fbshipit-source-id: 74d83524a7c0701cde7aa6d61bb930ff4a1c90f5
Summary:
This API allows us to stream the data. If callsites only use this API, we'll
be more confident that there are no 1-by-1 fetches.
Reviewed By: sfilipco
Differential Revision: D23911865
fbshipit-source-id: 4c7dd8c2b5be33be5a55822845d55345797bacdf
Summary:
The API is basically to resolve `input_stream` to `output_stream`, with a
stateful "resolver" that can resolve locally and remotely.
Reviewed By: sfilipco
Differential Revision: D23915775
fbshipit-source-id: 14a3a37fc897c8229514acac5c91c7e46b270896
Summary:
Introduce `FileMetadata` and `DirectoryMetadata` to `Treeentry`, along with corresponding request API.
Move `metadata.flags` to `file_metadata.revisionstore_flags`, as it is never populated for trees. Do not use `metadata.size` on the wire, as it is never currently populated.
Leaving `DirectoryMetadata` commented out temporarily because serde round trips fail for unit struct. Re-introduced with fields in the next change in this stack.
Reviewed By: DurhamG
Differential Revision: D23455274
fbshipit-source-id: 57f440d5167f0b09eef2ea925484c84f739781e2
Summary:
EdenAPI always checks the integrity of filenode hashes before returning file data to the application. In the case of LFS files, this resulted in errors because the filenode hash is computed using the full file content, but the blob from the server only contains an LFS pointer.
Fix the bug by exempting LFS blobs from filenode integrity checks. (If integrity checks for LFS blobs are desired, the LFS code should be able to do this on its own since LFS blobs are content-addressed.)
Reviewed By: quark-zju
Differential Revision: D24145027
fbshipit-source-id: d7d86e2b912f267eba4120d1f5186908c3f4e9e3
Summary:
`cpython_ext` provides utilities to implement From/ToPyObject directly for
serde types. Lets' use it to simplify the code and set up an example.
debugshell:
In [2]: s,f=api.commitdata(repo.name, list(repo.nodes('master')))
In [3]: list(s)
Out[3]:
[{'hgid': (7, 61, 22, ...), 'revlog_data': '...'}]
Note: `HgId` serialization should probably be changed to use `serde_bytes` somehow
so it does not translate to a Python tuple. That will be fixed later.
Reviewed By: kulshrax
Differential Revision: D23966987
fbshipit-source-id: 9278ccae6f543c387eafe401d4ef8d6ce96d370f
Summary:
This can be used to automate Python/Rust conversions for complex structures
like `CommitRevlogData`.
Reviewed By: kulshrax
Differential Revision: D23966988
fbshipit-source-id: 17a19d38270e6ef0952c13a1cd778487e84a94ff
Summary:
The goal is to implement `FromPyObject` and `ToPyObject` more easily.
Today crates have to dependent on `cpython` to implement `From/ToPyObject`,
which is somewhat unwanted for pure Rust crates.
The `ser` module used to ignore the `variant` field for non-unit enum variants.
They have been fixed so the serialized value can be deserialized correctly.
For example, `enum E { A, B(T) }` will be serialized to `"A"` for `E::A`, and
`{"B": T}` for `E::B`.
Reviewed By: kulshrax
Differential Revision: D23966994
fbshipit-source-id: c50d57bf313caeec65a604ed9b05a5729f3b3635
Summary:
Switch from the default tuple deserialization which only understands the tuple
format, to "bytes" deserialization, which understands not only the existing
"tuple" format (therefore compatible with old data), but also "bytes" and "hex"
formats (for CBOR).
This will unblock us from switching to bytes serialization in the future.
Note: This is a breaking change for mincode serialization. Mincode + HgId users
(zsotre, metalog) have switched to explicit tuple serialization so they don't use
the default deserializaiton and remain unaffected.
Reviewed By: kulshrax
Differential Revision: D23966995
fbshipit-source-id: 83dd53f57bd4e6098de054f46a1d47f8b48133d0
Summary: This will unblock us from switching HgId to bytes serialization by default.
Reviewed By: kulshrax
Differential Revision: D24009039
fbshipit-source-id: a277869ec24652af428cda581faffa62c25d32c4
Summary: Similar to D23966992 (2a2971a4c7), add support to serialize Key differently.
Reviewed By: DurhamG
Differential Revision: D24009041
fbshipit-source-id: 2ecf1610b989a04083196d180bc62307b5162c2f
Summary: Similar to D23966992 (2a2971a4c7), add support to serialize Sha256 differently.
Reviewed By: DurhamG
Differential Revision: D24009040
fbshipit-source-id: b77f6732802f95507e1540f0bbde4d5a92d13cac
Summary:
Add a way to specify different merge tools for interactive and non-interactive
mode.
This will be used for the default `editmerge` merge tool, which pops up the
`EDITOR` (vim) regardless of interactive mode, causing various user complains
and hangs, including `arc pull` running rebase triggering editor, or VS Code
running rebase triggering editor, and some other rebase hangs or vim errors.
Reviewed By: DurhamG
Differential Revision: D24069105
fbshipit-source-id: ec16fdc704cab6daeedb0c23d4028b4309d96d3f
Summary:
This diff makes it so that pushrebase fails if tries to rebase over a commit
with a specified extra "failpushrebase" set. If a client runs into this issue
then they need to do a manual rebase.
Differential Revision: D24110709
fbshipit-source-id: 82cd771c92b9fb45f4fa8794b2c736f08ac900b1
Summary:
This is the first part of allowing us to update mononoke blobstore put behaviour to optionally a) log when it is overwriting keys, and b) not overwrite existing keys.
Introduce BlobstorePutOps for blobstore implementations so we can track overwrite status of a put, and force an explicit PutBehaviour if required. Its intended that only blobstore implementation code and special admin tooling will need to access BlobstorePutOps methods.
Reviewed By: farnz
Differential Revision: D24021168
fbshipit-source-id: 56ae34f9995a93cf1e47fbcfa2565f236c28ae12
Summary:
This passes `--tmpdir` option to `~/fbcode/eden/scm/tests/run-tests.py`
so it's predictable where for example mononoke's logs will be.
Some time ago I was debugging hanging test. It was very annoying that I couldn't specify that tmpdir manually. It also wasn't printed out (it's only printed out with `--keep-tmpdir` **after** the test finishes).
Now it is possible to specify that.
Reviewed By: krallin
Differential Revision: D24137737
fbshipit-source-id: 6280832517b48ece9b65e443c236035e385efea6
Summary:
This diff adds two things:
- the ability to compute the reverse of a `CommitSyncDataProvider::Test`, useful when creating both small-to-large and large-to-small `CommitSyncer` structs in tests
- the ability to set a current `CommitSyncConfigVersion` in the provider, which can also be useful, when simulating current version changes.
NB: I ended up not needing the set version functionality in my tests (further in the stack) in the end, so I can remove it, but I do think it will prove useful eventually.
Reviewed By: StanislavGlebik
Differential Revision: D24103206
fbshipit-source-id: 389169b2984684d83b0f6fdeb3be597d84cc0f12
Summary: Remove unnecessary clone in packblob along with the Clone constraint on the inner blobstore.
Reviewed By: krallin
Differential Revision: D24109293
fbshipit-source-id: b47e68e63b6ffda95d28d974ed6883e4ae31b3a1
Summary:
The `hg unhide` command acquired the repo lock without acquiring the wlock.
This causes locking order problems, as it calls other parts of the code that
will acquire the `wlock` (such as autopull during revset resolution) while it
is already holding the `lock`.
This can cause `hg unhide` to deadlock with other `hg` commands that acquire
`wlock` before `lock`.
Reviewed By: kulshrax
Differential Revision: D24129559
fbshipit-source-id: cf31ec661123df329f1773d2b67deb474d6476f8
Summary:
Similarly to how we could try invalidating a file that isn't cached, we could
also be trying to invalidate a file whose path isn't cached. Both are
legitimate, and thus we need to ignore both.
Reviewed By: chadaustin
Differential Revision: D24125225
fbshipit-source-id: e8abe5cde5aa3602bb48258abb64aa0cdf60241d
Summary:
Thrift represents `binary` data type as `std::string` in C++. This method will
help us to convert `Hash` into a byte string.
Reviewed By: xavierd
Differential Revision: D24083621
fbshipit-source-id: ae50088db7727d98ca11a017f82b71e942217a17
Summary:
This diff adds a new constructor to `SqliteDatabase` to allow creation of
in-memory SQLite database. This can come in handy in testing.
Reviewed By: xavierd
Differential Revision: D24083579
fbshipit-source-id: ad6dd8b1c20392a882c1f164ef1f8af2f0ba11f8
Summary:
This allows `edenfsctl debug processfetch` to display what processes triggered
some IO in EdenFS which will be useful to debug rogue processes walking the
entire repo.
Reviewed By: chadaustin
Differential Revision: D23997665
fbshipit-source-id: 7d92755d0068a4b1819eb0c84b30cbdaa24296f7
Summary:
This will enable to gather a bit more debugging regarding what processes are
fetching data. The one missing bit on Windows is to collect the process name,
for now, a "NOT IMPLEMENTED" placeholder is put in place.
Reviewed By: wez
Differential Revision: D23946258
fbshipit-source-id: 9f7642c7b9207c5b48ffff0f4eb0333af00bc7d5
Summary: Instead of returning an error upon receiving an empty request, just return a `Fetch` object that does nothing. This prevents Mercurial from crashing in situations where an empty request somehow makes it to the EdenAPI remote store.
Reviewed By: quark-zju
Differential Revision: D24119632
fbshipit-source-id: cf4ec707b4097656c76d7084a55b2d0b3150b679
Summary:
Previously, EdenAPI was using `remotefilelog.debug` to determine whether to print things like download stats. Let's give EdenAPI its own `debug` option that can be configured independently of remotefilelog.
One notable benefit of this change is that download stats will always be printed immediately after the HTTP request completes. This can help rule out network or server issues in situations where Mercurial appears to be hanging during data fetching. (e.g, if hg had downloaded all of the data but was taking a while to process it, the debug output would show this.)
Reviewed By: DurhamG
Differential Revision: D24097942
fbshipit-source-id: bf9b065e7b97fc7ffe50ab74b1b13e2fe364755c
Summary: HostInfoProperties is allocated for every HostInfo and is accessed on every request. There's no reason this should be a unique_ptr, and the pointer indirection is expensive.
Reviewed By: jmswen
Differential Revision: D24009296
fbshipit-source-id: 2034d1c6e61e0dec51ca6ac7bd14ab12e74966d4
Summary:
Previously phase calculation was done via a simple ancestor check. This
was very slow in cases that required going far back into the graph. Going a year
back could take a number of seconds.
To fix it, let's take the Rust phaseset logic and rework it to make only_both
produce an incremental public nodes set. In a later diff we can switch the
phaseset function to use this as well, but right now phaseset returns IdSet, and
that would need to be changed to Set, which may have consequences. So I'll do it
later.
Reviewed By: quark-zju
Differential Revision: D24096539
fbshipit-source-id: 5730ddd45b08cc985ecd9128c25021b6e7d7bc89
Summary:
This is one more fix to use correct commit sync config version. In particular,
this diff fixes a case where a single parent commit was rewritten out. E.g.
if a large repo commit touches only files that do not remap in a small repo. In
that case we still want to record correct mapping so that all descendants used
the correct mapping as well.
Reviewed By: ikostia
Differential Revision: D24109221
fbshipit-source-id: bcdbb01b964d70227dff8363e77964716a345261
Summary:
Let's move initialization into a separate function. I'm planning to use it in
the next diff for another test
Reviewed By: ikostia
Differential Revision: D24109222
fbshipit-source-id: 73142dd46ef3de15ff381670ed6d5e31653c5dd4
Summary:
Previously fetch_bonsai_range returned all commits between `ancestor` and
`descendant`, but `ancestor` was included. This is usually not what we want and
it might be surprising and can lead to subtle bugs. As an example, next commit
in the stack might have failed pushrebases when it shouldn't do that.
This diff changes the semantic of the function to exclude an ancestor. This
function was used for 2 use cases:
1) Find changed files. find_rebased_set function was manually removing the
ancestor anyway, so there's no change in behaviour
2) To check that there are no case conflicts. Previously we were checking the
case conflicts with ancestor included, but that wasn't necessary. To prove that
let's go over the two possible situation:
i) This is a first iteration of the pushrebase
```
CB
SB |
| ...
... CA
SA
| /
root
```
in that case files introduced by root commit will be used to check if we have
case conflicts or not. But this is not necessary, because pushrebase assumption
is that CA::CB should not introduce any new case conflicts. Besides, even if
they added a case conflict then checking with just the files that were changed by root commit is
not enough to verify that.
Similar logic goes to SA::SB commits. Checking if root has any conflicts with
SA::SB commits doesn't make sense.
ii) This is not the first iteration of the pushrebase
```
CB
SB |
| ...
... CA
SA
|
O <- latest pushrebase attempt
... <- we rebased over these commits on the previous attempts
| /
root
```
In this case it's even easier. Commit O was verified on the previous iteration,
so no need to add it here again.
Reviewed By: aslpavel
Differential Revision: D24110710
fbshipit-source-id: 90dff253cba0013e9d5e401474132a152d473cae
Summary:
The SpawnedProcess tests were failing on my macOS machine because pwd
and getcwd returned slightly different paths. Normalize them before
comparing.
Reviewed By: genevievehelsel
Differential Revision: D24094634
fbshipit-source-id: aacf802280b1dd1de19797604bfe359d7e60cbf8
Summary:
A couple of files were moved but test-check-code.t wasn't updated to reflect
this, causing it to fail.
Reviewed By: DurhamG
Differential Revision: D24113079
fbshipit-source-id: 9a0c0b6f07a6532715bf5ee401036ded0a05b16a
Summary:
Pull Request resolved: https://github.com/facebookexperimental/eden/pull/65
Using $LOCALIP will ensure more consistent behavior when setting up the server in ipv4 or ipv6.
The LOCALIP variable was also abused when it was used to override ssh client address, so SSH_IP_OVEERIDE env was created here.
Lastly the result of `curl` call is now printed whenever the test failed to verify that Mononoke is running.
Reviewed By: farnz
Differential Revision: D24108186
fbshipit-source-id: e4b68dd2c5dd368851f0b00064088ffc442e31e0
Summary: D24070707: `[Thrift] Provide sorted fields to read_field_begin` made a change to the generated rust thrift files, so the eden/scm thrift files have to be regenerated to fix the build.
Reviewed By: farnz
Differential Revision: D24109655
fbshipit-source-id: e8575a76642673a11514fdce8e30f13ca28151f0
Summary:
Normally, sync logic infers `CommitSyncConfigVersion` to use from parent commits (or from current version for root commits). However, for test purposes it is convenient to force a version override This logic does not change any of the production behaviors, and will be used in a later diff.
TODO: can it ever be needed beyond tests? I've thought about using this for "version boundary" commits, but those would probably just be constructed while completely bypassing the sync logic.
TBH, I am not certain this diff is a good change. But I've spend a very large amount of time crafting the repos used in the `sync_merge` tests later in this stack, so I am proposing to land this, then spend some time refactoring sync tests (and hopefully making it easier to craft test repos), then removing this logic. Obviously, this logic should only be landed if we land the tests in the first place.
Reviewed By: StanislavGlebik
Differential Revision: D24104101
fbshipit-source-id: 0825f04ed74532e89fd5f1fbebeee5f2001fedcd
Summary: It is sometimes very convenient to just inject new DAGs into existing repos.
Reviewed By: StanislavGlebik
Differential Revision: D24103164
fbshipit-source-id: abdfa18acb2f2fb1475b601a7eccb57e006982ec