Summary: Mercurial Infinitepush normally records received bundles into the `forwardfillerqueue`, which is later tailed by the commit cloud `forwardfiller` in order be replayed onto Mononoke. Now I am adding a reverse filler, which means that we will have two such queues and sync in two directions. Therefore, in order to avoid infinite loops we need to distinguish cross-backend bundle replay from genuine pushes. I propose to use `crossbackendsync` bundle2 param to indicate that no recording is needed.
Reviewed By: krallin
Differential Revision: D21255446
fbshipit-source-id: 70f6efe1331bd3c7fd3aca61d486d350d93086dc
Summary:
Previously we weren't showing help at all when running "hg cloud --help". This
diff should fix it
Reviewed By: ikostia
Differential Revision: D21347825
fbshipit-source-id: 1d9d11e2f9fe18a03b5d2cd8bd316fe9a218347c
Summary:
In vs-code if you have a well-formed url you can command-click it
to open it, e.g., when you see it in your terminal. But this URL is not
well-formed because it's missing the protocol. So let's add it.
Reviewed By: quark-zju
Differential Revision: D21336742
fbshipit-source-id: dd2de2d5177d3a2542c4a22f0099f28b97c79d06
Summary: If the bundle file has 1G in size, we probably don't want to waste 1G of memory just to see that it is too big.
Reviewed By: markbt
Differential Revision: D21177359
fbshipit-source-id: 026b5ab40e6dbddba3d7142a8e34256d127bf82c
Summary:
Building on the previous two commits, this adds a test which performs the following steps:
- does an infintiepush push to a Mercurial server
- looks into the `forwardfillerqueue`
- runs commitcloud forwardfiller's `fill-one` to replay the bundle to Mononoke
- verifies that this action causes the commit to appear in Mononoke
As this test uses `getdb.sh` from Mercurial test suite, it needs to be whitelisted from network blackholing (Note: we whitelist mononoke tests, which run with `--mysql` automatically, but this one is different, so we need to add it manually. See bottom diff of the stack for why we don't use `--mysql` and ephemeral shards here).
Reviewed By: krallin
Differential Revision: D21325071
fbshipit-source-id: d4d6cbdb10a2bcf955ee371278bf2bbbd5f5122c
Summary:
Cloning a repository with selectivepull enabled temporarily disables
selectivepull for the duration of the initial pull so that we can get a
streaming clone.
Once this is done, hide all of the commits in the repository by clearing the
visible heads. Selective pull will then populate the remote bookmarks with the
public heads that we do want.
Reviewed By: quark-zju
Differential Revision: D21301037
fbshipit-source-id: 565ae50439ed5405ce940a5675caeba912fe7083
Summary: If a scratch remotebookmark points to a draft commit that is not available locally (i.e., has been hidden on a different machine), then don't pull it into the local repository. Instead, just omit the remote bookmark.
Reviewed By: quark-zju
Differential Revision: D21287886
fbshipit-source-id: 84e7c6b52250709f7c88b07fccdbb61e044370c8
Summary:
Selective pull subscripts are lost during a push. This is because
`remotenames.expush` calls `selectivepullbookmarknames` with the
remote, not the remote name.
`remotenames.exclone` does the same, but in that case there is
no need for the remote name as the destination can't have any
subscriptions other than the default set.
Reviewed By: quark-zju
Differential Revision: D21283029
fbshipit-source-id: 52c2e7e301b8e95b4a252cbfe2ad9de168e81044
Summary:
Commit cloud remote bookmarks sync has several edge cases that don't work correctly.
This test demonstrates some that we will fix.
Reviewed By: quark-zju
Differential Revision: D21283028
fbshipit-source-id: 4e746476a0f3bf0ca7d7088f510451632a2ee075
Summary: Fix permission issues we are seeing with the latest Mercurial release.
Reviewed By: xavierd
Differential Revision: D21294499
fbshipit-source-id: bcfb13dd005258b2e3b74fa281dbd8df36133ef6
Summary: `termios` is not avialable on Windows. Do not import it.
Reviewed By: DurhamG
Differential Revision: D21258999
fbshipit-source-id: f4390b69fe9abceea8b1959e7506c1558778f980
Summary:
I wanted to figure out "who added this visible head", "what is the difference
between this metalog root and that root". Those are actually source control
operations (blame, diff). Add a git export feature so we can export metalog
to git to run those queries.
Choosing git here as we don't have native Rust utilities to create a more
efficient hg repo yet.
Ideally we can also make hg operate on a metalog directory as a "metalogrepo"
directly. However that seems to be quite difficult right now due to poor
abstractions.
Reviewed By: DurhamG
Differential Revision: D21213073
fbshipit-source-id: 4cc0331fbad6e1586907c0a66c18bcc25608ea49
Summary:
This would avoid issues where the Phabricator commit hash points
to a filtered commit locally. There are 2 user complaints already.
Reviewed By: DurhamG
Differential Revision: D21180833
fbshipit-source-id: 7374e236bcf5e3f3e62bae59fa53604869e22c6f
Summary:
This makes metalog easier to use in debugshell context. For example, to
investigate the "bookmarks" in the past, the code gets simplified from:
roots = b.metalog.metalog.listroots(repo.svfs.join('metalog'))
past_ml = b.metalog.metalog(repo.svfs.join('metalog'), root[10])
past_ml.get("bookmarks")
to:
roots = ml.roots()
past_ml = ml.checkout(roots[10])
past_ml.get("bookmarks")
Reviewed By: DurhamG
Differential Revision: D21162568
fbshipit-source-id: 7cc5581afe596a3d2696311a36ac11caa718428a
Summary: This uploads more details that might be useful for debugging.
Reviewed By: DurhamG
Differential Revision: D21179514
fbshipit-source-id: ba24395fc4fd153c3ceb4957f822aab70821e6ef
Summary: This allows the Python world to obtain the root ID for logging purpose.
Reviewed By: DurhamG
Differential Revision: D21179513
fbshipit-source-id: 3f289c06d3d470ff492de39fa985203b3facbf00
Summary:
This would hopefully give us more insights about when and what host uploads
10k+ remote bookmarks.
Reviewed By: DurhamG
Differential Revision: D21179515
fbshipit-source-id: 3e5c5559e2e739268add05e40f71bea08c29662f
Summary:
We removed the feature in D20704618 and it does not cause complaints.
Let's remove the code supporting the chown feature.
Reviewed By: DurhamG
Differential Revision: D21170307
fbshipit-source-id: c845016219e8c681930bb1780b94e6d31ca99730
Summary:
While the change looks fairly mechanical and simple, the why is a bit tricky.
If we follow the calls of `ContentStore::get`, we can see that it first goes
through every on-disk stores, and then switches to the remote ones, thanks to
that, when we reach the remote stores there is no reason to believe that the
local store attached to them contains the data we're fetching. Thus the
code used to always prefetch the data, before reading from the store what was
just written.
While this is true for regular stores (packstore, indexedlog, etc), it starts
to break down for the LFS store. The reason being that the LFS store is
internally represented as 2 halves: a pointer store, and a blob store. It is
entirely possible that the LFS store contains a pointer, but not the actual
blob. In that case, the `get` executed on the LFS store will simply return
`Ok(None)` as the blob just isn't present, which will cause us to fallback to
the remote stores. Since we do have the pointer locally, we shouldn't try to
refetch it from the remote store, and thus why a `get_missing` needs to be run
before fetching from the remote store.
As I was writing this, I realized that all of this subtle behavior is basically
the same between all the stores, but unfortunately, doing a:
impl<T: RemoteDataStore + ?Sized> HgIdDataStore for T
Conflicts with the one for `Deref<Target=HgIdDataStore>`. Macros could be used
to avoid code duplication, but for now let's not stray into them.
Reviewed By: DurhamG
Differential Revision: D21132667
fbshipit-source-id: 67a2544c36c2979dbac70dac5c1d055845509746
Summary: implement the get() functions on the various LocalDataStore interface implementations
Reviewed By: quark-zju
Differential Revision: D21220723
fbshipit-source-id: d69e805c40fb47db6970934e53a7cc8ac057b62b
Summary:
This makes debugshell easier to use. Especially when the script wants to access
`__file__`.
Reviewed By: DurhamG
Differential Revision: D21169556
fbshipit-source-id: 88b3ebb1ca9a39fe26bc7cc5ea8e250c28fa0d6f
Summary:
Similar to part of D20804854, normalize infinitepush/ to default/ in
remotenames code paths.
This might not be strictly necessary, but we never want to write out
infinitepush/ remote bookmarks and we have seen that OnDemand writes out
infinitepush/ remote bookmarks (which got synced to commit cloud and caused
user complaints) recently.
Reviewed By: DurhamG
Differential Revision: D21233658
fbshipit-source-id: cf8c12ac40348c7b475a7f3d5d5223775491650e
Summary:
The python script run using `debugpython` within the test was importing
Mercurial internal modules which causes failures on some platforms like
Windows. Instead, lets use the `debugshell` command which already imports the
common objects like `repo` correctly.
Reviewed By: xavierd
Differential Revision: D21251818
fbshipit-source-id: db1b9e92df99b736a28bc9e89fb08ae77d6e82fc
Summary:
Memcache isn't available for Mac, but we can build the revisionstore with Buck
on macOS when building EdenFS. Let's only use Memcache for fbcode builds on
Linux for now.
Reviewed By: chadaustin
Differential Revision: D21235247
fbshipit-source-id: 5943ad84f6442e4dabbd2a44ae105457f5bb9d21
Summary:
When creates directories sometime we want to make sure other users within the same group have the write access to it to enable data sharing. Previously we rely on setting umask for the entire process to make sure the newly created directories have the correct permission bit. This is kind fragile and error-prone when running in a multi-thread environment.
This diff introduces an internal function `create_dir_with_mode` to create directory with specified permission mode. It first creates a temporary directory within the parent of the directory being created, setting up the correct permission bit, then attempts to rename the temporary directory to the desired name. This ensures that we never leave a directory without the correct permission in the place we need and without changing umask for the process.
Reviewed By: xavierd
Differential Revision: D21188903
fbshipit-source-id: 381bff7d3aaca097b9d50150e86cbbf70a90a0a5
Summary:
Adds initial python bindings for the rust pending changes. This is not
ready for production usage yet, but having the bindings let's me test changes
more easily until we're ready for automated tests.
Reviewed By: xavierd
Differential Revision: D20546896
fbshipit-source-id: c0ad7155e5068f45bf9c987030746e6c5f35c26a
Summary:
The second phase of pending changes is to iterate over the treestate
and figure out what files were not seen in the filesystem walk. This diff
implements that.
Reviewed By: xavierd
Differential Revision: D20546899
fbshipit-source-id: 3523fbc7e31ef0ed09c4937c72264b64e2a3db5b
Summary:
The first phase of pending changes is inspecting the filesystem for
changes. This diff adds that logic.
Reviewed By: xavierd
Differential Revision: D20546909
fbshipit-source-id: 1c2c0fa7f700dbff4acfce4d5271b4472a13571f
Summary: This ensures lagged master issue does not happen by pulling a single commit.
Reviewed By: DurhamG
Differential Revision: D20845384
fbshipit-source-id: 3ba16c07fe264fe2b6aecd494bbb832af7b390a0
Summary:
Patterns are more powerful than prefixes. Some full revset names are
obviously not valid remote name - for example, `remote/master :: .`.
They cannot be ruled out using prefixes but can using patterns.
As we're here, add some caching for the regex compiler.
Reviewed By: DurhamG
Differential Revision: D20831015
fbshipit-source-id: af8c4eed4a3153fd71480b8972c55feed4641392
Summary:
For names like `a-b-c`, it can be parsed in multiple ways:
- `"a-b-c"`
- `"a-b" - "c"`
- `"a" - "b-c"`
Mercurial uses `repo.lookup` in the parser to accept names like `"a-b-c"`.
Do it for the whole revset expression too.
But do not do it for every lookup (ex. testing `"a-b"` or `"b-c"` in the above
case), because that can be exceedingly expensive.
Reviewed By: DurhamG
Differential Revision: D20831014
fbshipit-source-id: f507e04ce24c953b096ccd836c356f50f11d2006
Summary:
Add a function to figure out unknown symbols of a revset AST without executing
the revset, then try to autopull unknown symbols.
This has pros and cons comparing to the runtime (stringset) approach.
Pros:
- The fullreposet restriction is lifted.
- We can have some ideas about whether the symbols are "probably" locally
known. If everything is locally known, then we can skip all remote lookups
(see the next diff for details).
- We can fetch multiple symbols in batch if we want. Although it's difficult
to do so right now due to different paths (paths.default, infinitepush,
infinitepushbookmark, etc). Revisit once the paths are unified.
Cons:
- Functions and argument types must be statically known. Might miss
some functions added by some extensions.
Reviewed By: DurhamG
Differential Revision: D20909535
fbshipit-source-id: a567ca0aa80aa7f2930dd75637ef3ff8cef1bdef
Summary:
Dashes are also revset operators. Add a test so we can verify if we can also
autopull those special names.
Reviewed By: DurhamG
Differential Revision: D20831016
fbshipit-source-id: 97e772053dae873ebaa529ac9eb84ea9d04f0f63
Summary:
Similar to the previous diff, let's use the new autopull logic, which
uses the tech-debt free repo.pull API.
In rare cases this can cause more pulls due to the path selection (ex. use
"infinitepush" for "remote/scratch/x" but "default" for "remote/stable/x").
With Mononoke serving repos, we might be able to remote the special paths,
and just use "default" for all cases.
If it goes well, we can then delete the old code.
Reviewed By: DurhamG
Differential Revision: D20804854
fbshipit-source-id: 75e68582a29b613c8626a119b85064e3c0ba9462
Summary:
The new auto pull logic can replace the one in remotenames. It it goes well, we
can then remove the code in remotenames doing the auto pull.
Reviewed By: sfilipco
Differential Revision: D20804853
fbshipit-source-id: c87b6b382f4cce3b306648b305a7b6bbaec05df1
Summary:
Attempt to auto pull bookmark names without the `remote/` or `default/` prefix
if hoist is set. This would hopefully be enough to allow us to enable
selectivepull globally without breaking existing users.
Reviewed By: sfilipco
Differential Revision: D20804856
fbshipit-source-id: 72601ac5e3545523cbfd7087d1fc822ef33c2f2e
Summary:
On repack, when the Rust stores are in use, the repack code relies on
ContentStore::commit_pending to return the path of a newly created packfile, so
it won't delete it when going over the repacked ones. When LFS is enabled, both
the shared and the local stores are behind the LfsMultiplexer store that
unfortunately would always return `Ok(None)`. In this situation, the repack
code would delete all the repacked packfiles, which usually is the expected
behvior, unless only one packfile is being repacked, in which case the repack
code effectively re-creates the same packfile, and is then subsequently
deleted.
The solution is for the multiplex stores to properly return a path if one was
returned from the underlying stores.
Reviewed By: DurhamG
Differential Revision: D21211981
fbshipit-source-id: 74e4b9e5d2f5d9409ce732935552a02bdde85b93
Summary:
This adds the proper hooks in the right place to upload the LFS blobs and write
to the bundle as LFS pointers. That last part is a bit hacky as we're writing
the pointer manually, but until that code is fully Rust, I don't really see a
good way of doing it.
Reviewed By: DurhamG
Differential Revision: D20843139
fbshipit-source-id: f2ef7b045c6604398b89580b468c354d14de1660
Summary:
Add two utility programs for ad-hoc debugging of EdenAPI. EdenAPI requests and responses are encoded as CBOR, which is not easy to work with manually on the command line. In order to allow debugging the HTTP API using tools like `curl`, we need tools that can generate raw request payloads and interpret CBOR responses.
The utility programs included in this diff are:
- `make_req` - Can construct EdenAPI request payloads from a human-editable JSON representation.
- `data_util` - Can list, validate, and extract the contents of an EdenAPI data response.
These tools can be used by themselves or as part of a pipeline. See test plan for examples.
Reviewed By: xavierd
Differential Revision: D21136575
fbshipit-source-id: d1ac8d92964614005078a6ac76dd0835c29a80a5
Summary:
`scmutil.revsingle` resolves the commit using the revset language layer,
which will trigger auto pull logic. The import helper does not want any
kind of auto pull logic. Therefore use `repo[rev]` instead, which resolves
the commit hash directly and bypasses the revset / auto pull logic.
Reviewed By: singhsrb
Differential Revision: D21196495
fbshipit-source-id: 5a51057a731523bbb643c7e264d6902dcfbb9059
Summary: Move the MutationEntry type to the Mercurial types crate. This will allow us to use it from Mononoke.
Reviewed By: quark-zju
Differential Revision: D20871338
fbshipit-source-id: 8de3bb8a2673673bc4c8a6cc7578a0a76358c14a
Summary:
The part of status that lists what files have changed is called
PendingChanges. This diff introduces the initial stub for PendingChanges. The
pending changes algorithm involves three parts:
1. Looking at files on the filesystem for changes.
2. Looking at files in the dirstate map for changes.
3. Looking at the content for any files that we were unsure of during steps 1
and 2.
This diff puts the basic state machine in place, and accepts the basic
information about the working copy (the root and what type of filesystem it is).
In the future we might have it detect what type of filesystem it is, but for now
this makes it easy.
Reviewed By: xavierd
Differential Revision: D20546898
fbshipit-source-id: a3030b7c846b3cb2fcba805b7fe4744df7c5764e
Summary:
treestate.get_filtered_keys passes directory paths to the filter
function and returns directory matches with a trailing '/' on the end. This
makes it difficult to act as a path normalization function when the caller
doesn't know if the path is a file or directory.
It seems like we can just strip the trailing '/' before exposing the strings to
the caller (both as filter inputs and as get_filtered_keys outputs).
This is useful in the following diff that adds a case normalization crate.
Reviewed By: xavierd
Differential Revision: D20880881
fbshipit-source-id: 6e9f419178b4e278844244bd6aff2fc10e09d2cd
Summary:
This logic will be used in a variety of places (update workers, status,
etc). Let's move it somewhere common.
Reviewed By: xavierd
Differential Revision: D20771623
fbshipit-source-id: b4de7c1d20055a10bbc1143d44c55ea1045ec62a
Summary:
In a later diff we'll need to be able to hand a reference to the
TreeState to the pending changes iterator. We'd like to be able to hand a
Rc<RefCell<TreeState>> but cpython requires that its fields implement Send. The
simplest solution is to use Arc<Mutex<_>>. Once we finish Rustifying all of this
code we can drop the cpython requirement that this work across threads and
downgrade this to a Rc<RefCell<_>>.
Reviewed By: xavierd
Differential Revision: D20546904
fbshipit-source-id: a4a1ce6973f53b3bb95f227616149f98fcd780e0