Summary:
configs.allowedlocations restricts what configs can be loaded to a
certain set of files. This will enable us to deprecate all old config locations.
This diff adds Python support and a high level test.
Reviewed By: quark-zju
Differential Revision: D25539736
fbshipit-source-id: fa2544379b65672227e0d9cf08dad7016d6bbac8
Summary:
We want to start disallowing non-approved config files from being
loaded. To do that, let's update the config verifier to accept an optional list
of allowed locations. If it's provided, we delete any values that came from a
disallowed location.
This will enable us to prune our config sources down to rust configs,
configerator configs, .hg/hgrc, and ~/.hgrc.
Reviewed By: quark-zju
Differential Revision: D25539738
fbshipit-source-id: 0ece1c7038e4a563c92140832edfa726e879e498
Summary:
The goal of this stack is to start logging commits to scribe even if a commit was
introduced by scs create_commit/move_bookmark api. Currently we don't do that.
Initially I had bigger plans and I wanted to log to scribe only from bookmarks_movement and remove scribe logging from unbundle/processing.rs, but it turned out to be trickier to implement. In general, the approach we use right now where in order to log to scribe we need to put `log_commit_to_scribe` call in all the places that can possibly create commits/move bookmarks seems wrong, but changing it is a bit hard. So for now I decided to solve the main problem we have, which is the fact that we don't get scribe logs from repos where bookmarks is moved via scs methods.
To fix that I added an additional option to CreateBookmark/UpdateBookmark structs. If this option is set to true then before moving/creating the bookmark it finds all draft commits that are going to be made public by the bookmark move i.e. all draft ancestors of new bookmark destinationl. This is unfortunate that we have to do this traversal on the critical path of the move_bookmark call, but in practice I hope it won't be too bad since we do similar traversal to record bonsai<->git mappings. In case my hopes are wrong we have scuba logging which should make it clear that this is an expensive operation and also we have a tunable to disable this behavioiur.
Also note that we don't use PushParams::commit_scribe_category. This is intentional - PushParams::commit_scribe_category doesn't seem useful at all, and I plan to deprecate it later in the stack. Even later it would make sense to deprecate PushrebaseParams::commit_scribe_category as well, and put commit_scribe_category optoin in another place in the config.
Reviewed By: markbt
Differential Revision: D25558248
fbshipit-source-id: f7dedea8d6f72ad40c006693d4f1a265977f843f
Summary:
Those messages like "pulling from ...", "added n commits ..." belong to stderr.
This makes it possible for us to turn on verbose output for auto pull, without
breaking tools that parses stdout.
Reviewed By: sfilipco
Differential Revision: D25315955
fbshipit-source-id: 933f631610840eb5f603ad817f7560c78b19e4ad
Summary:
It turns out `Arc::ptr_eq` is becoming unreliable, which will cause fast paths
to be not used, and extreme slowness in some cases (ex. `public & nodes`
iterating everything in `public`).
This diff adds an API for an IdMap to tell us its identity. That identity is
then used to replace the unreliable `Arc::ptr_eq`.
For an in-memory map, we just assign a unique number (per process) for its
identity on initialization. For an on-disk map, we use the type + path to
represent it.
Note: strictly speaking, this could cause false positives about
"maps are compatible", because two maps initially cloned from each other
can be mutated differently and their map_id do not change. That will
be addressed in upcoming diffs introducing a more complex but precise way to
track compatibility.
Reviewed By: sfilipco
Differential Revision: D25598076
fbshipit-source-id: 98c58f367770adaa14edcad20eeeed37420fbbaa
Summary:
On Windows, platform.version appears to be misleading as it returns 1903 for
Windows 10 1909... The registry has the correct build, so let's use that instead.
Reviewed By: kmancini
Differential Revision: D25577939
fbshipit-source-id: f47032906d02669bd1cb1a48304733c1e3499f81
Summary:
With selectivepull we can tell visibility directly about which heads to add,
instead of adding all nodes in a changegroup.
Note: This does not affect the pull command, nor exclude public commits yet.
Reviewed By: markbt
Differential Revision: D25562090
fbshipit-source-id: aa5f346f33058dfdb3b2f23f175e35b5d3c30a1d
Summary: They will be used in core later.
Reviewed By: markbt
Differential Revision: D25562093
fbshipit-source-id: 4402a629a09920fd4c6f85cb8e777446bb218a37
Summary:
When tailing to fill or backfill derived data, omit checking the heads from the
previous round of derivation, as we know for sure they've been derived.
Reviewed By: krallin
Differential Revision: D25465445
fbshipit-source-id: 384c7e67e99c561ce6aae324070e7c274c56b736
Summary:
Rustfmt gives up on formatting if strings are too long. Split the long help
strings so that the formatter works again.
Tidy up some of the help text while we're at it.
Reviewed By: krallin
Differential Revision: D25465443
fbshipit-source-id: 360dbedc1e3e2ffbc489a9d9cba008835bce506f
Summary:
ChangesetInfo derivation doesn't depend on the parent ChangesetInfo being
available, so we can derive them in batches very easily.
Reviewed By: krallin
Differential Revision: D25470721
fbshipit-source-id: cc8ce305990eb6c9846158f0e9e3917cf35e169d
Summary:
Add documentation comments to the derived data crate to describe how they fit
together.
Reviewed By: krallin
Differential Revision: D25432449
fbshipit-source-id: b62440bcecae900ad75d74245ce175bd9e07a894
Summary:
Using `backfill-all` on very large repositories is slow to get started and slow
to resume, as it must traverse the repository history all the way to the start
before it can even begin.
Make this more usable by using the skiplist index to slice the repository into
reasonably sized slices of heads with the same range of generation numbers.
Each slice is then derived in turn. If interrupted, derivation can continue
at the next slice more quickly.
Reviewed By: krallin
Differential Revision: D25371968
fbshipit-source-id: f150ea847f9fbbe84852587d620ae37ba2c58f28
Summary:
Right now, when we upload a hg commit, we check that we have all the content
the client is referencing.
The only problem is we do this by checking that the filenodes the client
mentions exist, but the way we store filenodes right now is we write them
concurrently with content blobs, so it is in fact possible to have a filenode
that references a piece of content that doesn't actually exist.
That isn't quite what one might call satisfactory when it comes to checking the
content does in fact exist, so this diff updates our content checking
In practice, with the way Mononoke works right now this should be quite free:
the client uploads everything all the time, and then we check later, so this
will just hit in the blobstore cache.
In a future world where clients don't upload stuff they already know we have,
that could be slower, but doing it the way is we do it before is simply not
correct, so that's not much better. The ways to make it faster would be:
- Trust that we'll hit in cache when checking for presence (could work out).
- Have the client prove to us that we have this content, and thread it through.
To do the latter, IMO the code here could actually look at the entries that
were actually uploaded, and not check them for presence again, but right now we
have a few layers of indirection that make this a bit tricky (technically, when
`process_one_entry` gets called, that means "I uploaded this", but nothing in
the signature of any of the functions involved really enforces that).
Reviewed By: StanislavGlebik
Differential Revision: D25422596
fbshipit-source-id: 3cf34d38bd6ed1cd83d93c778f04395c942b26c0
Summary:
It always takes a bit of time to find the logs. Since we do have scm daemon onces in the output of `hg cloud status`,
it would be nice to have the onces from background backup as well.
Reviewed By: markbt
Differential Revision: D25560145
fbshipit-source-id: cdf5d76c7c3ebb1492559d32935f9301452a1cd5
Summary: The order has been incorrect and led to a confusing message
Reviewed By: krallin
Differential Revision: D25559963
fbshipit-source-id: 4fcb3e53cedcb08675b60b25cbb5da2ca52c08ed
Summary:
Add a resolve_repos function to cmdlib::args for use from jobs that will run for multiple repos at once.
Planning to use this form the walker to scrub multiple small repos from single process
Differential Revision: D25422755
fbshipit-source-id: 40e5d499cf1068878373706fdaa72effd27e9625
Summary:
In a future diff, paths will be validated to make sure they are valid utf8. The
path sanity checker needs to be constexpr to construct global paths, but the
utf8 functions aren't, so let's write one that is.
Reviewed By: chadaustin
Differential Revision: D25562681
fbshipit-source-id: e48ec835c2cc9dc01090918cc7ee8f61b6c05a20
Summary:
When compiling with Buck, it tries to create hardlink which are rightfully
failing, but that error then triggers a spurious notification. For now, let's
only notify the user on timeout, as these are very likely to be network issues.
Reviewed By: kmancini
Differential Revision: D25500364
fbshipit-source-id: 95b609ae901fa6207c8edba26cd8e6a21ddfe3ac
Summary:
This diff does a small refactoring to hopefully make the code a bit clearer.
Previously we were calling log_commits_to_scribe in force_pushrebase and in
run_pushrebase, so log_commits_to_scribe was called twice. It didn't mean that
the commits were logged twice but it was hard to understand just by looking at
the code.
Let's make it a bit clearer.
Reviewed By: krallin
Differential Revision: D25555712
fbshipit-source-id: bed9754b1645008846a86da665b6f3f3483f30da
Summary:
I'm going to do a refactoring later in the stack, so let's add a test to avoid
regressions.
Reviewed By: krallin
Differential Revision: D25535655
fbshipit-source-id: 5ec6633c9c8c25d1affcede0adbc27dd43c48736
Summary:
IPython is incompatible with Python 3 demandimport. Disable demandimport to
make it work.
Reviewed By: singhsrb
Differential Revision: D25542394
fbshipit-source-id: 293880dff62e98895bc1ae2d3328d4af25b8218f
Summary:
Debug output belongs to stderr.
This makes it possible to turn on debug output without breaking programs
parsing stdout.
Reviewed By: singhsrb
Differential Revision: D25315954
fbshipit-source-id: c7813a824fbf6640cb5b80b5ed2d947e7059d53e
Summary:
With `collapse-obsolete`, `.` can be obsoleted and in the middle of a stack and
not shown up. That can be confusing. Make the smartlog revset always show the
`heads` passed in. If `.` is in `heads` (the default), then show it.
Reviewed By: DurhamG
Differential Revision: D24696595
fbshipit-source-id: 7deab109d0e0ae5e703928252bc63312d936955f
Summary:
Add open_existing_sqlite_path so we don't doing create_dir_all when we know the db already exists.
Noticed it in passing while investigating something else.
Reviewed By: markbt
Differential Revision: D25469502
fbshipit-source-id: 9810489c84220927937c037d69f5e8e70f2d9038
Summary: * suggest connecting to the VPN if hg update doesn't work
Reviewed By: sfilipco
Differential Revision: D25551017
fbshipit-source-id: 575f29cce4ab2719f2faae86616fdd9aac739f5f
Summary:
If Rust LFS is in use, we currently don't upload LFS blobs to commit cloud.
This is problematic because if you're going to Mononoke that means you can't
upload, and if you're going to Mercurial that means you're silently not backing
up data.
Reviewed By: StanislavGlebik
Differential Revision: D25537672
fbshipit-source-id: fd61f5a69450c97a0bc0895193f67fd22c9773fb
Summary: These show up when compiling with Buck, let's silence them.
Reviewed By: chadaustin
Differential Revision: D25513672
fbshipit-source-id: 277afae30059114f3646cdf4feedac442a4ee1b6
Summary:
On Windows, the GUID of the mount point identifies the virtualization instance,
that GUID is then propagated automatically to the created placeholders when
these are created as a response to a getPlaceholderInfo callback.
When the placeholders are created by EdenFS when invalidating directories we
have to pass GUID. The documentation isn't clear about whether that GUID needs
to be identical to the mount point GUID, but for a very long time these have
been mismatching due to the mount point GUID being generated at startup time
and not re-used.
One of the most common issue that users have reported is that sometimes
operations on the repository start failing with the error "The provider that
supports file system virtualization is temporarily unavailable". Looking at the
output of `fsutil reparsepoint query` for all the directories from the file
that triggers the error to the root of the repositories, shows that one of the
folder and its descendant don't share the same GUID, removing it solves the
issue.
It's not clear to me why this issue doesn't always reproduce when restarting
EdenFS, but a simple step that we can take to solve this is to always re-use
the GUID, and that hopefully will lead to the GUID always being the same and
the error to go away.
Reviewed By: fanzeyi
Differential Revision: D25513122
fbshipit-source-id: 0058dedbd7fd8ccae1c9527612ac220bc6775c69
Summary:
The config verifier would remove items from the values list if they
were disallowed. To do this, it iterated through the values list backwards,
removing bad items. In some cases it stored the index of a bad value for later
use, but because it was iterating backwards and removing things, the indexed it
stored might not be correct by the time the loop is done. To fix this, let's go
back to iterating forwards.
Reviewed By: quark-zju
Differential Revision: D25539737
fbshipit-source-id: 87663f3c162c690f3961b8075814f3467916cb4b
Summary:
The old case-conflict checks were more lenient, and only triggered if a commit
introduced a case conflict compared to its first parent.
This means that commits could still be landed to bookmarks that already had
pre-existing case conflicts.
Relax the new case-conflict checks to allow this same scenario.
Note that we're still a bit more strict: the previous checks ignored other
parents, and would not reject a commit if the act of merging introduces a case
conflict. The new case conflict checks only permit case conflicts in the case
where all conflicting files were present in one of the parents.
Reviewed By: StanislavGlebik
Differential Revision: D25508845
fbshipit-source-id: 95f4db1300ee73b8e6495ba8b5c1c2ce5a957d1a
Summary: Spotted this in passing. Was able to remove a call to fetch_root_manifest_id.
Reviewed By: StanislavGlebik
Differential Revision: D25472678
fbshipit-source-id: d450cb97630464be13d22fb37c3356611dc2e1b6
Summary: This makes it easier to run full walks on small repos.
Reviewed By: StanislavGlebik
Differential Revision: D25469485
fbshipit-source-id: 6e5b1426837a396d939e47a5b353e615437ae7cb
Summary:
If .buckd/pid is empty or invalid, assume that means Buck is not
running.
Reviewed By: genevievehelsel
Differential Revision: D25544725
fbshipit-source-id: 101ef67e17ff3e06f428cd7dbf51b2587fee4627
Summary:
Restore the behavior disabled by D25350916 (49c6f86325). This time it no longer runs Python
logic in background threads.
Reviewed By: sfilipco
Differential Revision: D25513054
fbshipit-source-id: 0220ccb37e658518d105bba04f45424c9fcfe142
Summary:
Make `get_commit_raw_text` aware of hg's hardcoded commit hashes: NULL_ID and
WDIR_ID. Previously, only `stream_commit_raw_text` is aware of it.
This makes it a bit more compatible when used in more places.
Reviewed By: sfilipco
Differential Revision: D25515006
fbshipit-source-id: 08708734a28f43acf662494df69694988a5b9ca0
Summary:
Unlike streamcommitrawtext, the new API does not put Python logic to a
background thread. This will make it easier to reason about Python logic as
they do not need to be thread-safe, and we don't need to think about Python GIL
deadlocks in the Rust async world.
Reviewed By: sfilipco
Differential Revision: D25513057
fbshipit-source-id: 4b30d7bab27070badd205ac1a9d54bae7f1f8cec
Summary:
Previously, only the batch fetching, or the stream fetching APIs will
actually fetch commit remotely. The 1-commit fetching API does not have
the network side effect, with the hope that we can migrate all usecases
to stream or batch fetching.
Practically it's quite difficult to migrate all use-cases, and the Python
layer has to have a fallback 1-by-1 fetching. Now let's just move that
fallback to Rust to simplify the code. The fallback in the Rust code
is by the default impl of get_commit_raw_text.
Reviewed By: sfilipco
Differential Revision: D25513056
fbshipit-source-id: b3c615397d33b8d35876dc23ca7b95173783ef80
Summary: The API will be used in Python bindings to avoid running Python in background threads.
Reviewed By: sfilipco
Differential Revision: D25513055
fbshipit-source-id: a108b55115271a256c0d43e0ff7b82c0b209be81
Summary:
Previously only `iterctx` does prefetch. Make `__iter__` do prefetch via `iterctx`.
The old `__iter__` without prefetching was renamed to `iterrev`.
Reviewed By: sfilipco
Differential Revision: D24365404
fbshipit-source-id: db5c687066794257719bb64c673dc384b5460ff1
Summary:
Now smartset has a reference to repo. It does not need `repo` from external
source.
Reviewed By: sfilipco
Differential Revision: D24365405
fbshipit-source-id: 8a43697b7b84a8a41691ed8f095c271107a90f16
Summary:
This will make `__iter__` to be able to do proper prefetch, or make it possible
for `__iter__` to return node instead of rev. To avoid cycles, weakref is used.
The smartset types are used widely. It's hard to migrate all callsites at once.
For now, `repo` is optional. Later, it will be required.
Reviewed By: DurhamG
Differential Revision: D24365400
fbshipit-source-id: 5dd40e3d930893c39f16da8f3169b026c8933bd2