Commit Graph

8546 Commits

Author SHA1 Message Date
Durham Goode
cfd7ccc828 configs: add Python support for allowed_locations list
Summary:
configs.allowedlocations restricts what configs can be loaded to a
certain set of files. This will enable us to deprecate all old config locations.

This diff adds Python support and a high level test.

Reviewed By: quark-zju

Differential Revision: D25539736

fbshipit-source-id: fa2544379b65672227e0d9cf08dad7016d6bbac8
2020-12-17 06:37:54 -08:00
Durham Goode
dce0faa9b1 configparser: add allowed_location criteria to config verifier
Summary:
We want to start disallowing non-approved config files from being
loaded. To do that, let's update the config verifier to accept an optional list
of allowed locations. If it's provided, we delete any values that came from a
disallowed location.

This will enable us to prune our config sources down to rust configs,
configerator configs, .hg/hgrc, and ~/.hgrc.

Reviewed By: quark-zju

Differential Revision: D25539738

fbshipit-source-id: 0ece1c7038e4a563c92140832edfa726e879e498
2020-12-17 06:37:54 -08:00
Stanislau Hlebik
79dda5ed04 mononoke: log public commits to scribe from scs move/create_bookmark method
Summary:
The goal of this stack is to start logging commits to scribe even if a commit was
introduced by scs create_commit/move_bookmark api.  Currently we don't do that.

Initially I had bigger plans and I wanted to log to scribe only from bookmarks_movement and remove scribe logging from unbundle/processing.rs, but it turned out to be trickier to implement. In general, the approach we use right now where in order to log to scribe we need to put `log_commit_to_scribe` call in all the places that can possibly create commits/move bookmarks seems wrong, but changing it is a bit hard. So for now I decided to solve the main problem we have, which is the fact that we don't get scribe logs from repos where bookmarks is moved via scs methods.

To fix that I added an additional option to CreateBookmark/UpdateBookmark structs. If this option is set to true then before moving/creating the bookmark it finds all draft commits that are going to be made public by the bookmark move i.e. all draft ancestors of new bookmark destinationl. This is unfortunate that we have to do this traversal on the critical path of the move_bookmark call, but in practice I hope it won't be too bad since we do similar traversal to record bonsai<->git mappings. In case my hopes are wrong we have scuba logging which should make it clear that this is an expensive operation and also we have a tunable to disable this behavioiur.

Also note that we don't use PushParams::commit_scribe_category. This is intentional - PushParams::commit_scribe_category doesn't seem useful at all, and I plan to deprecate it later in the stack. Even later it would make sense to deprecate PushrebaseParams::commit_scribe_category as well, and put commit_scribe_category optoin in another place in the config.

Reviewed By: markbt

Differential Revision: D25558248

fbshipit-source-id: f7dedea8d6f72ad40c006693d4f1a265977f843f
2020-12-17 00:19:00 -08:00
Jun Wu
63bb40f9c4 ui: move push/pull messages to stderr
Summary:
Those messages like "pulling from ...", "added n commits ..." belong to stderr.

This makes it possible for us to turn on verbose output for auto pull, without
breaking tools that parses stdout.

Reviewed By: sfilipco

Differential Revision: D25315955

fbshipit-source-id: 933f631610840eb5f603ad817f7560c78b19e4ad
2020-12-16 20:12:04 -08:00
Jun Wu
aef162f7a1 dag: add an API for IdMap identity
Summary:
It turns out `Arc::ptr_eq` is becoming unreliable, which will cause fast paths
to be not used, and extreme slowness in some cases (ex. `public & nodes`
iterating everything in `public`).

This diff adds an API for an IdMap to tell us its identity. That identity is
then used to replace the unreliable `Arc::ptr_eq`.

For an in-memory map, we just assign a unique number (per process) for its
identity on initialization. For an on-disk map, we use the type + path to
represent it.

Note: strictly speaking, this could cause false positives about
"maps are compatible", because two maps initially cloned from each other
can be mutated differently and their map_id do not change. That will
be addressed in upcoming diffs introducing a more complex but precise way to
track compatibility.

Reviewed By: sfilipco

Differential Revision: D25598076

fbshipit-source-id: 98c58f367770adaa14edcad20eeeed37420fbbaa
2020-12-16 20:08:41 -08:00
Xavier Deguillard
c37954fa29 rage: use registry to find Windows version
Summary:
On Windows, platform.version appears to be misleading as it returns 1903 for
Windows 10 1909... The registry has the correct build, so let's use that instead.

Reviewed By: kmancini

Differential Revision: D25577939

fbshipit-source-id: f47032906d02669bd1cb1a48304733c1e3499f81
2020-12-16 17:21:13 -08:00
Jun Wu
c9a701d104 pull: update visibility efficiently via repo.pull API
Summary:
With selectivepull we can tell visibility directly about which heads to add,
instead of adding all nodes in a changegroup.

Note: This does not affect the pull command, nor exclude public commits yet.

Reviewed By: markbt

Differential Revision: D25562090

fbshipit-source-id: aa5f346f33058dfdb3b2f23f175e35b5d3c30a1d
2020-12-16 17:12:55 -08:00
Jun Wu
35094bad4e remotenames: move a few methods to core
Summary: They will be used in core later.

Reviewed By: markbt

Differential Revision: D25562093

fbshipit-source-id: 4402a629a09920fd4c6f85cb8e777446bb218a37
2020-12-16 17:12:55 -08:00
Mark Juggurnauth-Thomas
91f2d07dbe tests: add test for derived data tailer
Reviewed By: krallin

Differential Revision: D25562158

fbshipit-source-id: 2ff917c4ae2f7c4b273b91d3f742bab6b05d8b46
2020-12-16 10:38:48 -08:00
Mark Juggurnauth-Thomas
5eaf0cfde2 backfill_derived_data: don't try to rederive heads we just derived
Summary:
When tailing to fill or backfill derived data, omit checking the heads from the
previous round of derivation, as we know for sure they've been derived.

Reviewed By: krallin

Differential Revision: D25465445

fbshipit-source-id: 384c7e67e99c561ce6aae324070e7c274c56b736
2020-12-16 10:38:48 -08:00
Mark Juggurnauth-Thomas
fb38295e3e backfill_derived_data: add option to backfill as well as tail
Reviewed By: krallin

Differential Revision: D25465444

fbshipit-source-id: b675ec47055e382bbb61a4ff0d6c109b0a16ac7d
2020-12-16 10:38:47 -08:00
Mark Juggurnauth-Thomas
feb9cbc143 backfill_derived_data: use concat to split long help strings
Summary:
Rustfmt gives up on formatting if strings are too long.  Split the long help
strings so that the formatter works again.

Tidy up some of the help text while we're at it.

Reviewed By: krallin

Differential Revision: D25465443

fbshipit-source-id: 360dbedc1e3e2ffbc489a9d9cba008835bce506f
2020-12-16 10:38:47 -08:00
Mark Juggurnauth-Thomas
590b84f62d changeset_info: implement batch_derive
Summary:
ChangesetInfo derivation doesn't depend on the parent ChangesetInfo being
available, so we can derive them in batches very easily.

Reviewed By: krallin

Differential Revision: D25470721

fbshipit-source-id: cc8ce305990eb6c9846158f0e9e3917cf35e169d
2020-12-16 10:38:47 -08:00
Mark Juggurnauth-Thomas
18653985b5 derived_data: add some docs describing how the traits fit together
Summary:
Add documentation comments to the derived data crate to describe how they fit
together.

Reviewed By: krallin

Differential Revision: D25432449

fbshipit-source-id: b62440bcecae900ad75d74245ce175bd9e07a894
2020-12-16 10:38:47 -08:00
Mark Juggurnauth-Thomas
8b6bc6dc8a backfill_derived_data: slice large repositories when backfilling
Summary:
Using `backfill-all` on very large repositories is slow to get started and slow
to resume, as it must traverse the repository history all the way to the start
before it can even begin.

Make this more usable by using the skiplist index to slice the repository into
reasonably sized slices of heads with the same range of generation numbers.

Each slice is then derived in turn.  If interrupted, derivation can continue
at the next slice more quickly.

Reviewed By: krallin

Differential Revision: D25371968

fbshipit-source-id: f150ea847f9fbbe84852587d620ae37ba2c58f28
2020-12-16 10:38:47 -08:00
Thomas Orozco
6c6d7705be mononoke: actually check that we have all the content in client uploads
Summary:
Right now, when we upload a hg commit, we check that we have all the content
the client is referencing.

The only problem is we do this by checking that the filenodes the client
mentions exist, but the way we store filenodes right now is we write them
concurrently with content blobs, so it is in fact possible to have a filenode
that references a piece of content that doesn't actually exist.

That isn't quite what one might call satisfactory when it comes to checking the
content does in fact exist, so this diff updates our content checking

In practice, with the way Mononoke works right now this should be quite free:
the client uploads everything all the time, and then we check later, so this
will just hit in the blobstore cache.

In a future world where clients don't upload stuff they already know we have,
that could be slower, but doing it the way is we do it before is simply not
correct, so that's not much better. The ways to make it faster would be:

- Trust that we'll hit in cache when checking for presence (could work out).
- Have the client prove to us that we have this content, and thread it through.

To do the latter, IMO the code here could actually look at the entries that
were actually uploaded, and not check them for presence again, but right now we
have a few layers of indirection that make this a bit tricky (technically, when
`process_one_entry` gets called, that means "I uploaded this", but nothing in
the signature of any of the functions involved really enforces that).

Reviewed By: StanislavGlebik

Differential Revision: D25422596

fbshipit-source-id: 3cf34d38bd6ed1cd83d93c778f04395c942b26c0
2020-12-16 07:29:00 -08:00
Liubov Dmitrieva
d10229e533 add log path for the background backups to hg cloud status output
Summary:
It always takes a bit of time to find the logs. Since we do have scm daemon onces in the output of `hg cloud status`,
it would be nice to have the onces from background backup as well.

Reviewed By: markbt

Differential Revision: D25560145

fbshipit-source-id: cdf5d76c7c3ebb1492559d32935f9301452a1cd5
2020-12-16 06:23:00 -08:00
Liubov Dmitrieva
7076cda874 fix order of arguments in error message
Summary: The order has been incorrect and led to a confusing message

Reviewed By: krallin

Differential Revision: D25559963

fbshipit-source-id: 4fcb3e53cedcb08675b60b25cbb5da2ca52c08ed
2020-12-16 06:23:00 -08:00
Alex Hornby
46e7757557 mononoke: add resolve_repos function to cmdlib::args
Summary:
Add a resolve_repos function to cmdlib::args for use from jobs that will run for multiple repos at once.

Planning to use this form the walker to scrub multiple small repos from single process

Differential Revision: D25422755

fbshipit-source-id: 40e5d499cf1068878373706fdaa72effd27e9625
2020-12-16 01:33:55 -08:00
Xavier Deguillard
a3115dacd8 utils: add a constexpr utf8 checker
Summary:
In a future diff, paths will be validated to make sure they are valid utf8. The
path sanity checker needs to be constexpr to construct global paths, but the
utf8 functions aren't, so let's write one that is.

Reviewed By: chadaustin

Differential Revision: D25562681

fbshipit-source-id: e48ec835c2cc9dc01090918cc7ee8f61b6c05a20
2020-12-16 01:03:32 -08:00
Xavier Deguillard
bc56270443 prjfs: only send a notification on future timeout
Summary:
When compiling with Buck, it tries to create hardlink which are rightfully
failing, but that error then triggers a spurious notification. For now, let's
only notify the user on timeout, as these are very likely to be network issues.

Reviewed By: kmancini

Differential Revision: D25500364

fbshipit-source-id: 95b609ae901fa6207c8edba26cd8e6a21ddfe3ac
2020-12-15 18:13:01 -08:00
Xavier Deguillard
1e8ac1405f service: fix build
Summary: The getConnectionContext is deprecated, use getRequestContext instead.

Reviewed By: chadaustin

Differential Revision: D25572462

fbshipit-source-id: 49aedfc6b2c8e79d8daa4bd3792f869fee6a4538
2020-12-15 15:44:09 -08:00
generatedunixname89002005307016
1e1a7fa10d suppress errors in eden - batch 1
Differential Revision: D25562362

fbshipit-source-id: ec62c64396f61335f6cfe8f355ba977bfd1da031
2020-12-15 15:22:22 -08:00
Stanislau Hlebik
86fce15e99 mononoke: refactor log_commits_to_scribe to forcepushrebase
Summary:
This diff does a small refactoring to hopefully make the code a bit clearer.
Previously we were calling log_commits_to_scribe in force_pushrebase and in
run_pushrebase, so  log_commits_to_scribe was called twice. It didn't mean that
the commits were logged twice but it was hard to understand just by looking at
the code.

Let's make it a bit clearer.

Reviewed By: krallin

Differential Revision: D25555712

fbshipit-source-id: bed9754b1645008846a86da665b6f3f3483f30da
2020-12-15 15:04:35 -08:00
Stanislau Hlebik
202d3affcf mononoke: add post-push logging test for infinitepush
Summary:
I'm going to do a refactoring later in the stack, so let's add a test to avoid
regressions.

Reviewed By: krallin

Differential Revision: D25535655

fbshipit-source-id: 5ec6633c9c8c25d1affcede0adbc27dd43c48736
2020-12-15 15:04:35 -08:00
Jun Wu
7c1d264b58 py3: fix debugshell compatibility without chg / with demandimport
Summary:
IPython is incompatible with Python 3 demandimport. Disable demandimport to
make it work.

Reviewed By: singhsrb

Differential Revision: D25542394

fbshipit-source-id: 293880dff62e98895bc1ae2d3328d4af25b8218f
2020-12-15 14:52:26 -08:00
Jun Wu
ec0b533381 ui: make ui.debug write to stderr
Summary:
Debug output belongs to stderr.

This makes it possible to turn on debug output without breaking programs
parsing stdout.

Reviewed By: singhsrb

Differential Revision: D25315954

fbshipit-source-id: c7813a824fbf6640cb5b80b5ed2d947e7059d53e
2020-12-15 12:07:47 -08:00
Jun Wu
d7f7bc0181 smartlog: always show "."
Summary:
With `collapse-obsolete`, `.` can be obsoleted and in the middle of a stack and
not shown up. That can be confusing. Make the smartlog revset always show the
`heads` passed in. If `.` is in `heads` (the default), then show it.

Reviewed By: DurhamG

Differential Revision: D24696595

fbshipit-source-id: 7deab109d0e0ae5e703928252bc63312d936955f
2020-12-15 11:52:02 -08:00
Alex Hornby
ba99662d83 mononoke: reduce number of sqlite directory and db creations
Summary:
Add open_existing_sqlite_path so we don't doing create_dir_all when we know the db already exists.

Noticed it in passing while investigating something else.

Reviewed By: markbt

Differential Revision: D25469502

fbshipit-source-id: 9810489c84220927937c037d69f5e8e70f2d9038
2020-12-15 11:06:37 -08:00
Robert Balicki
380cec1a54 Suggest connecting to the VPN if hg update doesn't work
Summary: * suggest connecting to the VPN if hg update doesn't work

Reviewed By: sfilipco

Differential Revision: D25551017

fbshipit-source-id: 575f29cce4ab2719f2faae86616fdd9aac739f5f
2020-12-15 08:50:38 -08:00
Thomas Orozco
30f6b1de5d infinitepush: submit LFS blobs via remotefilelog
Summary:
If Rust LFS is in use, we currently don't upload LFS blobs to commit cloud.
This is problematic because if you're going to Mononoke that means you can't
upload, and if you're going to Mercurial that means you're silently not backing
up data.

Reviewed By: StanislavGlebik

Differential Revision: D25537672

fbshipit-source-id: fd61f5a69450c97a0bc0895193f67fd22c9773fb
2020-12-15 08:20:29 -08:00
Xavier Deguillard
4001bb11b9 win: silence a handful of warnings.
Summary: These show up when compiling with Buck, let's silence them.

Reviewed By: chadaustin

Differential Revision: D25513672

fbshipit-source-id: 277afae30059114f3646cdf4feedac442a4ee1b6
2020-12-15 08:07:49 -08:00
Xavier Deguillard
34edb7b618 win: re-use guid for the lifetime of the checkout
Summary:
On Windows, the GUID of the mount point identifies the virtualization instance,
that GUID is then propagated automatically to the created placeholders when
these are created as a response to a getPlaceholderInfo callback.

When the placeholders are created by EdenFS when invalidating directories we
have to pass GUID. The documentation isn't clear about whether that GUID needs
to be identical to the mount point GUID, but for a very long time these have
been mismatching due to the mount point GUID being generated at startup time
and not re-used.

One of the most common issue that users have reported is that sometimes
operations on the repository start failing with the error "The provider that
supports file system virtualization is temporarily unavailable". Looking at the
output of `fsutil reparsepoint query` for all the directories from the file
that triggers the error to the root of the repositories, shows that one of the
folder and its descendant don't share the same GUID, removing it solves the
issue.

It's not clear to me why this issue doesn't always reproduce when restarting
EdenFS, but a simple step that we can take to solve this is to always re-use
the GUID, and that hopefully will lead to the GUID always being the same and
the error to go away.

Reviewed By: fanzeyi

Differential Revision: D25513122

fbshipit-source-id: 0058dedbd7fd8ccae1c9527612ac220bc6775c69
2020-12-15 08:07:49 -08:00
Durham Goode
d04c074d89 configs: fix issue with config removal
Summary:
The config verifier would remove items from the values list if they
were disallowed. To do this, it iterated through the values list backwards,
removing bad items.  In some cases it stored the index of a bad value for later
use, but because it was iterating backwards and removing things, the indexed it
stored might not be correct by the time the loop is done. To fix this, let's go
back to iterating forwards.

Reviewed By: quark-zju

Differential Revision: D25539737

fbshipit-source-id: 87663f3c162c690f3961b8075814f3467916cb4b
2020-12-15 08:03:50 -08:00
Mark Juggurnauth-Thomas
f8190ccadd bookmarks_movement: allow commits to land if there are pre-existing case conflicts
Summary:
The old case-conflict checks were more lenient, and only triggered if a commit
introduced a case conflict compared to its first parent.

This means that commits could still be landed to bookmarks that already had
pre-existing case conflicts.

Relax the new case-conflict checks to allow this same scenario.

Note that we're still a bit more strict: the previous checks ignored other
parents, and would not reject a commit if the act of merging introduces a case
conflict.  The new case conflict checks only permit case conflicts in the case
where all conflicting files were present in one of the parents.

Reviewed By: StanislavGlebik

Differential Revision: D25508845

fbshipit-source-id: 95f4db1300ee73b8e6495ba8b5c1c2ce5a957d1a
2020-12-15 01:53:39 -08:00
Alex Hornby
a39070bbcd mononoke: small simpliciation for filenode derivation
Summary: Spotted this in passing.  Was able to remove a call to fetch_root_manifest_id.

Reviewed By: StanislavGlebik

Differential Revision: D25472678

fbshipit-source-id: d450cb97630464be13d22fb37c3356611dc2e1b6
2020-12-15 00:48:03 -08:00
Alex Hornby
b94b1ba21e mononoke: add "all" as an option for walker NodeType and EdgeType args
Summary: This makes it easier to run full walks on small repos.

Reviewed By: StanislavGlebik

Differential Revision: D25469485

fbshipit-source-id: 6e5b1426837a396d939e47a5b353e615437ae7cb
2020-12-15 00:48:03 -08:00
Chad Austin
54cbc57962 doctor: handle invalid contents of .buckd/pid
Summary:
If .buckd/pid is empty or invalid, assume that means Buck is not
running.

Reviewed By: genevievehelsel

Differential Revision: D25544725

fbshipit-source-id: 101ef67e17ff3e06f428cd7dbf51b2587fee4627
2020-12-14 23:18:21 -08:00
Jun Wu
cb6bd19903 streams: restore commit text prefetching behavior
Summary:
Restore the behavior disabled by D25350916 (49c6f86325). This time it no longer runs Python
logic in background threads.

Reviewed By: sfilipco

Differential Revision: D25513054

fbshipit-source-id: 0220ccb37e658518d105bba04f45424c9fcfe142
2020-12-14 19:10:44 -08:00
Jun Wu
3a371e7aa7 hgcommits: be aware of hardcoded commit hashes in more places
Summary:
Make `get_commit_raw_text` aware of hg's hardcoded commit hashes: NULL_ID and
WDIR_ID. Previously, only `stream_commit_raw_text` is aware of it.

This makes it a bit more compatible when used in more places.

Reviewed By: sfilipco

Differential Revision: D25515006

fbshipit-source-id: 08708734a28f43acf662494df69694988a5b9ca0
2020-12-14 19:10:44 -08:00
Jun Wu
843ef6aab2 pydag: add API to get commit text in batch
Summary:
Unlike streamcommitrawtext, the new API does not put Python logic to a
background thread. This will make it easier to reason about Python logic as
they do not need to be thread-safe, and we don't need to think about Python GIL
deadlocks in the Rust async world.

Reviewed By: sfilipco

Differential Revision: D25513057

fbshipit-source-id: 4b30d7bab27070badd205ac1a9d54bae7f1f8cec
2020-12-14 19:10:44 -08:00
Jun Wu
1c2135dd15 hgcommits: add fallback 1-by-1 fetching for the hybrid backend
Summary:
Previously, only the batch fetching, or the stream fetching APIs will
actually fetch commit remotely. The 1-commit fetching API does not have
the network side effect, with the hope that we can migrate all usecases
to stream or batch fetching.

Practically it's quite difficult to migrate all use-cases, and the Python
layer has to have a fallback 1-by-1 fetching. Now let's just move that
fallback to Rust to simplify the code. The fallback in the Rust code
is by the default impl of get_commit_raw_text.

Reviewed By: sfilipco

Differential Revision: D25513056

fbshipit-source-id: b3c615397d33b8d35876dc23ca7b95173783ef80
2020-12-14 19:10:44 -08:00
Jun Wu
8d66ade1de hgcommits: add API to get commit text in batch
Summary: The API will be used in Python bindings to avoid running Python in background threads.

Reviewed By: sfilipco

Differential Revision: D25513055

fbshipit-source-id: a108b55115271a256c0d43e0ff7b82c0b209be81
2020-12-14 19:10:44 -08:00
Jun Wu
1a5755de2c smartset: make __iter__ do prefetch
Summary:
Previously only `iterctx` does prefetch. Make `__iter__` do prefetch via `iterctx`.
The old `__iter__` without prefetching was renamed to `iterrev`.

Reviewed By: sfilipco

Differential Revision: D24365404

fbshipit-source-id: db5c687066794257719bb64c673dc384b5460ff1
2020-12-14 13:12:43 -08:00
Jun Wu
4cde87b34b smartset: drop repo from iterctx
Summary:
Now smartset has a reference to repo. It does not need `repo` from external
source.

Reviewed By: sfilipco

Differential Revision: D24365405

fbshipit-source-id: 8a43697b7b84a8a41691ed8f095c271107a90f16
2020-12-14 13:12:42 -08:00
Jun Wu
1a1b3a3cb5 smartset: make repo required for baseset
Reviewed By: sfilipco

Differential Revision: D24365401

fbshipit-source-id: 4d0ee6d27717c1aa966086af68492295aa6ed372
2020-12-14 13:12:42 -08:00
Jun Wu
99c2440f9e smartset: make repo required for nameset
Reviewed By: sfilipco

Differential Revision: D24365402

fbshipit-source-id: 72282634dfba862eab39cef176f0073350921ffc
2020-12-14 13:12:42 -08:00
Jun Wu
0ef3c1aa32 smartset: make repo required for generatorset
Reviewed By: sfilipco

Differential Revision: D24365403

fbshipit-source-id: 8cdee2d344b6af02804c6c0dca3243057df0a643
2020-12-14 13:12:42 -08:00
Jun Wu
01553ffe03 smartset: make repo required for idset
Reviewed By: DurhamG

Differential Revision: D24365406

fbshipit-source-id: 17cbe8e5fc1ea6025b5006737dedb6f744c009a4
2020-12-14 13:12:41 -08:00
Jun Wu
1327b6ec3b smartset: add optional repo reference to smartset objects
Summary:
This will make `__iter__` to be able to do proper prefetch, or make it possible
for `__iter__` to return node instead of rev. To avoid cycles, weakref is used.

The smartset types are used widely. It's hard to migrate all callsites at once.
For now, `repo` is optional. Later, it will be required.

Reviewed By: DurhamG

Differential Revision: D24365400

fbshipit-source-id: 5dd40e3d930893c39f16da8f3169b026c8933bd2
2020-12-14 13:12:41 -08:00