Commit Graph

10050 Commits

Author SHA1 Message Date
Jun Wu
775899c0f2 dag: make protocol use IdSet instead of Vec<Id>
Summary: IdSet is more compact. This changes the order a bit.

Reviewed By: sfilipco

Differential Revision: D27339279

fbshipit-source-id: e9b50a47beba081b892eccd7711dbd6ab5c3a886
2021-04-05 12:55:40 -07:00
Jun Wu
c515d1f54f dag: show AnestorPath batch size in debug output
Summary: This will be used by the next change.

Reviewed By: sfilipco

Differential Revision: D27406591

fbshipit-source-id: fcacc35a9ae8ed96cebb2af804d26d1e5e83ad9e
2021-04-05 12:55:40 -07:00
Jun Wu
95ece1d6fe dag: add a way to flush the overlay map
Summary:
Add a way to flush the overlay map to disk so we can avoid network fetches over
and over.

Reviewed By: sfilipco

Differential Revision: D27406592

fbshipit-source-id: 7086ad665119cc3a0834f533690325c7a2363442
2021-04-05 12:55:40 -07:00
Jun Wu
dd042424f3 dag: move (x~n, name) -> (id, name) calculation to a function
Summary: It will be reused elsewhere.

Reviewed By: sfilipco

Differential Revision: D27406593

fbshipit-source-id: 296cf5f50830bb7285e0cb9c7c15a9b374689819
2021-04-05 12:55:40 -07:00
Jun Wu
5326b18c2b dag: track x~n paths in NameDag
Summary:
I spent some time thinking about how to flush the "overlay_map" to disk.
It is a bit tricky because the on-disk IdMap might have changed in an
incompatible way. I tried a few ways to verify the on-disk IdMap remains
compatible and didn't find a way that looks good (either slow - calculating
universal_ids, or is not 100% correct in corner cases).

Now I come up with this "just track x~n" idea. It trades memory usage (~2x
overlay_map) for easy-to-verify correctness, and efficient overlay_map
flush.

Reviewed By: sfilipco

Differential Revision: D27406583

fbshipit-source-id: 0b7fb3186a9c15f376c1dc4afe7f0516c25d3dec
2021-04-05 12:55:39 -07:00
Jun Wu
e6d231818d dag: add more comments about NameDag locking
Summary: It is not obvious. So let's add more comments.

Reviewed By: sfilipco

Differential Revision: D27406584

fbshipit-source-id: 9ce1215efc1a6d4849180c6693616613c08f2a51
2021-04-05 12:55:39 -07:00
Jun Wu
2b5f78d0ac dag: add a test about sparse dag
Summary:
A sparse dag does not have full IdMap. Its IdMap only contains "universally known" entries.

Add a basic test about cloning from a sparse clone data and resolve vertex <-> id mapping
on the fly.

Reviewed By: sfilipco

Differential Revision: D27352018

fbshipit-source-id: 4a3f5f50be52e91bf7b2021cdc858bcab9c99e80
2021-04-05 12:55:39 -07:00
Jun Wu
d5b5e1ea93 dag: make import_clone_data flush the dag directly
Summary:
The `NameDag::flush` API will actually rebuild the graph using a "parent" function.
That is not necessary if we got clone data, and won't work well for a lazy graph
(since the parent function talks about vertex names and some names are missing).

Let's bypass the `flush` function and write data directly in `import_clone_data`.

Reviewed By: sfilipco

Differential Revision: D27352019

fbshipit-source-id: a79569d25d858447b8c5eb86902b8d39ae0429a3
2021-04-05 12:55:39 -07:00
Jun Wu
cdbc0b9bb1 dag: add ways to use a NameDag as an implementation of the remote protocols
Summary: This will be used in tests.

Reviewed By: sfilipco

Differential Revision: D27343882

fbshipit-source-id: 5a2d94a9f755eed0fc27e5a11093b55c810dc8da
2021-04-05 12:55:39 -07:00
Jun Wu
13c9880eca dag: add logic to export clone data
Summary:
Implement logic to export the clone data. This allows us to construct a sparse/lazy
dag via export + import CloneData.

Reviewed By: sfilipco

Differential Revision: D27343885

fbshipit-source-id: 71dc0d31e36876a8b6a8c3d7f3498be3262ce297
2021-04-05 12:55:39 -07:00
Jun Wu
ec2c1a7928 dag: add verification importing clone data
Summary:
Clone data can only be imported to an empty Dag and universally known vertexes
should be present in the IdMap.

Reviewed By: sfilipco

Differential Revision: D27343888

fbshipit-source-id: ba150d6afdbe15f0902ec20ff150a70657e24c80
2021-04-05 12:55:39 -07:00
Jun Wu
429b0e1e15 dag: make import_clone_data async
Summary: It'll use some async functions.

Reviewed By: sfilipco

Differential Revision: D27406585

fbshipit-source-id: e757796f712a5f95f1227f88e797e43551039f0b
2021-04-05 12:55:38 -07:00
Jun Wu
fed4cdbe50 dag: implement Id prefetch for IdStatic set
Summary: Make IdStatic prefetch Id -> Names on iteration.

Reviewed By: sfilipco

Differential Revision: D27343886

fbshipit-source-id: 7957b574c8c14cfea476b9c42cbf9f11fefa39be
2021-04-05 12:55:38 -07:00
Jun Wu
8364f23186 dag: make IdConvert batch API on NameDag use less network requests
Summary: Collect "missing" items and only use one request to fetch them.

Reviewed By: sfilipco

Differential Revision: D27406588

fbshipit-source-id: a5cd091b39d90c1ad0e7c5d509673c4665232304
2021-04-05 12:55:38 -07:00
Jun Wu
55b0132518 dag: implement ExactSizeIterator on SpanSet
Summary: This allows something like `iter.rev().take(n).rev()`.

Reviewed By: sfilipco

Differential Revision: D27343887

fbshipit-source-id: 06c095eb448272dca6add0e707cdf38f0daee252
2021-04-05 12:55:38 -07:00
Jun Wu
0326d38c75 dag: add batch id <-> name API in IdConvert
Summary:
This will be used by upcoming changes. Sparse/Lazy NameSet will override it
to reduce network round-trips.

Reviewed By: sfilipco

Differential Revision: D27406590

fbshipit-source-id: a44a73b4aec6e14d6e82d55285fe1cfc0fcfd482
2021-04-05 12:51:38 -07:00
Jun Wu
7c8056ae38 dag: add APIs to test if Id or Vertex is present in IdMap locally
Summary: This will be used by upcoming changes.

Reviewed By: sfilipco

Differential Revision: D27343884

fbshipit-source-id: 0938b1fb3d90b35f9d51c468cffca53e3f421bb8
2021-04-05 12:51:38 -07:00
Jun Wu
47e1c97dc9 streams: drop mut from resolve_remote
Summary: The `mut` was unused. Remove it to make the interface more flexible.

Reviewed By: sfilipco

Differential Revision: D27406594

fbshipit-source-id: 1cfa4921015fc89b6c71ed4a97d9c351f56c7370
2021-04-05 12:51:37 -07:00
Jun Wu
0ff3c9109d dag: make NameDag::IdConvert resolve vertexes remotely
Summary:
Remove some TODOs. This serves as fallback paths where batch prefetch didn't happen.
I'd expect most use-cases will trigger IdStatic set's batch prefetch logic (to be
added).

I haven't decided what to do exactly with "contains". Fetching remotely there seems
to require some kind of negative cache (ex. in mutation records there might be nodes
not in the graph). But it _might_ be okay to say the "contains" is a local-only
operation, too. I leave it as-is and we'll see how the Python world uses "contains"
later.

Reviewed By: sfilipco

Differential Revision: D27339275

fbshipit-source-id: ba70b3c84a391a8e395c73ccd1d7e08f92b0cbd0
2021-04-05 12:51:37 -07:00
Jun Wu
b7c63d192d dag: add methods to resolve id <-> names remotely on NameDag
Summary:
Put everything together. I used "programming error" extensively to
provide more context if we have to investigate issues in the future.

Reviewed By: sfilipco

Differential Revision: D27339278

fbshipit-source-id: 574a2c048dc1d24dbe690f862fec3e5078cb067a
2021-04-05 12:51:37 -07:00
Jun Wu
8a381893db dag: improve error message in protocol
Summary: Provide more context about what invariants we expect. Not just show "vertex not found".

Reviewed By: sfilipco

Differential Revision: D27339273

fbshipit-source-id: 1c6c92537ff37666ff603783adfd8f9ea770fbaa
2021-04-05 12:51:37 -07:00
Jun Wu
c85c750baa dag: add a remote protocol field to NameDag
Summary:
Makes NameDag own the state (logic) about how to send remote requests.
So NameDag can send requests on its own.

Reviewed By: sfilipco

Differential Revision: D27339282

fbshipit-source-id: 3cb6327dfeaefae45d4e7b88a3535463a84b195b
2021-04-05 12:48:34 -07:00
Jun Wu
79b40c5ce8 dag: define remote protocol
Summary: Define a trait for implementing the remote protocol elsewhere.

Reviewed By: sfilipco

Differential Revision: D27339281

fbshipit-source-id: da5b316d98863507361d3bde4988fd6c9098f48c
2021-04-05 12:48:34 -07:00
Jun Wu
31ab817ba6 dag: make NameDag IdConvert consider overlay IdMap
Summary: This will make the overlay IdMap effective when query against the NameDag.

Reviewed By: sfilipco

Differential Revision: D27339276

fbshipit-source-id: 80712bf651beb6c7e9f23bd4233c6d916101696a
2021-04-05 12:48:34 -07:00
Stiopa Koltsov
babc84bed6 Add sandcastle_instance_id column to scuba
Summary: To be able to quickly find all activity related to given Sandcastle job as discussed in [this workplace post](https://fb.workplace.com/groups/2120196508269853/permalink/2899040243718805).

Reviewed By: kmancini

Differential Revision: D27541803

fbshipit-source-id: a55900064bbee92da902de785ebe0c0e8738c3a2
2021-04-03 00:11:47 -07:00
Xavier Deguillard
5fae7d2ad5 merge: allow specifiying a number of threads of remove/writer phase
Summary:
When watchman is in use, removing tons of files very quickly leads to FSEvents
losing events, causing watchman to recrawl the entire repository. It's not
entirely clear today what the maximum number of threads to use is, let's make
it configurable.

Reviewed By: andll

Differential Revision: D27536577

fbshipit-source-id: 58f2bc453321fd4e7468d3324ee4df2b79b96d5d
2021-04-02 20:19:05 -07:00
Morgan Newman
5debd246b9 add --prefetch-metadata flag to prefetch command
Summary: As titled.

Reviewed By: kmancini

Differential Revision: D27519428

fbshipit-source-id: 00da2c8dc6647d87c935ee2db0bb11fe2f7f8103
2021-04-02 14:33:59 -07:00
Durham Goode
82f39a44fb checkout: check HgId during resumable checkout
Summary:
We already checked size and mtime, which should allow us to identify if
a file has changed since the last update, but we need to check hgid as well, to
make sure the previously-written content is supposed to be the same as the
about-to-be-written content

Reviewed By: andll

Differential Revision: D26955401

fbshipit-source-id: 859ad35b008e68d699601217f2a4398c6789913e
2021-04-02 12:58:35 -07:00
Durham Goode
d596d1284a checkout: allow resuming an interrupted checkout
Summary:
Updates native checkout to store which files have already been written
in .hg/upgradeprogress. Then enables it to load that file and skip writing those
files if they're already on disk and their mtime and size match the previously
written values. In theory we could record and check file hashes as well, but
that'd likely slow things down quite a bit.

Future diffs will add:
- Recording and checking the hgid that was written before vs what is about to be
  written. Just an hgid comparison, not a full hash computation.
- Some UX to inform the user when hg checkout can be continued, and possibly to
  implement 'hg checkout --continue'.

Reviewed By: andll

Differential Revision: D26830249

fbshipit-source-id: 88a75080966dae5241550ed7eedbc057c65966dd
2021-04-02 12:58:35 -07:00
Stiopa Koltsov
ec1e6b5bea log session id on startup
Summary:
Currently eden on startup prints:

```
Starting edenfs (dev build), pid 190
Opening local RocksDB store...
Opened RocksDB store in 0.073 seconds.
Could not parse config.json file: couldn't read /var/twsvcscm/local/.eden/config.json: No such file or directory
Skipping remount step.
Started edenfs (pid 190)
Logs available at /var/twsvcscm/local/.eden/logs/edenfs.log
```

Would be convenient if it also printed session id, to be able to query scuba straight away.

Reviewed By: chadaustin

Differential Revision: D27522665

fbshipit-source-id: d7d4cf6c97bc551061761f2653375f208e393498
2021-04-02 11:36:13 -07:00
Stiopa Koltsov
afddf66676 Move getSessionId to a separate file
Summary: Refactoring to make the following diff smaller.

Reviewed By: chadaustin

Differential Revision: D27522581

fbshipit-source-id: 8f858714fcbfe4b8f8b1c3678bb2003623abbd94
2021-04-02 11:36:13 -07:00
Stiopa Koltsov
fe00ae209e Do not inherit stderr copy in scribe_cat
Summary:
Still trying to enable Scuba logging in ovrsource TD to figure out what caused Buck regression when we started working with multiple watchman instances.

Did not try previous fix of pgrp (deploying stuff on Sandcastle is not trivial), although now I'm not sure that pgrp was the issue. Hard to say.

But now I'm launching CI differently and observe different symptoms.

`eden start` command finishes in 30 seconds (according to logs), but Sandcastle waits for something for 10 minutes and then somebody kills `scribe_cat` and Sandcastle continues. I don't really know what that means, but there's another issue I discovered:

`scribe_cat` inherits a copy of stderr fs created during daemon startup.

This diff fixes the issue.

Reviewed By: kmancini

Differential Revision: D27494520

fbshipit-source-id: 069f4e9ea1efb553cf7a7f18e20ae92c27da808d
2021-04-02 09:57:16 -07:00
Meyer Jacobs
b6bae3550c scmstore: python bindings for constructing treescmstore object
Summary: Basic python constructor bindings for tree scmstore.

Reviewed By: DurhamG

Differential Revision: D27246806

fbshipit-source-id: 169adef7f01e8b6d96f35614b414f8190a40ab8c
2021-04-01 18:07:31 -07:00
Stiopa Koltsov
c247025da9 Log how long it took for eden to start
Summary:
I see in Sandcastle logs it was 10 minutes between `eden start` command and the following command.

So I'm curious, is it eden took 10 minutes to start or Sandcastle did something else but did not log it?

This little message in log will help to understand that.

Reviewed By: kmancini

Differential Revision: D27491709

fbshipit-source-id: 796c8db5abe49b056bd81b03ea57585a761c3cb6
2021-04-01 10:32:18 -07:00
Mark Juggurnauth-Thomas
69896e90b5 bonsai_hg_mapping: construct rendezvous connections in a blocking closure
Summary:
`RendezVousConnection::new` can block for some time doing work on the CPU,
specifically creating the stats objects.  This causes problems for other
futures during repo construction.

Instead, move rendez-vous construction to a `spawn_blocking` closure, so that
it doesn't interfere with the other futures.

Since `SqlBonsaiHgMapping::from_sql_connections` is not async, and is part of
the SqlConstruct trait, we must convert this to the builder pattern so that we
can defer rendez-vous construction to later on.

Reviewed By: farnz

Differential Revision: D27501915

fbshipit-source-id: 9c58c32411301128424985deeab127d052c43532
2021-04-01 08:27:15 -07:00
Alex Hornby
d2f915d4dd mononoke: compress manual_scrub output files if requested
Summary: Use async-compression to optionally zstd compress key and error files.

Reviewed By: farnz

Differential Revision: D27467761

fbshipit-source-id: e1ccb7dc32e4c41eaba82a3716cf4d13f64f71ea
2021-04-01 07:02:06 -07:00
Alex Hornby
9f9521dd4e rust: update zstd crates to zstd 1.4.8
Summary: Vendor updated zstd crates. Started on this as a prerequisite for vendoring updated async-compression crate, however dependency resolution means this actually updates async-compression as well.

Reviewed By: farnz

Differential Revision: D27467760

fbshipit-source-id: 74ca9e4da9f336cf609cf06c62559c9ef913c9a2
2021-04-01 06:20:12 -07:00
Durham Goode
dee3c000dd extensions: fix double loading extensions
Summary:
The extensions loading logic built up an _order list to track what order extensions should be loaded in. It also treated that list as the list of what has already been loaded, if we load extensions multiple times (like once at dispatch, then later for repo creation). Unfortunately some extensions, like tweakdefaults, modify the _order list, so when extensions uses len(_order) to decide which extensions are new, it sometimes gets the wrong results because tweakdefaults has inserted values at the beginning of _order which messes up the offset.

This causes some extension's uisetup and extsetup to be invoked twice. With dynamicconfig on Windows, I was seeing pushrebase load twice, which caused an error due to double registering bundle2 parts.

Extension loading is a mess, so this whole ordering concept really doesn't make sense, but for now let's fix it by not relying on the _order offset.

Reviewed By: quark-zju

Differential Revision: D27470828

fbshipit-source-id: e7ab8f87dfde22d82e5e080e8c01b884458b30c3
2021-03-31 21:32:37 -07:00
Stiopa Koltsov
18957b3401 Call setpgrp early in daemon init
Summary:
We call `setsid` too late, [on successful startup](https://fburl.com/diffusion/p88am07t), after `scribe_cat` binary launched.

As a result, `scribe_cat` does not belong to process group of eden, it belongs to the process group of launcher process, so it is killed by `timeout` command in ovrsource CI.

To deal with it, this diff proposes additionally calling `setpgrp` call early during initialization.

Reviewed By: chadaustin

Differential Revision: D27436474

fbshipit-source-id: acb168c1ab5b78b1598c34ebece88847a9a6480d
2021-03-31 18:34:55 -07:00
Stefan Filip
58996f610d segmented_changelog: increase wait time before starting periodic update
Summary:
Let's give some time for bookmarks to warm up before we start checking the
location of the master bookmark.

Reviewed By: quark-zju

Differential Revision: D27472520

fbshipit-source-id: f20e6f04acadaf8b7caf6aa38a6ab9d244c6f8ea
2021-03-31 17:16:01 -07:00
Xavier Deguillard
82720fa0b8 privhelper: make NFS mount soft
Summary:
In the case where EdenFS crashes, any client performing IO to EdenFS would be
stuck in the D state waiting for the server to be back up and running. Since on
restart EdenFS will chose different ports, this makes it hard/impossible to
actually recover and will keep some process unkillable, often time requiring
rebooting a host.

To avoid this, we can make the mount "soft" which aborts the IO after some time
has elapsed and the server isn't back up. This should enable us to kill these
processes and recover more gracefully.

Reviewed By: chadaustin

Differential Revision: D27402817

fbshipit-source-id: ff81a4360900c4c94665174e3c0cf63402f1533e
2021-03-31 13:34:20 -07:00
Arun Kulshreshtha
3295f0f1fe hg-http: add support for --insecure flag
Summary:
Add a `set_global_config` function to `hg-http` that gets called prior to command execution. This function can be used to configure the global behavior of all HTTP requests by setting up a callback that modifies every `Request` prior to sending.

Right now, the use case for this is to support the `--insecure` flag, as can be seen in the code.

Reviewed By: quark-zju

Differential Revision: D27243638

fbshipit-source-id: 61a52c1f4b56fffd860c2a7e9c7b03e146b2240a
2021-03-31 12:53:46 -07:00
Arun Kulshreshtha
e26bc3ffa4 clidispatch: add --insecure global flag
Summary: Add the `--insecure` flag to the list of global hg flags. This flag is already supported by Python, and adding it here will make it available to Rust code as well. This is used later in the stack.

Reviewed By: quark-zju

Differential Revision: D27243029

fbshipit-source-id: 150d42ee96e1e3194ff1be6a33d9b36887d86f2c
2021-03-31 12:53:46 -07:00
Mark Juggurnauth-Thomas
555375e20f smartlog: focused smartlog should include only draft related commits
Summary:
The `focusedbranch` revset should only include commits that are related via
draft commits.  If a commit is landed, its public successor shouldn't cause
focused smartlog to include descendants of the landed commit.

Reviewed By: quark-zju

Differential Revision: D27461850

fbshipit-source-id: c0eb4fbcd6cf1cd78a88e3f796b5a16257dfb190
2021-03-31 12:12:43 -07:00
Mark Mendoza
31282ad8e7 Have scratch set_file_owner on all created parents of created directory
Summary:
When running `sudo mkscratch path /home/markamendoza`, the folder `/data/users/markamendoza/scratch/homeZmarkamendoza` will be created, including all parents that did not already exist.  All these new directories will be created with owner = root.

Anticipating this kind of call, `mkscratch` already goes in and manually modifies the owner of the leaf directory (e.g. `/data/users/markamendoza/scratch/homeZmarkamendoza`), **but will not do so for any of the parent folders it created**.

This is normally not a problem, as long as `/data/users/markamendoza/scratch/` already exists.  If it doesn't then these directories are created with the wrong ownership, which causes problems if we (for example) call `mkscratch` again, without sudo.

This diff remedies this by calling the same `set_file_owner` function (that's already called on the leaf directory) on all parents that were created by `mkscratch`

Reviewed By: chadaustin

Differential Revision: D27378153

fbshipit-source-id: 26c40f4ca478cacf9117093c7b70cbedd679cea6
2021-03-31 11:50:03 -07:00
Liubov Dmitrieva
ea1362321f erase the state of the current workspace when switch to a new workspace
Summary:
Preserving that state causing some bugs.

Also it leads to accumulating lots of .eden-backing-repos/fbsource/.hg/store/commitcloudstate.user files.

It is cleaner to remove it.

Reviewed By: markbt

Differential Revision: D27461467

fbshipit-source-id: 21eebe799afcb919539ae059fc8b707abc74971c
2021-03-31 11:21:14 -07:00
Gus Wynn
fc46c24e8f update tokio to 1.4.0
Summary:
https://github.com/tokio-rs/tokio/releases/tag/tokio-1.4.0
I want the `biased;` option in `tokio::select!`

Reviewed By: ahornby

Differential Revision: D27435341

fbshipit-source-id: c29ca954c319327f62466131ae04483ad091bf49
2021-03-31 10:44:20 -07:00
Stefan Filip
f16a99af95 segmented_changelog: add IdMapVersion check to VersionStore
Summary:
We want to protect against races where we may have multiple writers with
different IdMap versions. We want IdMapVersions to be monotonous increasing.

The common race scenario is between tailers and seeder. We may have a tailer
that starts to updates while a seeder writes a version with a new idmap. The
tailer may try to write a version that has an old version of the idmap. We want
the tailer to fail in this case.

Reviewed By: krallin

Differential Revision: D27210035

fbshipit-source-id: 168c7c6f25622587071337df1172a12b799e0285
2021-03-31 10:29:36 -07:00
Liubov Dmitrieva
580c02a480 implementing hiding of scratch bookmarks with hg hide -B <name>
Summary: Alow hiding scratch bookmarks if selective pull is enabled

Reviewed By: quark-zju

Differential Revision: D27160251

fbshipit-source-id: 0ba839fc5c26189309c92e62293ee85acd8cb218
2021-03-31 10:05:16 -07:00
Harvey Hunt
bf5b2f4c6a mononoke: admin: Update help text of bonsai-fetch
Summary:
This subcommand had the same help text as `content-fetch`. Update the
help text to describe what the command actually does.

Reviewed By: ahornby

Differential Revision: D27463645

fbshipit-source-id: ac4fba62f7836b55c94410889aafbf76a3cad74f
2021-03-31 09:42:11 -07:00