Summary: IdSet is more compact. This changes the order a bit.
Reviewed By: sfilipco
Differential Revision: D27339279
fbshipit-source-id: e9b50a47beba081b892eccd7711dbd6ab5c3a886
Summary: This will be used by the next change.
Reviewed By: sfilipco
Differential Revision: D27406591
fbshipit-source-id: fcacc35a9ae8ed96cebb2af804d26d1e5e83ad9e
Summary:
Add a way to flush the overlay map to disk so we can avoid network fetches over
and over.
Reviewed By: sfilipco
Differential Revision: D27406592
fbshipit-source-id: 7086ad665119cc3a0834f533690325c7a2363442
Summary: It will be reused elsewhere.
Reviewed By: sfilipco
Differential Revision: D27406593
fbshipit-source-id: 296cf5f50830bb7285e0cb9c7c15a9b374689819
Summary:
I spent some time thinking about how to flush the "overlay_map" to disk.
It is a bit tricky because the on-disk IdMap might have changed in an
incompatible way. I tried a few ways to verify the on-disk IdMap remains
compatible and didn't find a way that looks good (either slow - calculating
universal_ids, or is not 100% correct in corner cases).
Now I come up with this "just track x~n" idea. It trades memory usage (~2x
overlay_map) for easy-to-verify correctness, and efficient overlay_map
flush.
Reviewed By: sfilipco
Differential Revision: D27406583
fbshipit-source-id: 0b7fb3186a9c15f376c1dc4afe7f0516c25d3dec
Summary: It is not obvious. So let's add more comments.
Reviewed By: sfilipco
Differential Revision: D27406584
fbshipit-source-id: 9ce1215efc1a6d4849180c6693616613c08f2a51
Summary:
A sparse dag does not have full IdMap. Its IdMap only contains "universally known" entries.
Add a basic test about cloning from a sparse clone data and resolve vertex <-> id mapping
on the fly.
Reviewed By: sfilipco
Differential Revision: D27352018
fbshipit-source-id: 4a3f5f50be52e91bf7b2021cdc858bcab9c99e80
Summary:
The `NameDag::flush` API will actually rebuild the graph using a "parent" function.
That is not necessary if we got clone data, and won't work well for a lazy graph
(since the parent function talks about vertex names and some names are missing).
Let's bypass the `flush` function and write data directly in `import_clone_data`.
Reviewed By: sfilipco
Differential Revision: D27352019
fbshipit-source-id: a79569d25d858447b8c5eb86902b8d39ae0429a3
Summary: This will be used in tests.
Reviewed By: sfilipco
Differential Revision: D27343882
fbshipit-source-id: 5a2d94a9f755eed0fc27e5a11093b55c810dc8da
Summary:
Implement logic to export the clone data. This allows us to construct a sparse/lazy
dag via export + import CloneData.
Reviewed By: sfilipco
Differential Revision: D27343885
fbshipit-source-id: 71dc0d31e36876a8b6a8c3d7f3498be3262ce297
Summary:
Clone data can only be imported to an empty Dag and universally known vertexes
should be present in the IdMap.
Reviewed By: sfilipco
Differential Revision: D27343888
fbshipit-source-id: ba150d6afdbe15f0902ec20ff150a70657e24c80
Summary: Collect "missing" items and only use one request to fetch them.
Reviewed By: sfilipco
Differential Revision: D27406588
fbshipit-source-id: a5cd091b39d90c1ad0e7c5d509673c4665232304
Summary:
This will be used by upcoming changes. Sparse/Lazy NameSet will override it
to reduce network round-trips.
Reviewed By: sfilipco
Differential Revision: D27406590
fbshipit-source-id: a44a73b4aec6e14d6e82d55285fe1cfc0fcfd482
Summary: This will be used by upcoming changes.
Reviewed By: sfilipco
Differential Revision: D27343884
fbshipit-source-id: 0938b1fb3d90b35f9d51c468cffca53e3f421bb8
Summary: The `mut` was unused. Remove it to make the interface more flexible.
Reviewed By: sfilipco
Differential Revision: D27406594
fbshipit-source-id: 1cfa4921015fc89b6c71ed4a97d9c351f56c7370
Summary:
Remove some TODOs. This serves as fallback paths where batch prefetch didn't happen.
I'd expect most use-cases will trigger IdStatic set's batch prefetch logic (to be
added).
I haven't decided what to do exactly with "contains". Fetching remotely there seems
to require some kind of negative cache (ex. in mutation records there might be nodes
not in the graph). But it _might_ be okay to say the "contains" is a local-only
operation, too. I leave it as-is and we'll see how the Python world uses "contains"
later.
Reviewed By: sfilipco
Differential Revision: D27339275
fbshipit-source-id: ba70b3c84a391a8e395c73ccd1d7e08f92b0cbd0
Summary:
Put everything together. I used "programming error" extensively to
provide more context if we have to investigate issues in the future.
Reviewed By: sfilipco
Differential Revision: D27339278
fbshipit-source-id: 574a2c048dc1d24dbe690f862fec3e5078cb067a
Summary: Provide more context about what invariants we expect. Not just show "vertex not found".
Reviewed By: sfilipco
Differential Revision: D27339273
fbshipit-source-id: 1c6c92537ff37666ff603783adfd8f9ea770fbaa
Summary:
Makes NameDag own the state (logic) about how to send remote requests.
So NameDag can send requests on its own.
Reviewed By: sfilipco
Differential Revision: D27339282
fbshipit-source-id: 3cb6327dfeaefae45d4e7b88a3535463a84b195b
Summary: This will make the overlay IdMap effective when query against the NameDag.
Reviewed By: sfilipco
Differential Revision: D27339276
fbshipit-source-id: 80712bf651beb6c7e9f23bd4233c6d916101696a
Summary: To be able to quickly find all activity related to given Sandcastle job as discussed in [this workplace post](https://fb.workplace.com/groups/2120196508269853/permalink/2899040243718805).
Reviewed By: kmancini
Differential Revision: D27541803
fbshipit-source-id: a55900064bbee92da902de785ebe0c0e8738c3a2
Summary:
When watchman is in use, removing tons of files very quickly leads to FSEvents
losing events, causing watchman to recrawl the entire repository. It's not
entirely clear today what the maximum number of threads to use is, let's make
it configurable.
Reviewed By: andll
Differential Revision: D27536577
fbshipit-source-id: 58f2bc453321fd4e7468d3324ee4df2b79b96d5d
Summary:
We already checked size and mtime, which should allow us to identify if
a file has changed since the last update, but we need to check hgid as well, to
make sure the previously-written content is supposed to be the same as the
about-to-be-written content
Reviewed By: andll
Differential Revision: D26955401
fbshipit-source-id: 859ad35b008e68d699601217f2a4398c6789913e
Summary:
Updates native checkout to store which files have already been written
in .hg/upgradeprogress. Then enables it to load that file and skip writing those
files if they're already on disk and their mtime and size match the previously
written values. In theory we could record and check file hashes as well, but
that'd likely slow things down quite a bit.
Future diffs will add:
- Recording and checking the hgid that was written before vs what is about to be
written. Just an hgid comparison, not a full hash computation.
- Some UX to inform the user when hg checkout can be continued, and possibly to
implement 'hg checkout --continue'.
Reviewed By: andll
Differential Revision: D26830249
fbshipit-source-id: 88a75080966dae5241550ed7eedbc057c65966dd
Summary:
Currently eden on startup prints:
```
Starting edenfs (dev build), pid 190
Opening local RocksDB store...
Opened RocksDB store in 0.073 seconds.
Could not parse config.json file: couldn't read /var/twsvcscm/local/.eden/config.json: No such file or directory
Skipping remount step.
Started edenfs (pid 190)
Logs available at /var/twsvcscm/local/.eden/logs/edenfs.log
```
Would be convenient if it also printed session id, to be able to query scuba straight away.
Reviewed By: chadaustin
Differential Revision: D27522665
fbshipit-source-id: d7d4cf6c97bc551061761f2653375f208e393498
Summary: Refactoring to make the following diff smaller.
Reviewed By: chadaustin
Differential Revision: D27522581
fbshipit-source-id: 8f858714fcbfe4b8f8b1c3678bb2003623abbd94
Summary:
Still trying to enable Scuba logging in ovrsource TD to figure out what caused Buck regression when we started working with multiple watchman instances.
Did not try previous fix of pgrp (deploying stuff on Sandcastle is not trivial), although now I'm not sure that pgrp was the issue. Hard to say.
But now I'm launching CI differently and observe different symptoms.
`eden start` command finishes in 30 seconds (according to logs), but Sandcastle waits for something for 10 minutes and then somebody kills `scribe_cat` and Sandcastle continues. I don't really know what that means, but there's another issue I discovered:
`scribe_cat` inherits a copy of stderr fs created during daemon startup.
This diff fixes the issue.
Reviewed By: kmancini
Differential Revision: D27494520
fbshipit-source-id: 069f4e9ea1efb553cf7a7f18e20ae92c27da808d
Summary:
I see in Sandcastle logs it was 10 minutes between `eden start` command and the following command.
So I'm curious, is it eden took 10 minutes to start or Sandcastle did something else but did not log it?
This little message in log will help to understand that.
Reviewed By: kmancini
Differential Revision: D27491709
fbshipit-source-id: 796c8db5abe49b056bd81b03ea57585a761c3cb6
Summary:
`RendezVousConnection::new` can block for some time doing work on the CPU,
specifically creating the stats objects. This causes problems for other
futures during repo construction.
Instead, move rendez-vous construction to a `spawn_blocking` closure, so that
it doesn't interfere with the other futures.
Since `SqlBonsaiHgMapping::from_sql_connections` is not async, and is part of
the SqlConstruct trait, we must convert this to the builder pattern so that we
can defer rendez-vous construction to later on.
Reviewed By: farnz
Differential Revision: D27501915
fbshipit-source-id: 9c58c32411301128424985deeab127d052c43532
Summary: Vendor updated zstd crates. Started on this as a prerequisite for vendoring updated async-compression crate, however dependency resolution means this actually updates async-compression as well.
Reviewed By: farnz
Differential Revision: D27467760
fbshipit-source-id: 74ca9e4da9f336cf609cf06c62559c9ef913c9a2
Summary:
The extensions loading logic built up an _order list to track what order extensions should be loaded in. It also treated that list as the list of what has already been loaded, if we load extensions multiple times (like once at dispatch, then later for repo creation). Unfortunately some extensions, like tweakdefaults, modify the _order list, so when extensions uses len(_order) to decide which extensions are new, it sometimes gets the wrong results because tweakdefaults has inserted values at the beginning of _order which messes up the offset.
This causes some extension's uisetup and extsetup to be invoked twice. With dynamicconfig on Windows, I was seeing pushrebase load twice, which caused an error due to double registering bundle2 parts.
Extension loading is a mess, so this whole ordering concept really doesn't make sense, but for now let's fix it by not relying on the _order offset.
Reviewed By: quark-zju
Differential Revision: D27470828
fbshipit-source-id: e7ab8f87dfde22d82e5e080e8c01b884458b30c3
Summary:
We call `setsid` too late, [on successful startup](https://fburl.com/diffusion/p88am07t), after `scribe_cat` binary launched.
As a result, `scribe_cat` does not belong to process group of eden, it belongs to the process group of launcher process, so it is killed by `timeout` command in ovrsource CI.
To deal with it, this diff proposes additionally calling `setpgrp` call early during initialization.
Reviewed By: chadaustin
Differential Revision: D27436474
fbshipit-source-id: acb168c1ab5b78b1598c34ebece88847a9a6480d
Summary:
Let's give some time for bookmarks to warm up before we start checking the
location of the master bookmark.
Reviewed By: quark-zju
Differential Revision: D27472520
fbshipit-source-id: f20e6f04acadaf8b7caf6aa38a6ab9d244c6f8ea
Summary:
In the case where EdenFS crashes, any client performing IO to EdenFS would be
stuck in the D state waiting for the server to be back up and running. Since on
restart EdenFS will chose different ports, this makes it hard/impossible to
actually recover and will keep some process unkillable, often time requiring
rebooting a host.
To avoid this, we can make the mount "soft" which aborts the IO after some time
has elapsed and the server isn't back up. This should enable us to kill these
processes and recover more gracefully.
Reviewed By: chadaustin
Differential Revision: D27402817
fbshipit-source-id: ff81a4360900c4c94665174e3c0cf63402f1533e
Summary:
Add a `set_global_config` function to `hg-http` that gets called prior to command execution. This function can be used to configure the global behavior of all HTTP requests by setting up a callback that modifies every `Request` prior to sending.
Right now, the use case for this is to support the `--insecure` flag, as can be seen in the code.
Reviewed By: quark-zju
Differential Revision: D27243638
fbshipit-source-id: 61a52c1f4b56fffd860c2a7e9c7b03e146b2240a
Summary: Add the `--insecure` flag to the list of global hg flags. This flag is already supported by Python, and adding it here will make it available to Rust code as well. This is used later in the stack.
Reviewed By: quark-zju
Differential Revision: D27243029
fbshipit-source-id: 150d42ee96e1e3194ff1be6a33d9b36887d86f2c
Summary:
The `focusedbranch` revset should only include commits that are related via
draft commits. If a commit is landed, its public successor shouldn't cause
focused smartlog to include descendants of the landed commit.
Reviewed By: quark-zju
Differential Revision: D27461850
fbshipit-source-id: c0eb4fbcd6cf1cd78a88e3f796b5a16257dfb190
Summary:
When running `sudo mkscratch path /home/markamendoza`, the folder `/data/users/markamendoza/scratch/homeZmarkamendoza` will be created, including all parents that did not already exist. All these new directories will be created with owner = root.
Anticipating this kind of call, `mkscratch` already goes in and manually modifies the owner of the leaf directory (e.g. `/data/users/markamendoza/scratch/homeZmarkamendoza`), **but will not do so for any of the parent folders it created**.
This is normally not a problem, as long as `/data/users/markamendoza/scratch/` already exists. If it doesn't then these directories are created with the wrong ownership, which causes problems if we (for example) call `mkscratch` again, without sudo.
This diff remedies this by calling the same `set_file_owner` function (that's already called on the leaf directory) on all parents that were created by `mkscratch`
Reviewed By: chadaustin
Differential Revision: D27378153
fbshipit-source-id: 26c40f4ca478cacf9117093c7b70cbedd679cea6
Summary:
Preserving that state causing some bugs.
Also it leads to accumulating lots of .eden-backing-repos/fbsource/.hg/store/commitcloudstate.user files.
It is cleaner to remove it.
Reviewed By: markbt
Differential Revision: D27461467
fbshipit-source-id: 21eebe799afcb919539ae059fc8b707abc74971c
Summary:
We want to protect against races where we may have multiple writers with
different IdMap versions. We want IdMapVersions to be monotonous increasing.
The common race scenario is between tailers and seeder. We may have a tailer
that starts to updates while a seeder writes a version with a new idmap. The
tailer may try to write a version that has an old version of the idmap. We want
the tailer to fail in this case.
Reviewed By: krallin
Differential Revision: D27210035
fbshipit-source-id: 168c7c6f25622587071337df1172a12b799e0285
Summary:
This subcommand had the same help text as `content-fetch`. Update the
help text to describe what the command actually does.
Reviewed By: ahornby
Differential Revision: D27463645
fbshipit-source-id: ac4fba62f7836b55c94410889aafbf76a3cad74f