Summary:
Include changes since the last progress update (if any) in the final delta and total time logged.
When not chunking this was a minor inaccuracy, but when chunking in small chunks it would mean a lot of stats for each chunk were missing as if a chunk took less than 5 seconds there might not even have been a progress update before this end state.
Reviewed By: StanislavGlebik
Differential Revision: D25852274
fbshipit-source-id: 6cea76e3abd37908475052947794eed442a1ac82
Summary:
When the root type is e.g. Changeset allow edges from ChangesetToX
This is important for chunking, doing it independently to check it doesn't break non-chunked behaviour.
Reviewed By: krallin
Differential Revision: D25742294
fbshipit-source-id: ea4c989e9f61b30094d0fd83e543fe14a38254fd
Summary:
Calling the count Final doesn't make sense when chunking as it can appear more than once per tail, so update it to something more appropriate.
Seen is the count of Nodes seen in the stream. Loaded is the count of nodes where NodeData was loaded. When chunking some deferred Nodes will be seen but not Loaded.
Reviewed By: krallin
Differential Revision: D25742283
fbshipit-source-id: 1f10007d94ad2dbd750bfa53bab3e46a2caad7fa
Summary: This logging was being globbed away in tests. Disable it to make tests easier to read
Reviewed By: krallin
Differential Revision: D25742287
fbshipit-source-id: dfba05a688c9b1a56d6ab8746df2df3ea885c239
Summary: Add ability to turn on or off various types of logging, allows removing boilerplate in test cases.
Reviewed By: krallin
Differential Revision: D25742284
fbshipit-source-id: bbbe7f477156fc49ff6779f9a09e1b397ff6f618
Summary: This logging was always being globbed away, so remove it
Reviewed By: krallin
Differential Revision: D25742292
fbshipit-source-id: 75e004d3fdadc617f479beee44999692c267d2a9
Summary:
Add arguments to cmdlib so we can filter log messages by the slog tag, using new Drains added in slog_ext.
To use tagging from slog the form is:
```
const FOO_TAG: &str = "foo";
info!(logger, #FOO_TAG, "hello foo!");
```
Reviewed By: krallin
Differential Revision: D25837627
fbshipit-source-id: b164d508a2e82a80c4ff6f5f35c0c722257b9a2a
Summary:
override_ctx() sets different SessionClass while data is derived. This should
reduce the number of entries we write to blobstore sync queue. See previous
diff for more motivation.
This diff uses override_ctx() for batch_derive() method
Reviewed By: krallin
Differential Revision: D25910465
fbshipit-source-id: b8a3e729c059cad5716b1b09bd2f1cc618273627
Summary:
Previously we weren't initializing tunables at all in derived_data_tailer and
probably in lots of other binaries as well.
This is unfortunate, since derived_data_tailer runs for a long time and it
would be good to be able to control its behaviour.
Let's fix it by using init_mononoke() function.
Reviewed By: ikostia
Differential Revision: D25912156
fbshipit-source-id: eabd1c56120d087a169746077c8a7d36855c2b84
Summary:
We had issues with mononoke writing too much blobstore sync queue entries while
deriving data for large commits. We've contemplated a few solutions, and
decided to give this one a go.
This approach forces derive data to use Background SessionClass which has an
effect of not writing data to blobstore sync queue if the write to blobstore
was successful (it still writes data to the queue otherwise). This should
reduce the number of entries we write to the blobstore sync queue
significantly. The downside is that writes might get a bit slower - our
assumption is that this slowdown is acceptable. If that's not the case we can
always disable this option.
This diff overrides SessionClass for normal ::derive() method. However there's
also batch_derive() - this one will be addressed in the next diff.
One thing to note - we still write derived data mapping to blobstore sync queue. That should be find as we have a constant number of writes per commits.
Reviewed By: krallin
Differential Revision: D25910464
fbshipit-source-id: 4113d00bc0efe560fd14a5d4319b743d0a100dfa
Summary:
We ran into issues when we write too much data to our blobstore sync queue. The
idea we want to try is to prefer to not write to the queue if puts were
successful.
In the next diff we'll start overriding session class when we derive data.
This diff makes it possible to do so.
Reviewed By: krallin
Differential Revision: D25910463
fbshipit-source-id: 2ead5291acbce4014fbb833cac1dc53a6eb61b13
Summary:
This might be controversial, so I'd like to hear opinions.
We are running low on retention once again, and there's client_ip and
source_hostname field that have largely duplicated info and together use ~14%
of space.
I suggest to log source_hostname if it exists and client_ip if source_hostname
is not set. This should save ~7% of space since most of the time both of these
fields are set.
Reviewed By: krallin
Differential Revision: D25900993
fbshipit-source-id: f3db59e0dde44a5117059c829df7a1a1c811f641
Summary: Allow us to return arg parsing errors rather than panicing
Reviewed By: krallin
Differential Revision: D25837626
fbshipit-source-id: 87e39de140b1dcd3b13a529602fdafc31233175d
Summary: ChangesetEntry query is a more constrained query where we know the ids, can issue query in chunks separately
Differential Revision: D25804024
fbshipit-source-id: 6627fa17ee155182285997cb0642c7d7f033da26
Summary:
Allow use of read replica when fetching bulk ids. Bulkops clients not needing most up to date bounds can use this mode providing they are not checkpointing the repo min/max ids.
Existing default behaviour is unchanged.
Differential Revision: D25804028
fbshipit-source-id: ca14e929ea94c351e27eed2aa012fe914c8c691e
Summary: I was seeing mysql timeouts (where client doesn't consume query results within 10s) in walker runs using fetch_id(). Spawning the query means the results are collected and ready when we do end up polling.
Differential Revision: D25804025
fbshipit-source-id: 443dd87028fe68de16c174deb7b017d7ce5439d1
Summary: Simplify the code by allowing us to remove the old windowing logic.
Differential Revision: D25804022
fbshipit-source-id: 1f2837c2f83adcb3afdb453a9220ac68509a36ec
Summary:
When querying for changesets in a repo, often there are very few changesets for a repo inside a 65536 wide range of ids, which means multiple round trips to the database.
This change adds a LIMIT based mysql query that can always return up to the specified limit rows if the repo has them, and then using bounded_traversal_stream to unfold from the highest id loaded from that query to the next chunk to be loaded.
Reviewed By: StanislavGlebik
Differential Revision: D25804023
fbshipit-source-id: 46df2ea48d01bc4143d96642e45066f520faa4d6
Summary: **expected_sha1** was declared but not used in **xattr_test.py** file, removing this unused variable and variables which was used to creat **expected_sha1**
Reviewed By: xavierd
Differential Revision: D25899401
fbshipit-source-id: 26f0bb06d2c96e7f6754a4b821ffe4cf59a2f35e
Summary: Small improvement, can pass by reference
Reviewed By: krallin
Differential Revision: D25898714
fbshipit-source-id: f1012b7d947e1ead00cd9c032fea3f3aa04a3072
Summary:
We seem to get cpu spikes. The theory is that it happens because of a commit with
low generation number lands which triggers a slow path in getbundle code. Note that I've landed two
optimizations (D23824204 (609c2ac257) and D23599866 (54d43b7f95)) which *should* help, however at the
moment threshold for what to consider a log generation number is too low so the
optimization doesn't kick in.
I'd like to verify this theory, hence adding this logging.
Reviewed By: ahornby
Differential Revision: D25884345
fbshipit-source-id: 9686933726ff0a3ae11b541b3738eb08d011abe0
Summary:
In this diff I've replaced non-transparent error definition error("{0}") with error(transparent).
The reason is non-transparent errors print the same thing as the original errors:
```
Error: failed to complete task
Caused by: other error description <-- this duplicate shouldn't be here
Caused by: other error description
```
Reviewed By: krallin
Differential Revision: D25899411
fbshipit-source-id: e586af86b635a7e2fbf8952297171c546b859300
Summary:
Use the new API to clean up stale logs at open time. This hopefully helps
releasing disk space on Windows if the normal rotation fails to remove
old files being mmap-ed by other processes.
Reviewed By: xavierd
Differential Revision: D25894282
fbshipit-source-id: a3d8247b737dd451ee68b58cc5a38fdd2822c0c3
Summary:
Previously rotation only happens at flush time and file deletion is a best
effort (it might fail on Windows). For use-cases that are sensitive about
space usage that's suboptimal. This diff adds an API to remove old files
manually so high level logic can choose to clean up old files after open().
Reviewed By: xavierd
Differential Revision: D25894283
fbshipit-source-id: fbffff426544b39349ddf3537d46954d3cab5d12
Summary:
Previously, the fsmonitor state update logic will skip updating treestate if the wlock
cannot be obtained. D17468790 (8d4d0a66a2) made it wait for wlock for the painful "watchman fresh
instance" case. But things can still suck if it's not a "fresh instance" but there are just
too many nonnormal files.
This diff makes it that exceeding a threshold of nonnormal files will trigger a fsmonitor
state write as an attempt to to reduce the number of nonnormal files. In additional,
`--debug` was changed to print more internal states for debugging.
This would hopefully address issues where people have a large "nonnormal"
treestate, suffers from the bad performance issue and cannot recover from it
automatically.
Reviewed By: DurhamG
Differential Revision: D25794083
fbshipit-source-id: 741426cf31484d9318f9cfcab11d38da33ab5067
Summary:
We are going to write docs on how to unbind repositories and bind them later.
This integration test is a "test for documentation" i.e. it should run the same
steps as we'd run ourselves whenever we need to rebind.
The key part here is that we do a merge of unbound small repo into large repo commits
Reviewed By: ikostia
Differential Revision: D25883873
fbshipit-source-id: fac1d7871e52f4e1aa7f15d32d39f2755b803cd3
Summary:
Add a command that will unconditionally rewrite commits from one repo to
another. This is similar to what x-repo lookup thrift api does, however this
command doesn't do a few safety checks (in particular, it allows to rewrite
public commits).
This is going to be used for unbinding procedure.
Reviewed By: ikostia
Differential Revision: D25883874
fbshipit-source-id: a40eda9aa40ef7ad63e2403d956871940dd1601d
Summary: We don't use obsstore. Its size is irrelevant.
Reviewed By: singhsrb
Differential Revision: D25876066
fbshipit-source-id: 5657c3ca08f5ed1cb5a3d1a5b3395ab74756b7e5
Summary: It's used for malloc / realloc which take size_t instead of unsigned int.
Reviewed By: StanislavGlebik, xavierd
Differential Revision: D25890562
fbshipit-source-id: e2787786e9b995431c50b411d77cbce438a82c98
Summary: This is a bit nicer to read
Reviewed By: ahornby
Differential Revision: D25881919
fbshipit-source-id: 3c97390a96410a18e8fdc6cb6279b2d46e407cd3
Summary:
I think it's nice to group them together. Also while there I sorted both groups
lexicographically
Reviewed By: ahornby
Differential Revision: D25881918
fbshipit-source-id: 05bb9f07ee7799c3d5c19a8ffaabadaca635fef2
Summary: Certain doctor items are not meaningful outside Facebook. This diff adds the ability to have Facebook-specific doctors.
Reviewed By: xavierd
Differential Revision: D25789275
fbshipit-source-id: 94160be741a8fc3e8d01e86beaa0d2428952db21
Summary:
Before this diff, we did DNS lookups using a crate called `dns_lookup`. This crate is a thin layer over libc DNS lookups. Those lookups are blocking (i.e. they hold a thread), so they're not very friendly to asynchronous code. We currently offload them on a dedicated thread pool to mitigate the issue, but this isn't ideal: if we experience e.g. slow DNS responses, we could saturate this thread pool pretty easily.
I updated it to use the trust-dns crate, which provides an asynchronous implementation of DNS lookups, and is currently used in other parts of Mononoke.
Reviewed By: krallin
Differential Revision: D25849872
fbshipit-source-id: 826ab4e5618844f3b48e5def4ad9bd163753ebb1
Summary:
The `pull` commmand has a lot of tech debt (with issues like inefficiency, race
conditions, named branches, etc). The new `repo.pull` API isn't designed to
support all legacy usecases.
This diff switches a subset of `pull` command that the new API can support to
the new API. It should solve race condition or `visibility.add` inefficiency
issues (ex. 20s on addgroup, 187s on visibility.add - P154836357).
Logic related to remotenames was ported from the remotenames extension.
The selectivepull tests seem strong enough to check their behaviors.
The new pull API is used by commit cloud for many months. So I turned the
new code path on by default. It can be turned off by changing
`commands.new-pull`.
There are a few test changes. The new behavior seems more correct overall:
- test-commitcloud-switch-workspace.t
- "M/F" swap is caused by rev number: both are correct
- "S: public remote/stable" is more correct than "S: draft"
- test-commitcloud-sync-rb-deletion.t
- "draft1: draft remote/scratch/draft1" is more correct because
"remote/scratch/draft1" is listed in the "book --list-subs"
command above.
- test-commitcloud-sync-rb-enabling.t
- "public1: draft" not showing up is more correct.
- test-fb-hgext-remotefilelog-lfs-prefetch.t
- Difference seems to be caused by rev order.
Reviewed By: markbt
Differential Revision: D25562089
fbshipit-source-id: ac22b2f0492ab53517d580d706dfbc823fd0e0cc
Summary:
Make the `pull` API update visible heads even if all commits exist locally.
This is a more expected behavior, and will make the "pull" command using the
pull API simpler.
Reviewed By: markbt
Differential Revision: D25562091
fbshipit-source-id: 8a43cfe4afd31d3cb9ad5369a6081de120043168
Summary:
Filters out some obviously public heads at the end of repo.pull so they don't
get passed to `visibility.add`. Note: this just removes some obviously "public"
commits without considering the graph. A stronger guarantee would be removing
public heads at metalog commit time.
Reviewed By: markbt
Differential Revision: D25562092
fbshipit-source-id: cc6a19252fcfe305e3a14895b61ab0d6b84a007e
Summary: Update name to match usage of Try as tri-state, since this method also throws if the Try is empty
Reviewed By: yfeldblum
Differential Revision: D25737810
fbshipit-source-id: a4166153362f07353d212216fbaf7105867eef2a
Summary:
On Windows, since Mercurial doesn't yet build with Buck, we need to test
against the system Mercurial, thus remove the dependencies to //eden/scm:hg for
the tests. Also remove various dependencies that don't build yet on Windows.
This allows for the tests to run, but fail while trying to execute edenfsctl.par.
Reviewed By: kmancini
Differential Revision: D25807727
fbshipit-source-id: c2533eedc361cc6db9fdf2190476c3d52833139d
Summary: Add a method to get just the ids. This is a new best case for the fetching, so updated the benchmark as well.
Reviewed By: StanislavGlebik
Differential Revision: D25804027
fbshipit-source-id: ccc4573c8a4ebc07db854a0ffa737f572087019e
Summary: Simplify the tests for bulkops, no need to map to hg to check the expected changesets. Can compare to expected bonsai directly instead as its simpler
Reviewed By: StanislavGlebik
Differential Revision: D25804020
fbshipit-source-id: eb4381c37ed6a4fb1e213f0397ffb2136ddee473
Summary: When scrubbing repos it is preferable to scrub newest data first. This diff adds Direction::NewestFirst to bulkops for use in scrubbing and updates existing call sites to Direction::OldestFirst so as not to change behaviour
Reviewed By: StanislavGlebik
Differential Revision: D25742279
fbshipit-source-id: 363a4854b14e9aa970b2c1ec491dcaccac7a6ec9
Summary: Preparation for adding OldestFirst vs NewestFirst direction of fetching in next diff.
Reviewed By: StanislavGlebik
Differential Revision: D25742281
fbshipit-source-id: 655f297efc2094d4325717d97cce53e697c35597
Summary: Add a benchmark tool for bulkops that uses the criterion benchmarking framework. This is so we can measure effect of optimizations later in stack.
Differential Revision: D25804026
fbshipit-source-id: 71b8addf1145c0ecb69d6392b4602172f2b52080
Summary:
We used to get those in the old (Python) LFS extension, but didn't have them in
the new one. However, this is helpful to correlate requests to LFS with data in
hg logs. It's also convenient to be able to identify whether a set of requests
are part of the same session or not.
This diffs threads the client correlator through to the LFS store from the
Python, similarly to how it's done in EdenAPI.
Reviewed By: DurhamG
Differential Revision: D25804930
fbshipit-source-id: a5d5508617fa4184344834bbd8e3423816aa7668
Summary: The dependency that we want to share is 0.14 instead of 0.15.
Reviewed By: singhsrb
Differential Revision: D25871110
fbshipit-source-id: 16e9f8a858ee04a47867c2916909edfc996f8bc4
Summary:
Add `manifest_node` to the crdump output, which is the root manifest node of the commit.
This is useful for detecting commits that have the same tree content but
different metadata (e.g. if only the commit message has been edited).
Reviewed By: singhsrb
Differential Revision: D25782674
fbshipit-source-id: dfdf426833533140b676eee82e123a0cba23c77a
Summary:
We had a mix of callsites using fetch_all_public_changesets directly vs those using the PublicChangesetBulkFetch::fetch.
Update callsites to use PublicChangesetBulkFetch::fetch for consistency. This also has the nice side effect to removing some explicit config_store usage.
Reviewed By: StanislavGlebik
Differential Revision: D25804019
fbshipit-source-id: 5a88888dd915d1d693fb26ffe3bb359c9e918d5c
Summary:
Right now, we have zero visibility HTTP level errors (such as broken pipes),
because this is all set up on Gotham with no logging. This diff moves the
bootstrapping from Gotham to us to fix this.
There's a bit of code I'd like to deduplicate here, but it's tricky to do given
that the code is a little different in the HTTP vs. HTTPS branches. For now,
this will give us some logging we need without too much effort. We can make it
more robust (and route it to Scuba or give it session IDs) if this proves
useful.
Reviewed By: StanislavGlebik
Differential Revision: D25851426
fbshipit-source-id: 4ca5d1ecb3931715f04af735aa1b7cfdac87846d
Summary:
This updates Gotham. Under the hood I rebased our fork, you can see the diff
here: P161171514.
The stuff that is relevant is that Gotham got rid of its dependency on
`failure` and now uses `anyhow` instead, and I also added a little bit to our
existing socket data patch by making a few things public so that we can get
access to a few more internals.
Reviewed By: StanislavGlebik
Differential Revision: D25850262
fbshipit-source-id: 25ebf5d63be39e3e93208705d91abc5c61c90453
Summary:
commit cloud commands shouldn't fail if the set is empty
the bug was introduced by using the function bookmarks.selectivepullbookmarknames that has a check at the end that the set is not empty, commit cloud doesn't need the check
the issue introduced D25802193 (713faa71cd) via code refactoring
Differential Revision: D25853480
fbshipit-source-id: 7d3f057dead097b86269e7b03d78f5523e8f1ec7
Summary:
Depending on the thrift defition, `thrift_library` targets may also depend on `ref-cast`.
Add this to the `Cargo.toml`.
Reviewed By: lukaspiatkowski
Differential Revision: D25636872
fbshipit-source-id: 8263395db2bb31127528f5c66c4cc5dd9180d89f
Summary:
This diff adds a debug command that allows inserting different kinds of mapping
entries:
1) Rewritten, meaning that source repo commit rewrites into a target repo
commit
2) Equivalent working copy, meaning that source repo commit doesn't rewrite
into a target repo commit, but one of its ancestors does
3) NotSyncCandidate, meaning that large repo commit shouldn't be remapped into
a small repo
Reviewed By: ahornby
Differential Revision: D25844996
fbshipit-source-id: 1ba64540cf511da8cc50c80a5bee822a950707be
Summary: Let's be a bit more consistent and use ARG_VERSION_NAME
Reviewed By: krallin
Differential Revision: D25844995
fbshipit-source-id: c09be5a38ef97bc491b324f49a2c7d0b6a47212e
Summary:
Rough progress reporting. The progress bars are straight coming from the
`indicatif` crate. Integrating with the `IO` object is not trivial because
we only have a reference. It gets tricky. I think that it makes sense for
us to expand the IO object to something that is more than a `Box<dyn Write>`.
We have about 3 scenarios:
1. Write object that we need to implement interior mutability for to give out
clones.
2. Stdin/Stdout which have their own imlementation for interior mutability.
3. PyObject which has locking already implemented.
(Note: this ignores all push blocking failures!)
Reviewed By: quark-zju
Differential Revision: D25840469
fbshipit-source-id: 87f466f06f2c5d4c63ccb3bbc5c009fae41ed002
Summary:
Right now we don't log handler errors to Scuba. This can make debugging a
little tricky: if a client is sending an invalid request, we'll see that we
sent them a 400, but we won't know what was invalid about the request.
This diff updates our logging to actually log that.
Reviewed By: sfilipco
Differential Revision: D25826522
fbshipit-source-id: 89486014e0eeaac5c9b149224601db54a26080d9
Summary:
There is a little bug here. We produce a stream of futures of futures, then we
buffer it, which gives us a stream of futures, and then we await the futures
one by one, here:
```
while let Some(next) = stream.next().await {
next.await?
}
```
This is not really correct, because it means we don't actually do fetches
concurrently at all (we just instantiate futures concurrently, but that's not
really async work).
This fixes that by removing one layer of future-ing.
Reviewed By: singhsrb
Differential Revision: D25825895
fbshipit-source-id: 3ad3367f1eb802ce5b9b5288f04fd3705e172537
Summary:
Cross repo sync has two tables to store mapping between commits (this is
something we should probably change, but this is what we have right now).
"map" subcommand was a bit useless because it searched only in a single table,
so for lots of commits it would just return empty response.
Let's instead return CommitSyncOutcome which gives more useful information
Reviewed By: krallin
Differential Revision: D25823629
fbshipit-source-id: afc14f48b6c30bec3714dc9b79cfc4a7d67e38a9
Summary:
This allows adding progress bars tracking downloads from the server.
We could be smarter in this instance if we were to deserialize on the fly.
The first part of the payload contains the number of idmap entries that we need
but it needs more work to make it clear. The progress object right now is
designed for general bytes.
Reviewed By: quark-zju
Differential Revision: D25840470
fbshipit-source-id: c466c8d606b44981fe63c95352db2d8f14d6071b
Summary:
Unbundlereplay command was not implemented in the mononoke but it is used by sync job. So let's add this command here
together with additional integration test for sync between 2 mononoke repos. In addition I'm adding non fast forward bookmark movements by specifying key to sync job.
Reviewed By: StanislavGlebik
Differential Revision: D25803375
fbshipit-source-id: 6be9e8bfed8976d47045bc425c8c796fb0dff064
Summary:
The segmented changelog tailer is going to run with multiple instances which
may race to update the database. This change adds a test that checks that
concurrent updates keep the IdMap correct.
Reviewed By: ahornby
Differential Revision: D25684783
fbshipit-source-id: a09f6e6c915bde38158d9737dcfdc7adc3f15cb7
Summary:
The most common scenario where we see matcher errors is when we iterate through
a manifest and the user sends SIGTERM to the process. The matcher may be both
Rust and Python code. The Python code handles the interrupt and prevents future
function calls. The iterating Rust code will continue to call matcher functions
through this time so we get matcher errors from the terminated Python stack.
As long as we have Python matcher code, errors are valid.
It is unclear to me whether the matcher trait should have `Result` return
values when all implementations are Rust. It is easy to imagine implementations
that can fail in different circumstances but the ones that we use if we just
port the Python code wouldn't fail.
All in all, I think that this is a reasonable step forward.
Reviewed By: quark-zju
Differential Revision: D25697099
fbshipit-source-id: f61c80bd0a8caa58040a447ed02d48a1ae84ad60
Summary: These globs were lost as part of D25315954 (ec0b533381).
Reviewed By: quark-zju
Differential Revision: D25814934
fbshipit-source-id: b1896893e37e355a73eb136758f8966666e0ec05
Summary:
On Windows, it's possible that not all files could be removed from the
repository due to some other process holding a reference to it. When that
happens the `edenfsctl rm` operation will fail. Sometimes, instead of failing
with the actual reason for why the removal failed, it throws a cryptic "list
index out of range" error.
The reason was that when the file that can't be removed is actually a
directory, the `errors` list would be empty. Since filtering the folders is a
bit silly, let's not do it.
Reviewed By: fanzeyi
Differential Revision: D25836412
fbshipit-source-id: 36f936ff9d7697dfd2f4c68d4e56bdb18b66b06a
Summary:
Fixed `README.md` so commands in it work now.
Fixed integration_runner.
Reviewed By: lukaspiatkowski
Differential Revision: D25823461
fbshipit-source-id: 0d6784758c9f86bca38beafe014af4766169bee3
Summary:
This unbreaks the test. The reversefiller need access to SMC to talk to
scmquery (we could set up our own scmquery instance but I don't think it's worh
it).
Reviewed By: krallin
Differential Revision: D25824395
fbshipit-source-id: 676b3ac1e3af95e8e02bd272f7cb25250e047eed
Summary:
Sometimes we want to rechunk just a few file contents, this diff makes it
possible to do so.
Reviewed By: ahornby
Differential Revision: D25804144
fbshipit-source-id: 6ce69f7cee8616a872531bdf5a48746dd401442d
Summary: There is only one implementation of the trait so remove it and use that impl directly. Removing the trait makes it simpler to work on bulkops in the rest of this stack.
Reviewed By: farnz
Differential Revision: D25804021
fbshipit-source-id: 22fe797cf87656932d383ae236f2f867e788a832
Summary:
Unless we can't update to a public root, there is nothing wrong with having local changes and switching workspaces feature.
Those are not related. Uncommited changes shouldn't impact switching workspaces.
Reviewed By: mitrandir77
Differential Revision: D25802406
fbshipit-source-id: 3fcb70864002bed11ad32621947294f643ca1fc3
Summary:
Right now we get zero logs from the blobstore healer, which is pretty annoying
because it makes it impossible to really tell what it's doing.
This fixes that.
Reviewed By: HarveyHunt
Differential Revision: D25823800
fbshipit-source-id: ded420753ba809626d6e4291eb3d900dcfbff3d1
Summary:
This was a request from users. Repo could go into a disconnected state, for example, if rejoin in fbclone fails due to some reason.
In this case it was confusing that `hg cloud switch` command doesn't work. Users have to run `hg cloud join` command first.
If the repo is disconnected but doesn't contain any relevant local changes for commit cloud, it should be fine to switch workspace.
Reviewed By: mitrandir77
Differential Revision: D25802193
fbshipit-source-id: 3216a10c3438463773602b2dfd13740866fb5908
Summary:
In some cases we might have chunked file content in one blobstore component and
unchunked file content in another. And rechunking the second component was
awkward since we never know which version a filestore will fetch - filestore
can fetch a chunked version and decide that rechunking is not necessary.
This diff makes it possible to rechunk only a single component of a multiplexed
blobstore. It does so by manually creating BlobRepo with the single-component
blobstore.
Reviewed By: krallin
Differential Revision: D25803821
fbshipit-source-id: f2a992b73d0c5fc9d389a4b81e0f3e312c17fdea
Summary:
The cert path isn't correctly set up on all platforms, so this can
cause Mercurial to throw an error complaining about missing certs, even when
edenapi isn't enabled.
Let's back this out for now until we can fix the cert paths or only hit this
path when we actually use edenapi.
Reviewed By: singhsrb
Differential Revision: D25792491
fbshipit-source-id: 022a89a089cabcc709a07934eb62b883082261c2
Summary: Convert `Changsets` trait and all its uses to new type futures
Reviewed By: krallin
Differential Revision: D25638875
fbshipit-source-id: 947423e2ee47a463861678b146641bcc6b899a4a
Summary:
Lots of things can look like CBOR data, such as ... strings representing
errors. Right now, if the data in our CBOR stream is actually an error message,
then we'll just ignore it (see details in T80406893).
This isn't how we normally handle invalid data on the stream (we'd raise an
error) — it only happens with trailing data. This fixes our decoding to raise
an error in this case.
Reviewed By: quark-zju
Differential Revision: D25759082
fbshipit-source-id: c3d8be5007112ec1d2e7f25a102d8caaf0dbba56
Summary:
enable switching from a draft commit possible for most of the cases
make it possible if the public root of the current commit is an ancestor of the main bookmark
this condition we need because the remote bookmarks can be different for different workspaces and they define phases
I think it will cover most of workflows
Reviewed By: mitrandir77
Differential Revision: D25780999
fbshipit-source-id: b1c25b29a7668d51244ca43d6b0c30fa2fc068d9
Summary: Skip some very long configs to make rage output cleaner.
Reviewed By: DurhamG
Differential Revision: D25625452
fbshipit-source-id: 44bf8b9f93d9cb06d065a89f5d0ffa53ad6d6286
Summary:
The StringPiece constructor is untyped, and was only used in test. We can
afford to build the PathComponent in tests instead to avoid future headaches.
Reviewed By: genevievehelsel
Differential Revision: D25434556
fbshipit-source-id: 4b10bf2576870e81412d76c4b9755b45e26986b3
Summary:
Mercurial support files with `\` in their name, which can't be represented on
Windows due to `\` being the path separator. Currently, EdenFS will throw
errors at the user when such file are encountered, let's simply warn, and
continue.
Reviewed By: chadaustin
Differential Revision: D25430523
fbshipit-source-id: 4167b4cd81380226aead8e4f4850a7738087fd95
Summary:
On OSX, if Mercurial is built from fbcode, these environment variables
(which point specifically to Eden's own par file data) can break Mercurial's
ability to load dynamic libraries. Let's unset them.
Reviewed By: xavierd
Differential Revision: D25783552
fbshipit-source-id: 74e6232d225856fedc0382abc6cd223a6c47d8bc
Summary:
All of the strace logging was done in PrjfsChannel except for the notification
callbacks, let's remediate this.
Reviewed By: kmancini
Differential Revision: D25643491
fbshipit-source-id: 7eaed2503557b0e486d7d1b0637c68287ee9df90
Summary:
In a previous diff, chadaustin noted that there was a bunch of duplicated code
prior to calling into the PrjfsChannel, let's use template to solve this.
One of the non-refactored piece is the BAIL_ON_RECURSIVE_CALL, and I'm not sure
of a good way to move it into runCallback while still being able to understand
what callback is recursive. Previously, the line number from XLOG was
sufficient, moving it into the runCallback function would lose that.
Reviewed By: chadaustin
Differential Revision: D25576860
fbshipit-source-id: 619ed0c9fecf05cda2263dfcdf2fbcbaec85e45a
Summary:
The RcuPtr abstraction allows us to use RCU instead of the significantly more
expensive Synchronized<shared_ptr>. This should reduce the cost of all the
callbacks while not sacrificing the guarantee that unmounting a repository
needs to wait for all the pending callbacks to complete.
A new rcu_domain is used as the pending callbacks may sleep and take a long
time to complete when the servers aren't reachable. To avoid penalizing all the
other RCU clients, it's best to be isolated in its own domain.
Reviewed By: kmancini
Differential Revision: D25351535
fbshipit-source-id: bd40d59056e3e710c28c42d651b79876be496bc3
Summary:
We should not filter based on parsed level when passiing an inner drain into
the `DynamicLevelDrain`, as in cases when the binary is ran
`--with-dynamic-observability=true`, this would default the level to `INFO` and
make the inner drain filter on that level, which would essentially make debug
logging impossible. Instead, we should pass unfiltering inner drain into
`DynamicLevelDrain`, as `DynamicLevelDrain` actually uses
`ObservabilityContext`, which when the binary is called with `--debug` or
`--level=SOMETHING` would [instantiate](https://fburl.com/diffusion/sib8ayrn) a `Static` variant, behaving just as
current static level filtering.
Note also that this bug does not affect production, as we never actually try to
control the logging levels dynamically: we always run either with `--debug` or
with `--level=SOMETHING`, which again uses `Static` variant of
`ObservabilityContext`, which in turn filters the same way as the inner drain.
Reviewed By: krallin
Differential Revision: D25783488
fbshipit-source-id: 8054863fb655dd66747b6d2306a38c13cbc64443
Summary:
This diff adds an (as yet unused) option to log verbose scuba samples.
Here's the high-level overview.
In addition to doing `scuba_sample.log_with_msg` and `scuba_sample.log()`, you can now do `scuba_sample.log_with_msg_verbose()` and `scuba_sample.log_verbose()`. These two methods indicate that the intended sample is verbose and should go through some filtering prior to logging.
By default verbose samples are just not logged, but there are ways to override this via `ScubaObservabilityConfig`. Namely, the config has a "system" `ScubaVerbosityLevel`, which is either `Normal` or `Verbose`. When the level is `Verbose`, all samples are logged (those triggered by `.log_with_msg()`, `.log()`, `.log_with_msg_verbose()` and `.log_verbose()`. In addition to the "system" verbosity level, `ScubaObservabilityConfig` supports a few filtering overrides: a list of verbose sessions, a list of verbose unixnames and a list of verbose hostnames. Whenever a verbose sample's session, unixname or source hostname belongs to a corresponding list, the sample is logged.
`ScubaObservabilityConfig` is a struct, queried from `configerator` without the need to restart a service. Querying/figuring out whether logging is needed is done by the `ObservabilityContext` struct, which was introduced a few diffs earlier.
Note: I also want to add regex-based filtering for hostnames, as it's likely to be more useful than exact-match filtering, but I will do that later.
Reviewed By: StanislavGlebik
Differential Revision: D25232429
fbshipit-source-id: 057af95fc31f70d796063cefac5b8f7c69d7b3ef
Summary:
In the previous diff I had to make the same change in two places, this change
deduplicates the code so we can reuse the change. This isn't 100% equivalent,
since now we have 2 layers of boxing on the stream in `Fetch`.
That being said, that seems quite unlikely to matter considering that this is
ultimately handling responses that came to us over HTTP, so one pointer
traversal seems to be reasonable overhead (also, similar experience in Mononoke
suggests it really does not matter).
Reviewed By: quark-zju
Differential Revision: D25758652
fbshipit-source-id: 399ead1b67ffbb241597615a29129411580cf194
Summary:
This updates the edenapi fetch mechanism to check status codes from the server.
If the server responds with an error, we propagate the error up to the caller.
This is equivalent to what we would do if e.g. the server had just crashed.
Reviewed By: quark-zju
Differential Revision: D25758653
fbshipit-source-id: f44f6384be7944dce670c3825ccbb60b5fa2090a
Summary: This was a bit triggering while looking at logs :p
Reviewed By: StanislavGlebik
Differential Revision: D25781047
fbshipit-source-id: 22ebf1273b8b8d0b765c1bc7df2ba93752bf45e8
Summary:
See D25780870 for a bit of context. Our admin server was failing to start up
because of changesets warmup taking too long, but that's not easy to figure out
if all you have are the logs that don't tell you what we are doing (you'd have
to look at counters to work this out).
Let's just log this stuff.
Reviewed By: StanislavGlebik
Differential Revision: D25781048
fbshipit-source-id: 57a783dadc618956f577f32df3d2ec92ee729d56
Summary:
Like it says in the title. This is helpful with e.g. Mononoke server where the
"server" handle includes a long winded startup sequence. Right now, if we get
an error, then we don't get an error message immediately, even if we have one.
This leaves you with logs like this:
```
0105 04:20:48.563924 995374 [main] eden/mononoke/cmdlib/src/helpers.rs:229] Server has exited! Starting shutdown...
I0105 04:20:48.564076 995374 [main] eden/mononoke/cmdlib/src/helpers.rs:240] Waiting 0s before shutting down server
I0105 04:20:48.564238 995374 [main] eden/mononoke/cmdlib/src/helpers.rs:248] Shutting down...
E0105 04:20:48.564315 995374 [main] eden/mononoke/server/src/main.rs:119] could not send termination signal: ()
```
This isn't great because you might have to wait for a while to see the error,
and if something hangs in the shutdown sequence later, then you might not see
it at all.
The downside is we might log twice if we have a server that crashes like this,
but I guess that's probably better than not logging at all.
Reviewed By: StanislavGlebik
Differential Revision: D25781095
fbshipit-source-id: bf5bf016d7aa36e3ff6302175bef1aab826977bc
Summary:
After the refactoring in the previous diff let's stop using CommitSyncConfig in
PushRedirectorArgs and start using get_common_pushrebase_bookmarks() method.
Reviewed By: mitrandir77
Differential Revision: D25636577
fbshipit-source-id: 126b38860b011c5a9506a38d4568e5d51b2af648
Summary:
At the moment we are in the bit of a mess with cross repo sync configuration,
and this diff will try to clean it up a bit.
In particular, we have LiveCommitSyncConfig which is refreshed automatically,
and also we have CommitSyncConfig which is stored in RepoConfig. The latter is
deprecated and is not supposed to be used, however there are still a few places
that do that. This stack is an attempt to clean it up.
In particular deprecated CommitSyncConfig is used to fetch common pushrebase
bookmarks i.e. bookmarks where pushes from both repos are sent. This diff adds
get_common_pushrebase_bookmarks() method to CommitSyncer so that in the later
diffs we can avoid using CommitSyncConfig for that.
Reviewed By: mitrandir77
Differential Revision: D25636394
fbshipit-source-id: 09b049eb8a54834881d215bc6b9c4150377e387f
Summary: Starting from 3.11.1, OSXFUSE switched into using macOS's major version number for different system versions. So we need to consider that when calculating path to the kernel extensions on macOS.
Reviewed By: xavierd
Differential Revision: D25675984
fbshipit-source-id: ea8c76ce7204ba5da3ca98ceca2cfbeb9c84fa8f
Summary:
Make sure we give more explanation to users to they can self-fix any errors
related to certificates that might pop up.
Reviewed By: xavierd
Differential Revision: D25758517
fbshipit-source-id: 3b9929be3d1c0c44a5e13cc9c1e7b2a4f785abf4
Summary:
The introduction of `eden trace` broke the Buck build on Windows due to its use
of streaming thrift which unfortunately doesn't compile on Windows. Since `eden
trace` is not supported on Windows for now, let's only depend on the streaming
thrift on Linux and macOS.
With this, we can now compile edenfsctl on Windows with Buck. This will later
enable integration tests to be run on Windows.
Reviewed By: genevievehelsel
Differential Revision: D25758445
fbshipit-source-id: d4be2cafd9472840f65dcfab63a5fcfb8eceffb7
Summary:
Like it says in the title. Judging by an earlier similar change (D21092866 (15f98fe58c)),
this kind of flakiness in walker tests occurs when a node's children are
reachable via other paths.
Reviewed By: HarveyHunt
Differential Revision: D25756891
fbshipit-source-id: 05bc0697381e068d466ea6dfe85529dbd9ef1a50
Summary:
Like it says in the title. Note that I did *not* retry stuff like resolving
hosts or connecting, so this should only really temporary blips in
connectivity. We probably shouldn't go much beyond that at a low level like
this.
Reviewed By: HarveyHunt
Differential Revision: D25615915
fbshipit-source-id: 78c33eff2e9ce380a260708e9fbeb929eede383c
Summary:
This is the goal of this stack: retry errors that occur when Curl detects that
the transfer speed is too low. This should let us eventually set a much higher
timeout on overall request completion, thus ensuring that we don't uploads
that make progress, all the while aborting connections early and retrying them
if they are legitimately stuck.
Reviewed By: farnz
Differential Revision: D25615790
fbshipit-source-id: fe294aee090758b1a3aef138788ac2926c741b79
Summary:
Right now, the error handling in LFS doesn't handle e.g. transfer timeouts. I'd
like us to support that, notably so that we can have curl require a minimum
transfer speed and retry if we fail.
To do so, I need to be able to capture the errors and figure out if they're
retryable. Right now, everything is either a `FetchError` that includes a HTTP
status and URL, or just an `Error` that aborts.
This diff introduces a `TransferError` that explains why a transfer failed and
can be used for retry decisions. We later add the request details if we decide
to not retry and turn it into a `FetchError`.
Reviewed By: xavierd
Differential Revision: D25615789
fbshipit-source-id: e4a2f4f16a34ca2f86bd61491bb26e7f328dec63
Summary:
Like it says in the title. This adds support for setting a min-transfer-speed
in Curl. My goal with this is to fix two problems we have:
- a) Uploads that timeout on slow connections. Right now we set a transfer
timeout on requests, but given files to upload can be arbitrarily large, that
can fail. This happened earlier this week to a user (T81365552).
- b) Transfer timeouts in LFS. Right now, we have a very high timeout on
requests and we can't lower it due to this problem with uploads. Besides,
the reason for lowering the timeout would be to retry thing, but right now
we don't support this anyway.
Reviewed By: xavierd
Differential Revision: D25615788
fbshipit-source-id: 57d75ee8f522cf8524f9d12103e34b0765b6846a
Summary:
I'd like to make it a little easier to add more options without having to
thread them all the way through to the HTTP transfer callsite.
Reviewed By: xavierd
Differential Revision: D25615787
fbshipit-source-id: 4c6274dc2e6b5ba878e0027aae9a08b04f974463
Summary: Extended git-import test to include both `full-repo` and `missing-for-commit` import modes.
Reviewed By: ahornby
Differential Revision: D25675361
fbshipit-source-id: b93e2db963c2060540308bf0477cd891d40e5810
Summary:
Managing tailer processes that run multiple times and run once is different. We
want separate code paths when we run contiously than when running only once.
Reviewed By: quark-zju
Differential Revision: D25684782
fbshipit-source-id: 354b32c1dd73f867d6a7b1bd4518d9dd98e6b9a3
Summary:
The intention was to sort entries by Dag Id entry. This was instead sorted
lexicographically.
Reviewed By: quark-zju
Differential Revision: D25684784
fbshipit-source-id: 0a3db6398aec7d8df080bbb2366e41660483608c
Summary:
Comments for why we don't need a lock when updating the SqlIdMap with multiple
writers. Structure can definitely be improved but I'll live with this for a
short time.
No fundamental change in logic. I added extra checks to the insert function and
changed from an optimistic insert race logic to a pessimistic version. I
explain in the comments that it's to have an easier time reasoning about what
happens and that theoretically doesn't matter.
Reviewed By: quark-zju
Differential Revision: D25606290
fbshipit-source-id: ea21915fc797fe759b3fe481e8ad9e8cb594fb6a
Summary: Will reduce the number of jobs needed for small repos
Reviewed By: StanislavGlebik
Differential Revision: D25492059
fbshipit-source-id: de11c06615857ad43f3337e58973849d2026a114
Summary: preparation for multi repo, get the repo name into ErrorKind::NotTraversable
Reviewed By: StanislavGlebik
Differential Revision: D25541444
fbshipit-source-id: 8fd99d5d3f144d8a3a72c7c33205ae58bd5f1ae2
Summary:
In preparation for having the walker able to scrub multiple repos at once, define parameter structs. This also simplifies the code in tail.rs.
The param objects are:
* RepoSubcommandParams - per repo params that can be setup in setup_common and are consumed in the subcommand. They don't get passed through to the walk
* RepoWalkParams - per repo params that can be setup in setup_common and will get passed all the way into the walk.rs methods
* JobWalkParams - per job params that at can be setup in setup_common and will get passed all the way into the walk.rs methods
* TypeWalkParams - per repo params that need to be setup in the subcommands, and are passed all the way into walk.rs
Reviewed By: StanislavGlebik
Differential Revision: D25524256
fbshipit-source-id: bfc8e087e386b6ed45121908b48b6535f65debd3
Summary: parsing of progress options an sampling options was same in each subcommand, move them to functions in setup.rs
Reviewed By: StanislavGlebik
Differential Revision: D25524255
fbshipit-source-id: a2f48814f24aa9b3a158cb7d4abbfc2c0c338305
Summary: Simplify open_blobrepo_given_datasources parameters to pass less arguments, make it so can pass the sql_factory by reference.
Reviewed By: krallin
Differential Revision: D25524254
fbshipit-source-id: c324127f42c53a52f388d303e310014f4fa0d7bb
Summary: Allows the walker blobstore code to be used by more than one blobrepo. This is a step to reduce the number of jobs needed to scrub small repos.
Reviewed By: StanislavGlebik
Differential Revision: D25422937
fbshipit-source-id: e2d11239f172f50680bb6e10dd60026c9e6c3c3d
Summary:
By doing the hg to hg steps via bonsai I will later introduce a check if the bonsai is in the current chunk of commits to be processed as part of allowing walker checkpoint and restart.
On its own this is a minor change to the number of nodes the walk will cover as seen in the updated tests.
Reviewed By: krallin
Differential Revision: D25394085
fbshipit-source-id: 3e50cf76c7032635ce9e6a7375228979b2e9c930
Summary: This is in preparation for all walker hg to hg steps (e.g HgChangeset to Parent HgChangeset) going via Bonsai, which without this would continually check if the filenodes are derived
Reviewed By: krallin
Differential Revision: D25394086
fbshipit-source-id: bb75e7ddf5b09f9d13a0f436627f4c3c95e24430
Summary:
`PartialOrd` was suggested by sfilipco. Note `Option<std::cmp::Ordering>` is
similar to `Side` in terms of expressiveness. `PartialOrd` can be written
using shorter symbols (`<=`, etc) so it's easier to understand.
The `compatible` family APIs were replaced by `partial_cmp` APIs.
There are some minor differences:
- Bitwise or used by union set is no longer supported. `Hints::union` was
added as a replacement.
- `Option<T>` implements full order. `Some(T) > None`. This is different
from `compatible_dag` and `compatible_id_map` APIs. Additional `> None`
checks were added for correctness.
Reviewed By: sfilipco
Differential Revision: D25652784
fbshipit-source-id: 51d88948fa556300678050088c06e9dda09cbf98
Summary:
```
warning: variable does not need to be mutable
--> eden/scm/lib/configparser/src/config.rs:448:21
|
448 | let mut values_copy = values.clone();
| ----^^^^^^^^^^^
| |
| help: remove this `mut`
|
= note: `#[warn(unused_mut)]` on by default
```
Reviewed By: sfilipco
Differential Revision: D25625453
fbshipit-source-id: 8475056a87095f9ba633282666e6d3fee864074b
Summary:
Some code paths use (expensive) snapshot to be compatible with `Arc::ptr_eq`
compatibility check. With `VerLink` it's more efficient to use `VerLink`
directly. This is potentially more efficient for `VerLink` too because the
`Arc` won't be cloned unnecessarily and `VerLink::bump()` is more likely to
use its optimized path.
Reviewed By: sfilipco
Differential Revision: D25608200
fbshipit-source-id: 1b3ecc5d7ec5d495bdda22d66025bb812f3d68a0
Summary:
Similar to the previous change. `VerLink` tracks compatibility more accurately.
- No false positives comparing to the current `map_id` approach.
- Less false negatives comparing to the previous `Arc::ptr_eq` approach.
The `map_id` is kept for debugging purpose.
Reviewed By: sfilipco
Differential Revision: D25607513
fbshipit-source-id: 7d7c7e3d49f707a584142aaaf0a98cfd3a9b5fe8
Summary:
Previously, snapshots need to be invalidated manually. That is error-prone.
For example, `import_clone_data` forgot to call `invalidate_snapshot`.
With `VerLink`, it's easy to check if snapshot is up-to-date. So let's just
use that and remove the need of invalidating manually.
`invalidate_snapshot` is still useful to drop `version` in `snapshot` so
`VerLink::bump` might be more efficient. Forgetting about it no longer affects
correctness.
Reviewed By: sfilipco
Differential Revision: D25607514
fbshipit-source-id: 5efb489cda1d4875bcd274c5a197948f67101dc1
Summary:
`VerLink` tracks compatibility more accurately.
- No false positives comparing to the current `dag_id` approach.
- Less false negatives comparing to the previous `Arc::ptr_eq` approach.
The `dag_id` is kept for debugging purpose.
Note: By the current implementation, `dag.flush()` will make `dag`
incompatible from its previous state. This is somewhat expected, as
`flush` might pick up any changes on the filesystem, reassign non-master. Those
can be actually incompatible. This might be improved in the future to detect
reload changes by using some extra information.
Reviewed By: sfilipco
Differential Revision: D25607511
fbshipit-source-id: 3cfc97610504813a3e5bb32ec19a90495551fd3a
Summary:
There are 2 kinds of changes:
- Append-only changes. It is backwards-compatible.
- Non-append-only changes. It is not backwards-compatible.
Previously,
- `Arc::ptr_eq` on snapshot is too fragile. It treats append-only compatible
changes as incompatible.
- Even worse, because of wrapper types (ex. `Arc::new(Arc::new(dag))` is
different from `dag`), even a same underlying struct can be treated as
incompatible.
- `(map|dag)_id` is too rough. It treats incompatible non-append-only changes
as compatible.
Add `VerLink` to track those 2 different kinds of changes. It basically keeps a
(cheap) tree so backwards compatible changes will be detected precisely.
`VerLink` will replace IdMap and Dag compatibility checks.
Reviewed By: sfilipco
Differential Revision: D25607512
fbshipit-source-id: 478f81deee4d2494b56491ec4a851154ab7ae52d
Summary: This makes it easier to investigate fast path issues.
Reviewed By: sfilipco
Differential Revision: D25598077
fbshipit-source-id: 27b7042fb9510321c25371f8c5d134e248b3d5d5
Summary:
This makes it easier to check if set operations are using fast paths or not by
setting `RUST_LOG=dag=debug`.
Reviewed By: sfilipco
Differential Revision: D25598075
fbshipit-source-id: 1503a195268c0989d5166596f2c8a66e15201372
Summary:
See the previous diff for context. The new API will be used to check if two
dags are compatible.
Note: It can cause false positive on compatibility checks, which need a
more complex solution. See D25607513 in this stack.
Reviewed By: sfilipco
Differential Revision: D25598079
fbshipit-source-id: f5fc9c03d73b42fadb931038fe2e078881be955f
Summary: The backend is designed to be used by the "debugsegmentclone" command, which does not write revlog.
Reviewed By: sfilipco
Differential Revision: D25624786
fbshipit-source-id: e145128c7b41d78fed495f8da540169f741b674d
Summary: This makes it possible to add new commits in a repo without revlog.
Reviewed By: sfilipco
Differential Revision: D25602527
fbshipit-source-id: 56c27a5f00307bcf35efa4517c7664a865c47a43
Summary:
HowToEven believes that both path and manifestNode might be used after being
moved and thus complains about it as that's often what is intended. However,
in C++17, this lint is spurious as both of these variables will be moved after
being copied properly in the first lambda. To silence the linter, let's just
split the combinator chain in 2.
Reviewed By: genevievehelsel
Differential Revision: D25627413
fbshipit-source-id: 1a93ca039310dfd04a3f11bd9c7de32e93057517
Summary: Because mysql connection pool options had both `conflicts_with(myrouter)` and default values, the binary always failed if myrouter option was provided.
Differential Revision: D25639679
fbshipit-source-id: 21ebf483d4ee88a05db519a14b7e2561b3089ad1
Summary:
When running `python3 run-tests.py test-run-tests.py`, some bytes were printed
with `b` prefix. Convert them to `str`.
Reviewed By: DurhamG
Differential Revision: D25642164
fbshipit-source-id: f1103b24ad88d0d024f6be546bf632141f06ebd1
Summary:
A bit of history first. For some time we had a problem in our cross repo sync
library where it used the "current" commit sync version, where "current" meant
"the latest commit sync config version that was added". That was incorrect, and
we migrated away from this model, however there were still a few places that
used get_current_mover_DEPRECATED() mover.
Removing this method from a test file is easy, but it's trickier for
sync_diamond_merge tool. This tool is used to sync a diamond merge from a small
repo to a large repo (this is non-trivial to do, so we don't do it
automatically). To make things simpler this diff requires all involved commits
(i.e. both parents, where to rebase (onto) and root of the diamond merge) to
have the same commit sync config versions.
Reviewed By: markbt
Differential Revision: D25612492
fbshipit-source-id: 6483eed9698551920fb1cf240218db7b7e78f7bd
Summary:
The warning will go to debug level logs if the delay is not reached.
The messages about the locks make profoundly bad effect on attitude to commit cloud even if the delay is just 1 second (that is a reasonable delay).
Reviewed By: quark-zju
Differential Revision: D25587459
fbshipit-source-id: 9a09484d590ba04d17a881e0c9c5d543686b934f
Summary:
The correct workflow for using a multi-threaded connection pool for multiple DBs is to have a single shared pool for all the use-cases. The pool is smart enough to maintain separate "pools" for each DB locator and limit them to maximum 100 conn per key.
In this diff I create a `OnceCell` connection pool that is initialized once and reused for every attempt to connect to the DB.
The pool is stored in `MononokeAppData` in order to bind its lifetime to the lifetime of Mononoke app. Then it is passed down as a part of `MysqlOptions`. Unfortunately this makes `MysqlOptions` not copyable, so the diff also contains lots of "clones".
Reviewed By: ahornby
Differential Revision: D25055819
fbshipit-source-id: 21f7d4a89e657fc9f91bf22c56c6a7172fb76ee8
Summary:
In the next diff I'm going to add Mysql connection object to `MysqlOptions` in order to pass it down from `MononokeAppData` to the code that works with sql.
This change will make MysqlOptions un-copyable.
This diff fixed all issues produced by the change.
Reviewed By: ahornby
Differential Revision: D25590772
fbshipit-source-id: 440ae5cba3d49ee6ccd2ff39a93829bcd14bb3f1
Summary:
benchmark_filestore XDB subcommands uses mysql and has option of using either myrouter or mysql. In this diff I used `args::parse_mysql_options` function to parse the arguments instead of manual processing and get a `MysqlOptions` object.
This is needed later to pass a connection pool object through the `MysqlOptions` struct (see the next diff).
Reviewed By: ahornby
Differential Revision: D25587898
fbshipit-source-id: 66fcfd98ad8f3f9e285ca9635d8f625aa680d7ff
Summary:
Like it says in the title. This is nice to do because we had old futures
wrapping new futures here, so this lets us get rid of a lot of cruft.
Reviewed By: ahornby
Differential Revision: D25502648
fbshipit-source-id: a34973b32880d859b25dcb6dc455c42eec4c2f94
Summary:
This was kinda almost done. Might as well finish it by updating what's left,
i.e. the tests.
Reviewed By: ahornby
Differential Revision: D25498799
fbshipit-source-id: 65b7b144f5cf86d5f1754f5c7dafe373173b5ece
Summary: Let's not spawn too many futures at once
Reviewed By: markbt
Differential Revision: D25612069
fbshipit-source-id: e48901b981b437f66573a1abfba08eb144af2377
Summary: Forgot to add them when I wrote the test. Let me add tem now
Differential Revision: D25611802
fbshipit-source-id: 0db7bee2034ad6e1566c5eb6de2e80e18140d757
Summary: Convert all BlobRepoHg methods to new type futures
Reviewed By: StanislavGlebik
Differential Revision: D25471540
fbshipit-source-id: c8e99509d39d0e081d082097cbd9dbfca431637e
Summary:
configs.allowedlocations restricts what configs can be loaded to a
certain set of files. This will enable us to deprecate all old config locations.
This diff adds Python support and a high level test.
Reviewed By: quark-zju
Differential Revision: D25539736
fbshipit-source-id: fa2544379b65672227e0d9cf08dad7016d6bbac8
Summary:
We want to start disallowing non-approved config files from being
loaded. To do that, let's update the config verifier to accept an optional list
of allowed locations. If it's provided, we delete any values that came from a
disallowed location.
This will enable us to prune our config sources down to rust configs,
configerator configs, .hg/hgrc, and ~/.hgrc.
Reviewed By: quark-zju
Differential Revision: D25539738
fbshipit-source-id: 0ece1c7038e4a563c92140832edfa726e879e498
Summary:
The goal of this stack is to start logging commits to scribe even if a commit was
introduced by scs create_commit/move_bookmark api. Currently we don't do that.
Initially I had bigger plans and I wanted to log to scribe only from bookmarks_movement and remove scribe logging from unbundle/processing.rs, but it turned out to be trickier to implement. In general, the approach we use right now where in order to log to scribe we need to put `log_commit_to_scribe` call in all the places that can possibly create commits/move bookmarks seems wrong, but changing it is a bit hard. So for now I decided to solve the main problem we have, which is the fact that we don't get scribe logs from repos where bookmarks is moved via scs methods.
To fix that I added an additional option to CreateBookmark/UpdateBookmark structs. If this option is set to true then before moving/creating the bookmark it finds all draft commits that are going to be made public by the bookmark move i.e. all draft ancestors of new bookmark destinationl. This is unfortunate that we have to do this traversal on the critical path of the move_bookmark call, but in practice I hope it won't be too bad since we do similar traversal to record bonsai<->git mappings. In case my hopes are wrong we have scuba logging which should make it clear that this is an expensive operation and also we have a tunable to disable this behavioiur.
Also note that we don't use PushParams::commit_scribe_category. This is intentional - PushParams::commit_scribe_category doesn't seem useful at all, and I plan to deprecate it later in the stack. Even later it would make sense to deprecate PushrebaseParams::commit_scribe_category as well, and put commit_scribe_category optoin in another place in the config.
Reviewed By: markbt
Differential Revision: D25558248
fbshipit-source-id: f7dedea8d6f72ad40c006693d4f1a265977f843f
Summary:
Those messages like "pulling from ...", "added n commits ..." belong to stderr.
This makes it possible for us to turn on verbose output for auto pull, without
breaking tools that parses stdout.
Reviewed By: sfilipco
Differential Revision: D25315955
fbshipit-source-id: 933f631610840eb5f603ad817f7560c78b19e4ad
Summary:
It turns out `Arc::ptr_eq` is becoming unreliable, which will cause fast paths
to be not used, and extreme slowness in some cases (ex. `public & nodes`
iterating everything in `public`).
This diff adds an API for an IdMap to tell us its identity. That identity is
then used to replace the unreliable `Arc::ptr_eq`.
For an in-memory map, we just assign a unique number (per process) for its
identity on initialization. For an on-disk map, we use the type + path to
represent it.
Note: strictly speaking, this could cause false positives about
"maps are compatible", because two maps initially cloned from each other
can be mutated differently and their map_id do not change. That will
be addressed in upcoming diffs introducing a more complex but precise way to
track compatibility.
Reviewed By: sfilipco
Differential Revision: D25598076
fbshipit-source-id: 98c58f367770adaa14edcad20eeeed37420fbbaa
Summary:
On Windows, platform.version appears to be misleading as it returns 1903 for
Windows 10 1909... The registry has the correct build, so let's use that instead.
Reviewed By: kmancini
Differential Revision: D25577939
fbshipit-source-id: f47032906d02669bd1cb1a48304733c1e3499f81
Summary:
With selectivepull we can tell visibility directly about which heads to add,
instead of adding all nodes in a changegroup.
Note: This does not affect the pull command, nor exclude public commits yet.
Reviewed By: markbt
Differential Revision: D25562090
fbshipit-source-id: aa5f346f33058dfdb3b2f23f175e35b5d3c30a1d
Summary: They will be used in core later.
Reviewed By: markbt
Differential Revision: D25562093
fbshipit-source-id: 4402a629a09920fd4c6f85cb8e777446bb218a37
Summary:
When tailing to fill or backfill derived data, omit checking the heads from the
previous round of derivation, as we know for sure they've been derived.
Reviewed By: krallin
Differential Revision: D25465445
fbshipit-source-id: 384c7e67e99c561ce6aae324070e7c274c56b736
Summary:
Rustfmt gives up on formatting if strings are too long. Split the long help
strings so that the formatter works again.
Tidy up some of the help text while we're at it.
Reviewed By: krallin
Differential Revision: D25465443
fbshipit-source-id: 360dbedc1e3e2ffbc489a9d9cba008835bce506f
Summary:
ChangesetInfo derivation doesn't depend on the parent ChangesetInfo being
available, so we can derive them in batches very easily.
Reviewed By: krallin
Differential Revision: D25470721
fbshipit-source-id: cc8ce305990eb6c9846158f0e9e3917cf35e169d
Summary:
Add documentation comments to the derived data crate to describe how they fit
together.
Reviewed By: krallin
Differential Revision: D25432449
fbshipit-source-id: b62440bcecae900ad75d74245ce175bd9e07a894
Summary:
Using `backfill-all` on very large repositories is slow to get started and slow
to resume, as it must traverse the repository history all the way to the start
before it can even begin.
Make this more usable by using the skiplist index to slice the repository into
reasonably sized slices of heads with the same range of generation numbers.
Each slice is then derived in turn. If interrupted, derivation can continue
at the next slice more quickly.
Reviewed By: krallin
Differential Revision: D25371968
fbshipit-source-id: f150ea847f9fbbe84852587d620ae37ba2c58f28
Summary:
Right now, when we upload a hg commit, we check that we have all the content
the client is referencing.
The only problem is we do this by checking that the filenodes the client
mentions exist, but the way we store filenodes right now is we write them
concurrently with content blobs, so it is in fact possible to have a filenode
that references a piece of content that doesn't actually exist.
That isn't quite what one might call satisfactory when it comes to checking the
content does in fact exist, so this diff updates our content checking
In practice, with the way Mononoke works right now this should be quite free:
the client uploads everything all the time, and then we check later, so this
will just hit in the blobstore cache.
In a future world where clients don't upload stuff they already know we have,
that could be slower, but doing it the way is we do it before is simply not
correct, so that's not much better. The ways to make it faster would be:
- Trust that we'll hit in cache when checking for presence (could work out).
- Have the client prove to us that we have this content, and thread it through.
To do the latter, IMO the code here could actually look at the entries that
were actually uploaded, and not check them for presence again, but right now we
have a few layers of indirection that make this a bit tricky (technically, when
`process_one_entry` gets called, that means "I uploaded this", but nothing in
the signature of any of the functions involved really enforces that).
Reviewed By: StanislavGlebik
Differential Revision: D25422596
fbshipit-source-id: 3cf34d38bd6ed1cd83d93c778f04395c942b26c0
Summary:
It always takes a bit of time to find the logs. Since we do have scm daemon onces in the output of `hg cloud status`,
it would be nice to have the onces from background backup as well.
Reviewed By: markbt
Differential Revision: D25560145
fbshipit-source-id: cdf5d76c7c3ebb1492559d32935f9301452a1cd5
Summary: The order has been incorrect and led to a confusing message
Reviewed By: krallin
Differential Revision: D25559963
fbshipit-source-id: 4fcb3e53cedcb08675b60b25cbb5da2ca52c08ed
Summary:
Add a resolve_repos function to cmdlib::args for use from jobs that will run for multiple repos at once.
Planning to use this form the walker to scrub multiple small repos from single process
Differential Revision: D25422755
fbshipit-source-id: 40e5d499cf1068878373706fdaa72effd27e9625
Summary:
In a future diff, paths will be validated to make sure they are valid utf8. The
path sanity checker needs to be constexpr to construct global paths, but the
utf8 functions aren't, so let's write one that is.
Reviewed By: chadaustin
Differential Revision: D25562681
fbshipit-source-id: e48ec835c2cc9dc01090918cc7ee8f61b6c05a20
Summary:
When compiling with Buck, it tries to create hardlink which are rightfully
failing, but that error then triggers a spurious notification. For now, let's
only notify the user on timeout, as these are very likely to be network issues.
Reviewed By: kmancini
Differential Revision: D25500364
fbshipit-source-id: 95b609ae901fa6207c8edba26cd8e6a21ddfe3ac
Summary:
This diff does a small refactoring to hopefully make the code a bit clearer.
Previously we were calling log_commits_to_scribe in force_pushrebase and in
run_pushrebase, so log_commits_to_scribe was called twice. It didn't mean that
the commits were logged twice but it was hard to understand just by looking at
the code.
Let's make it a bit clearer.
Reviewed By: krallin
Differential Revision: D25555712
fbshipit-source-id: bed9754b1645008846a86da665b6f3f3483f30da
Summary:
I'm going to do a refactoring later in the stack, so let's add a test to avoid
regressions.
Reviewed By: krallin
Differential Revision: D25535655
fbshipit-source-id: 5ec6633c9c8c25d1affcede0adbc27dd43c48736
Summary:
IPython is incompatible with Python 3 demandimport. Disable demandimport to
make it work.
Reviewed By: singhsrb
Differential Revision: D25542394
fbshipit-source-id: 293880dff62e98895bc1ae2d3328d4af25b8218f
Summary:
Debug output belongs to stderr.
This makes it possible to turn on debug output without breaking programs
parsing stdout.
Reviewed By: singhsrb
Differential Revision: D25315954
fbshipit-source-id: c7813a824fbf6640cb5b80b5ed2d947e7059d53e
Summary:
With `collapse-obsolete`, `.` can be obsoleted and in the middle of a stack and
not shown up. That can be confusing. Make the smartlog revset always show the
`heads` passed in. If `.` is in `heads` (the default), then show it.
Reviewed By: DurhamG
Differential Revision: D24696595
fbshipit-source-id: 7deab109d0e0ae5e703928252bc63312d936955f
Summary:
Add open_existing_sqlite_path so we don't doing create_dir_all when we know the db already exists.
Noticed it in passing while investigating something else.
Reviewed By: markbt
Differential Revision: D25469502
fbshipit-source-id: 9810489c84220927937c037d69f5e8e70f2d9038
Summary: * suggest connecting to the VPN if hg update doesn't work
Reviewed By: sfilipco
Differential Revision: D25551017
fbshipit-source-id: 575f29cce4ab2719f2faae86616fdd9aac739f5f
Summary:
If Rust LFS is in use, we currently don't upload LFS blobs to commit cloud.
This is problematic because if you're going to Mononoke that means you can't
upload, and if you're going to Mercurial that means you're silently not backing
up data.
Reviewed By: StanislavGlebik
Differential Revision: D25537672
fbshipit-source-id: fd61f5a69450c97a0bc0895193f67fd22c9773fb
Summary: These show up when compiling with Buck, let's silence them.
Reviewed By: chadaustin
Differential Revision: D25513672
fbshipit-source-id: 277afae30059114f3646cdf4feedac442a4ee1b6
Summary:
On Windows, the GUID of the mount point identifies the virtualization instance,
that GUID is then propagated automatically to the created placeholders when
these are created as a response to a getPlaceholderInfo callback.
When the placeholders are created by EdenFS when invalidating directories we
have to pass GUID. The documentation isn't clear about whether that GUID needs
to be identical to the mount point GUID, but for a very long time these have
been mismatching due to the mount point GUID being generated at startup time
and not re-used.
One of the most common issue that users have reported is that sometimes
operations on the repository start failing with the error "The provider that
supports file system virtualization is temporarily unavailable". Looking at the
output of `fsutil reparsepoint query` for all the directories from the file
that triggers the error to the root of the repositories, shows that one of the
folder and its descendant don't share the same GUID, removing it solves the
issue.
It's not clear to me why this issue doesn't always reproduce when restarting
EdenFS, but a simple step that we can take to solve this is to always re-use
the GUID, and that hopefully will lead to the GUID always being the same and
the error to go away.
Reviewed By: fanzeyi
Differential Revision: D25513122
fbshipit-source-id: 0058dedbd7fd8ccae1c9527612ac220bc6775c69
Summary:
The config verifier would remove items from the values list if they
were disallowed. To do this, it iterated through the values list backwards,
removing bad items. In some cases it stored the index of a bad value for later
use, but because it was iterating backwards and removing things, the indexed it
stored might not be correct by the time the loop is done. To fix this, let's go
back to iterating forwards.
Reviewed By: quark-zju
Differential Revision: D25539737
fbshipit-source-id: 87663f3c162c690f3961b8075814f3467916cb4b
Summary:
The old case-conflict checks were more lenient, and only triggered if a commit
introduced a case conflict compared to its first parent.
This means that commits could still be landed to bookmarks that already had
pre-existing case conflicts.
Relax the new case-conflict checks to allow this same scenario.
Note that we're still a bit more strict: the previous checks ignored other
parents, and would not reject a commit if the act of merging introduces a case
conflict. The new case conflict checks only permit case conflicts in the case
where all conflicting files were present in one of the parents.
Reviewed By: StanislavGlebik
Differential Revision: D25508845
fbshipit-source-id: 95f4db1300ee73b8e6495ba8b5c1c2ce5a957d1a
Summary: Spotted this in passing. Was able to remove a call to fetch_root_manifest_id.
Reviewed By: StanislavGlebik
Differential Revision: D25472678
fbshipit-source-id: d450cb97630464be13d22fb37c3356611dc2e1b6
Summary: This makes it easier to run full walks on small repos.
Reviewed By: StanislavGlebik
Differential Revision: D25469485
fbshipit-source-id: 6e5b1426837a396d939e47a5b353e615437ae7cb
Summary:
If .buckd/pid is empty or invalid, assume that means Buck is not
running.
Reviewed By: genevievehelsel
Differential Revision: D25544725
fbshipit-source-id: 101ef67e17ff3e06f428cd7dbf51b2587fee4627
Summary:
Restore the behavior disabled by D25350916 (49c6f86325). This time it no longer runs Python
logic in background threads.
Reviewed By: sfilipco
Differential Revision: D25513054
fbshipit-source-id: 0220ccb37e658518d105bba04f45424c9fcfe142
Summary:
Make `get_commit_raw_text` aware of hg's hardcoded commit hashes: NULL_ID and
WDIR_ID. Previously, only `stream_commit_raw_text` is aware of it.
This makes it a bit more compatible when used in more places.
Reviewed By: sfilipco
Differential Revision: D25515006
fbshipit-source-id: 08708734a28f43acf662494df69694988a5b9ca0
Summary:
Unlike streamcommitrawtext, the new API does not put Python logic to a
background thread. This will make it easier to reason about Python logic as
they do not need to be thread-safe, and we don't need to think about Python GIL
deadlocks in the Rust async world.
Reviewed By: sfilipco
Differential Revision: D25513057
fbshipit-source-id: 4b30d7bab27070badd205ac1a9d54bae7f1f8cec
Summary:
Previously, only the batch fetching, or the stream fetching APIs will
actually fetch commit remotely. The 1-commit fetching API does not have
the network side effect, with the hope that we can migrate all usecases
to stream or batch fetching.
Practically it's quite difficult to migrate all use-cases, and the Python
layer has to have a fallback 1-by-1 fetching. Now let's just move that
fallback to Rust to simplify the code. The fallback in the Rust code
is by the default impl of get_commit_raw_text.
Reviewed By: sfilipco
Differential Revision: D25513056
fbshipit-source-id: b3c615397d33b8d35876dc23ca7b95173783ef80
Summary: The API will be used in Python bindings to avoid running Python in background threads.
Reviewed By: sfilipco
Differential Revision: D25513055
fbshipit-source-id: a108b55115271a256c0d43e0ff7b82c0b209be81
Summary:
Previously only `iterctx` does prefetch. Make `__iter__` do prefetch via `iterctx`.
The old `__iter__` without prefetching was renamed to `iterrev`.
Reviewed By: sfilipco
Differential Revision: D24365404
fbshipit-source-id: db5c687066794257719bb64c673dc384b5460ff1
Summary:
Now smartset has a reference to repo. It does not need `repo` from external
source.
Reviewed By: sfilipco
Differential Revision: D24365405
fbshipit-source-id: 8a43697b7b84a8a41691ed8f095c271107a90f16