Commit Graph

4119 Commits

Author SHA1 Message Date
Xavier Deguillard
751fc53638 types: add an ancestors method to RepoPath
Summary: This returns the ancestors in the reverser order as the parents method.

Reviewed By: sfilipco

Differential Revision: D20265277

fbshipit-source-id: 83277cee3d8e9070fc56d20d4c1877e6782c22f7
2020-03-05 09:31:32 -08:00
Katie Mancini
4f0c4a1b04 Track number of imports queued
Summary:
adds a counter to track the imports queued to enable more statistics exposure.

- Add a counters to track the number of blob, tree, prefetch  imports that are in the pending

- have the counters increment (increment in constructor of wrapper struct) when the import is about to be queued

- have counters decrement once the load has completed (decrement in destructor of wrapper struct)

Reviewed By: chadaustin

Differential Revision: D20256410

fbshipit-source-id: 5536b46307b30fc19dc5747414727a86961c78e1
2020-03-05 09:03:06 -08:00
Pavel Aslanov
95bf3a32a4 Report bytes sent via perf counters for stream_out_shallow command
Summary: Report bytes sent via perf counters for `stream_out_shallow` command

Reviewed By: krallin

Differential Revision: D20283114

fbshipit-source-id: 1f354904c68322b941ff0c035bb0b811e41e74a1
2020-03-05 08:58:21 -08:00
Liubov Dmitrieva
bb2f81e26b mononoke_api: improve algo for stack calculation
Summary: Improvements aim to minimize number of db queries

Differential Revision: D20280711

fbshipit-source-id: 6cc06f1ac4ed8db9978e0eee956550fcd16bbe8a
2020-03-05 08:31:37 -08:00
Aida Getoeva
db19504972 mononoke: derive changeset info
Summary:
Implementation of derivation logic for the changeset info.

BonsaiDerived is implemented for the ChangesetInfo. `derive_from_parents` just derives an info and BonsaiDerivedMapping then puts it into the blobstore.

```
ChangesetInfo::derive(..) -> ChacgesetInfo
```

Reviewed By: krallin

Differential Revision: D20185954

fbshipit-source-id: afe609d1b2711aed7f2740714df6b9417c6fe716
2020-03-05 08:24:38 -08:00
Aida Getoeva
09b03ce1bf mononoke: derived changeset info - data structures
Summary:
Introducing data structures for derived Bonsai changeset info, which is supposed to store all commit metadata except of the file changes.

Bonsai changeset consists of the commit metadata and a set of all the file changes associated with the commit.
Some of the changesets, usually for merge commits, include thousands of file changes. It is not a problem by itself, however in cases where we need to know some information about the commit apart from its hash, we have to fetch the whole changeset. And it can take up to 15-20 seconds

Changeset info as a separate data structure is needed to speed up changeset fetching process: when we need to use commit metadata but not the file changes.

Reviewed By: markbt

Differential Revision: D20139434

fbshipit-source-id: 4faab267304d987b44d56994af9e36b6efabe02a
2020-03-05 08:24:38 -08:00
Jun Wu
7c9e74aa09 pytracing: make ascii() return bytes on Python 2
Summary: Do not leak unicode to Python 2.

Reviewed By: simpkins

Differential Revision: D20269851

fbshipit-source-id: ebd1b0678b1335a951c9655210601dd80842336e
2020-03-05 07:35:26 -08:00
Liubov Dmitrieva
047862c02c mononoke: add 'repo_stack_info' API
Summary:
The new API is required for migration Commit Cloud off hg servers and infinitepush database

This also can fix phases issues with `hg cloud sl`.

Reviewed By: markbt

Differential Revision: D20221913

fbshipit-source-id: 67ddceb273b8c6156c67ce5bc7e71d679e8999b6
2020-03-05 05:48:32 -08:00
Alex Hornby
cbb3996141 mononoke: walker: fix waiting on tail
Summary:
Fix the tail interval delay, it wasn't triggering.

Took the opportunity to structure the code as a loop as well which simplified it a bit.

Reviewed By: markbt

Differential Revision: D20247077

fbshipit-source-id: 1786ef1528a4b0493f5e454d28450d7198af8ad4
2020-03-05 05:41:02 -08:00
Lukas Piatkowski
ddeeeb65e0 Re-sync with internal repository 2020-03-05 11:56:21 +01:00
Adam Simpkins
a0358352da remove an integration test for handling SIGKILL after SIGSTOP
Summary:
Remove a failing integration test that was testing behavior we don't really
care about.

My changes in D20210708 made this test start failing.  This integration test
was initially added to exercise the code I reverted in D20210708.

This test fails when EdenFS is invoked in the foreground and under sudo.  If
you send SIGSTOP to the EdenFS process sudo happens to notice this and send
the same signal to itself too.  This results in a state where the `sudo`
command is stopped and is never resumed so it never wakes up to reap its child
EdenFS process when EdenFS exits.  The behavior I reverted in D20210708 caused
the edenfsctl CLI code to simply ignore the fact that EdenFS was stuck in a
zombie state, and proceed anyway.  This allowed EdenFS to at least restart,
but it left old zombies stuck forever on the system.

This problem is arguably an issue with how sudo operates, and it's sort of
hard for us to work around.  To solve the problem you need to send SIGCONT to
the sudo process, but since it is running with root privileges you don't
normally have permission to send a signal to it.  It is understandable why
sudo behaves this way, since normally it is desirable for sudo to background
itself when the child is stopped.

In practice this isn't really ever a situation that we care much about
handling.  Normal users shouldn't ever get into this situation (they don't run
EdenFS in the foreground, and they generally don't run it under sudo either).

Reviewed By: genevievehelsel

Differential Revision: D20268924

fbshipit-source-id: d61d0a10ee1e132f00dbd2e4dc135808b7c79345
2020-03-04 22:15:49 -08:00
Durham Goode
e2ff8d5da2 infinitepush: remove transaction that spans pull
Summary:
D18538145 introduced a transaction that spans the entire infintepush
pull. This has a couple of unfortunate consequences:

1. hg pull --rebase now aborts the entire pull if the rebase hits a conflict,
since it's unable to commit the transaction.
2. If tree prefetching fails, it aborts the entire pull as well.

Tests seem to work fine if we scope down this lock.

Reviewed By: xavierd

Differential Revision: D20260480

fbshipit-source-id: d84228ababdb5572401645f74e78df035bf1461b
2020-03-04 19:49:26 -08:00
Jun Wu
bb1562604a dag: make some test APIs public in crate
Summary: Those will be reused by nameset::DagSet.

Reviewed By: sfilipco

Differential Revision: D20242563

fbshipit-source-id: 944e9a04aeb15439256ecea64355b67e326e5c89
2020-03-04 17:33:25 -08:00
Jun Wu
b8e1477401 nameset: impl Debug for other sets
Summary:
This is useful for `assert_eq!(format!("{:?}", set), "...")` tests.

It will be eventually exposed to Python as `__repr__`, similar to Python's
smartsets.

Reviewed By: sfilipco

Differential Revision: D20242562

fbshipit-source-id: 5373bb180db7cafebf273ace7cf2cb80fbfb8038
2020-03-04 17:33:25 -08:00
Jun Wu
fa069204e3 nameset: impl Debug for StaticSet
Summary:
In the Python world all smartsets have some kind of "debug" information. Let's
do something similar in Rust.

Related code is updated so the test is more readable.

Reviewed By: sfilipco

Differential Revision: D20242564

fbshipit-source-id: 7439c93d82d5d037c7167818f4e1125c5a1e513e
2020-03-04 17:33:24 -08:00
Jun Wu
0ae5a59e9e indexedlog: fix metadata-only updates for Indexes
Summary:
Previously, `flush()` will skip writing the file if there are only metadata
changes. Fix it by detecting metadata changes.

This can potentially fix an issue that certain blackbox indexes are empty,
lagging and require scanning the whole log again and again. In that case,
the index itself is not changed (the root radix entry is not changed), but
only the metadata tracking how many bytes in Log the index covered
changed.

Reviewed By: sfilipco

Differential Revision: D20264627

fbshipit-source-id: 7ee48454a92b5786b847d8b1d738cc38183f7a32
2020-03-04 15:59:12 -08:00
Jun Wu
33d65ac5eb test-doctor: fix test on filesystems without symlink
Summary:
On filesystems without symlinks, the test fails because ln prints errors.

Fix the test by using `#if symlink`.

Reviewed By: DurhamG

Differential Revision: D20260904

fbshipit-source-id: 1d0ffcc7e95d2718087fb01297369ca276b59013
2020-03-04 15:52:53 -08:00
Adam Simpkins
df29669c4b add a process_finder method to get process start time
Summary: Add a `get_process_start_time()` method to the ProcessFinder class.

Reviewed By: genevievehelsel

Differential Revision: D20178481

fbshipit-source-id: ab84e198d113f06f33432159cf9ebcb1a0975279
2020-03-04 14:09:32 -08:00
Jeff Zhang
7061e5d03b Deprecate rust-crypto in eden/mononoke/mercurial
Summary: The `rust-crypto` crate has not been maintained; replacing it with the `sha-1` crate since it's the only algorithm used in this library.

Reviewed By: dtolnay

Differential Revision: D20236029

fbshipit-source-id: 9c4ff25f393b099ec9570a7badbe4b378fbd98af
2020-03-04 13:18:36 -08:00
Stanislau Hlebik
dded155135 mononoke: do not derive while initializing warm bookmark cache
Summary:
Previously warm bookmark cache tried to derive all bookmarks on startup. It slows down the startup time and in some cases it might prevent scs server from starting up at all.

Let's change how warm bookmark cache initializes the bookmarks - instead of trying to derive all of them let's move underived bookmarks back in history.

Reviewed By: krallin

Differential Revision: D20195211

fbshipit-source-id: 5cb5d8599d3035973175d3063186a7c01536889a
2020-03-04 13:14:32 -08:00
Stanislau Hlebik
2fddb7e1e4 mononoke: replace DelayBlob with DelayedBlobstore
Summary:
We didn't use DelayBlob at all, however we use DelayedBlobstore in benchmark
lib. DelayedBlobstore seem to have more useful options, so let's remove
DelayBlob and use DelayedBlobstore instead.

Reviewed By: farnz

Differential Revision: D20245865

fbshipit-source-id: bd694a0e178367014adc2776185450693f87475d
2020-03-04 12:48:33 -08:00
David Tolnay
c008ba8513 rust: Move tokio-old rdeps to renamed tokio-old
Summary:
Context: https://fb.workplace.com/groups/rust.language/permalink/3338940432821215/

In targets that depend on both 0.1 and 0.2 tokio, this codemod renames the 0.1 dependency to be exposed as tokio_old::. This is in preparation for flipping the 0.2 dependencies from tokio_preview:: to plain tokio::.

This is the tokio version of what D20168958 did for futures.

Codemod performed by:

```
rg \
    --files-with-matches \
    --type-add buck:TARGETS \
    --type buck \
    --glob '!/experimental' \
    --regexp '(_|\b)rust(_|\b)' \
| sed 's,TARGETS$,:,' \
| xargs \
    -x \
    buck query "labels(srcs,
        rdeps(%Ss, fbsource//third-party/rust:tokio-old, 1)
        intersect
        rdeps(%Ss, //common/rust/renamed:tokio-preview, 1)
    )" \
| xargs sed -i 's,\btokio::,tokio_old::,'
```

Reviewed By: k21

Differential Revision: D20235404

fbshipit-source-id: cfb2689a584ad0d73f16d98d8587fb9c44661465
2020-03-04 11:09:30 -08:00
Mark Thomas
8c6f30a688 renderdag: add auto-detection of best lines renderer
Summary:
The `lines` renderer doesn't work if the output encoding doesn't support the
curved line drawing characters.  In this case we should fall back to
`lines-square`.

Rename `lines` to `lines-curved`, and change `lines` to pick the best renderer
to use based on what is possible with the current output encoding.

Reviewed By: quark-zju

Differential Revision: D20248022

fbshipit-source-id: dfaf359426528a9cb515fb3e1d366fbfb15162ff
2020-03-04 11:05:29 -08:00
Mark Thomas
eb7f7aacd9 pager: allow overriding of encoding for the pager
Summary:
The pager may accept a different encoding than either the process encoding or
the output encoding.

For example, on Windows:
  * the process encoding may be cp1252 (which is used for all `...A` system calls.
  * the output encoding may be cp436 (which is used for writing directly to the console).
  * the pager encoding may be utf-8 (which is written to the console using more modern system calls).

To fix this, add a `pager.encoding` config option, which, when set, overrides
the output encoding when writing to the pager.

Reviewed By: quark-zju

Differential Revision: D20247650

fbshipit-source-id: 1e4d1246c95f2102763d879f9783d02acc193a73
2020-03-04 11:05:29 -08:00
Mark Thomas
6664cf259b encoding: outputencoding option overrides wrong module attribute
Summary:
The `--outputencoding` option should override `encoding.outputencoding`, not
`util.outputencoding`.

Reviewed By: quark-zju

Differential Revision: D20247651

fbshipit-source-id: b54cee6cd14fb1f3b6d5e8ffc0bf96b7ed924840
2020-03-04 11:05:29 -08:00
Adam Simpkins
385eed0b4e do not treat zombie processes as exited in edenfsctl restart
Summary:
Update `edenfsctl restart` so that it does not treat zombie processes as
stopped.  This effectively reverts the changes added in D9980225.

This behavior was causing `edenfsctl restart` to spuriously fail, as it would
try to start the new EdenFS process too early, before the kernel had fully
cleaned up the old EdenFS process and released all of its locks.  In
particular, the new process would often fail to acquire the RocksDB lock.
Older versions of EdenFS did not always explicitly release this lock during
shutdown, and so it would end up being cleaned up by the kernel after the
process had exited.

I wrote a simple test program to verify this behavior, where one process
would acquire a file lock with an `F_SETLK` `fcntl()` call, and then exit
without releasing it.  Another process that polled for this process to enter
zombie state and then try to acquire the lock.  It would very reliably receive
`EAGAIN` failures if it attempted to acquire the lock immediately after it saw
the first process enter a zombie state.

In practice we shouldn't normally run into issues with EdenFS being stuck in a
zombie state.  The situation described in D9980225 sounds like a corner case
encountered during development while running EdenFS under sudo.

Reviewed By: chadaustin

Differential Revision: D20210708

fbshipit-source-id: cd62b47405d7f3e53bd4a1fb4ff2964596ca3536
2020-03-04 10:29:53 -08:00
Adam Simpkins
09bf63eca1 update some of the systemd tests to wait on subprocess properly
Summary:
Update some of the systemd tests that were using
`eden.cli.daemon.wait_for_process_exit()` and were relying on it to return for
zombie processes that had not been reaped.  This test would spawn a subprocess
and then wait for it using `wait_for_process_exit()` instead of actually just
using `subprocess.Popen.wait()`.

The `wait_for_process_exit()` function is only intended to be used for
non-child processes.  For immediate children processes it is always better to
simply use `wait()`.

This refactors the code so that it uses `subprocess.Popen.wait()` where
appropriate.  This is needed to make these tests work even after D20210708
lands.

Reviewed By: wez

Differential Revision: D20242891

fbshipit-source-id: 0afd3d3d7ee1d733099ea74f7b9b19cbe48b22d4
2020-03-04 10:29:52 -08:00
Stanislau Hlebik
800abb3253 mononoke: use only tokio-preview
Summary: clippy was failing, this diff should fix it hopefully

Reviewed By: krallin

Differential Revision: D20250585

fbshipit-source-id: 6a9becdb84ec293659433fa9078e456d40210b6c
2020-03-04 10:17:50 -08:00
Xavier Deguillard
314d1978ef clidispatch: silence warning on windows
Summary:
Using `if cfg!` instead of `#[cfg]` allows for the compiler to understand
that the arguments aren't unused, and silence the warnings.

Reviewed By: quark-zju

Differential Revision: D20242280

fbshipit-source-id: 332dfe17b3a80a1096d15c91c9fb6644bd10e0cd
2020-03-04 09:49:15 -08:00
Xavier Deguillard
7d9d38017c configparser: silence compiler warning
Summary:
Compiling it on Windows produced a bunch of warning due to
`hgrc_configset_load_path` not being compiled on it. Fixed it so it no longer
depends on Unix specific imports.

Reviewed By: quark-zju

Differential Revision: D20241102

fbshipit-source-id: 3002f961191fbb9bc51aa9ac1154d6d50bd7fe23
2020-03-04 09:49:14 -08:00
Xavier Deguillard
db76b7d52b procinfo: address compiler warning
Summary:
The `.into_iter()` for this object is being deprecated and won't compile in
the future, fix it now.

Reviewed By: quark-zju

Differential Revision: D20241103

fbshipit-source-id: fdee463ed81cd07a65f3cc4c70a96c88928b3b87
2020-03-04 09:49:14 -08:00
Xavier Deguillard
be7ae642ea commitcloudsubscriber: silence compiler warning
Summary:
While compiling on Windows, this file issues a bunch of warnings, use `if
cfg!` instead of `#[cfg]` to silence them. The behavior is the same, but the
later allows the compiler to recognize that some is not unused.

Reviewed By: quark-zju

Differential Revision: D20241104

fbshipit-source-id: 2cd7f171c7a2f7220cc73bea9be3359260de19b2
2020-03-04 09:49:14 -08:00
Thomas Orozco
275e4eff76 mononoke/mercurial: remove incorrect FileBytes Extend implementation
Summary:
This removes the Extend implementation for FileBytes, which was incorrect (it
discarded existing data!). I had introduced this as a backwards compatibility
shim when doing the Bytes 0.4 to Bytes 0.5 migration :/

We don't really need this shim, considering:

- The only place that really matters that uses this is the remotefilelog crate,
  where we have a content id, and where we should use `filestore::fetch_concat`
  instead.
- The other places are tests (or close to abandonware...), which can do their
  own folding.

Longer term, I'd like to remove the whole `Content` stream in hg entries, so
those callsites can use the filestore methods, which a) have test coverage
(unlike ad-hoc folds, which don't always do), and b) are more efficient since
they know how large the destination buffer needs to be ahead of time, and don't
need to re-allocate.

To make sure this fixes the bug, I also introduced tests for the remotefilelog
crate. As expected, the chunked variant fails without this fix.

Reviewed By: mitrandir77

Differential Revision: D20248978

fbshipit-source-id: 1b554d3e595eb867b6b6cf4204d31f27dd90a111
2020-03-04 08:51:42 -08:00
Thomas Orozco
1bce31dbe1 mononoke/fastreplay: don't sample errors
Summary:
Not sampling errors will make it easier to use Fastreplay as an early alarm
system for errors.

Reviewed By: ahornby

Differential Revision: D20249202

fbshipit-source-id: 92da53d5703b58bcef49cfcdc251f008ae6f25bc
2020-03-04 08:43:26 -08:00
Jun Wu
49464342fd indexedlog: try to use symlink for atomic_write on unix
Summary:
The change is in theory not necessary. However it improves the reliability on
OS crashes a bit, and can potentially workaround some bugs in filesystems
(as we saw in production where the atomic-written files are empty and the
system didn't crash).

The idea is, the `symlink` syscall does the file creation and "content" writing
together, while there is no way to create a file and write specific content
in one syscall. Note that the C symlink call uses 0-terminated string, and
the Rust stdlib exports it as accepting `Path`. To be safe, we encode binary
or non-utf8 content using `hex`.

For downgrade safety, the write path does not use symlink by default unless
format.use-symlink-atomic-write is set to true. This makes downgrade possible:
the read path is rolled out first, then we can turn on and off the write path.

The indexedlog Rust unit tests and test-doctor.t are migrated to use the new
symlink code paths.

Reviewed By: DurhamG

Differential Revision: D20153864

fbshipit-source-id: c31bd4287a8d29575180fbcf7227d2b04c4c1252
2020-03-04 07:23:48 -08:00
Jun Wu
def12896db indexedlog: add a utility function to read files crated by atomic_write
Summary:
This makes it possible to implement atomic_write differently (ex. use a
symlink).

Reviewed By: DurhamG

Differential Revision: D20153865

fbshipit-source-id: 07fa78c2f2dac696668f477c75f65cf70950b73f
2020-03-04 07:23:47 -08:00
Mateusz Kwapich
1e33cd40b6 a small tool to backfill git mappings
Summary:
The git mappings are normally populated during blobimport of the repo but we
need something for the repos we've already imported.

Reviewed By: markbt

Differential Revision: D20160768

fbshipit-source-id: 9e37c7d0f12682e73ca9990e56e4d827e9861a9f
2020-03-04 06:08:43 -08:00
Thomas Orozco
16d5ab5066 mononoke/cache_warmup: remove tracing
Summary:
We don't use it, and this tries to write to Manifold from tests, which is
undesirable. Let's remove it;

Reviewed By: farnz

Differential Revision: D20219902

fbshipit-source-id: 2e983bee54cadad257648cc9633695be825a1ef3
2020-03-04 04:02:19 -08:00
Thomas Orozco
f4f96c1100 mononoke/microwave: create repository snapshots for faster cache warmup
Summary:
This introduces a new binary and library that (microwave: it makes warmup
faster..!) that can be used to accelerate cache warmup. The idea is the
microwave binary will run cache warmup and capture things that are loaded
during cache warmup, and commit those to a file.

We can then use that file when starting up a host to get a head start on cache
warmup by injecting all those entries into our local cache before actually
starting cache warmup.

Currently, this only supports filenodes, but that's already a pretty good
improvement. Changesets should be easy to add as well. Blobs might require a
bit more work.

Reviewed By: StanislavGlebik

Differential Revision: D20219905

fbshipit-source-id: 82bb13ca487f82ca53b4a68a90ac5893895a96e9
2020-03-04 04:02:18 -08:00
Thomas Orozco
7f044a7b2e mononoke/walker: disable filenodes SQL timeouts
Summary:
The walker has been hitting the filenodes-enforced 5 second SQL timeout when
querying filenodes from MySQL.

It's not clear why that is, but looking at previous run history shows that we
occasionally have queries that take > 30 seconds to complete (none of those
show up in MySQL slow queries, though, and there's no particular load on the
hosts around that time, so it's not clear whether this is happening in MySQL or
our end).

Anyhow, those queries would have worked in the old implementation (after a long
time), but they fail in the new one, since it enforces a 5-second timeout.

We should investigate why this is happening (and Alex has landed diffs to add
more reporting in the walker to that end), but in the meantime, there's no
reason to break the walker

Reviewed By: farnz

Differential Revision: D20227842

fbshipit-source-id: 5ee5c8225b6474b66c1f48a10b4a2d671ebc79c6
2020-03-04 03:20:26 -08:00
Thomas Orozco
f486c3d190 mononoke/fastreplay: add context on cache warmup failures
Summary: When it fails, it's better to know which repo failed.

Reviewed By: farnz

Differential Revision: D20245375

fbshipit-source-id: 9794911308dbdd67b20673857ac8b7b54f06a217
2020-03-04 03:14:45 -08:00
Stanislau Hlebik
e9f78e0601 mononoke: add context with repoid to cache_warmup error message
Summary: Makes it easier to understand which repo is failing

Reviewed By: krallin

Differential Revision: D20244630

fbshipit-source-id: ca32f7831c5ed4e701103020e9878c459ba6d573
2020-03-04 01:52:11 -08:00
Jun Wu
ea7a8b68a5 run-tests: fail instead of skipping tests for unknwon hghave features
Summary:
If hghave fails to check a feature because the feature name is unknown, treat
it as a test failure instead of skipping the entire test. This is especially
useful since `#if feature-name` only affects part of the test and failing to
test the feature should not skip the entire test. It also allows us to capture
issues about mis-spelled feature names or stale feature tests.

This has bitten us twice in the past:

- D18819680 removed `pure` and accidently disabled tests including
  `test-install.t`, `test-annotate.t` and `test-issue4074.t`. Those tests got
  re-enabled as part of D20155399, while they pass Python 2 tests,
  the Python 3 tests were failing.

- D18088850 removed svn related feature checks, which has caused some issues
  that got fixed by D18713921 and D18713922.<Paste>

Reviewed By: xavierd

Differential Revision: D20231782

fbshipit-source-id: 6adf99bd79b2a295d4e84ce4da5f9425a100936a
2020-03-03 19:17:59 -08:00
Jun Wu
73f0525b89 test-issue4074: fix py3 compatibility
Summary: There are multiple issues. Fix them.

Reviewed By: kulshrax

Differential Revision: D20231783

fbshipit-source-id: fc6be43fda088822fe8ff9dbd32410aa616c1772
2020-03-03 17:46:34 -08:00
Jun Wu
2e03deb89e test-annotate: fix py3 compatibility
Summary: The encoding.trim function needs update.

Reviewed By: kulshrax

Differential Revision: D20231780

fbshipit-source-id: 82ea022d815fe9077b8b72403f8de1049173956c
2020-03-03 17:46:34 -08:00
Jun Wu
96ef84b2d4 test-install: fix py3 compatibility
Summary: The test should not assert Python version is "2.*".

Reviewed By: kulshrax

Differential Revision: D20231781

fbshipit-source-id: 2e10c37bb4b665bc4d5d4b27329c4c2cb23d54e3
2020-03-03 17:46:33 -08:00
Arun Kulshreshtha
78adda0589 mercurial_types: make envelope functions use generics instead of trait objects
Summary: Make these functions generic so that callers don't need to construct a trait object whenever they want to call them. Passing in a trait object should still work so existing callsites should not be affected.

Reviewed By: krallin

Differential Revision: D20225830

fbshipit-source-id: df0389b0f19aa44aaa89682198f43cb9f1d84b25
2020-03-03 15:11:04 -08:00
Arun Kulshreshtha
f8d0ad25a2 mononoke_api: add history method to HgFileContext
Summary: Add a method to `HgFileContext` to stream the history of the file. Will be used to support EdenAPI history requests.

Reviewed By: krallin

Differential Revision: D20211779

fbshipit-source-id: 49e8c235468d18b23976e64a9205cbcc86a7a1b4
2020-03-03 15:11:04 -08:00
Arun Kulshreshtha
fa999d9de1 mononoke_api: add HgTreeContext
Summary: Add an 'HgTreeContext' struct to the 'hg' module to allow querying for tree data in Mercurial-specific formats. This initial implementation's primary purpose is to enable getting the content of tree nodes in a format that can be written directly to Mercurial's storage.

Reviewed By: krallin

Differential Revision: D20159958

fbshipit-source-id: d229aee4d6c7d9ef45297c18de6e393d2a2dc83f
2020-03-03 15:11:03 -08:00
Genevieve Helsel
93d6f0a3e9 use a NullTelemetryLogger during integration tests
Summary: I was looking in the `edenfs_events` table and saw that sandcastle was logging to this table. Rice was able to identify that the reason was because the integration tests were logging. So if we're on running integration tests, we should return a `NullTelemetryLogger`. The daemon currently does not log on sandcastle AFAIK.

Reviewed By: simpkins

Differential Revision: D20203556

fbshipit-source-id: e09175347631478cb366d4fa2c6092d976504dd8
2020-03-03 14:56:49 -08:00
Hezi Zhang
2bbc4e5043 --clean option for eden du
Summary: `buck run edenfsctl -- du --clean` would help reduce the space used by the storage engine.

Reviewed By: chadaustin

Differential Revision: D20200616

fbshipit-source-id: 6ffa588fc71660a6a80d81aef7d58dda08932374
2020-03-03 14:07:53 -08:00
Adam Simpkins
d205829363 update process_finder to also be able to report EdenFS build info
Summary:
Add a `get_build_info()` method to the `EdenFSProcess` objects returned by
the `process_finder` module.  This returns information about the process
version and build time.

Reviewed By: wez

Differential Revision: D20178487

fbshipit-source-id: b1eb41de9184ca59dc1e90d0a92ff1cbc89a6b77
2020-03-03 13:48:55 -08:00
Adam Simpkins
53f15731c6 fix a couple pyre-fixme comments in eden/cli/main.py
Summary:
Store member variables in a local variable so that Pyre will allow unwrapping
it from an `Optional` type.  Pyre refuses to allow member variables to be
extracted from `Optional` since other functions called indirectly could modify
them.

Reviewed By: fanzeyi

Differential Revision: D20212162

fbshipit-source-id: 95655b73b5e469688f48d402c0b587928cbb0a35
2020-03-03 13:41:28 -08:00
Jun Wu
7c5e47bab1 indexedlog: rename chunk_size_log to chunk_size_logarithm
Summary:
This makes it clear that `log` is a math concept, not an append-only file like
`Log`.

Reviewed By: DurhamG

Differential Revision: D20149376

fbshipit-source-id: 67d2e9584b15f48759ca9b6dfce4279a5b1365a0
2020-03-03 13:41:28 -08:00
Jun Wu
49de84398b bindings: use Str for return type of repair()
Summary: This makes it friendly to Python 2.

Reviewed By: sfilipco

Differential Revision: D20162233

fbshipit-source-id: 5beb7a0f52159afc454332ff6e37e13087177cc0
2020-03-03 13:41:27 -08:00
Jun Wu
5ba323af16 doctor: skip unknown visibleheads format
Summary:
When I run `hg doctor` in my www checkout it fails the assertion check of the
first line of visibleheads is "v1". Make it graceful so doctor can check and
fix other components.

Reviewed By: DurhamG

Differential Revision: D20147969

fbshipit-source-id: 6aee2cab962fcd0ef06a0611d288021e86621249
2020-03-03 13:41:27 -08:00
Puneet Kaushik
7e0c6397c4 Update find_eden to work on Windows
Summary: Updated find_eden to find the Eden clone on a Windows system. On Windows we don't use symlinks, which make the logic different than on POSIX implementation.

Reviewed By: simpkins

Differential Revision: D19953934

fbshipit-source-id: bfbc112c3ccc48735ec6590746d8275cc9850796
2020-03-03 13:27:19 -08:00
Alvaro Leiva Geisse
5bd7c1ad3e Revert D20090620: Type hints
Differential Revision:
D20090620

Original commit changeset: 811bb54159ab

fbshipit-source-id: 4d00afda362120c23567244cbbb77a288f05a6dd
2020-03-03 12:34:59 -08:00
Adam Simpkins
f85fe60c31 update the CLI to handle old EdenFS instances without getDaemonInfo()
Summary:
In D20130406 I updated the CLI to call `getDaemonInfo()` to check on the
server status.  However, some very old EdenFS instances do not have have this
method.  These instances should all be gone shortly, but for now update the
code to handle the unknown method error and fall back to calling `getPid()`
and `getStatus()` separately.

I implemented this in our `EdenClient` wrapper class, similar to our existing
wrapper for `getPid()`.

Reviewed By: fanzeyi

Differential Revision: D20212518

fbshipit-source-id: 9d48bdd26822802a7e9776128c5567436d4bb445
2020-03-03 12:15:26 -08:00
Adam Simpkins
11451012a0 make autodeps happy about eden/py
Summary:
Update the import statements so that autodeps works on the `eden/py`
directories.

Reviewed By: fanzeyi

Differential Revision: D20212519

fbshipit-source-id: 37ccabf14dc0dbfe998664260ae9b83c9136ad63
2020-03-03 12:15:25 -08:00
Adam Simpkins
2ea7064ae6 update process_finder to also find the Eden dir through the lock file
Summary:
Update process_finder.py to look for a process's Eden state directory by
looking through its open FDs to find the EdenFS lock file, if it can't find
the state directory from the command line arguments.

At the moment we almost always invoke EdenFS with an explicit `--edenDir`
argument, but this code will allow this code to work even if we remove that in
the future.

Reviewed By: wez

Differential Revision: D20178484

fbshipit-source-id: 361b78f4a2566b8c09ce02fb21c46233d7e2546b
2020-03-03 11:53:50 -08:00
Adam Simpkins
ffa4d589b9 update the process_finder tests to also fake a privhelper process
Summary:
Update the `FakeProcessFinder.add_edenfs()` function to also add a fake
privhelper process in addition to the main edenfs process.  This allows the
tests to more accurately simulate the normal edenfs behavior.

Reviewed By: wez

Differential Revision: D20178482

fbshipit-source-id: edc70ade1b61929b37f13ece77757c7c35aa4eec
2020-03-03 11:53:49 -08:00
Adam Simpkins
931523b160 update process_finder to return processes for all users
Summary:
Update the code in `process_finder.py` to return EdenFS processes owned by all
users.  We now report the `uid` as a field in the returned process info, so
that callers can filter the results based on user ID if they want.  This
allows callers more flexibility when finding processes.

This also updates the `FakeProcessFinder` test utility code to support
providing fake UIDs to allow testing this behavior.

Reviewed By: wez

Differential Revision: D20178490

fbshipit-source-id: 6b76e1109e4835b167c80688fd3ace50f7986a22
2020-03-03 11:53:49 -08:00
Adam Simpkins
da4e49b89b move rogue process detection from process_finder to doctor code
Summary:
Move the code to find rogue EdenFS processes out of the generic
`process_finder` module and into the `check_rogue_edenfs` module that is
specific to the `eden doctor` checks.

The `ProcessFinder` class now exposes a `get_edenfs_processes()` API instead
of `find_rogue_pids()`, which makes it more generically usable outside of just
the doctor code.

Reviewed By: wez

Differential Revision: D20178486

fbshipit-source-id: e289f1673a5d4a666e9d54e8f58f4f00bdde94b7
2020-03-03 11:53:49 -08:00
Katie Mancini
3a035094f8 Record Mercurial tree import time
Summary: - added logging only around the import tree call to capture non-queue related wait time

Reviewed By: chadaustin, fanzeyi

Differential Revision: D20207472

fbshipit-source-id: d88bb34ce224a26ff2be100d7789ddeff608006d
2020-03-03 11:44:28 -08:00
Katie Mancini
52e211fe8e Record Mercurial file import time
Summary:
- added logging only around the import blob call to capture non-queue related wait time
- added to `test_reading_file_gets_file_from_hg` in `integration.stats_test.HgBackingStoreStatsTest`  to test import blob logging in addition to the get blob loging

(not yet done for importing trees, will do in next diff)

Reviewed By: chadaustin

Differential Revision: D20201215

fbshipit-source-id: c89281fe7d3d6e89d111ac8cce9014adff44ac40
2020-03-03 11:44:27 -08:00
David Tolnay
e988a88be9 rust: Rename futures_preview:: to futures::
Summary:
Context: https://fb.workplace.com/groups/rust.language/permalink/3338940432821215/

This codemod replaces *all* dependencies on `//common/rust/renamed:futures-preview` with `fbsource//third-party/rust:futures-preview` and their uses in Rust code from `futures_preview::` to `futures::`.

This does not introduce any collisions with `futures::` meaning 0.1 futures because D20168958 previously renamed all of those to `futures_old::` in crates that depend on *both* 0.1 and 0.3 futures.

Codemod performed by:

```
rg \
    --files-with-matches \
    --type-add buck:TARGETS \
    --type buck \
    --glob '!/experimental' \
    --regexp '(_|\b)rust(_|\b)' \
| sed 's,TARGETS$,:,' \
| xargs \
    -x \
    buck query "labels(srcs, rdeps(%Ss, //common/rust/renamed:futures-preview, 1))" \
| xargs sed -i 's,\bfutures_preview::,futures::,'

rg \
    --files-with-matches \
    --type-add buck:TARGETS \
    --type buck \
    --glob '!/experimental' \
    --regexp '(_|\b)rust(_|\b)' \
| xargs sed -i 's,//common/rust/renamed:futures-preview,fbsource//third-party/rust:futures-preview,'
```

Reviewed By: k21

Differential Revision: D20213432

fbshipit-source-id: 07ee643d350c5817cda1f43684d55084f8ac68a6
2020-03-03 11:01:20 -08:00
Stanislau Hlebik
b90a3e842a common/rust: add fbinit::compat_test
Summary:
While we are transitioning from tokio 0.1 to tokio 0.2 we might need to use
[tokio_compat](https://docs.rs/tokio-compat/0.1.4/tokio_compat/) crate.

Let's add a helper macro similar to fbinit::test that uses tokio_compat
runtime.

Reviewed By: farnz

Differential Revision: D20213814

fbshipit-source-id: 18976e953011c8ada1fa915686e2dcb76ea288d5
2020-03-03 10:18:02 -08:00
Thomas Orozco
83cd9eec54 mononoke/apiserver: run streams on a Tokio 0.2 runtime
Summary:
Well, we don't have a Tokio Compat runtime in Actix. This means Tokio 0.2 code
(e.g. Tokio 0.2 timers) blows up when executed in the API Server.

How do we fix this? By not running Mononoke code on Actix's runtime, and
instead running in on a Mononoke runtime we instantiated.

How do we do that? By passing a Tokio Compat Executor all the way down to the
place where Actix is about to consume our stream ... and at that point, we
spawn the stream on our runtime, and give Actix a dumb receiver that does work
when polled on a Tokio 0.1 runtime.

This feels like the end of the road for the API Server. Nothing about this is
even remotely sane, but it should take us through the API Server's eventual
demise and replacement with the Gotham-based EdenAPI Server, which runs on the
runtime of our choice (i.e. Tokio 0.2).

Reviewed By: farnz

Differential Revision: D20222294

fbshipit-source-id: 1646e35fe05b131b030e4962c8a7f68f72995035
2020-03-03 10:18:02 -08:00
Doug Neal
1e088c0af2 mononoke: lfs_server: add optional client identities to ratelimit config
Summary:
* Added intermediate (de)serializers for config types, so that we generate full Identity objects at config load time
* Implement FromStr for Identity
* Compare configured identities to presented identities in ratelimit middleware in order to decide whether or not to apply the limit

Reviewed By: krallin

Differential Revision: D20139308

fbshipit-source-id: 340c300db549575eb6d06efcbe437c0b1db4927b
2020-03-03 09:33:03 -08:00
Genevieve Helsel
e1e698ccb3 update eden doctor to log vector of problem types
Reviewed By: chadaustin

Differential Revision: D20199631

fbshipit-source-id: 30c770167181db30f956a76ea48327800c4a6ae6
2020-03-03 08:04:29 -08:00
Genevieve Helsel
0351783cbe allow tag support in cli scuba logging
Summary: We should support logging tags as well. I pass this along as a set until json construction because we do not want to have repeat values since tags are expected to be sets

Reviewed By: chadaustin

Differential Revision: D20199632

fbshipit-source-id: 2b5c94f1747a9b30d7a97b605abfd0e39928464c
2020-03-03 08:04:29 -08:00
Stanislau Hlebik
a70ccf6f04 mononoke: make it clearer which repo is accessed in permission error
Summary:
Usually we have only one repo, but in case of xrepo_commit_lookup we actually
have two. It's nice to know which permission failed

Reviewed By: krallin

Differential Revision: D20221509

fbshipit-source-id: ee98845767e72f99027ba18a8c5b374cb6f9f3ab
2020-03-03 07:22:50 -08:00
Alex Hornby
464ffc40eb mononoke: pushrebase: fix casefolding_check usage during changeset creation
Summary: Honor the repo casefolding_check setting as tested by test-pushrebase-allow-casefolding.t

Reviewed By: StanislavGlebik

Differential Revision: D20192411

fbshipit-source-id: 8da72049417015b1f284c115a53b13c26ce3c3f6
2020-03-03 03:57:32 -08:00
Alex Hornby
5491f049a4 mononoke: walker: publish per-node-type stats
Summary: publish per-node-type progrss stats so we can correlate storage access/load to type of node traversed

Reviewed By: farnz

Differential Revision: D20181064

fbshipit-source-id: c741b526c50e86a3eee105fab57fd7bc3ecc063b
2020-03-03 03:47:57 -08:00
Alex Hornby
37da3ebd2b mononoke: pushrebase: add tests for casefolding
Summary: Add tests for existing default block casefolding_check behaviour,  plus test demonstrating problem with casefolding_check=false

Reviewed By: farnz

Differential Revision: D20192412

fbshipit-source-id: 1aea0fc5581e0c44388a4224ca693698731d3cd5
2020-03-03 02:44:06 -08:00
David Tolnay
fe65402e46 rust: Move futures-old rdeps to renamed futures-old
Summary:
In targets that depend on *both* 0.1 and 0.3 futures, this codemod renames the 0.1 dependency to be exposed as futures_old::. This is in preparation for flipping the 0.3 dependencies from futures_preview:: to plain futures::.

rs changes performed by:

```
rg \
    --files-with-matches \
    --type-add buck:TARGETS \
    --type buck \
    --glob '!/experimental' \
    --regexp '(_|\b)rust(_|\b)' \
| sed 's,TARGETS$,:,' \
| xargs \
    -x \
    buck query "labels(srcs,
        rdeps(%Ss, fbsource//third-party/rust:futures-old, 1)
        intersect
        rdeps(%Ss, //common/rust/renamed:futures-preview, 1)
    )" \
| xargs sed -i 's/\bfutures::/futures_old::/'
```

Reviewed By: jsgf

Differential Revision: D20168958

fbshipit-source-id: d2c099f9170c427e542975bc22fd96138a7725b0
2020-03-02 21:02:50 -08:00
Zeyi (Rice) Fan
7627417ce8 check for interrupted transaction and try to repair it
Summary: Recently there are increased reports on EdenFS's backing repo stuck in interrupted transaction state, and the user has to manually run `hg recover` in their backing repo to fix the problem. This diff teaches `eden doctor` to automatically run that command for the users.

Reviewed By: simpkins

Differential Revision: D20109567

fbshipit-source-id: a7427834e98425be388741c7f214b9d7354ac44e
2020-03-02 18:54:55 -08:00
Adam Simpkins
3c29a20934 move the process_finder CLI code to its own library
Summary:
Enable pyre-strict type checking for `process_finder.py`, and split it into
its own library.

Reviewed By: genevievehelsel

Differential Revision: D20178483

fbshipit-source-id: e6c62ca5d84c7b7e599ae00fb51df6f7e4c55a65
2020-03-02 15:41:37 -08:00
Adam Simpkins
95ec8e042a fix platform checks in the CLI code
Summary:
A couple places in the CLI code (mostly used by `eden doctor`) were checking
`sys.platform` to tell if we were on Linux.  Unfortunately these checks both
expected the value `linux2`.  However, since Python 3.3 `sys.platform` is just
`linux` on Linux, and not `linux2`.  This meant we were always hitting the
non-Linux code paths and skipping these checks.

This updates the code to check `platform.system()`.  Based on the
documentation it sounds like this is intended to give a bit more consistent
behavior across different platforms and OS versions.

Reviewed By: genevievehelsel

Differential Revision: D20178488

fbshipit-source-id: c908d5133a9c41e6a239a8893742d03f6c08527c
2020-03-02 15:41:36 -08:00
Adam Simpkins
3d18d04475 change the process name for the privhelper to "edenfs_privhelp"
Summary:
Call `folly::setThreadName()` in the privhelper process when it starts.  This
changes the command name reported in `/proc/PID/comm` and in `ps`

The process name is limited to 15 bytes, so this shows up as `edenfs_privhelp`

Reviewed By: fanzeyi

Differential Revision: D20199409

fbshipit-source-id: a5349bfab9230174aaa99c87f0db73fe31659186
2020-03-02 15:35:21 -08:00
Doug Huff
1a1c6d7e35 Type hints
Summary: One small step towards typing

Reviewed By: thatch

Differential Revision: D20090620

fbshipit-source-id: 811bb54159ab91e5560d115c20373eaf6542b2f9
2020-03-02 13:49:15 -08:00
Stanislau Hlebik
25c57e445c mononoke: add create_warmer() function
Summary:
Small cleanup that removes a bunch of duplicate code.
That should make it easier to add other types of derived data to the warmer

Reviewed By: krallin

Differential Revision: D20193169

fbshipit-source-id: 437fe7981d8a71164dc9edfcc423e8c41cbe0967
2020-03-02 10:08:09 -08:00
Arun Kulshreshtha
bd4a623ccb mononoke_api: Add HgFileContext::new_check_exists
Summary: Add a `new_check_exists` method to `HgFileContext` to allow looking up potentially nonexistent filenodes.

Reviewed By: xavierd

Differential Revision: D20159085

fbshipit-source-id: f6047f7a25f59594823672373d8b35adb49586e1
2020-03-02 09:41:21 -08:00
Arun Kulshreshtha
8ec76a0bce mononoke_api: add hg module
Summary:
Add a a new `hg` module to the `mononoke_api` crate that provides a `HgRepoContext` type, which can be used to query the repo for data in Mercurial-specific formats. This will be used in the EdenAPI server.

Initially, the `HgRepoContext`'s functionality is limited to just getting the content of individual files. It will be expanded to support querying more things in later diffs.

Reviewed By: markbt

Differential Revision: D20117038

fbshipit-source-id: 23dd0c727b9e3d80bd6dc873804e41c7772f3146
2020-03-02 09:41:20 -08:00
Thomas Orozco
0dadca26e7 mononoke/gotham_ext: make MononokeHttpHandler middleware async & allow preemption
Summary:
This updates our middleware stack and introduces two new pieces of functinality:

- Middleware can now be async.
- Middleware can now preempt requests and dispatch a response.

The underlying motivation for this is to allow implementing Mononoke LFS's rate
limiting middleware in our existing middleware stack.

Reviewed By: kulshrax

Differential Revision: D20191213

fbshipit-source-id: fc1df7a14eb0bbefd965e32c1fca5557124076b5
2020-03-02 09:28:08 -08:00
Arun Kulshreshtha
615d8392bc mononoke_api: update doc comments on file content methods
Summary: D20121350 changed the methods for accessing file content on `FileContext` to no longer return `Stream`s. We should update the comments accordingly.

Reviewed By: ahornby

Differential Revision: D20160128

fbshipit-source-id: f5bfd7e31bc7e6db63f56b8f4fc238893aa09a90
2020-03-02 09:21:08 -08:00
Shai Szulanski
42456710dd Add some missing transitive dependencies
Summary:
A bunch of files include folly/executors/GlobalExecutors.h transitively through thrift/lib/cpp2/async/Stream.h, which is going away. Explicitly include the header (and add dependency to target) in preparation for deleting Stream.h
drop-conflicts

Reviewed By: vitaut

Differential Revision: D20141838

fbshipit-source-id: 21c58cf82136287fc2d84ba5badec6b872106015
2020-03-02 08:54:49 -08:00
Thomas Orozco
2d04773c23 mononoke/hg_sync_job: update Globalrevs in hgsql
Summary:
This updates the hg_sync_job to update Globalrevs in hgsql before attempting to
sync bundles. This means that if we're syncing successfully, hg is in sync with
Mononoke, and if we fail (which should be very uncommon to begin with!), hg
might skip a little bit ahead, but that's OK.

This only makes sense when generating bundles — when doing pushrebase, hg would
be updating its own globalrevs.

Reviewed By: StanislavGlebik

Differential Revision: D20159262

fbshipit-source-id: 6736f8592682da1001c7c9c4c9444462b71913c2
2020-03-02 08:24:16 -08:00
Genevieve Helsel
528015f9fe allow more hg fastpath cases
Reviewed By: simpkins

Differential Revision: D20143888

fbshipit-source-id: 4b1a73159bde6835626ad1766b2cf9dcd2faf6c4
2020-03-02 07:43:39 -08:00
Stanislau Hlebik
638e637ef6 RFC: mononoke: introduce unodes v2
Summary:
Our previous implementation of unodes had a problem with diamond merges -
essentially because p1 and p2 might have the same file but with different
content unode will always create a merge unode which can be unexpected.
(code comment in unodes/derive.rs has more info about it).

This diff fixes the problem by introducing unodes v2. This allows us to import
new repos with new unode implementation while keeping the old repos with unode
v1.

This implementation uses a heuristic which should be fast and should do the
correct thing most of the time. In some cases it might exclude some parts of
the history completely. For example:

     O <- merge commit, doesn't change anything
    / \
   P1  |  <- modified "file.txt" to "B"
   |   P2    <- modified "file.txt" to "B"
   \  /
    ROOT <- created "file.txt" with content "A"

In that case history of "file.txt" starting from merge commit will contain only (P1, ROOT),
but it won't contain P2.

We also considered other options:
1) Move this heuristic to fastlog batch derived data. See D19973553 for more
details about why we decided not to do it.

2) Filter out parent unodes that are ancestors of other parent unodes. This should
always be correct, but it will be hard to implement, it wil be even harder to make
sure it always have good performance.

Reviewed By: krallin

Differential Revision: D19978157

fbshipit-source-id: 445ddd5629669d987e7aa88c35fecf0b34a40da0
2020-03-02 05:27:31 -08:00
Stanislau Hlebik
d7a4ff29b5 mononoke: log derivations to separate scuba table
Summary: I'd like to log all derivations to a single place so that's it's easier to understand what was derived and where

Reviewed By: aslpavel

Differential Revision: D20140004

fbshipit-source-id: 305ea533031a04ff95995a6fe2a6e57e95a87026
2020-03-02 04:30:12 -08:00
Alex Hornby
63937e3030 mononoke: walker: log the source node when validating
Summary: Log the source node when validating so that we can more quickly reproduce any issues in a single step via the --walk-root option, rather than needing to run the entire walk again.

Differential Revision: D20098200

fbshipit-source-id: 6b0d7d151c97f25080953d6c0fbf431dc2cec6a8
2020-03-02 02:29:34 -08:00
Jun Wu
c718a5dc19 pathmatcher: add a test about a bug in globset/aho-corasick
Summary:
Also patch aho-corasick to fix the issue.

The issue was introduced by [an optimization path](063ca0d253) added in aho-corasick 0.7 series (used by globset 0.4.3).
aho-corasick 0.6.x (globset 0.4.2) are not affected.

The next aho-corasick release (0.7.9) contains the fix.

See https://github.com/BurntSushi/aho-corasick/issues/53 for more context.

Reported by: yns88

Reviewed By: DurhamG

Differential Revision: D20125697

fbshipit-source-id: 592375b43d7ee494bb3e916a1cb11c18f9ebe425
2020-02-28 22:09:28 -08:00
Jun Wu
c1535925cf pydag: do not take parentfunc at __init__
Summary:
`parentfunc` is only needed when adding new nodes to the DAG.
Move it to `addheads` methods instead.

Reviewed By: sfilipco

Differential Revision: D20155398

fbshipit-source-id: 0bddd5f46e84c44891928b9f598a38206917aecb
2020-02-28 17:45:27 -08:00
Jun Wu
26127e91ec debugstrip: repo.revs -> repo.nodes
Summary: One step towards removing usage of revision numbers.

Reviewed By: sfilipco

Differential Revision: D20155397

fbshipit-source-id: f4f3823146217afd8be75120e46901691fbd24cd
2020-02-28 17:45:27 -08:00
Jun Wu
10bb5a144e revset: replace some repo.revs with repo.nodes
Summary:
Migrate away from some uses of revision numbers.
Some dead code in discovery.py is removed.

I also fixed some test issues when I run tests locally.

Reviewed By: sfilipco

Differential Revision: D20155399

fbshipit-source-id: bfdcb57f06374f9f27be51b0980652ef50a2c8e0
2020-02-28 17:45:26 -08:00
Jun Wu
5d253a75df amend: remove hiddenoverride
Summary:
`hiddenoverride` is a hacky implementation that preserves part of another hacky
`inhibit` extension. With our modern setup (inhibit or narrow-heads),
`hiddenoverride` is less useful. Therefore just remove it.

Reviewed By: sfilipco

Differential Revision: D20148011

fbshipit-source-id: f4a5f05b67ae6f315e9b07d50ef03018d6d05df5
2020-02-28 17:45:26 -08:00
Jun Wu
5b15556e60 pydag: replace SpanSet with NameSet in NameDag public APIs
Summary:
This makes it so that DAG calculations in NameDag are all using commit hashes.

The `id2node`, `node2id` APIs are still using integer ids, and hopefully their
usage can eventually be removed.

Reviewed By: sfilipco

Differential Revision: D20020527

fbshipit-source-id: ee32b1ccacabd5174ff1556e426b5ed32d2b8507
2020-02-28 16:35:25 -08:00
Jun Wu
7c6a84c8f5 pydag: add wrappers for NameSet
Summary:
This exposes the NameSet type to the Python world.
The code is similar to the SpanSet wrapper that exists in pydag.

Reviewed By: sfilipco

Differential Revision: D20020521

fbshipit-source-id: 840e009eadca7154f11ca61561da4c48022088f6
2020-02-28 16:35:25 -08:00
Jun Wu
0220b4a0c3 nameset: make NameIter Send
Summary: This makes it possible to use NameIter in py_class.

Reviewed By: sfilipco

Differential Revision: D20020529

fbshipit-source-id: b9147b7dccb38d18d8361b420507fcbe97e01351
2020-02-28 16:35:25 -08:00
Jun Wu
2bbbd3d956 pydag: handle null node special case
Summary:
Mercurial has a special case that b'\0' * 20 maps to rev -1 and means
"an empty commit". This cannot be cleanly supported by the zstore commit data,
since sha1("") is not '\0' * 20 and zstore does not allow faked SHA1 keys.

Therefore let's add the special case in the bindings layer. It's possible to
do this check in Python, but that'll be slower.

Reviewed By: sfilipco

Differential Revision: D20020520

fbshipit-source-id: 0686832666646f2e201035992e3951b47c32eb5a
2020-02-28 16:35:24 -08:00
Jun Wu
13d6e7c92f pydag: use NameDag
Summary: Use the new NameDag as the backing structure and expose its APIs.

Reviewed By: sfilipco

Differential Revision: D20020528

fbshipit-source-id: ccb49e1a5e757bd35a3f71cfb54ceccfb544664e
2020-02-28 16:35:24 -08:00
Jun Wu
782f2017aa dag: add hex prefix lookup
Summary: This will be used by commit hash prefix lookup.

Reviewed By: sfilipco

Differential Revision: D20020523

fbshipit-source-id: f2905ddf63098704b08dad8eb48272c3ffba7e25
2020-02-28 16:35:24 -08:00
Jun Wu
12441f48bf dag: re-export common types at top-level
Summary: Export common types at the top-level of the crate so it's easier to use.

Reviewed By: sfilipco

Differential Revision: D20020526

fbshipit-source-id: e9a0a8bc3cc91f81d0bc74e7530cd4613fc1dd61
2020-02-28 16:35:23 -08:00
Jun Wu
bc9f72ccf3 dag: implement DAG algorithms on NameDag
Summary: Those just delegate to IdDag for the actual calculation.

Reviewed By: sfilipco

Differential Revision: D20020522

fbshipit-source-id: 272828c520097c993ab50dac6ecc94dc370c8e8b
2020-02-28 16:35:23 -08:00
Jun Wu
b88da34fb0 dag: expose NameDag in tests
Summary: This allows tests to check NameDag APIs.

Reviewed By: sfilipco

Differential Revision: D20020525

fbshipit-source-id: 4ee8e4bcbd0731512ba17068e827b8045fc5d522
2020-02-28 16:35:23 -08:00
Jun Wu
194cd25f4f dag: add Arc<IdMap> to NameDag
Summary: This will be used to produce NameSet.

Reviewed By: sfilipco

Differential Revision: D20020519

fbshipit-source-id: abf6d73f2b985b74560d6b5db2800ff25450f02e
2020-02-28 16:35:22 -08:00
Jun Wu
7a343271b9 dag: rename NameDag::parents to NameDag::parent_names
Summary: This matches IdDag::parents (taking a set) and IdDag::parent_ids.

Reviewed By: sfilipco

Differential Revision: D20020524

fbshipit-source-id: 6e90727c355a7400f9a23e0b25e3392bdc032f49
2020-02-28 16:35:22 -08:00
Jun Wu
e3b28a683c nameset: add fast paths for DagSet
Summary: DagSet's SpanSet has fast paths for set operations. Use them.

Reviewed By: sfilipco

Differential Revision: D19912104

fbshipit-source-id: 24b55aa14d03be2f1be59c923e0b8e79d6bcbe6d
2020-02-28 16:35:22 -08:00
Jun Wu
587b06efee nameset: AllSet
Summary: This is similar to hg's fullreposet. It'll be useful as a dummy "subset".

Reviewed By: sfilipco

Differential Revision: D19912108

fbshipit-source-id: 33a95bcb3cf5931a431a1201d1a1f3c627cec7a1
2020-02-28 16:35:21 -08:00
Jun Wu
d41c55a13b nameset: SortedSet
Summary: SortedSet is a wrapper to other sets that marks it as topologically sorted.

Reviewed By: sfilipco

Differential Revision: D19912111

fbshipit-source-id: 2637e8fd29b97f6db0c5bae3f0decd7ac382eeb1
2020-02-28 16:35:21 -08:00
Jun Wu
51bea7aff7 nameset: LazySet
Summary: Similar to Mercurial's smartset.generatorset.

Reviewed By: sfilipco

Differential Revision: D19912110

fbshipit-source-id: 7d940b8578ec7090282e2addb1fde871cddb2b25
2020-02-28 16:35:20 -08:00
Jun Wu
5e451d07b1 nameset: DagSet
Summary:
Wraps SpanSet + IdMap so it only exposes commit names without ids.
There is no equivalent smartset in Mercurial.

Reviewed By: sfilipco

Differential Revision: D19912112

fbshipit-source-id: 0d257de11527dfa8836065ac94f652730a97a468
2020-02-28 16:35:20 -08:00
Jun Wu
e7e7a5b356 nameset: StaticSet
Summary: Similar to Mercurial's smartset.baseset. All names are statically known.

Reviewed By: sfilipco

Differential Revision: D19912105

fbshipit-source-id: e4fcf2d59291adb3ca01b3b90f1ac32c65ad7eaa
2020-02-28 16:35:20 -08:00
Jun Wu
349d1bc33e nameset: IntersectionSet
Summary: Similar to Mercurial's smartset.filterset.

Reviewed By: sfilipco

Differential Revision: D19912113

fbshipit-source-id: 7cf2101b2eb7ba34b542199293cdbfd3973ef72f
2020-02-28 16:35:19 -08:00
Jun Wu
c0a1a3ab22 nameset: DifferenceSet
Summary: Similar to Mercurial's smartset.filterset.

Reviewed By: sfilipco

Differential Revision: D19912107

fbshipit-source-id: a3187c94f8e0c64f6d92e924ba46e83ce74c3e19
2020-02-28 16:35:19 -08:00
Stanislau Hlebik
168b74e38c mononoke: fix logging in bookmarks
Reviewed By: ahornby

Differential Revision: D20161053

fbshipit-source-id: 7c69bf9421dd9e55bc2ca805c2f14b9c4cd0e669
2020-02-28 13:24:29 -08:00
Stanislau Hlebik
9cf34d97ca mononoke: asyncify WarmBookmarksCache
Reviewed By: ikostia

Differential Revision: D20159967

fbshipit-source-id: dab201530416f17da4b4a3be6c4ecc04b2c10950
2020-02-28 13:24:28 -08:00
Xavier Deguillard
ffced54dff packaging: add edenscm/hgext/convert/repo
Summary: The python files were missing in the package, let's add them.

Reviewed By: quark-zju

Differential Revision: D20163637

fbshipit-source-id: 0a7870a21c42d9b92a8b78b51e4954db0d96c593
2020-02-28 12:15:10 -08:00
Durham Goode
a50d0da7fe py3: fix blame tests
Summary:
Blame can use a templater which doesn't support bytes. Let's just force
all blame output to be unicode, since it doesn't make a ton of sense to blame
binary files anyway.

Also fix test-annotate.py

Reviewed By: quark-zju

Differential Revision: D19907530

fbshipit-source-id: a7a47246368ed50f65486e824f93552872adc09a
2020-02-28 11:32:16 -08:00
Durham Goode
54484268fb py3: more commit cloud fixes
Summary:
Notably, we drop all the encoding business when dealing with json
objects, and instead use mercurial.json.

Reviewed By: sfilipco

Differential Revision: D19888130

fbshipit-source-id: 2101c32833484c37ce4376a61220b1b0afeb175a
2020-02-28 11:32:16 -08:00
Durham Goode
84a42f3471 py3: fix a number of commit cloud tests
Reviewed By: xavierd

Differential Revision: D19888131

fbshipit-source-id: ce1bc011bf76e8cf4bb9bdc0930b8c916229d66d
2020-02-28 11:32:15 -08:00
Durham Goode
98ed0fc5b0 py3: fix a few test-dirstate* tests
Reviewed By: xavierd

Differential Revision: D19888129

fbshipit-source-id: 947ea1bd9c5425fe3babcc60d6b885bde8fc4e2f
2020-02-28 11:32:15 -08:00
Thomas Orozco
82027505a0 mononoke/mercurial: add tests for metadata extraction
Summary:
I noticed in my earlier Bytes 0.5 diff that this doesn't have local test
coverage (there might be things somewhere else in the test suite that look for
it). Let's add some.

Reviewed By: ahornby

Differential Revision: D20139437

fbshipit-source-id: c17e4516574d674bb0b009cd1f322008fb3c1a79
2020-02-28 10:54:04 -08:00
Jun Wu
b6cea95ea5 dag: use Bytes to avoid some VertexName copies
Summary:
This is an example about how to use the new Bytes type. The performance change
is not obviously visible in benchmarks since the bottleneck is not at the bytes
copying.

Reviewed By: DurhamG

Differential Revision: D19818720

fbshipit-source-id: a431ae206cfa4fa08b2e162a48b3d7cbcd900f7f
2020-02-28 09:23:59 -08:00
Jun Wu
76ab726056 dag: switch from bytes to minibytes
Summary: The APIs are compatible so the switch is straightforward.

Reviewed By: DurhamG

Differential Revision: D19818713

fbshipit-source-id: 504e9149567c90eb661804e0dad20580a401aa76
2020-02-28 09:23:59 -08:00
Jun Wu
9e3920ca1c dag: fix benchmarks
Summary: D19559127 forgot those files.

Reviewed By: DurhamG

Differential Revision: D19818715

fbshipit-source-id: 92321492eae89ed9f748800b3bfcc306a54aab20
2020-02-28 09:23:59 -08:00
Jun Wu
c417232b1b mutationstore: update lag_threshold
Summary:
D20042045 changes the meaning of "lag_threshold". Update the value in mutation
store accordingly.

Reviewed By: DurhamG

Differential Revision: D20043116

fbshipit-source-id: 154e6dc2aa88ab0a9a9b21929ae5fa6163dcd403
2020-02-28 09:23:59 -08:00
Jun Wu
1962fd5f5b indexedlog: update lagging indexes at open time
Summary:
Previously indexes are only updated at `sync()` time. This diff makes it so
`open()` can also update lagging indexes. This should make index migration
(ex. D19851355) smoother - indexes are built in time and users suffer less from
the absent of indexes.

Reviewed By: DurhamG

Differential Revision: D20042046

fbshipit-source-id: 20412661a0ca4f5f67b671137c47b6373a42981d
2020-02-28 09:23:58 -08:00
Jun Wu
6da3bdadd2 indexedlog: extract logic writing indexes to disk to a method
Summary: The logic is currently only used by `sync()`. I'd like to reuse it at `open()`.

Reviewed By: DurhamG

Differential Revision: D20042044

fbshipit-source-id: 5c9734ff68bdcf8f8c8710c6a821b18d3afeaca0
2020-02-28 09:23:58 -08:00
Jun Wu
afb24f8a8a indexedlog: change IndexDef.lag_threshold from bytes to entries
Summary:
This is more friendly for indexedlog users - deciding lag_threshold by number
of entries is easier than by bytes.

Initially, I thought checking `bytes` is cheaper and checking `entries` is more
expensive. However, practically we will have to build indexes for `entires`
anyway. So we do know the number of entries lagging behind.

Reviewed By: DurhamG

Differential Revision: D20042045

fbshipit-source-id: 73042e406bd8b262d5ef9875e45a3fd5f29f78cf
2020-02-28 09:23:58 -08:00
Jun Wu
55363a78a7 indexedlog: add API to convert &[u8] to zero-copy Bytes
Summary:
This can be useful for users of indexedlog when they want `Bytes` (to get rid
of the lifetime parameter).

This might be useful for storage layer that wants to take the ownership of the
returned bytes.

Reviewed By: xavierd

Differential Revision: D19818714

fbshipit-source-id: cb2d4e7deff921915e07454fee15cb94a3d5c00d
2020-02-28 09:23:57 -08:00
Jun Wu
556850e715 indexedlog: remove unused mmap utility functions
Summary: Those utilities are no longer necessary since the new code uses Bytes.

Reviewed By: xavierd

Differential Revision: D19818717

fbshipit-source-id: 0b43af0f1eae1a4288e84d4170db058b27f80334
2020-02-28 09:23:57 -08:00
Jun Wu
aaf59c569d indexedlog: replace Mmap with Bytes in Log
Summary: This simplifies the code a bit and makes it cheaper to clone the Log.

Reviewed By: xavierd

Differential Revision: D19818716

fbshipit-source-id: bbf07b8b36009d53b63d8066ec422fc3c3796840
2020-02-28 09:23:57 -08:00
Jun Wu
90ee3cb05a indexedlog: remove ChecksumTable
Summary: It's no longer used since Index now has inlined its checksum logic.

Reviewed By: ikostia

Differential Revision: D19850744

fbshipit-source-id: eb134e4c1613573a2d238710b44ad8119c80a5ee
2020-02-28 09:23:56 -08:00
Jun Wu
a1601bfdd9 indexedlog: bump index filename
Summary:
Change index filename and metadata name. This makes sure the new format and old
format are separate so upgrading or downgrading won't have issues.

Reviewed By: DurhamG

Differential Revision: D19851355

fbshipit-source-id: 25dee018073a90040f5818b32b753a3f589c10e0
2020-02-28 09:23:56 -08:00
Jun Wu
6f4bf325d5 indexedlog: write Checksum inline with Log
Summary:
Enhance the index format: The Root entry can be followed by an optional
Checksum entry which replaces the need of ChecksumTable.

The format is backwards compatible since the old format will be just
treated as "there is no ChecksumTable", and the ChecksumTable will be built on
the next "flush".

This change is non-trivial. But the tests are pretty strong - the bitflip test
alone covered a lot of issues, and the dump of Index content helps a lot too.

For the index itself without ".sum", checksum, this change is bi-directional
compatible:
1. New code reading old file will just think the old file does not have the
   checksum entry, similar to new code having checksum disabled.
2. Old code will think the root+checksum slice is the "root" entry. Parsing
   the root entry is fine since it does not complain about unknown data at the
   end.

However, this change dropped the logic updating ".sum" files. That part is an
issue blocking old clients from reading new data.

Reviewed By: DurhamG

Differential Revision: D19850741

fbshipit-source-id: 551a45cd5422f1fb4c5b08e3b207a2ffe3d93dea
2020-02-28 09:23:55 -08:00
Jun Wu
b9e3046a8d indexedlog: add Checksum entry to Index
Summary:
To solve the soundness issue of ChecksumTable raised by the last diff.
I plan to move Checksum logic to Index. This has multiple benefits:
- Solve the soundness issue of ChecksumTable.
- Indexedlog no longer writes the ".sum" files. `atomic_write` can be quite
  slow (tens of milliseconds) on Windows. So this should help perf - with
  many indexes, it can save hundreds of milliseconds on Windows per
  indexedlog sync.

This diff adds the definition and serialization of the new Checksum entry.
The index format is not updated yet.

Reviewed By: markbt

Differential Revision: D19850742

fbshipit-source-id: df6e6ed12a12ef0d2a782dc9d6b4dc5dec3f4b46
2020-02-28 09:23:55 -08:00
Jun Wu
0f09413ed4 indexedlog: add a broken test showing checksum_table is racy
Summary:
With the last change, mmap cost is reduced, but ChecksumTable is unsound in a
corner case: the buffer to check is shorter than what ChecksumTable covers:

    checksum:  |----chunk----|----chunk----|----chunk--|
    buf:       |-------------------------------|       |
                                               ^       ^
                                        logic len     physical len

The checksum table will be unable to verify the last chunk, since it does not
have enough data in buf.

The issues is exposed by stress testing the multithread sync tests. It's not
always easy to reproduce, though.

Reviewed By: markbt

Differential Revision: D19850745

fbshipit-source-id: a1a96080163b7b9b56dcd6c1673d5d8d10e18a2b
2020-02-28 09:23:55 -08:00
Jun Wu
1e10527482 indexedlog: share Bytes between Index and ChecksumTable
Summary: This avoids some extra mmap syscalls by ChecksumTable.

Reviewed By: xavierd

Differential Revision: D19818721

fbshipit-source-id: dace55193f2b4b0f35e3868781faa2d2998d3b58
2020-02-28 09:23:54 -08:00
Jun Wu
1ece621c4d indexedlog: replace Mmap with Bytes in Index
Summary:
This simplifies the code a bit (no special cases about 0-sized mmap buffers)
and makes it cheaper to clone the index buffer (just an Arc::clone, without
another mmap syscall).

Reviewed By: xavierd

Differential Revision: D19818718

fbshipit-source-id: e96d42af74c7f0bb11703c5da31cdfbd5d76c372
2020-02-28 09:23:54 -08:00
Jun Wu
918672b106 tracing-collector: support owned strings in TreeSpans
Summary:
TreeSpans used to use `&str`, which adds a lifetime to the struct, making it
harder to be used in the Python land. Use a type parameter so TreeSpans<String>
can be used.

Reviewed By: DurhamG

Differential Revision: D19797708

fbshipit-source-id: c66429abfaf16d876151ca6f29da976bed91485d
2020-02-28 09:16:14 -08:00
Jun Wu
4cd7df6a01 tracing-collector: rename structs
Summary:
TreeSpan -> RawTreeSpan; TreeSpanWithMeta -> TreeSpanRef.

I'm going to add a non-reference version of TreeSpanRef.

Differential Revision: D19797701

fbshipit-source-id: 42b04c23d4d0ddbe821b94fa2ccb133ce9eafa05
2020-02-28 09:16:14 -08:00
Jun Wu
957617c8b8 tracing-collector: support filtering in TreeSpans
Summary:
The filtering interface allows callsite to select what they want. It's similar
to manifest walk with files or directory matchers in source control.

Reviewed By: DurhamG

Differential Revision: D19784467

fbshipit-source-id: 5cf6e4016d6fa1c90f8aeccc50809baccd4af5ab
2020-02-28 09:16:13 -08:00
Jun Wu
366e701239 tracing-collector: support Events in TreeSpans
Summary: The idea is that instants (events) can be a drop-in replacement for `ui.log`.

Reviewed By: DurhamG

Differential Revision: D19782897

fbshipit-source-id: 795bbba23d921e460f723f19ef529b203aea366a
2020-02-28 09:16:13 -08:00
Jun Wu
d205592d42 tracing-collector: extract logic finding parent span to a function
Summary: This function will be reused by the next diff.

Reviewed By: DurhamG

Differential Revision: D19782895

fbshipit-source-id: 1e636eabee9b0dffd287a1e6784a24ab2259f51f
2020-02-28 09:16:13 -08:00
Jun Wu
8b5fdc01fc tracing-collector: put treespans into a struct
Summary: This allows us to define methods on the treespans, such as filtering APIs.

Reviewed By: DurhamG

Differential Revision: D19782896

fbshipit-source-id: 2e7bd8344c0196e382728c26a8233abf944bbf29
2020-02-28 09:16:12 -08:00
Alex Hornby
938830d3f6 mononoke: walker: add ability to track route to node
Summary: Add ability to track route to node, so that one could report the node from which failing step started from.

Reviewed By: ikostia

Differential Revision: D20097615

fbshipit-source-id: 4f2c000f54bd212225533e7f3570178020f34a9d
2020-02-28 09:01:35 -08:00
Kostia Balytskyi
cec057adc5 mononoke: add some perf counters for hydrated getbundle responses
Summary:
In case this starts to cause problems, let's have a way to correlate those
problems with some exported metrics.

Reviewed By: StanislavGlebik

Differential Revision: D20158822

fbshipit-source-id: 6ac9e25861dbedaecdf04fd92bda835ae66535eb
2020-02-28 08:30:43 -08:00
Kostia Balytskyi
7ed52ee31b mononoke: return hydrated bundles for infinitepush, if config says so
Summary:
## Wider goal
See D20068839

## This diff
This diff actually implements the conditional hydration of `getbundle`
responses, as described in the D20068839.

Note that as well as implementing support for hydrated `getbyndle` responses, this diff also implements support for changegroup v3 and lfs in such responses, which is needed if we are to do this kind of stuff in LFS-enabled repository.

Reviewed By: StanislavGlebik

Differential Revision: D20068838

fbshipit-source-id: fbdd3f8f5fb7cd2cb60473a94094553a1d4b4d2f
2020-02-28 08:30:43 -08:00
Alex Hornby
7f09703c4c mononoke: walker: log per-run session id to scuba for validate
Summary:
Extend the session id logging to the validate command by adding ability to set
the progress reporters scuba builder.

Reviewed By: ikostia

Differential Revision: D20074153

fbshipit-source-id: ceaeebdb7eb976080061ad3b76b22d7a0f7bd891
2020-02-28 04:57:09 -08:00
Alex Hornby
7baf1066ab mononoke: walker: fix performance regression in loading file data for compression-benefit
Summary: Fix performance regression in loading file data in compression-benefit subcommand

Reviewed By: StanislavGlebik

Differential Revision: D20142143

fbshipit-source-id: 0b9d93feaddab1df4b9d5777e0637f35aed2feda
2020-02-28 04:57:08 -08:00
Thomas Orozco
c6957c1f1e mononoke/newfilenodes: use for for_sharded_connection()
Summary: I canaried with this but I forgot to fold it in -_-

Reviewed By: HarveyHunt

Differential Revision: D20158157

fbshipit-source-id: 4a570bbca421d8c3e1e66605f164f2b8e2a433f6
2020-02-28 04:53:03 -08:00
Kostia Balytskyi
d5080d20ce mononoke: asyncify get_manifest_and_filenodes in getbundle_response
Summary:
## Wider goal
See D20068839

## This diff
Modernize this particular function

Reviewed By: StanislavGlebik

Differential Revision: D20097802

fbshipit-source-id: fe76aaf2c0b65cf9b47a1dedc66d417d22cad255
2020-02-28 04:36:38 -08:00
Kostia Balytskyi
7755c4c4e6 mononoke: asyncify prepare_filenode_entries_stream in getbundle_response
Summary:
## Wider goal
See D20068839

## This diff
Modernize this particular function.

Reviewed By: krallin

Differential Revision: D20097805

fbshipit-source-id: bbcf371921d3a709cc7178ec50b7729bddf1f630
2020-02-28 02:49:57 -08:00
Thomas Orozco
c680696e40 mononoke: defer hook loading
Summary:
Most binaries don't need hooks. Let's not require them. This might not be very
long lived since Simon is working on removing lua hooks, but this was a trivial
fix.

Reviewed By: johansglock

Differential Revision: D20140026

fbshipit-source-id: cc74b37459f63c5dd550c5779b72aa1d6531202c
2020-02-28 02:03:07 -08:00
Thomas Orozco
515f4a507d mononoke/cachelob: remove Memcache blob write leases
Summary:
(this doesn't remove ad-hoc leases, like derived data)

Let's see if this has any impact on performance. We no longer fail Manifold
writes on conflicts, and

Reviewed By: StanislavGlebik

Differential Revision: D20038572

fbshipit-source-id: 4a972ff09ceb65e69a1d22a643a8f2d9b2ab1b17
2020-02-28 01:59:36 -08:00
David Tolnay
37a8401761 rust/thrift: Un-rename futures-preview dependency
Summary: The Thrift generated code depends only on futures 0.3, not 0.1. Thus it isn't necessary to depend on renamed:futures-preview and we can depend on futures-preview directly, which is exposed to Rust code as `futures::`.

Reviewed By: jsgf

Differential Revision: D20145921

fbshipit-source-id: 5cae94ec6747a374c2bf05f124ab237c798de005
2020-02-27 22:27:58 -08:00
David Tolnay
d8bd00ce36 rust/thrift: Drop unused dependencies on old futures in various places
Summary:
The last uses of futures 0.1 were removed in D18411564 and D18392252.

A later diff will switch thrift from using renamed:futures-preview to plain futures-preview to prepare for eliminating the -preview suffix.

Reviewed By: jsgf

Differential Revision: D20143832

fbshipit-source-id: b7fd79f18368ade59eeba6ed0ac09613000c046b
2020-02-27 22:24:10 -08:00
Adam Simpkins
e7e58e4eb1 make a few additional enhancements to the CLI telemetry code
Summary:
Add a `TelemetrySample.fail()` method to report error information in a sample,
even when it is used in a `with` context that completes without an exception.

Also add a `TestTelemetryLogger` to help check telemetry logging behavior in
unit tests.

Reviewed By: genevievehelsel

Differential Revision: D20136170

fbshipit-source-id: ad94d044c7ae0835e3fe17aaa74eb92dfd41bf8e
2020-02-27 19:25:35 -08:00
Jun Wu
b1f8456309 phabdiff: allow reviewers to be used as a string
Summary: It was a list. Make it possible to use it as a string.

Reviewed By: xavierd

Differential Revision: D20144811

fbshipit-source-id: b280c0344215a4c23ab9c63d89f47adf34fb06f3
2020-02-27 19:21:44 -08:00
Jun Wu
ae8f6ff8e8 tests: opt-in DUMMYSSH_STABLE_ORDER for more tests
Summary: This should help reduce test flakiness.

Reviewed By: xavierd

Differential Revision: D19872952

fbshipit-source-id: d66f6c404534b3f47903b478e3cdfdda5ed46284
2020-02-27 17:54:08 -08:00
Durham Goode
becb7da2e3 py3: use 's' instead of 'C' for dirstatetuple parsing
Summary:
The state entry of a dirstate tuple is a single character. In python 3
it's a unicode string. To parse it, previously we used 'C' which takes a single
character unicode string and (little did I know) returns an int. We were storing
this in a char, which causes corruption.

Let's switch to reading the string, and just grabbing the first byte.

Reviewed By: xavierd

Differential Revision: D20143094

fbshipit-source-id: d9946c0cefdafe0941f4bdac070659fac27f30e3
2020-02-27 13:07:13 -08:00
Jeff Zhang
c517e81329 Push compat down deeper into subcommands & make subcommand functions async in eden/mononoke/cmds/admin/main.rs
Summary: Continue to push `compat()` deeper into subcommands. This enables us to refactor each file one at a time and ultimately remove the old futures from our code base.

Reviewed By: farnz

Differential Revision: D20132126

fbshipit-source-id: cc10dde6eda7ddcbf911dbe8d3ebe1713f8ec2ab
2020-02-27 12:39:28 -08:00
Thomas Orozco
b7dfbdd09d mononoke/newfilenodes: stop using i8 internally for is_tree
Summary: Makes the code a little nicer to work with.

Reviewed By: HarveyHunt

Differential Revision: D20138720

fbshipit-source-id: 19f228782ab3582739e35fddcb2b0bf952110641
2020-02-27 12:34:23 -08:00
Thomas Orozco
ed602e6009 mononoke/newfilenodes: retry on master whens paths are missing
Summary:
Paths are in a different replica, so they can be missing even if copy info is
present. Let's fallback to master in this case.

Differential Revision: D20098902

fbshipit-source-id: 838ab1c70a74420c431a2f442f1504c8edd29a2e
2020-02-27 12:34:23 -08:00
Thomas Orozco
4d2932c43b mononoke/newfilenodes: switch to a virtual sharding strategy
Summary:
Locking by physical shard worked earlier in this stack as indicated in the
benchmarks, but after Ondemand restored their fetching for www, it proved
insufficient in terms of parallelism, and resulted in substantially slower
gettreepacks.

Besides, with the "physical sharding" approach, we found ourselves between a rock and a hard place in terms of what to do with paths:

- We could keep holding the semaphore for a filenode while fetching paths. This is undesirable because it further limits our level our concurrency (because fetching a filenode + paths is going to be at least 2x as slow as fetching a filenode).
- We could fetch them without holding a lease at all. This is even more undesirable, because it means that when we release the semaphore for a given shard, we haven't filled the cache yet. This means that if we have a queue of 2 requests for the same bit of data, we're going to fetch twice (task A acquires the lock, goes to MySQL for the filenode, releases the lock and starts going to paths, at which point task B acquires the lock and goes to MySQL again since the filenode hasn't been filled yet).

To fix this, I had to add a dedicated cache for paths, and put it behind  semaphores as well. In the example above, this would ensure task B finds a "partial filenode" in the cache and doesn't go to MySQL (instead, it goes straight up to queuing for access to paths, where it will wait behind task A and also won't hit MySQL).

There are a few problems with this:

- It's a lot of extra complexity (because we need to handle half misses where we have the filenode but not the path).
- It ties together our level of concurrency a second time to that of the underlying number of physical shards, which is kinda meaningless when some of this data can be provided by Memcache to begin with.

This diff fixes both problems.

The root cause of our problem that is that we're tying our level of concurrency to physical
MySQL shards, whereas what we actually want is a tunable level of concurrency
that matches our work load, yet effectively deduplicates queries.

In this diff, I'm updating our exclusive locking to be purely virtual. This
means that we're still not over-fetching, but we are no longer constrained by
the parallelism of the underlying DB (this does mean we might queue up requests
there, but they won't be duplicate requests).

This also results in simpler code, and opens up the way for further
improvements in the future, such as using Memcache lease-get operations to
further deduplicate calls, if we'd like.

As part of that, I've also updated our remote_cache to use the same CacheKey
entity as the local cache, to avoid spending time producing new keys when we
have perfectly good ones available.

Reviewed By: StanislavGlebik

Differential Revision: D20097821

fbshipit-source-id: 03d7be9082982fc1c6ef365d541c1ed8ae3e6e8d
2020-02-27 12:34:23 -08:00
Thomas Orozco
b4e8201d4c mononoke/newfilenodes: track perf counters appropriately
Summary: Let's record perf counters properly.

Reviewed By: StanislavGlebik

Differential Revision: D20097823

fbshipit-source-id: 0daed281d3c080fcbe7b4fac996fb265bdd6d408
2020-02-27 12:34:22 -08:00
Thomas Orozco
500baffb5c mononoke/newfilenodes: add tests for cache fill behavior
Summary:
This adds a test for our cache fill behavior, which is to fill the remote cache
if we miss in local cache. I hadn't added this later and it's a little easier
to add now that the refactor for FilenodeInfo is through.

Reviewed By: ahornby

Differential Revision: D19905396

fbshipit-source-id: 88b5fd83f5d2213e91efc3c5dfb91dfe4e395136
2020-02-27 12:34:22 -08:00
Thomas Orozco
95d463ce47 mononoke/filenodes: Remove path from FilenodeInfo
Summary:
This updates our filenodes implementation to use different types for writing
(`PreparedFilenode`) and reading `(FilenodeInfo`).

The bottom line is that this avoids a bunch of cloning of paths on the read
path, which doesn't need to return the path to the caller, since the caller
already knows it! We can also take it out of Memcache, since we don't need
Memcache to tell us the path for a blob we could only possibly have found by
having the path to begin with.

This does update our filenodes serialization format. I bumped MC_CODEVER
accordingly.

Reviewed By: StanislavGlebik

Differential Revision: D19905400

fbshipit-source-id: 6037802c1773de564cade8e264d36087382ee15a
2020-02-27 12:34:21 -08:00
Thomas Orozco
7fa9607859 mononoke/newfilenodes: remove sqlfilenodes
Summary:
This removes the old sqlfilenodes implementation, since we're now using the new
one. There's also a bit of cruft here and there we can get rid of.

Reviewed By: StanislavGlebik

Differential Revision: D19905395

fbshipit-source-id: 2526b6d65eeb981f5aedda9951b44b389ecec29d
2020-02-27 12:34:21 -08:00
Thomas Orozco
149e15f2ad mononoke: use spawn_future in getpack to fetch history
Summary:
The former implementation would eagerly query Memcache when fetching history
(due to how old futures work) for files in getpack, but the new one does not.
This means the new one loses out on a lot of buffering, which the old one used
to do.

This diff emulates the old behavior by eagerly querying filenodes in getpack,
which improves performance on a very big getpack (32K files) by about 3x, and
makes it 30% faster than the old code, instead of > 2x slower.

Note that I'm not certain we really want to do this kind of aggressive
buffering in getpack long term, but for now, I'd like to keep this unchanged.

Reviewed By: StanislavGlebik

Differential Revision: D19905398

fbshipit-source-id: 49f9a2cd505a98123fd1dabb835e8e378d45c930
2020-02-27 12:34:21 -08:00
Thomas Orozco
f6866eb97d mononoke: switch to new filenodes implementation
Summary:
This updates Mononoke to use the new filenodes implementation introduced
earlier in this stack.

See the test plan for detailed performance results supporting why I'm making
this change.

Reviewed By: StanislavGlebik

Differential Revision: D19905394

fbshipit-source-id: 8370fd30c9cfd075c3527b9220e4cf4f604705ae
2020-02-27 12:34:20 -08:00
Thomas Orozco
a039745642 mononoke/newfilenodes: introduce timeouts talking to Memcache, MySQL
Summary:
Since we have one connection per shard, it's a good idea to make sure we don't
keep those locked for too long. This diffs adds generous timeouts to protect
against this, as well as ODS reporting to track errors.

Reviewed By: StanislavGlebik

Differential Revision: D19905393

fbshipit-source-id: ee4f4d3e33cf48a9002b016e31d37a401c6578f2
2020-02-27 12:34:20 -08:00
Thomas Orozco
c31b7d9ef9 mononoke/newfilenodes: introduce remote caching
Summary:
This introduces caching of filenodes to Memcache as in the old filenodes
implementation. The code is mostly was ported over from the existing filenodes
implementation, and converted to async / await. However, one key difference is
that the lookups happen once we hold the semaphore to talk to the underlying
MySQL shard.

The reason for this is:

- Reads to Memcache are really fast. They're often under 1ms. If you're going
  to miss in Memcache and have to go to SQL, it won't make you much slower.
- Reads to Memcache are kinda expensive CPU-wise. Data in Memcache is
  compressed, and we often see a lot of our CPU cycles spent talking to Memache
  when we're under load.
- Memcache isn't an infinite resource. If we're reading the exact same
  key a hundred times, that's going to hit the same Memcache box. A bit of
  deduplication on our end is a nice thing to strive for. Besides, our own
  thread pool we use to talk to Memcache is limited in size.

From a performance perspective, this doesn't make things any slower, but
reduces CPU usage when we'd otherwise have a lot of duplicate fetching.

Finally, note that this update also includes support for dirty-tracking in our
local cache. We use this to know if we should fill the remote cache (if we 100%
hit in local cache, we don't fill the remote cache).

Reviewed By: StanislavGlebik

Differential Revision: D19905390

fbshipit-source-id: 363f638bb24cf488c7cd3a8ecea43e93f8391d3f
2020-02-27 12:34:19 -08:00
Thomas Orozco
1c94a586f0 mononoke/newfilenodes: introduce local caching
Summary:
This is the meat of the change I'm trying to make here. This updates
newfilenodes to check their cache before dispatching queries to MySQL once they
acquire the connection.

Since we only get one connection per shard, this ensures that we don't query
several times for the same piece of data.

Note that the caching structure is a little different from the old one, which
cached entire filenode info. Instead, this now caches the exact data we'd get
out of MySQL, since we want to map MySQL queries 1-1 to cache lookups.

With this change, we also now have a local cache for file history queries.
Historically, we hadn't cached those at all, but with this change, we can get a
lot of value of caching them even for small period of time in order to
de-amplify reads to MySQL and Memcache.

However, they are in separate cache pools to make sure they don't evict point
filenodes, which we use for gettreepack (and have a good hit rate, unlike
history blocks, which have a pretty poor hit rate).

Note that having those semaphored connections might feel a little scary, but
it's worth noting that the exact same bottleneck is implicitly present in the
existing filenodes implementation, since we can only have one active query to
any given shard a given time. That said, this approach also gives us a little
more future flexibility, if we'd like, since we could map multiple semaphores to
"sub shards" that map N-to-1 to real, physical shards.

Reviewed By: HarveyHunt

Differential Revision: D19905391

fbshipit-source-id: 02b5efaa44789e6afcccdeb9ee2b4791f7c3c824
2020-02-27 12:34:19 -08:00
Thomas Orozco
ab4f7adaeb mononoke/newfilenodes: introduce a queue-conscious filenodes implementation
Summary:
This introduces a new implementation of filenodes that maintains its own
queuing on top of the queuing enforced by the SQL crate.

Later in this stack, the goal is for this implementation to avoid dispatching
duplicate queries when there is a lot of contention talking to MySQL, which
happens when large changes land and suddenly everyone wants the updated code.

The underlying goal is to avoid dispatching a lot of duplicate queries when
there is contention. Indeed, if there is contention, then the latency between
query and response increases. As a result, without visibility in the queue, the
following can happen:

- Task 1 looks for A in the cache. It misses
- Task 1 dispatches a SQL query
- Task 2 looks for A in the cache. It misses
- Task 2 dispatches a SQL query
- Task 3 looks for A in the cache. It misses
- Task 3 dispatches a SQL query
- ...
- Task 1's SQL query finally executes and fills the cache.
- All other queries execute anyway.

The longer the dispatch queue, the longer it takes to run those queries.
Looking at Mononoke's stats in prod, this happens pretty often:
https://pxl.cl/10xxmo (the spike at 3pm was a 10K-files change in fbsource, for
example).

The goal of this stack is to avoid this effect, by checking the cache only once
we know we're ready to go to SQL.

In this particular diff, what's added is:

- The SQL read and write implementation. This is all implemented using new
  futures, but the logic should be largely unchanged from before (i.e. we store
  filenodes and their associated copy info in shards by the filenode's path —
  not the source path if there is copy info —, and paths in their own shard).
  The queries themselves largely unchanged from the existing filenodes, with
  only a few tweaks:
  - Filenodes and copy info are now selected in one go.
  - There are types to distinguish path hashes and paths.
- The structs to support this implementation.

Reviewed By: StanislavGlebik

Differential Revision: D19905397

fbshipit-source-id: bec981e7bfb396d62eb06e5ce249c21555afc64b
2020-02-27 12:34:19 -08:00
Thomas Orozco
341b4f1bc3 mononoke/filenodes: expect a Vec of filenodes to insert
Summary:
The API expects a stream of filenodes to insert, but we actually never used
that ability. Instead, every single callsites has a `Vec`, which it converts to
a stream and passes that in.

I'd like to change this for two reasons:

- It's un-necessary
- It makes the code more complex on the Filenodes implementation side, and less
  efficient, since we need to `chunk()` there in small chunks, which might not
  all be in the same shard. If we get the entire `Vec` at once, we can chunk on a
  per-shard basis (this happens later in this stack).

Besides, if we end up having a stream and wanting the old behavior, we can
always call `chunk()` the stream and call `add_filenodes` on each batch (which
is actually nicer because if you have a futures 0.2 stream that isn't static,
you can do this, but you can't turn it into a `BoxStream`!).

Reviewed By: StanislavGlebik

Differential Revision: D19902537

fbshipit-source-id: a4c030c4a51afbb6e9db133b32464009eed197af
2020-02-27 12:34:18 -08:00
Xavier Deguillard
6fac9ebad0 revisionstore: add a get_stripped method to ContentStore
Summary:
This new method returns the content of a blob without the copy-from metadata
header.

Reviewed By: DurhamG

Differential Revision: D20102889

fbshipit-source-id: e96f636b7d30460b59707a2cb700d667e616116a
2020-02-27 12:29:42 -08:00
Stanislau Hlebik
cc8be5997e mononoke: asyncify derived data
Reviewed By: krallin

Differential Revision: D20139701

fbshipit-source-id: 7f1c8370707eb415dd7e23d94eb923846f7ef59b
2020-02-27 12:17:54 -08:00
Durham Goode
88f9e15086 phrevset: use Mercurial json instead of Python json
Summary:
Python json produces unicode strings in the parsed results. This breaks
when passed to parts of the code that now assert that byte strings are required
(like the wire protocol). Let's switch phabricator stuff to use Mercurial json,
which produces bytes in Python 2 and unicode in Python 3.

Reviewed By: ikostia

Differential Revision: D20123140

fbshipit-source-id: d1b11426736a0f43ff7e74acf709ab1fd70d5bfe
2020-02-27 09:30:43 -08:00
Alex Hornby
e70f3dc76c mononoke: walker: log per-run session id to scuba for scrub
Summary:
Log a per-run session id to distinguish runs more easily.

This diff adds the session for scrub logging ,  following one extends this to validate/progress logging.

So that each tail has a separate session logged,  setup is delayed until the start of each tail by passing it in as a function.

Differential Revision: D19907398

fbshipit-source-id: 8e5470918112321866c67c9f94e703fd46e6a16b
2020-02-27 09:00:44 -08:00
Thomas Orozco
f1121ccef6 mononoke: add a @nocommit hook
Reviewed By: HarveyHunt

Differential Revision: D20139540

fbshipit-source-id: 0be6d1aa8ad7ad1197197ec886f0cf44bd6b864d
2020-02-27 08:28:05 -08:00
Thomas Orozco
26ae726af5 mononoke: update internals to Bytes 0.5
Summary:
The Bytes 0.5 update left us in a somewhat undesirable position where every
access to our blobstore incurs an extra copy whenever we fetch data out of our
cache (by turning it from Bytes 0.5 into Bytes 0.4) — we also have quite a few
place where we convert in one direction then immediately into the other.

Internally, we can start using Bytes 0.5 now. For example, this is useful when
pulling data out of our blobstore and deserializing as Thrift (or conversely,
when serializing and putting it into our blobstore).

However, when we interface with Tokio (i.e. decoders & encoders), we still have
to use Bytes 0.4.  So, when needed, we convert our Bytes 0.5 to 0.4 there.

The tradeoff idea is that we deal with more bytes internally than we end up
sending to clients, so doing the Bytes conversion closer to the point of
sending data to clients means less copies.

We can also start removing those once we migrate to Tokio 0.2 (and newer
versions of Hyper for HTTP services).

Changes that were required:

- You can't extend new bytes (because that implicitly copies). You need to use
  BytesMut instead, which I did where that was necessary (I also added calls in
  the Filestore to do that efficiently).
- You can't create bytes from a `&'a [u8]`, unless `'a` is  `'static`. You need
  to use `copy_from_slice` instead.
- `slice_to` and `slice_from` have been replaced by a `slice()` function that
  takes ranges.

Reviewed By: StanislavGlebik

Differential Revision: D20121350

fbshipit-source-id: eb31af2051fd8c9d31c69b502e2f6f1ce2190cb1
2020-02-27 08:08:28 -08:00
Thomas Orozco
7698cded43 mononoke/hooks: add a signed source hook
Reviewed By: HarveyHunt

Differential Revision: D20139152

fbshipit-source-id: a0a48d447444cf969162f5f9655ab003e7ca2f76
2020-02-27 08:05:14 -08:00
Mateusz Kwapich
6f9f82767c add git identifiers to Source Control Service
Summary: This allows us to translate git hashes

Reviewed By: markbt

Differential Revision: D19972870

fbshipit-source-id: 871a4cf94d468d987221cb08fe7b6135050bac93
2020-02-27 08:05:14 -08:00
Mateusz Kwapich
5825db21c6 add the git<->bonsai translation to mononoke_api crate
Reviewed By: markbt

Differential Revision: D19972871

fbshipit-source-id: 79c0c59f0bd1bd033bf2a8999dbe56b60a7ac085
2020-02-27 08:05:13 -08:00
Mateusz Kwapich
3ff29a8810 make BonsaiGitMapping repo-specific
Summary:
Nearly all of the Mononoke SQL stores are instantiated once per repo but they don't store the `RepositoryId` anywhere so every method takes it as argument. And because providing the repo_id on every call is not ergonomical we tend to add methods to blob_repo that just call the right method with the right repo_id in on of the underlying stores (see `get_bonsai_from_globalrev` on blobrepo for example).

Because my reviewers [pushed back](https://our.intern.facebook.com/intern/diff/D19972871/?transaction_id=196961774880671&dest_fbid=1282141621983439) when I've tried to do the same for bonsai_git_mapping  I've decided to make it right by adding the repo_id to the BonsaiGitMapping.

Reviewed By: krallin

Differential Revision: D20029485

fbshipit-source-id: 7585c3bf9cc8fa3cbe59ab1e87938f567c09278a
2020-02-27 08:05:13 -08:00
Jun Wu
bce29c9562 nameset: UnionSet
Summary: Similar to Mercurial's smartset.addset.

Reviewed By: sfilipco

Differential Revision: D19912106

fbshipit-source-id: 0d0c8d0b71d2757259d26295eb4a564fea807dea
2020-02-27 07:34:57 -08:00
Jun Wu
c906a21ce1 nameset: initial NameSet abstraction
Summary:
The NameSet is something similar to SpanSet and Mercurial's smartset but speaks
VertexNames instead of Ids. The idea is, NameSet will be part of NameDag APIs,
and potentially replace Mercurial's smartset layer (just smartset the container
types, not the revset language), in a way that revision numbers are completely
hidden behind the scenes.

This diff adds some basic abstraction around iteration-related operations.
Other operations will be added later.

Reviewed By: sfilipco

Differential Revision: D19912109

fbshipit-source-id: 504a26c074282ec51f260535ca63e943124f688e
2020-02-27 07:34:57 -08:00
Genevieve Helsel
9f6c043bfd handle EdenError in checkout path
Summary: EdenFS is planning on throwing an error if a user requests a checkout while a checkout is already in progress. Often, this is already disallowed by a mercurial repository lock, but there are instances where these calls can still get through. We would like to disallow these calls to queue, so we will throw an `EdenError` instead. Without this handling, a full stack trace prints, so this just makes it a bit prettier for the user.

Reviewed By: simpkins

Differential Revision: D20106480

fbshipit-source-id: e33df3d0b7aa42867ee752e4c1f3a47b31ade76b
2020-02-27 07:30:35 -08:00
Jun Wu
2996eeb273 test-commitcloud-backup-all: opt-in DUMMYSSH_STABLE_ORDER
Summary:
Stabilize the test. Without the change it's relatively easy to reproduce
test breakage. For example, the following command reproduces the breakage
within 20 seconds:

```
% ./run-tests.py test-commitcloud-backup-all.t --loop
 --- test-commitcloud-backup-all.t
+++ test-commitcloud-backup-all.t.err
@@ -45,10 +45,10 @@

   $ hg cloud backup --traceback
   backing up stack rooted at 64164d1e0f82
+  backing up stack rooted at d0d71d09c927
   remote: pushing 2 commits:
   remote:     64164d1e0f82  A1
   remote:     796f1f48de85  B
-  backing up stack rooted at d0d71d09c927
   remote: pushing 2 commits:
   remote:     d0d71d09c927  A2
   remote:     daeeb2f180d6  C

ERROR: test-commitcloud-backup-all.t output changed
[====================] 47 Passed. 1 Failed. 0 Skipped. -48 Remaining        16.2s
[------------>       ] test-commitcloud-backup-all.t                        1.8s
```

Reviewed By: xavierd

Differential Revision: D19872613

fbshipit-source-id: 4b6a48e2c8987ec0fded73a5b88430c1df1f6fb7
2020-02-27 07:24:37 -08:00
Jun Wu
a5066dd552 dummyssh: add a way to stabilize stdout, stderr order
Summary:
The ssh output order issue is a large contributor to test flakiness.
Example test failures are:

```
 --- test-unbundlereplay.t
+++ test-unbundlereplay.t.respondfully.err
@@ -154,9 +154,9 @@
   remote: [ReplayVerification] Expected: (master_bookmark, c2e526aacb5100b7c1ddb9b711d2e012e6c
69cda). Actual: (master_bookmark, 893d83f11bf81ce2b895a93d51638d4049d56ce2)
   remote: pushkey-abort: prepushkey hook exited with status 1
   remote: transaction abort!
+  replay failed: error:pushkey
+  unbundle replay batch item #0 failed
   remote: rollback completed
-  replay failed: error:pushkey
-  unbundle replay batch item #0 failed
   [1]
   $ cat $TESTTMP/reports.txt
   unbundle replay batch item #0 failed

 --- test-commitcloud-backup-all.t
+++ test-commitcloud-backup-all.t.err
@@ -59,9 +59,9 @@
   remote: pushing 1 commit:
   remote:     eccc11f58a56  D3
   backing up stack rooted at 42952ab62cec
+  backing up stack rooted at 4903fdffd9c6
   remote: pushing 1 commit:
   remote:     42952ab62cec  E1
-  backing up stack rooted at 4903fdffd9c6
   remote: pushing 1 commit:
   remote:     4903fdffd9c6  E2
   commitcloud: backed up 8 commits

test-fb-hgext-lfspushrebase-verify-blobs.t

 --- test-fb-hgext-treemanifest-pushrebase.t
+++ test-fb-hgext-treemanifest-pushrebase.t.err
@@ -127,9 +127,9 @@
   $ hg push --to master -B master --config treemanifest.sendtrees=True
   pushing to ssh://user@dummy/master
   searching for changes
-  remote: baz
   remote: prepushrebase.cat hook exited with status 1
   abort: push failed on remote
+  remote: baz
   [255]

 - Disable the hook
 ```

The order is nondeterministic because the stderr reading thread can read the
content before or after ui.write or ui.write_err in the main thread.

This diff introduces an optional feature in dummyssh that buffers all stderr
output and only write them after the wrapped hg serve process has exited, at
which time the hg client should also have completed its operations and has no
reason to ui.write or ui.write_err anything nondeterministically. Then the
dummyssh wrapper writes out the buffered stderr so the output order becomes
well defined.

Reviewed By: xavierd

Differential Revision: D19872612

fbshipit-source-id: 84710f98a8e6b4a1c283ffecf008585cca12be0a
2020-02-27 07:24:37 -08:00
Jun Wu
c8157cc25a dummyssh: format it
Summary: This makes the next change easier to see.

Reviewed By: xavierd

Differential Revision: D19872609

fbshipit-source-id: 9263a246258ffd18d8d883da7ced435a91fb5ced
2020-02-27 07:24:37 -08:00
Kostia Balytskyi
7ee657f124 mononoke: asyncify signatures of two fns in getbundle_response
Summary:
## Wider goal
See D20068839

## This diff
Asyncifying only singatures allows us to independently work on function bodies, without touching the callsites later in the diff.

Reviewed By: StanislavGlebik

Differential Revision: D20097804

fbshipit-source-id: f1391a055947c7802f719bc99b9eae71a4ac39cd
2020-02-27 05:01:52 -08:00
Kostia Balytskyi
bd90a843a7 mononoke: asyncify diff_with_parents in getbundle_response
Summary:
## Wider goal
See D20068839

## This diff
Let's modernize this particular fucntion

Reviewed By: StanislavGlebik

Differential Revision: D20097800

fbshipit-source-id: a919b5ad1b544a7b784668ca265e24c375100fa3
2020-02-27 05:01:51 -08:00
Kostia Balytskyi
90b03f5a0d mononoke: call old-style Future OldFuture in getbundle_response
Summary:
## Wider goal
See D20068839

## This diff
This file contains a mix of old and new-style futures. It even has futures,
which have items composed of futures. To be able to convert on one of the
levels and not the other, we need to deal with the confusion.

Let's have old things have `Old` in the name.

Reviewed By: StanislavGlebik

Differential Revision: D20097803

fbshipit-source-id: fedb3669ef34a8328ec389a30ff2c512ab363818
2020-02-27 05:01:51 -08:00
Kostia Balytskyi
4f2993c765 mononoke: move bundle generation bits from hg_sync_job into getbundle_response
Summary:
## Wider goal
We want the flexibility to return hydrated responses for `getbundle` wireproto
requests for draft commits. This means that the responses will contain not
only the commit data (as they do now), but also trees and files.
For context, when an "unhydrated" response is returned for the `getbundle`
request for a draft commit, we expect one of two things to happen later
in the e2e scenario:
- either `hg` client would immediately make another wireproto request
  (`gettreepack`, `getpackv1`) within the same client `hg` command execution
- or a subsequent `hg update` call will cause another wireproto request

In any case, another request is needed before the pulled commit can be used.
This request can hit a different server, sometimes it can even be Mercurial
instead of Mononoke. Specifically, it can Mercurial instead of Mononoke if the
`fallback` path markers are configured incorrectly. In that case we have a
problem, as Mercurial is incapable of serving `gettreepack` or `getpackv1` for
infinitepush commits.

One way to deal with this is to always have correct path markers, which is
prone to human mistakes. Another way is to guarantee that Mononoke returns
everything in the original `getbundle` request. We don't want to do this for
public commits, as `pull`s of public commits typically fetch thousands of those
commits and never care about tree or file data for all but one of them. Draft
commits are different however, as they are usually exactly what the client
intends to use, so hydrating those is fine. Still, we want this behavior to
be gated behind a config flag.

## This diff
A lot of the needed code is already implemented in the hg-sync job, bundle
generating variant. So prior to implementing the actual behavior described
above, let's move the relevant bits to `getbundle_response`. Later we can comb
them up a bit (asyncify) and use to implement the needed behavior.

Reviewed By: StanislavGlebik

Differential Revision: D20068839

fbshipit-source-id: 0ab63d57b2d167401b7ee8864fe7760f5f65f8ec
2020-02-27 05:01:51 -08:00
Kostia Balytskyi
aac7bff59d mononoke: pull config schema changes from configerator
Summary:
This is the moral equivalent of D20115877 in fbcode. See that diff for
motivation.

Reviewed By: StanislavGlebik

Differential Revision: D20118575

fbshipit-source-id: 8f77f572068e611003b1344be3434f2d04ec56ca
2020-02-27 05:01:50 -08:00
Stanislau Hlebik
d5d3061168 mononoke: distinguish derived data waits with derived data generation
Summary:
Previously it was hard to tell whether the process were actually responsible
for generating derived data or it was just waiting for it to be generated.

Let's make this distinction clearer.

Reviewed By: johansglock

Differential Revision: D20138284

fbshipit-source-id: 52ae12679db2f61869f048baf2a603b456710a71
2020-02-27 03:15:39 -08:00
Adam Simpkins
3d1962ec1e add a context manager API TelemetrySample
Summary:
Add `__enter__()` and `__exit__()` methods to `TelemetrySample` so it can be
used in `with` statements.  It will automatically track the runtime for the
body of the `with` context, and will record this in the `duration` field of
the sample.  It will also set the `success` field to True if the context exis
normally and False if it exits due to an exception.  On an exception the
`error` field will also be populated with the exception message.

Reviewed By: genevievehelsel

Differential Revision: D20112723

fbshipit-source-id: d55ac3f1b53c23dc001f92a4f8eae431db8954e1
2020-02-26 21:18:11 -08:00
Adam Simpkins
8ec16c8413 add TelemetryLogger that logs directly using scubadata_py3
Summary:
Add a TelemetryLogger class that logs directly to scuba, and use that if we
are building in a Facebook environment.

Reviewed By: genevievehelsel

Differential Revision: D20112727

fbshipit-source-id: 284ca45d1902d51b753ff9a90debf3dfa8282f82
2020-02-26 21:18:11 -08:00
Adam Simpkins
2557cebfd7 add a TelemetryLogger interface
Summary:
Add a `TelemetryLogger` class that abstracts the mechanism we use to log
telemetry samples.  This makes it possible to plug in alternative
implementations.

This includes 3 initial implementations of this class:
* `ExternalTelemetryLogger` logs samples by calling an external command
* `LocalTelemetryLogger` logs JSON samples to a local file
* `NullTelemetryLogger` simply discards all samples

This also moves some of the helper code for constructing telemetry samples
from the `EdenInstance` class and into `TelemetryLogger`.

Reviewed By: genevievehelsel

Differential Revision: D20112725

fbshipit-source-id: dbe24952a92fe548631fc169f146cc14008a7bb6
2020-02-26 21:18:10 -08:00
Adam Simpkins
0642f1618d report the fb303 status in the getDaemonInfo() result
Summary:
Update the thrift `getDaemonInfo()` call to also return the fb303 status.
This allows the CLI to make a single thrift call instead of 2 when checking if
the EdenFS daemon is healthy.

Reviewed By: genevievehelsel

Differential Revision: D20130406

fbshipit-source-id: 9d25341e1d5f82fb1a921e1d7b1ebd34bcf19dc8
2020-02-26 21:03:52 -08:00
Adam Simpkins
436b5bb258 fix thrift timeouts in eden restart
Summary:
Fix the `check_health()` function to always set a timeout when querying for
EdenFS's health.  Originally we used to always set a default timeout of 60
seconds when creating thrift connections to EdenFS, but this was removed in
D5942205.  In practice we ideally really want a handful of specific thrift
calls (e.g., 'checkOutRevision()`, `getScmStatusV2()`) to have extremely high
timeouts, but most other calls should have fairly short timeouts.

For now this ensures that we apply a 3 second timeout by default when checking
for EdenFS health.  The `edenfsctl status` call did explicitly set a 15 second
timeout, but other commands like `edenfsctl clone` and `edenfsctl restart`
would also check for health and were not applying their own timeout.

Also add thrift timeout for the `initiateShutdown()` call when doing a full
restart in `edenfsctl restart`

Reviewed By: chadaustin

Differential Revision: D20130405

fbshipit-source-id: c59118dbcafc2ed0d29206e33891f1a58da8c05f
2020-02-26 21:03:52 -08:00
Michael Devine
0a46a14017 Repo converter: New class "repomanifest"
Summary:
Right now, all of our manifest parsing and evaluation is in the repo() class, but this is a design mistake. Over a repo's convert lifetime, a single repo will have many different manifests, based on branch, and location in the commit history. What's worse is that the current design makes it hard to build unit tests and new features like include evaluation.

This commit creates a whole new class called repomanifest, that represents a specific manifest (and its included files). It also has unit tests to test the various operations that the manifest performs, such as path and revision mapping. This commit does not modify the existing converter code outside of the class to use this new implementation.

Reviewed By: tchebb

Differential Revision: D19402995

fbshipit-source-id: b97dadcc595c6332f4495460618317194873a780
2020-02-26 17:25:22 -08:00
Jun Wu
251fe1b775 sshpeer: always read all stderr messages
Summary:
In the past I saw test breakages where the stderr from the remote ssh process
becomes incomplete. It's hard to reproduce by running the tests directly.
But inserting a sleep in the background stderr thread exposes it trivially:

```
# sshpeer.py:class threadedstderr
     def run(self):
         # type: () -> None
         while not self._stop:
             buf = self._stderr.readline()
+            import time
+            time.sleep(5)
             if len(buf) == 0:
                 break
```

Example test breakage:

```
 --- a/test-commitcloud-sync.t
+++ b/test-commitcloud-sync.t.err
@@ -167,8 +167,7 @@ Make a commit in the first client, and sync it
   $ hg cloud sync
   commitcloud: synchronizing 'server' with 'user/test/default'
   backing up stack rooted at fa5d62c46fd7
   remote: pushing 1 commit:
-  remote:     fa5d62c46fd7  commit1
   commitcloud: commits synchronized
   finished in * (glob)
....
```

Upon investigation it's caused by 2 factors:
- The connection pool calls pipee.close() before pipeo.close(), to workaround
  an issue that I suspect solved by D19794281.
- The new threaded stderr (pipee)'s close() method does not actually closes the
  pipe immediately. Instead, it limits the text to read to one more line at
  most, which causes those incomplete messages.

This diff made the following changes:
- Remove the `pipee.close` workaround in connectionpool.
- Remove `pipee.close`. Embed it in `pipee.join` to prevent misuses.
- Add detailed comments in sshpeer.py for the subtle behaviors.

Reviewed By: xavierd

Differential Revision: D19872610

fbshipit-source-id: 4b61ef8f9db81c6c347ac4a634e41dec544c05d0
2020-02-26 17:08:23 -08:00
Jun Wu
7f38170116 sshpeer: call cleanup on close
Summary:
This makes `peer.close()` actually close the ssh connection if it's an
sshpeer. This affects the `clone` path to actually clean up the ssh connection
so we don't depend on (fragile) `__del__`.

I traced the code back to peerrepository.close in 2011 [1]. At that time it
seems the codebase depends on `__del__`. Nowadays the codebase calls `close()`
properly so I think it's reasonable to make the change.

[1]: https://www.mercurial-scm.org/repo/hg/rev/d747774ca9da.

Reviewed By: ikostia

Differential Revision: D19911393

fbshipit-source-id: ea640d1cd82ffcb786e22f47da8116c7f50a4690
2020-02-26 17:08:23 -08:00
Jun Wu
6465cda913 clone: add a "clonepreclose" function
Summary:
The added function can be used by extensions to run extra logic before the
"clone" function closes the repos or peers.

This is needed to make the next diff work. Otherwise extensions like remotenames will try to write to a closed sshpeer and cause errors.

Reviewed By: DurhamG

Differential Revision: D19911390

fbshipit-source-id: ca1364e808cebb632e051fbbdcfe4bf0dca721bc
2020-02-26 17:08:23 -08:00
David Tolnay
de96589260 autocargo: Strip line comments
Summary:
These comments end up being a source of churn as we roll out D20125635, and anyway are not particularly meaningful after the transformations performed by autocargo. For example:

```
bytes = { version = "0.4", features = ["serde"] } # todo: remove
```

^ This doesn't mean the generated Cargo.toml intends to drop its bytes dependency altogether, but just that will be migrated to a different version that is present in the third-party/rust/Cargo.toml but not visible in the generated Cargo.toml.

Reviewed By: jsgf

Differential Revision: D20128612

fbshipit-source-id: a9e7b29ddc4b26bc47a626dd73bdaa4771ee7b18
2020-02-26 16:31:52 -08:00
Stanislau Hlebik
98f6d5d1a8 mononoke: fix walker filenode walks
Summary:
Since Mononoke's filenodes were migrated to derived data framework
hg_linknode_populated alarm has been firing. The main reason was that there's
now a delay between hg changeset being generated and filenodes being generated.

This diff fixes it by making sure walker won't visit hg changesets without
generated filenodes (note that walker will visit these changesets later after filenodes will be
generated).

Reviewed By: ahornby

Differential Revision: D20067615

fbshipit-source-id: 285e9a3d8c89b85441491c889a8458c86ca0e3a8
2020-02-26 15:21:53 -08:00
Adam Simpkins
0ffcf3e450 update the Rust print_status() function to take an IO parameter
Summary:
Update the `print_status()` function to take a `clidispatch::io::IO` object as
a parameter, instead of a simple output object.  This will allow us to also
print error messages from this function in a future diff.

Reviewed By: quark-zju

Differential Revision: D19958504

fbshipit-source-id: bf482fdc4420e1350363a730c6a539cd760aef25
2020-02-26 14:54:40 -08:00
Durham Goode
430f047eda py3: fix flat dirstate parsing/packing
Summary: Updates the C code to support unicode filenames and states.

Reviewed By: simpkins

Differential Revision: D19786275

fbshipit-source-id: e7aeb029b792818b1b1a9c5d3028640b56522235
2020-02-26 12:53:25 -08:00
Xavier Deguillard
76dd52a310 infinitepush: only open a transaction when deleting bookmarks
Summary: There is no need to open a transaction otherwise.

Reviewed By: DurhamG

Differential Revision: D20109840

fbshipit-source-id: e47adaaeea2d7565f3629701d8de4a67d4b55182
2020-02-26 10:27:05 -08:00
Durham Goode
f188acb4e0 recover: don't verify the repo
Summary:
Verifying the changelog is quite slow and we've had more users needing
to run hg recover these days. Let's finally get rid of the verify step.

Reviewed By: simpkins

Differential Revision: D20109706

fbshipit-source-id: a512d9e11716514bce986b0e3a26347fe6afd955
2020-02-26 09:07:08 -08:00
Aida Getoeva
8f09d5a51b hg-py3: fix the last amend commands
Summary: Most of the fixes related to encoding in `patch.py`

Reviewed By: DurhamG

Differential Revision: D19713378

fbshipit-source-id: 66ccbd0fc7826ab2d4c05173c7e9edb96700d106
2020-02-26 08:26:13 -08:00
Aida Getoeva
585899f419 mononoke/scs: use last change in file history
Summary:
There is no need to generate expensive file history stream if only one node is requested.

I refactored code that generated stream of history commits, so it'd first yield the nodes and only then prefetch their parents. That will help to solve latency problem for the history request for only a single commit.

I removed BFS queue and added two state variables: ready nodes and already processed:
* The last are the nodes that were return as a part of a history stream on the last iteration and now can be used to construct next BFS layer: prefetch fastlog batches, fill the commit graph, take parents in BFS order to form new bunch of nodes.
* First are used if it's the first iteration - there is no processed nodes yet but there are some that are ready to be returned.

I believe removing the queue I simplified the code and logic a little bit.

Reviewed By: StanislavGlebik

Differential Revision: D19818100

fbshipit-source-id: c30d28c623464ba3552a00e8542552f7655076ef
2020-02-26 08:09:12 -08:00
Alex Hornby
04e011525a mononoke: walker: test validate scuba logging for non-public commits
Summary: add test for scuba logging for non-public commits

Reviewed By: StanislavGlebik

Differential Revision: D20093721

fbshipit-source-id: eb0792bcae8ea27c11709181390efb0ac0c817ee
2020-02-26 06:16:29 -08:00
Stanislau Hlebik
7076fac933 mononoke: add exponential backoff
Summary:
During our tests we noticed that we can send too many blobstore read requests to the
mapping. Let's add exponential backoff to prevent that

Reviewed By: ikostia

Differential Revision: D20116043

fbshipit-source-id: 6fecbda4c36a5065b77ba9df561c6d9c6a969089
2020-02-26 05:05:33 -08:00
Thomas Orozco
4ca1333b8a mononoke/hooks: use a smaller test group for faster tests
Reviewed By: ikostia

Differential Revision: D20115985

fbshipit-source-id: 4f69fc84eee352bcc689918527c6d460fcf672ba
2020-02-26 04:44:39 -08:00
Thomas Orozco
c14a88bbef mononoke: convert places that talk to Memcache to Bytes 0.5
Summary:
Memcache doesn't care (because both old and new Bytes to `Into<IOBuf>`), but
Thrift is Bytes 0.5. We have our caching ext layer in the middle, which wants
Bytes 0.4. This means we end up copying things we don't need to copy.

Let's update to fewer copies. I didn't update apiserver, because a) it's going
away, and b) those bytes go into Actix, and Actix isn't upgrading to Bytes 0.5
any time soon! Besides, this doesn't actually need updating besides tests anyway.

Reviewed By: dtolnay

Differential Revision: D20006062

fbshipit-source-id: 42766363a0ff8494f18349bcc822b5238e1ec0cd
2020-02-26 03:30:47 -08:00
Adam Simpkins
08f86af0a4 enable strict type checking in telemetry.py
Summary: Enable `pyre-strict` mode in eden/cli/telemetry.py

Reviewed By: genevievehelsel

Differential Revision: D20102260

fbshipit-source-id: 0e5030f99852eb07dc427ba80cc30334adea4bfb
2020-02-25 19:01:10 -08:00
Adam Simpkins
f0cf7fec98 update the telemetry wrapper to log the current code version
Summary:
Add methods to `version.py` to get the version of the current running Eden CLI
code, rather than looking for the current installed RPM version.  This means
that we no longer have to execute a separate subprocess that examines the RPM
database.  This also makes sure we log the correct version information in
cases where developers are testing local development code even though they
have a different RPM version currently installed.

Reviewed By: genevievehelsel

Differential Revision: D20102259

fbshipit-source-id: ba9eb0c563c7f7c929170b130566946a67f679a5
2020-02-25 19:01:10 -08:00
Adam Simpkins
9ee7b23604 update RPM version code to return Optional[Tuple[str, str]]
Summary:
Update `get_installed_eden_rpm_version_parts()` to simplify the return type
from `Tuple[Optional[str], Optional[str]]` to `Optional[Tuple[str, str]]`

This also improves the output of `get_installed_eden_rpm_version()` when the
RPM is not installed so that it returns `<Not Installed>` rather than
`<Not Installed>-` with a trailing dash.

Additionally this updates the telemetry logging to include the full
version+release string.  With our current version number scheme there can be
multiple packages with the same version but different release numbers if we
release multiple packages within a single day.

Reviewed By: genevievehelsel

Differential Revision: D20102263

fbshipit-source-id: 24d2df4cdca6ac576267be66b85422c3e50f1229
2020-02-25 19:01:09 -08:00
Adam Simpkins
4ee1a29578 move code to get the running EdenFS version to EdenInstance
Summary:
Move the `get_running_eden_version()` functions from the `version.py` module
into the `EdenInstance` class in `config.py`.  This helps eliminate some
circular dependency cycles in the code, so I can start breaking a few modules
out of the main CLI `lib` library.

I also changed the return type of `get_running_version_parts()` from
`Tuple[Optional[str], Optional[str]]` to just `Tuple[str, str]`.  A dev build
of EdenFS already returns empty strings (rather than `None`) for the version
and release fields).  There shouldn't really be any cases where `None` is
returned here, and even if there were I don't think we would ever care to
distinguish this from the empty string case.

Reviewed By: genevievehelsel

Differential Revision: D20102262

fbshipit-source-id: 564ec5ee820026a0c86c70ad0d7cfd3750ad94f5
2020-02-25 19:01:09 -08:00
Genevieve Helsel
35c8305d13 scuba logging eden full restart
Summary: Log when a user runs a normal (full) restart, including success or not. Success is determined by the return code of `start_daemon()` (which calls `subprocess.call()`), similar to the success critera for graceful restart logging

Reviewed By: fanzeyi

Differential Revision: D20098949

fbshipit-source-id: 0c6f4927571f686ed6b678d5c814f76c78322274
2020-02-25 15:31:44 -08:00
Genevieve Helsel
0c908acc0d scuba logging eden doctor calls
Summary: log when a user runs eden doctor, and log how many errors they encounter

Reviewed By: fanzeyi

Differential Revision: D20084617

fbshipit-source-id: 122a062c538931eb906cbfcd515ec1e8093efc38
2020-02-25 15:31:43 -08:00
Genevieve Helsel
88851bc88d add no-op logging to FakeEdenInstance
Summary: This is required for eden doctor cli tests when adding logging to the eden doctor code path. This can just be a stub since we don't consume these scuba log statements during testing

Reviewed By: fanzeyi

Differential Revision: D20087861

fbshipit-source-id: 6805ae8d9c51e33a118cbda76461483962e876f3
2020-02-25 15:31:43 -08:00
Genevieve Helsel
0528daf796 add type annotation in check_filesystems
Summary: the TypeCheck test cases were yelling at me because of this annotation missing when running locally, so adding it to fix those tests.

Reviewed By: fanzeyi

Differential Revision: D20098619

fbshipit-source-id: 630e7bca2b63033b34d72d1c739184819d3d86a3
2020-02-25 15:31:43 -08:00
Jeff Zhang
33140b117c Push compat down one level in eden/mononoke/cmds/admin/main.rs
Summary: Moving `compat` one level down to the call sites of subcommand functions.

Reviewed By: farnz

Differential Revision: D20085398

fbshipit-source-id: 461e147d2ae6e560b3a75fb92fa6b23f9f54d13e
2020-02-25 10:22:03 -08:00
Zeyi (Rice) Fan
2222dbc1a5 fix HgPrefetchTest
Summary:
The problem is that the datapack files are not flushed to disk when it is prefetched. By having a pair of brackets around the `HgBackingStore`, it will ensure the `HgImporter` is closed by the time when we verify the prefetch with `hg cat` since it will terminate the `debugedenimporthelper` process in its destructor, which flushes the datapack files.

The real cause of the test failure is still unclear but I believe this is the correct way of doing this test.

Reviewed By: xavierd

Differential Revision: D20090249

fbshipit-source-id: 8e3966936a402c92311919433282027846d065e8
2020-02-25 10:14:29 -08:00
Puneet Kaushik
2b19eb7c17 Define directory types for Windows
Summary: Windows SDK doesn't define dirent. Defining it here for adding Inodes support on Edenfs on Windows.

Reviewed By: simpkins

Differential Revision: D19956272

fbshipit-source-id: 1bdf9a7563c194fe38008741b09668242ffa64ee
2020-02-25 10:14:29 -08:00
Puneet Kaushik
ca40c6f0f4 Update log level and remove async
Summary:
Logging on Windows doesn't work when the async is set. We haven't debugged it yet. Removing the async mode flag until we fix that.

Also bumping up the log level to 4. This would help to get more info while we are running in beta.

Reviewed By: simpkins

Differential Revision: D19776609

fbshipit-source-id: ccd6a6ed4d81f4a2edd550c6bb7195ac8b8b4d16
2020-02-25 10:14:28 -08:00
Stanislau Hlebik
19e1e94984 mononoke: add lease renewing to derived data
Summary:
During S196197 lease expired and we were rederiving the same derived data over and over again for a big commit.
this diff adds lease renewal that should help with this problem.

Reviewed By: HarveyHunt

Differential Revision: D20093323

fbshipit-source-id: d139abf6659722f47ea40d9b2f279daa03623ff4
2020-02-25 09:22:46 -08:00
Stanislau Hlebik
4bd758289b mononoke: async/await derive_may_panic() function
Reviewed By: HarveyHunt

Differential Revision: D20092945

fbshipit-source-id: 70ec1a8e5b9c99f3853a13bebe3657ece5ff9e9e
2020-02-25 09:22:46 -08:00
Genevieve Helsel
887de5105d scuba log eden rage calls
Summary: log when a user runs eden rage

Reviewed By: simpkins

Differential Revision: D20084529

fbshipit-source-id: a92c5472554cd541c9a7d340edcf6845c1c9c0c0
2020-02-25 08:11:28 -08:00
Stanislau Hlebik
3418318883 mononoke: do not generate hgchangesets unnecessarily in FilenodesOnlyPublicMapping
Summary:
fetch_root_filenode is called by FilenodesOnlyPublicMapping to figure out if
filenodes were already derived. Previously it first derived hg changeset and
then fetched looked up root manifest in db. However if hg changeset is not
derived then filenodes couldn't possible be derived either and we can return an
answer faster.

This is useful in the next diff where I change walker

Reviewed By: ahornby

Differential Revision: D20068819

fbshipit-source-id: 17f066c437e0b1f7bbeb8f6e247eadc9afe94f90
2020-02-25 08:07:07 -08:00
Thomas Orozco
f8fcbc9723 mononoke/blobstore_healer: wait for MyRouter properly
Summary:
The blobstore_healer has never waited for MyRouter before querying for slave
status, but it ended up implicitly working because creating a blobstore
required a SQL factory, and creating a SQL factory would result in waiting for
MyRouter.

Now that creating a blobstore doesn't require SQL factory unless you're going
to actually use it (which the healer isn't: it doesn't use a multiplexblob, it
uses the underlying blobstores instead), we no longer wait properly for
MyRouter, so if MyRouter isn't there when we boot, we crash.

This fixes that.

Reviewed By: ahornby

Differential Revision: D20094829

fbshipit-source-id: 82b7e8d893a01049d1f434ee8dff36a877a0d2f4
2020-02-25 07:03:28 -08:00
Alex Hornby
693e8dee0a mononoke: walker: add support for loading by GitSha1 Aliases
Summary:
Add support for loading by GitSha1 Aliases.  This relies on the change to
Alias::GitSha1 earlier in stack.

Reviewed By: ikostia

Differential Revision: D19903577

fbshipit-source-id: 73cdccc04af61fa524c3683851d8af9ae90d31dc
2020-02-25 03:36:06 -08:00
Adam Simpkins
ef04ccf546 replace a bunch of pyre-fixme comments with pyre-ignoree
Summary:
D17135557 added a bunch of `pyre-fixme` comments to the EdenFS integration
tests for cases where Pyre cannot detect that some attributes are initialized
by the test case `setUp()` method.

It looks like Pyre's handling of `setUp()` is somewhat incorrect: it looks
like if a class has a `setUp()` method this currently suppresses all
uninitialized attribute errors (even if some attributes really are never
initialized).  However, Pyre does not detect `setUp()` methods inherited from
parent classes, and always warns about uninitialized attributes in this case
even they are initialized.

Lets change these comments from `pyre-fixme` to `pyre-ignore` since this
appears to be an issue with Pyre rather than with this code.  T62487924 is
open to track adding support for annotating custom constructor methods, which
might help here.  I've also posted in Pyre Q&A about incorrect handling of
`setUp()` in derived classes.

Reviewed By: grievejia

Differential Revision: D19963118

fbshipit-source-id: 9fd13fc8665367e0780f871a5a0d9a8fe50cc687
2020-02-24 18:55:19 -08:00
Michael Devine
69e9601f71 Refactor convert repo into directory
Summary: As I work, it's getting harder and harder to keep my multiple changes from introducing merge conflicts between different branches. We need to break out the repo_source's implementation in to a bunch of different files to make it easier to keep things separate.

Reviewed By: zhonglowu, tchebb

Differential Revision: D20015946

fbshipit-source-id: bf954ac581e5ca9e43c091b6b1b4c539c14471f2
2020-02-24 18:07:11 -08:00
generatedunixname89002005287564
d801a85055 eden/integration/persistence_test.py
Reviewed By: simpkins

Differential Revision: D19995899

fbshipit-source-id: 28cf25cb5a4cde8b15f8a4f3199aaa249aade2a3
2020-02-24 15:42:45 -08:00
Adam Simpkins
b22fc79e4b clean up PathRelativizer API usage of Path vs PathBuf
Summary:
Fix the PathRelativizer APIs to accept `Path` and even `str` arguments instead
of just `PathBuf`.  The old code required a `PathBuf`, which often forced
callers to make a copy of the path data.

Reviewed By: quark-zju

Differential Revision: D19958505

fbshipit-source-id: 6fa40dd4b75df4e3faf9ad2ae4f0e4e6595669f6
2020-02-24 15:38:36 -08:00
Thomas Orozco
2a12e2beb6 mononoke/derived_data: log when we start deriving
Summary:
This should give us a slightly better idea of what hosts are doing to
troubleshoot duplicate derivation.

Also, let's make the logging a bit less confusing.

Reviewed By: StanislavGlebik

Differential Revision: D20070619

fbshipit-source-id: 91cc264b7043b8fc8c21c007832fba328ef0017d
2020-02-24 12:03:41 -08:00
Thomas Orozco
b3bebee0b4 mononoke: include DB config in multiplexed blobstore configuration
Summary:
This updates our multiplexed blobstore configuration to carry its own DB
config. The upshot of this change is that we can move the blobstore sync queue
(a fairly unruly table) to its own DB.

Another nice side effect of this is that it cleans up a bunch of other code, by
finally decoupling the blobstore config from the DB config. For examples,
places that need to instantiate a blobstore can now to do even without a DB
config (such as wireproto logging).

Obviously, this cannot land until we update the configs to include this. I'll
do so in Configerator prior to landing the diff.

Reviewed By: HarveyHunt

Differential Revision: D19973905

fbshipit-source-id: 79e4ff92cdb989aab4532decd3fe4fd6c55e2bb2
2020-02-24 11:54:45 -08:00
Thomas Orozco
b7185f0f13 mononoke/metaconfig: tidy up blobstore creation
Summary:
I'd like to refactor our multiplex blob to store its DB using a different
shard. In preparation of doing so, let's:

- Extract parsing DB configs from storage configs
- Tidy up some related places that take a reference when they actually need
  ownership (which is sort of wasteful).

Reviewed By: StanislavGlebik

Differential Revision: D19973906

fbshipit-source-id: 82baceb892e9e24e5fd0349ffa5503884c177a7a
2020-02-24 11:54:44 -08:00
Adam Simpkins
8c9899a197 reduce the glog logging level to info (1)
Summary:
Most of EdenFS's main logging is done through folly::logging, however a number
of libraries that we use do logging through glog.  Previously we set glog's
`--minloglevel` setting to `0`, and we use the default `--v=0` setting.
This enabled glog `VLOG` messages, only for at VLOG level `0` messages.

Now that the Rust backing store code can fetch directly from memcache this now
links in some additional memcache library code that has some `VLOG(0)`
messages that are logged fairly frequently.  These aren't useful for us to
have in our logs, so reduce the `minloglevel` to `1` for now, which disables
all `VLOG` messages.

Reviewed By: genevievehelsel

Differential Revision: D20050589

fbshipit-source-id: 167e301d61e46ae3c19975e0c9233eda371495c0
2020-02-24 11:34:01 -08:00
Xavier Deguillard
401d44916b add lfs_protocol to autocargo
Summary: Now it no longer depends on mononoke_types, we can build it with cargo

Reviewed By: krallin

Differential Revision: D20070438

fbshipit-source-id: 1b2f9cc3640c58fd38e962c7c738d08cbb22a71d
2020-02-24 11:12:45 -08:00
Xavier Deguillard
934b64397b convert to bytes 0.5
Summary:
The bytes 0.5 is a depencency of newer tokio, it's also newer, and thus better.
Staying on 0.4 means that copies between Bytes 0.4 and 0.5 need to be done,
this will be especially bad in the LFS code since 10+MB buffer will have to be
copied...

One main API change is for the configparser. The code used to take Into<Bytes>
for the keys, I switched it to AsRef<[u8]>.

For hg_memcache_client, an extra copy is performed to build a Delta, since this
code uses an old tokio, and is being replaced right now, the effort of
switching to a new tokio and new bytes was not deemed worth it, the copy will
do for now.

Reviewed By: dtolnay

Differential Revision: D20043137

fbshipit-source-id: 395bfc3749a3b1bdfea652262019ac6a086e61e0
2020-02-24 10:28:46 -08:00